Thursday, February 19, 2009

SnTT - NIF means NOT INDEXING FULLY

Sometimes I need to be more granular than I am when troubleshooting problems.
I don't like to be because I feel it inhibits my ability to grok the total issue.
But some servers just demand it.

Admittedly NIF stands for Notes Indexing Facility. But my explanation is more accurate when it is not functioning.

So I ran DCT, the excellent Domino Configuration Tuner on the clients Domino Windows server running 703FP1, after temporarily installing the 8.5 client on the server.

It of course told me the basics I know already, mutlithread the full text index, enhance various other notes.ini settings like:
UPDATE_FULLTEXT_THREAD=1
SERVER_NAME_LOOKUP_NO_UPDATE=1
DEBUG_ENABLE_UPDATE_FIX=8191
NSF_BACKUP_MEMORY_CONSTRAINED=1
SF_BACKUP_MEMORY_LIMIT=104857600

and my personal favorite: view_rebuild_dir=e:\lotustemp\

This last line allows you to specify where the temporary space required for the rebuilds are located. always helpful when one of the hard drives is running low and can lead to other issues.

The bottom line is the problem occurs and it seems to be an agent which is stuck. Thought about and DCT confirmed, web agents were not running concurrently, trying that change too.

Enable agents to run asynchronously through the Server document -> Internet Protocols -> Domino Web Engine tab. Under Web Agents, enable the "Run Web Agents Concurrently" setting.

And found a very corrupted file, which turned out to be some bad restored file anyway.
Cleared it all and what happened? dead server in 2 days instead of 1. Improvement yes, fixed, no.

Called IBM Support, I had done more than my usual efforts and was tired of trying.

They suggested using the line, ftg_use_sys_memory=1, which came out in 5.0.9, link to the technote.
This will instruct the full-text (FT) engine to use direct malloc calls from the operating system for large allocations, instead of requesting memory through the Notes Memory Manager.


And we wait to see if this indeed fixed it as IBM suggested. Updated late Wen. night, it did not resolve it, although it hung on longer.

Issue seems to be when users are logging in during the morning, I wonder about peaking or utilization.