This week I was setting up a Sitecore application in my computer for the current project I am working on and I found that the application is hanging. After setting up the application with the help of Sitecore Instance Manager (SIM), I launched the application and found the browser is spinning infitely. We use SIM all the time, because it takes care of all the headache of installing Sitecore application and works most of the time without any problem. This time though SIM had failed to add permission for the ‘NETWORK SERVICE’ account to the website folders and it didn’t show any error. Not having permission turned out to be the problem. The Sitecore version I used was 7.2 rev. 140228.
Not knowing what the actual problem is, I decided to do some debugging. I launched the application and took a memory dump using ProcDump. Visual Studio is a good place to start the dump analysis. I really like Visual Studio debugging because of the Summary Reports and tabular/pictorial way it shows debug information. I opened the memory dump from the File Menu (File->Open->File..) and selected the .dmp file. It shows me the summary report, what modules are loaded etc.
To see what’s going on with the application I need to see the thread report. I ran the ‘Debug Managed Memory’ and opened the Parallel Stacks report from Debug->Windows->Parallel Stacks menu. It shows me that there are 27 active managed threads, one of them is sitting on Thread.Sleep and other 26 are waiting on that thread (thread id=6588).
Now I need to use Windbg to look at the clr stack of thread 6588 to see what’s going on with that thread. Here is the clr stack of that thread. So, it looks like Sitecore search was trying to obtain a Lock from CreateDirectory() method.
0:006> !clrstack -n
OS Thread Id: 0x19bc (6)
IP Call Site
[HelperMethodFrame: 000000187835b5d8] System.Threading.Thread.SleepInternal(Int32)
I loaded Sitecore and Lucene assemblies in the decompiler to check the above methods in the stack.
The CreateDirectory method instantiate the Sitecore.Search.IndexLocker
using (new IndexLocker(fsDirectory.MakeLock(“write.lock”)))
IndexLocker constructor calls the Lucene.Net.Store.Lock.Obtain method and passing int.MaxValue as LockWaitTime, which is, 597 hours. The code in Obtain method wait that long until either it acquires the lock or lock acquiring code returns a failureReason.
if (!this.lockInstance.Obtain((long) int.MaxValue))
throw new Exception(“Could not obtain lock ” + (object) this.lockInstance);
The above Obtain method calls abstract Obtain() method, that lands into the implementation in Sitecore.Search.FSLock class. This implementation calls FSLock.FileLock constructor that creates a Stream and it calls File.Create method. It looks like both Obtain() method and constructor doesn’t bubble up if there is any error. Obtain() method returns false in case of any error and that causes Lucene.Net.Store.Lock.Obtain to wait infitely (597 hours). So, I thought, was there any error in FSLock.FileLock constructor? I went back to Windbg and dump all the exception that occurred.
0:006> !dumpheap -type Exception
Address MT Size
000000187897f370 00007ffa8ebcced8 64
0000001a78a300c0 00007ffa8ebcced8 64
0000001a78a3aa38 00007ffa8ebcced8 64
0000001a78a3aab8 00007ffa8ebcced8 64
0000001a792e03e8 00007ffa8ebd1e58 160
0000001a792e1290 00007ffa8ebd1e58 160
0000001a792e2138 00007ffa8ebd1e58 160
0000001a792e3170 00007ffa8ebd1e58 160
0000001b78871048 00007ffa8ebbfe70 160
0000001b788710e8 00007ffa8ebc0058 160
0000001b78871188 00007ffa8ebc00d0 160
0000001b78871228 00007ffa8ebc0148 160
0000001b788712c8 00007ffa8ebc01c0 160
0000001b78871368 00007ffa8ebc01c0 160
0000001b78874ca0 00007ffa8ebcced8 64
0000001b7887b3d0 00007ffa8ebee550 32
0000001b78884fe0 00007ffa8ebcc120 24
0000001b78885010 00007ffa8ebcc198 24
0000001b788aa990 00007ffa8ebcc120 24
0000001b788aa9c0 00007ffa8ebcc198 24
MT Count TotalSize Class Name
00007ffa8ebee550 1 32 System.Collections.Generic.KeyValuePair`2[[System.String, mscorlib],[System.Runtime.ExceptionServices.ExceptionDispatchInfo, mscorlib]]
00007ffa8ebcc198 2 48 System.Text.DecoderExceptionFallback
00007ffa8ebcc120 2 48 System.Text.EncoderExceptionFallback
00007ffa8ebc0148 1 160 System.ExecutionEngineException
00007ffa8ebc00d0 1 160 System.StackOverflowException
00007ffa8ebc0058 1 160 System.OutOfMemoryException
00007ffa8ebbfe70 1 160 System.Exception
00007ffa8ebcced8 5 320 System.UnhandledExceptionEventHandler
00007ffa8ebc01c0 2 320 System.Threading.ThreadAbortException
00007ffa8ebd1e58 4 640 System.UnauthorizedAccessException
There is one exception that I found interesting UnauthorizedAccessException. Further digging shows that this error occurred in the FileLock constructor. It looks like there was an Access Denied error on
0:006> !pe 0000001a792e03e8
Exception object: 0000001a792e03e8
Exception type: System.UnauthorizedAccessException
Message: Access to the path ‘C:\inetpub\wwwroot\MyPracticeSite72\Data\indexes\sitecore_core_index\lucene-2ff1b7a3c092e73366175703b8ae9146-write.lock’ is denied.
The security setting of C:\inetpub\wwwroot\MyPracticeSite72\Data\indexes\sitecore_core_index
shows that ‘NETWORK SERVICE’ account doesn’t have any permission to access this folder. As soon as I added the proper access to ‘NETWORK SERVICE’ account, the website started working.