Author Topic: BSOD 0x7e System Thread Exception Not Handled  (Read 4495 times)

Offline TheKLF

  • Newbie
  • *
  • Posts: 24
  • Karma: +1/-0
    • View Profile
BSOD 0x7e System Thread Exception Not Handled
« on: January 02, 2015, 11:25:03 am »
Since upgrading to the latest release of tRAID, I have been getting 0x7e BSODs just about once a week.  The server is never rebooted, and regularly writes to the storage pool (been averaging about 200GB per day).  Specs are:

Windows Home Server 2011
Intel i3-3240
8GB ram
Z77 Mobo with combo of 2 LSI (Dell H200 and LSI 1068e) and onboard intel controllers
22 DRUs and 2 PPUs (12 in the server and 12 in a SAS expander)
Storage Pooling Enabled - no separate mounted drives
Everything is on a pure sine UPS
I am using low voltage memory (1.35v) which has run 72hours of memtest and 24 hours of prime with no errors.
I am not seeing any smart errors on my drives
I am using IE to manage the raid - I only log into the management console to make changes and then logout / close browser
I have been searching the forums fairly regularly but have only seen a few posts on BSODs which are all fairly old now

I purchased my license in september, and at that time I was also seeing BSODs on the then current release, so I rolled back to the (July?) release and was stable until december 8th, when I got the first of the current batch of BSODs.  I upgraded to the latest release (11.16 / 10.29) and have been getting the weekly BSODs since.  I haven't tried downgrading again because I was trying to see if there was some way to determine the cause of the errors, but I have been unsuccessful so I am hoping for some help.

I have attached my Raid config, as well as the tRAID logs and minidumps.  I probably made a mistake and cleared the flexraid.NZFS.log because it seemed to only have info in it from august / september 6- and I was hoping I would see some new info, but the new log only seems to reflect the service starting - which happens after the BSOD / reboot.

Memory usage seems to be normal, and the broker service never crashes.  The only thing I have observed is that the BSOD always happens when writing to the storage pool.  It has never BSOD'd from just idling or reading.  I typically am writing 10-25GB files.  Please let me know what other info I can provide to help troubleshoot as I am sure I missed something.

Thanks in advance!

Offline Brahim

  • Global Moderator
  • Hero Member
  • *****
  • Posts: 8,547
  • Karma: +204/-16
    • View Profile
Re: BSOD 0x7e System Thread Exception Not Handled
« Reply #1 on: January 03, 2015, 12:20:51 am »
You will need to analyze your minidumps: http://www.osronline.com/page.cfm?name=analyze

Offline TheKLF

  • Newbie
  • *
  • Posts: 24
  • Karma: +1/-0
    • View Profile
Re: BSOD 0x7e System Thread Exception Not Handled
« Reply #2 on: January 03, 2015, 05:56:12 pm »
Here is the most recent:

Crash Dump Analysis provided by OSR Open Systems Resources, Inc. (http://www.osr.com)
Online Crash Dump Analysis Service
See http://www.osronline.com for more information
Windows 7 Kernel Version 7601 (Service Pack 1) MP (4 procs) Free x64
Product: Server, suite: TerminalServer SingleUserTS WHServer
Built by: 7601.18409.amd64fre.win7sp1_gdr.140303-2144
Machine Name:
Kernel base = 0xfffff800`02210000 PsLoadedModuleList = 0xfffff800`02453890
Debug session time: Fri Jan  2 00:07:32.697 2015 (UTC - 5:00)
System Uptime: 6 days 14:12:05.838
*******************************************************************************
*                                                                             *
*                        Bugcheck Analysis                                    *
*                                                                             *
*******************************************************************************

SYSTEM_THREAD_EXCEPTION_NOT_HANDLED_M (1000007e)
This is a very common bugcheck.  Usually the exception address pinpoints
the driver/function that caused the problem.  Always note this address
as well as the link date of the driver/image that contains this address.
Some common problems are exception code 0x80000003.  This means a hard
coded breakpoint or assertion was hit, but this system was booted
/NODEBUG.  This is not supposed to happen as developers should never have
hardcoded breakpoints in retail code, but ...
If this happens, make sure a debugger gets connected, and the
system is booted /DEBUG.  This will let us see why this breakpoint is
happening.
Arguments:
Arg1: ffffffffc0000005, The exception code that was not handled
Arg2: fffff88003832e40, The address that the exception occurred at
Arg3: fffff88017b99868, Exception Record Address
Arg4: fffff88017b990c0, Context Record Address

Debugging Details:
------------------

TRIAGER: Could not open triage file : e:\dump_analysis\program\triage\modclass.ini, error 2

EXCEPTION_CODE: (NTSTATUS) 0xc0000005 - The instruction at "0x%08lx" referenced memory at "0x%08lx". The memory could not be "%s".

FAULTING_IP:
NZFSD+32e40
fffff880`03832e40 480fc311        movnti  qword ptr [rcx],rdx

EXCEPTION_RECORD:  fffff88017b99868 -- (.exr 0xfffff88017b99868)
ExceptionAddress: fffff88003832e40 (NZFSD+0x0000000000032e40)
   ExceptionCode: c0000005 (Access violation)
  ExceptionFlags: 00000000
NumberParameters: 2
   Parameter[0]: 0000000000000001
   Parameter[1]: 0000000000000000
Attempt to write to address 0000000000000000

CONTEXT:  fffff88017b990c0 -- (.cxr 0xfffff88017b990c0)
rax=0000000000000000 rbx=fffffa8017c0f2b0 rcx=0000000000000000
rdx=0000000000000000 rsi=0000000000000000 rdi=0000000001000400
rip=fffff88003832e40 rsp=fffff88017b99aa8 rbp=fffffa8013157bf0
 r8=0000000000000000  r9=00000000007001c0 r10=fffffa8006815180
r11=fffff88017b998a0 r12=0000000000000001 r13=00000020729c1000
r14=0000000000000000 r15=0000000001000000
iopl=0         nv up ei pl nz na po nc
cs=0010  ss=0018  ds=002b  es=002b  fs=0053  gs=002b             efl=00010206
NZFSD+0x32e40:
fffff880`03832e40 480fc311        movnti  qword ptr [rcx],rdx ds:002b:00000000`00000000=????????????????
Resetting default scope

CUSTOMER_CRASH_COUNT:  1

DEFAULT_BUCKET_ID:  NULL_DEREFERENCE

PROCESS_NAME:  System

CURRENT_IRQL:  0

ERROR_CODE: (NTSTATUS) 0xc0000005 - The instruction at "0x%08lx" referenced memory at "0x%08lx". The memory could not be "%s".

EXCEPTION_PARAMETER1:  0000000000000001

EXCEPTION_PARAMETER2:  0000000000000000

WRITE_ADDRESS: GetPointerFromAddress: unable to read from fffff800024bd100
GetUlongFromAddress: unable to read from fffff800024bd1c0
 0000000000000000 Nonpaged pool

FOLLOWUP_IP:
NZFSD+32e40
fffff880`03832e40 480fc311        movnti  qword ptr [rcx],rdx

BUGCHECK_STR:  0x7E

LAST_CONTROL_TRANSFER:  from fffff8800380903f to fffff88003832e40

STACK_TEXT: 
fffff880`17b99aa8 fffff880`0380903f : 00000000`00000001 fffffa80`17c0f2b0 00000000`4c414342 fffffa80`00000000 : NZFSD+0x32e40
fffff880`17b99ab0 00000000`00000001 : fffffa80`17c0f2b0 00000000`4c414342 fffffa80`00000000 00000000`01000400 : NZFSD+0x903f
fffff880`17b99ab8 fffffa80`17c0f2b0 : 00000000`4c414342 fffffa80`00000000 00000000`01000400 fffff880`0381d8fc : 0x1
fffff880`17b99ac0 00000000`4c414342 : fffffa80`00000000 00000000`01000400 fffff880`0381d8fc fffffa80`09bae6e0 : 0xfffffa80`17c0f2b0
fffff880`17b99ac8 fffffa80`00000000 : 00000000`01000400 fffff880`0381d8fc fffffa80`09bae6e0 00000000`00000000 : 0x4c414342
fffff880`17b99ad0 00000000`01000400 : fffff880`0381d8fc fffffa80`09bae6e0 00000000`00000000 fffff800`02194080 : 0xfffffa80`00000000
fffff880`17b99ad8 fffff880`0381d8fc : fffffa80`09bae6e0 00000000`00000000 fffff800`02194080 00000000`00000000 : 0x1000400
fffff880`17b99ae0 fffffa80`09bae6e0 : 00000000`00000000 fffff800`02194080 00000000`00000000 fffff880`00000000 : NZFSD+0x1d8fc
fffff880`17b99ae8 00000000`00000000 : fffff800`02194080 00000000`00000000 fffff880`00000000 fffffa80`0a6619f0 : 0xfffffa80`09bae6e0


SYMBOL_STACK_INDEX:  0

SYMBOL_NAME:  NZFSD+32e40

FOLLOWUP_NAME:  MachineOwner

MODULE_NAME: NZFSD

IMAGE_NAME:  NZFSD.sys

DEBUG_FLR_IMAGE_TIMESTAMP:  5468c8e7

STACK_COMMAND:  .cxr 0xfffff88017b990c0 ; kb

FAILURE_BUCKET_ID:  X64_0x7E_NZFSD+32e40

BUCKET_ID:  X64_0x7E_NZFSD+32e40

Followup: MachineOwner
---------
« Last Edit: January 03, 2015, 06:20:32 pm by TheKLF »

Offline TheKLF

  • Newbie
  • *
  • Posts: 24
  • Karma: +1/-0
    • View Profile
Re: BSOD 0x7e System Thread Exception Not Handled
« Reply #3 on: January 03, 2015, 05:58:16 pm »
Here is the one just prior to the most recent:
Crash Dump Analysis provided by OSR Open Systems Resources, Inc. (http://www.osr.com)
Online Crash Dump Analysis Service
See http://www.osronline.com for more information
Windows 7 Kernel Version 7601 (Service Pack 1) MP (4 procs) Free x64
Product: Server, suite: TerminalServer SingleUserTS WHServer
Built by: 7601.18409.amd64fre.win7sp1_gdr.140303-2144
Machine Name:
Kernel base = 0xfffff800`0225a000 PsLoadedModuleList = 0xfffff800`0249d890
Debug session time: Fri Dec 26 02:48:32.755 2014 (UTC - 5:00)
System Uptime: 5 days 1:35:23.755
*******************************************************************************
*                                                                             *
*                        Bugcheck Analysis                                    *
*                                                                             *
*******************************************************************************

SYSTEM_THREAD_EXCEPTION_NOT_HANDLED_M (1000007e)
This is a very common bugcheck.  Usually the exception address pinpoints
the driver/function that caused the problem.  Always note this address
as well as the link date of the driver/image that contains this address.
Some common problems are exception code 0x80000003.  This means a hard
coded breakpoint or assertion was hit, but this system was booted
/NODEBUG.  This is not supposed to happen as developers should never have
hardcoded breakpoints in retail code, but ...
If this happens, make sure a debugger gets connected, and the
system is booted /DEBUG.  This will let us see why this breakpoint is
happening.
Arguments:
Arg1: ffffffffc0000005, The exception code that was not handled
Arg2: fffff8800469be40, The address that the exception occurred at
Arg3: fffff88017b64868, Exception Record Address
Arg4: fffff88017b640c0, Context Record Address

Debugging Details:
------------------

TRIAGER: Could not open triage file : e:\dump_analysis\program\triage\modclass.ini, error 2

EXCEPTION_CODE: (NTSTATUS) 0xc0000005 - The instruction at "0x%08lx" referenced memory at "0x%08lx". The memory could not be "%s".

FAULTING_IP:
NZFSD+32e40
fffff880`0469be40 480fc311        movnti  qword ptr [rcx],rdx

EXCEPTION_RECORD:  fffff88017b64868 -- (.exr 0xfffff88017b64868)
ExceptionAddress: fffff8800469be40 (NZFSD+0x0000000000032e40)
   ExceptionCode: c0000005 (Access violation)
  ExceptionFlags: 00000000
NumberParameters: 2
   Parameter[0]: 0000000000000001
   Parameter[1]: 0000000000000000
Attempt to write to address 0000000000000000

CONTEXT:  fffff88017b640c0 -- (.cxr 0xfffff88017b640c0)
rax=0000000000000000 rbx=fffffa80180fe7e0 rcx=0000000000000000
rdx=0000000000000000 rsi=0000000000000000 rdi=0000000001003e00
rip=fffff8800469be40 rsp=fffff88017b64aa8 rbp=fffffa800a641840
 r8=0000000000000000  r9=0000000000701b20 r10=fffffa8006814ac8
r11=fffff88017b648a0 r12=0000000000000001 r13=00000073d73d7000
r14=0000000000000000 r15=0000000001000000
iopl=0         nv up ei pl nz na pe nc
cs=0010  ss=0018  ds=002b  es=002b  fs=0053  gs=002b             efl=00010202
NZFSD+0x32e40:
fffff880`0469be40 480fc311        movnti  qword ptr [rcx],rdx ds:002b:00000000`00000000=????????????????
Resetting default scope

CUSTOMER_CRASH_COUNT:  1

DEFAULT_BUCKET_ID:  NULL_DEREFERENCE

PROCESS_NAME:  System

CURRENT_IRQL:  0

ERROR_CODE: (NTSTATUS) 0xc0000005 - The instruction at "0x%08lx" referenced memory at "0x%08lx". The memory could not be "%s".

EXCEPTION_PARAMETER1:  0000000000000001

EXCEPTION_PARAMETER2:  0000000000000000

WRITE_ADDRESS: GetPointerFromAddress: unable to read from fffff80002507100
GetUlongFromAddress: unable to read from fffff800025071c0
 0000000000000000 Nonpaged pool

FOLLOWUP_IP:
NZFSD+32e40
fffff880`0469be40 480fc311        movnti  qword ptr [rcx],rdx

BUGCHECK_STR:  0x7E

LAST_CONTROL_TRANSFER:  from fffff8800467203f to fffff8800469be40

STACK_TEXT: 
fffff880`17b64aa8 fffff880`0467203f : 00000000`00000001 fffffa80`180fe7e0 00000000`4c414342 fffffa80`00000000 : NZFSD+0x32e40
fffff880`17b64ab0 00000000`00000001 : fffffa80`180fe7e0 00000000`4c414342 fffffa80`00000000 00000000`01003e00 : NZFSD+0x903f
fffff880`17b64ab8 fffffa80`180fe7e0 : 00000000`4c414342 fffffa80`00000000 00000000`01003e00 fffff880`046868fc : 0x1
fffff880`17b64ac0 00000000`4c414342 : fffffa80`00000000 00000000`01003e00 fffff880`046868fc fffffa80`0a5ae010 : 0xfffffa80`180fe7e0
fffff880`17b64ac8 fffffa80`00000000 : 00000000`01003e00 fffff880`046868fc fffffa80`0a5ae010 00000000`00003000 : 0x4c414342
fffff880`17b64ad0 00000000`01003e00 : fffff880`046868fc fffffa80`0a5ae010 00000000`00003000 00000000`00000000 : 0xfffffa80`00000000
fffff880`17b64ad8 fffff880`046868fc : fffffa80`0a5ae010 00000000`00003000 00000000`00000000 fffffa80`0aa60700 : 0x1003e00
fffff880`17b64ae0 fffffa80`0a5ae010 : 00000000`00003000 00000000`00000000 fffffa80`0aa60700 fffff880`00000000 : NZFSD+0x1d8fc
fffff880`17b64ae8 00000000`00003000 : 00000000`00000000 fffffa80`0aa60700 fffff880`00000000 fffffa80`00000001 : 0xfffffa80`0a5ae010
fffff880`17b64af0 00000000`00000000 : fffffa80`0aa60700 fffff880`00000000 fffffa80`00000001 00000001`00000000 : 0x3000


SYMBOL_STACK_INDEX:  0

SYMBOL_NAME:  NZFSD+32e40

FOLLOWUP_NAME:  MachineOwner

MODULE_NAME: NZFSD

IMAGE_NAME:  NZFSD.sys

DEBUG_FLR_IMAGE_TIMESTAMP:  5468c8e7

STACK_COMMAND:  .cxr 0xfffff88017b640c0 ; kb

FAILURE_BUCKET_ID:  X64_0x7E_NZFSD+32e40

BUCKET_ID:  X64_0x7E_NZFSD+32e40

Followup: MachineOwner
---------
« Last Edit: January 03, 2015, 06:21:42 pm by TheKLF »

Offline Brahim

  • Global Moderator
  • Hero Member
  • *****
  • Posts: 8,547
  • Karma: +204/-16
    • View Profile
Re: BSOD 0x7e System Thread Exception Not Handled
« Reply #4 on: January 04, 2015, 11:35:09 am »
Thanks.

I will take a deeper look at the dump within the next few days.

Offline TheKLF

  • Newbie
  • *
  • Posts: 24
  • Karma: +1/-0
    • View Profile
Re: BSOD 0x7e System Thread Exception Not Handled
« Reply #5 on: January 10, 2015, 10:05:07 am »
Just an update on this - I am now getting blue screens about every other day - so just to rule out the ram, I replaced the ram with another 8GB set that I tested for 72 hours, and after 1 day I received another blue screen.

One interesting thing I have observed is that OS caching seems to be enabled even though I have it set to disabled.  If I recall, when OS Caching is set to disabled I thought I should see a steady copy rate of about 50Mb/s (depending on how full the disk is, etc) - no caching to ram first.  However - With the latest release and OS caching disabled, and TCQ disabled, it still caches and starts with a copy speed in the 200+Mb/s range.  I can use task manager / resource monitor to see the memory filling up by 2-3GB...  I feel like this is related - since I only get the BSOD when copying to the array.

Any luck with the minidumps?

Offline Brahim

  • Global Moderator
  • Hero Member
  • *****
  • Posts: 8,547
  • Karma: +204/-16
    • View Profile
Re: BSOD 0x7e System Thread Exception Not Handled
« Reply #6 on: January 10, 2015, 02:23:18 pm »
Hi, I haven't had the time to analyze the dump yet. Please upload all the minidumps you have somewhere and I will analyze them to get a more consistent picture of the issue.
Thanks.

Offline TheKLF

  • Newbie
  • *
  • Posts: 24
  • Karma: +1/-0
    • View Profile
Re: BSOD 0x7e System Thread Exception Not Handled
« Reply #7 on: January 10, 2015, 03:20:12 pm »
I attached them to this post - please let me know if you are not able to open and I will look for another way to link to them...

Thanks!

Offline TheKLF

  • Newbie
  • *
  • Posts: 24
  • Karma: +1/-0
    • View Profile
Re: BSOD 0x7e System Thread Exception Not Handled
« Reply #8 on: February 15, 2015, 05:50:25 pm »
Since posting this thread, I have been keeping a lot closer eye on what is going on, and it seems clear to me that my problem is directly related to OS caching and running out of memory.  Using task manager to monitor the memory usage, if I start with less than 3GB in use, I am typically safe.  Memory usage will climb to between 6-7GB and slowly start decreasing as it flushes the data back to the array.  However, if I have over 4GB in use, the caching continues to climb until it overflows my available ram (8GB) at which point the machine blue screens.  Therefore, I have been able to mostly avoid blue screens by rebooting to flush memory and by shutting down all other programs before attempting to copy files to the array, but it is getting tiresome.

How can I get OS caching to turn off?  Do I need to delete and recreate the array?  Additionally do I need to Uninstall and reinstall?  The OS caching setting has no effect.  Why is OS caching exceeding the available ram in my system?

Offline Brahim

  • Global Moderator
  • Hero Member
  • *****
  • Posts: 8,547
  • Karma: +204/-16
    • View Profile
Re: BSOD 0x7e System Thread Exception Not Handled
« Reply #9 on: February 15, 2015, 06:49:05 pm »
Indeed, looking at through the dumps, it appears that it runs out of memory at some point.
Try reducing the CQ Depth to 16 and see if that helps.

Offline TheKLF

  • Newbie
  • *
  • Posts: 24
  • Karma: +1/-0
    • View Profile
Re: BSOD 0x7e System Thread Exception Not Handled
« Reply #10 on: February 16, 2015, 02:10:00 pm »
I set the CQ depth down and will let you know how it goes, Thanks

Offline TheKLF

  • Newbie
  • *
  • Posts: 24
  • Karma: +1/-0
    • View Profile
Re: BSOD 0x7e System Thread Exception Not Handled
« Reply #11 on: March 06, 2015, 10:58:24 pm »
With CQ depth lowered, I am back to about a week but still getting blue screens as it runs out of memory.  However, I have scheduled a weekly reboot so I am hoping I have a workaround for now, as that seems to help with memory usage.

Offline Brahim

  • Global Moderator
  • Hero Member
  • *****
  • Posts: 8,547
  • Karma: +204/-16
    • View Profile
Re: BSOD 0x7e System Thread Exception Not Handled
« Reply #12 on: March 07, 2015, 04:14:03 am »
Are you monitoring your overall system memory usage?
Try monitoring your system resources and correlating them with the crashes.

Offline TheKLF

  • Newbie
  • *
  • Posts: 24
  • Karma: +1/-0
    • View Profile
Re: BSOD 0x7e System Thread Exception Not Handled
« Reply #13 on: March 09, 2015, 05:37:20 pm »
Hi Brahim,

I have been monitoring it in a general sense - in that I have found regardless of the CQ depth, if I have over 4GB in use it overflows and BSODs if copying a file over 4 gb in size.  If I reboot, I can free a couple gigs, and then copied use 4gb of ram but don't overflow.  So my plan is to map over time the actual increase so I can see what is causing the memory usage to climb from 2 to 4 gb over time.  If it matters, I only use IE to manage the raid and I don't leave it open.  I have not seen any dramatic usage increase by processes such as NZFSB.exe or nzfs.exe, and those never crash.  Most of my memory usage is from media browser.

One odd thing to note - I have noticed that if I open google chrome while in the middle of a copy to the storage pool, it will usually cause the copy to fail.  The copy will stop progressing, memory will empty out and it will throw an error.  I will post a screen shot next time I get it to happen.

Offline john9871

  • Newbie
  • *
  • Posts: 3
  • Karma: +0/-0
    • View Profile
Re: BSOD 0x7e System Thread Exception Not Handled
« Reply #14 on: March 06, 2016, 01:46:53 pm »
i got this problem from two days and now i'm sick with this.
did anyone find the new solution?