[oi-dev] Fmd out of memory crash
Jon Tibble
meths at btinternet.com
Thu Sep 22 22:29:17 UTC 2011
Hi Steve,
I'm cross-posting this to the illumos developer list as I think you'll
find your target audience is probably there.
You don't seem to mention which version of OI you're using (unless I
missed it).
Regards,
Jon
On 22/09/2011 21:57, Steve Gonczi wrote:
> Greetings folks,
>
> I have recently seen some fmd crashes, basically fmd running out of memory
> and shutting down.
>
> Prompted me to look at the fmd code, and I am wondering if this
> notification
> is the way the designer intended it:
>
> 2 <http://src.opensolaris.org/source/xref/onnv/onnv-gate/usr/src/cmd/fm/fmd/common/fmd_module.c#632>*void*
> 633 <http://src.opensolaris.org/source/xref/onnv/onnv-gate/usr/src/cmd/fm/fmd/common/fmd_module.c#633>fmd_module_gc <http://src.opensolaris.org/source/s?refs=fmd_module_gc&project=onnv>(fmd_module_t <http://src.opensolaris.org/source/s?defs=fmd_module_t&project=onnv> *mp <http://src.opensolaris.org/source/s?defs=mp&project=onnv>)
> 634 <http://src.opensolaris.org/source/xref/onnv/onnv-gate/usr/src/cmd/fm/fmd/common/fmd_module.c#634>{
> 635 <http://src.opensolaris.org/source/xref/onnv/onnv-gate/usr/src/cmd/fm/fmd/common/fmd_module.c#635> fmd_hdl_info_t <http://src.opensolaris.org/source/s?defs=fmd_hdl_info_t&project=onnv> *info <http://src.opensolaris.org/source/s?refs=info&project=onnv>;
> 636 <http://src.opensolaris.org/source/xref/onnv/onnv-gate/usr/src/cmd/fm/fmd/common/fmd_module.c#636> fmd_event_t <http://src.opensolaris.org/source/s?defs=fmd_event_t&project=onnv> *e;
> 637 <http://src.opensolaris.org/source/xref/onnv/onnv-gate/usr/src/cmd/fm/fmd/common/fmd_module.c#637>
> 638 <http://src.opensolaris.org/source/xref/onnv/onnv-gate/usr/src/cmd/fm/fmd/common/fmd_module.c#638> *if* (mp <http://src.opensolaris.org/source/s?defs=mp&project=onnv>->mod_error <http://src.opensolaris.org/source/s?defs=mod_error&project=onnv> !=0)
> 639 <http://src.opensolaris.org/source/xref/onnv/onnv-gate/usr/src/cmd/fm/fmd/common/fmd_module.c#639> *return*;/* do not do anything if the module has failed */
> 640 <http://src.opensolaris.org/source/xref/onnv/onnv-gate/usr/src/cmd/fm/fmd/common/fmd_module.c#640>
> 641 <http://src.opensolaris.org/source/xref/onnv/onnv-gate/usr/src/cmd/fm/fmd/common/fmd_module.c#641> fmd_module_lock <http://src.opensolaris.org/source/xref/onnv/onnv-gate/usr/src/cmd/fm/fmd/common/fmd_module.c#fmd_module_lock>(mp <http://src.opensolaris.org/source/s?defs=mp&project=onnv>);
> 642 <http://src.opensolaris.org/source/xref/onnv/onnv-gate/usr/src/cmd/fm/fmd/common/fmd_module.c#642>
> 643 <http://src.opensolaris.org/source/xref/onnv/onnv-gate/usr/src/cmd/fm/fmd/common/fmd_module.c#643> *if* ((info <http://src.opensolaris.org/source/xref/onnv/onnv-gate/usr/src/cmd/fm/fmd/common/fmd_module.c#info> =mp <http://src.opensolaris.org/source/s?defs=mp&project=onnv>->mod_info <http://src.opensolaris.org/source/s?defs=mod_info&project=onnv>) !=NULL <http://src.opensolaris.org/source/s?defs=NULL&project=onnv>) {
> 644 <http://src.opensolaris.org/source/xref/onnv/onnv-gate/usr/src/cmd/fm/fmd/common/fmd_module.c#644> fmd_serd_hash_apply <http://src.opensolaris.org/source/s?defs=fmd_serd_hash_apply&project=onnv>(&mp <http://src.opensolaris.org/source/s?defs=mp&project=onnv>->mod_serds <http://src.opensolaris.org/source/s?defs=mod_serds&project=onnv>,
> 645 <http://src.opensolaris.org/source/xref/onnv/onnv-gate/usr/src/cmd/fm/fmd/common/fmd_module.c#645> (fmd_serd_eng_f <http://src.opensolaris.org/source/s?defs=fmd_serd_eng_f&project=onnv> *)fmd_serd_eng_gc <http://src.opensolaris.org/source/s?defs=fmd_serd_eng_gc&project=onnv>,NULL <http://src.opensolaris.org/source/s?defs=NULL&project=onnv>);
> 646 <http://src.opensolaris.org/source/xref/onnv/onnv-gate/usr/src/cmd/fm/fmd/common/fmd_module.c#646> }
> 647 <http://src.opensolaris.org/source/xref/onnv/onnv-gate/usr/src/cmd/fm/fmd/common/fmd_module.c#647>
> 648 <http://src.opensolaris.org/source/xref/onnv/onnv-gate/usr/src/cmd/fm/fmd/common/fmd_module.c#648> fmd_module_unlock <http://src.opensolaris.org/source/xref/onnv/onnv-gate/usr/src/cmd/fm/fmd/common/fmd_module.c#fmd_module_unlock>(mp <http://src.opensolaris.org/source/s?defs=mp&project=onnv>);
> 649 <http://src.opensolaris.org/source/xref/onnv/onnv-gate/usr/src/cmd/fm/fmd/common/fmd_module.c#649>
>
> !!!!!! The question is: what this FMD_EVT_GC supposed to signify?
> My sense is that this is just to record the fact that the
> garbage collection was kicked off.
> If that is the case, it clearly gets called too many times,
> since fmd_module_gc is a callback function, and
> it repeatedly gets called from fmd_modhash_apply() in a double
> loop.
> !!!!!!
>
> 650 <http://src.opensolaris.org/source/xref/onnv/onnv-gate/usr/src/cmd/fm/fmd/common/fmd_module.c#650> *if* (info <http://src.opensolaris.org/source/xref/onnv/onnv-gate/usr/src/cmd/fm/fmd/common/fmd_module.c#info> !=NULL <http://src.opensolaris.org/source/s?defs=NULL&project=onnv>) {
> 651 <http://src.opensolaris.org/source/xref/onnv/onnv-gate/usr/src/cmd/fm/fmd/common/fmd_module.c#651> e =fmd_event_create <http://src.opensolaris.org/source/s?defs=fmd_event_create&project=onnv>(FMD_EVT_GC <http://src.opensolaris.org/source/s?defs=FMD_EVT_GC&project=onnv>,FMD_HRT_NOW <http://src.opensolaris.org/source/s?defs=FMD_HRT_NOW&project=onnv>,NULL <http://src.opensolaris.org/source/s?defs=NULL&project=onnv>,NULL <http://src.opensolaris.org/source/s?defs=NULL&project=onnv>);
> 652 <http://src.opensolaris.org/source/xref/onnv/onnv-gate/usr/src/cmd/fm/fmd/common/fmd_module.c#652> fmd_eventq_insert_at_head <http://src.opensolaris.org/source/s?defs=fmd_eventq_insert_at_head&project=onnv>(mp <http://src.opensolaris.org/source/s?defs=mp&project=onnv>->mod_queue <http://src.opensolaris.org/source/s?defs=mod_queue&project=onnv>, e);
> 653 <http://src.opensolaris.org/source/xref/onnv/onnv-gate/usr/src/cmd/fm/fmd/common/fmd_module.c#653> }
> 654 <http://src.opensolaris.org/source/xref/onnv/onnv-gate/usr/src/cmd/fm/fmd/common/fmd_module.c#654>}
> 655 <http://src.opensolaris.org/source/xref/onnv/onnv-gate/usr/src/cmd/fm/fmd/common/fmd_module.c#655>
>
>
> The stack looks like:
>
> debugging core file of fmd (32-bit) from nrchbs-s1194
> file: /usr/lib/fm/fmd/fmd
> initial argv: /usr/lib/fm/fmd/fmd
> ..
> status: process terminated by SIGABRT (Abort), pid=533 uid=0 code=-1
> > ::stack
> ...
> fmd_vpanic+0x125(8086f5c, fdb4ee44, fdb4ee48, c)
> fmd_panic+0x12(8086f5c, c, 3e8, 8070c8c)
> fmd_alloc+0xc7(c)
> fmd_eventq_insert_at_head+
> 0x2c(8413d88, 907df18, 0, feeb9c76)
> fmd_module_gc+0x6b(86eb440, 0, 1284885e, feec2c1f)
> fmd_modhash_apply+0x3c(83f6428, 8075fe4, fdb4ef48, 85fa248)
> fmd_gc+0x27(809d840)
> fmd_timerq_exec+0x109(85fa240, 0, 8, fef62440)
> ...
>
> TIA for any comments, should someone have any.
>
> Steve
>
>
>
> _______________________________________________
> oi-dev mailing list
> oi-dev at openindiana.org
> http://openindiana.org/mailman/listinfo/oi-dev
More information about the oi-dev
mailing list