[OpenIndiana-discuss] Problem with high cpu load (oi_151a)

Gernot Wolf gw.inet at chello.at
Thu Oct 20 19:48:23 UTC 2011


Results are up, see other post...

Regards,
Gernot Wolf


Am 20.10.11 21:00, schrieb Michael Stapleton:
> +1
>
> Mike
>
> On Thu, 2011-10-20 at 11:47 -0700, Rennie Allen wrote:
>
>> I'd like to see a run of the script I sent earlier.  I don't trust
>> intrstat (not for any particular reason, other than that I have never used
>> it)...
>>
>>
>> On 10/20/11 11:33 AM, "Michael Stapleton"
>> <michael.stapleton at techsologic.com>  wrote:
>>
>>> Don't know. I don't like to trouble shoot by guess if possible. I rather
>>> follow the evidence to capture the culprit. Use what we know to discover
>>> what we do not know.
>>>
>>> We know CS rate in vmstat is high, we know Sys time is high, we know
>>> syscall rate is low, we know it is not a user process therefor it is
>>> kernel. Likely a driver.
>>>
>>> So what kernel code is running the most?
>>>
>>> What's causing that code to run?
>>>
>>> Does that code belong to a driver?
>>>
>>>
>>> Mike
>>>
>>>
>>>
>>> On Thu, 2011-10-20 at 20:25 +0200, Michael Schuster wrote:
>>>
>>>> Hi,
>>>>
>>>> just found this:
>>>> http://download.oracle.com/docs/cd/E19253-01/820-5245/ghgoc/index.html
>>>>
>>>> does it help?
>>>>
>>>> On Thu, Oct 20, 2011 at 20:23, Michael Stapleton
>>>> <michael.stapleton at techsologic.com>  wrote:
>>>>> My understanding is that it is not supposed to be a loaded system. We
>>>>> want to know what the load is.
>>>>>
>>>>>
>>>>> gernot at tintenfass:~# intrstat 30
>>>>>
>>>>>       device |      cpu0 %tim      cpu1 %tim
>>>>> -------------+------------------------------
>>>>>     e1000g#0 |         1  0,0         0  0,0
>>>>>       ehci#0 |         0  0,0         4  0,0
>>>>>       ehci#1 |         3  0,0         0  0,0
>>>>>    hci1394#0 |         0  0,0         2  0,0
>>>>>      i8042#1 |         0  0,0         4  0,0
>>>>>       i915#1 |         0  0,0         2  0,0
>>>>>    pci-ide#0 |        15  0,1         0  0,0
>>>>>       uhci#0 |         0  0,0         2  0,0
>>>>>       uhci#1 |         0  0,0         0  0,0
>>>>>       uhci#2 |         3  0,0         0  0,0
>>>>>       uhci#3 |         0  0,0         2  0,0
>>>>>       uhci#4 |         0  0,0         4  0,0
>>>>>
>>>>>       device |      cpu0 %tim      cpu1 %tim
>>>>> -------------+------------------------------
>>>>>     e1000g#0 |         1  0,0         0  0,0
>>>>>       ehci#0 |         0  0,0         3  0,0
>>>>>       ehci#1 |         3  0,0         0  0,0
>>>>>    hci1394#0 |         0  0,0         1  0,0
>>>>>      i8042#1 |         0  0,0         6  0,0
>>>>>       i915#1 |         0  0,0         1  0,0
>>>>>    pci-ide#0 |         3  0,0         0  0,0
>>>>>       uhci#0 |         0  0,0         1  0,0
>>>>>       uhci#1 |         0  0,0         0  0,0
>>>>>       uhci#2 |         3  0,0         0  0,0
>>>>>       uhci#3 |         0  0,0         1  0,0
>>>>>       uhci#4 |         0  0,0         3  0,0
>>>>>
>>>>> gernot at tintenfass:~# vmstat 5 10
>>>>>   kthr      memory            page            disk          faults
>>>>> cpu
>>>>>   r b w   swap  free  re  mf pi po fr de sr cd s0 s1 s2   in   sy   cs
>>>> us
>>>>> sy id
>>>>>   0 0 0 4243840 1145720 1  6  0  0  0  0  2  0  1  1  1 9767  121
>>>> 37073 0
>>>>> 54 46
>>>>>   0 0 0 4157824 1059796 4 11  0  0  0  0  0  0  0  0  0 9752  119
>>>> 37132 0
>>>>> 54 46
>>>>>   0 0 0 4157736 1059752 0  0  0  0  0  0  0  0  0  0  0 9769  113
>>>> 37194 0
>>>>> 54 46
>>>>>   0 0 0 4157744 1059788 0  0  0  0  0  0  0  0  0  0  0 9682  104
>>>> 36941 0
>>>>> 54 46
>>>>>   0 0 0 4157744 1059788 0  0  0  0  0  0  0  0  0  0  0 9769  105
>>>> 37208 0
>>>>> 54 46
>>>>>   0 0 0 4157728 1059772 0  1  0  0  0  0  0  0  0  0  0 9741  159
>>>> 37104 0
>>>>> 54 46
>>>>>   0 0 0 4157728 1059772 0  0  0  0  0  0  0  0  0  0  0 9695  127
>>>> 36931 0
>>>>> 54 46
>>>>>   0 0 0 4157744 1059788 0  0  0  0  0  0  0  0  0  0  0 9762  105
>>>> 37188 0
>>>>> 54 46
>>>>>   0 0 0 4157744 1059788 0  0  0  0  0  0  0  0  0  0  0 9723  102
>>>> 37058 0
>>>>> 54 46
>>>>>   0 0 0 4157744 1059788 0  0  0  0  0  0  0  0  0  0  0 9774  105
>>>> 37263 0
>>>>> 54 46
>>>>>
>>>>> Mike
>>>>>
>>>>>
>>>>> On Thu, 2011-10-20 at 11:02 -0700, Rennie Allen wrote:
>>>>>
>>>>>> Sched is the scheduler itself.  How long did you let this run?  If
>>>> only
>>>>>> for a couple of seconds, then that number is high, but not
>>>> ridiculous for
>>>>>> a loaded system, so I think that this output rules out a high context
>>>>>> switch rate.
>>>>>>
>>>>>> Try this command to see if some process is making an excessive
>>>> number of
>>>>>> syscalls:
>>>>>>
>>>>>> dtrace -n 'syscall:::entry { @[execname]=count()}'
>>>>>>
>>>>>> If not, then I'd try looking at interrupts...
>>>>>>
>>>>>>
>>>>>> On 10/20/11 10:52 AM, "Gernot Wolf"<gw.inet at chello.at>  wrote:
>>>>>>
>>>>>>> Yeah, I've been able to run this diagnostics on another OI box (at
>>>> my
>>>>>>> office, so much for OI not being used in production ;)), and noticed
>>>>>>> that there were several values that were quite different. I just
>>>> don't
>>>>>>> have any idea on the meaning of this figures...
>>>>>>>
>>>>>>> Anyway, here are the results of the dtrace command (I executed the
>>>>>>> command twice, hence two result sets):
>>>>>>>
>>>>>>> gernot at tintenfass:~# dtrace -n 'sched:::off-cpu {
>>>> @[execname]=count()}'
>>>>>>> dtrace: description 'sched:::off-cpu ' matched 3 probes
>>>>>>> ^C
>>>>>>>
>>>>>>>    ipmgmtd
>>>> 1
>>>>>>>    gconfd-2
>>>> 2
>>>>>>>    gnome-settings-d
>>>> 2
>>>>>>>    idmapd
>>>> 2
>>>>>>>    inetd
>>>> 2
>>>>>>>    miniserv.pl
>>>> 2
>>>>>>>    netcfgd
>>>> 2
>>>>>>>    nscd
>>>> 2
>>>>>>>    ospm-applet
>>>> 2
>>>>>>>    ssh-agent
>>>> 2
>>>>>>>    sshd
>>>> 2
>>>>>>>    svc.startd
>>>> 2
>>>>>>>    intrd
>>>> 3
>>>>>>>    afpd
>>>> 4
>>>>>>>    mdnsd
>>>> 4
>>>>>>>    gnome-power-mana
>>>> 5
>>>>>>>    clock-applet
>>>> 7
>>>>>>>    sendmail
>>>> 7
>>>>>>>    xscreensaver
>>>> 7
>>>>>>>    fmd
>>>> 9
>>>>>>>    fsflush
>>>> 11
>>>>>>>    ntpd
>>>> 11
>>>>>>>    updatemanagernot
>>>> 13
>>>>>>>    isapython2.6
>>>> 14
>>>>>>>    devfsadm
>>>> 20
>>>>>>>    gnome-terminal
>>>> 20
>>>>>>>    dtrace
>>>> 23
>>>>>>>    mixer_applet2
>>>> 25
>>>>>>>    smbd
>>>> 39
>>>>>>>    nwam-manager
>>>> 60
>>>>>>>    svc.configd
>>>> 79
>>>>>>>    Xorg
>>>> 100
>>>>>>>    sched
>>>> 394078
>>>>>>>
>>>>>>> gernot at tintenfass:~# dtrace -n 'sched:::off-cpu {
>>>> @[execname]=count()}'
>>>>>>> dtrace: description 'sched:::off-cpu ' matched 3 probes
>>>>>>> ^C
>>>>>>>
>>>>>>>    automountd
>>>> 1
>>>>>>>    ipmgmtd
>>>> 1
>>>>>>>    idmapd
>>>> 2
>>>>>>>    in.routed
>>>> 2
>>>>>>>    init
>>>> 2
>>>>>>>    miniserv.pl
>>>> 2
>>>>>>>    netcfgd
>>>> 2
>>>>>>>    ssh-agent
>>>> 2
>>>>>>>    sshd
>>>> 2
>>>>>>>    svc.startd
>>>> 2
>>>>>>>    fmd
>>>> 3
>>>>>>>    hald
>>>> 3
>>>>>>>    inetd
>>>> 3
>>>>>>>    intrd
>>>> 3
>>>>>>>    hald-addon-acpi
>>>> 4
>>>>>>>    nscd
>>>> 4
>>>>>>>    gnome-power-mana
>>>> 5
>>>>>>>    sendmail
>>>> 5
>>>>>>>    mdnsd
>>>> 6
>>>>>>>    devfsadm
>>>> 8
>>>>>>>    xscreensaver
>>>> 9
>>>>>>>    fsflush
>>>> 10
>>>>>>>    ntpd
>>>> 14
>>>>>>>    updatemanagernot
>>>> 16
>>>>>>>    mixer_applet2
>>>> 21
>>>>>>>    isapython2.6
>>>> 22
>>>>>>>    dtrace
>>>> 24
>>>>>>>    gnome-terminal
>>>> 24
>>>>>>>    smbd
>>>> 39
>>>>>>>    nwam-manager
>>>> 58
>>>>>>>    zpool-rpool
>>>> 65
>>>>>>>    svc.configd
>>>> 79
>>>>>>>    Xorg
>>>> 82
>>>>>>>    sched
>>>> 369939
>>>>>>>
>>>>>>> So, quite obviously there is one executable standing out here,
>>>> "sched",
>>>>>>> now what's the meaning of this figures?
>>>>>>>
>>>>>>> Regards,
>>>>>>> Gernot Wolf
>>>>>>>
>>>>>>>
>>>>>>> Am 20.10.11 19:22, schrieb Michael Stapleton:
>>>>>>>> Hi Gernot,
>>>>>>>>
>>>>>>>> You have a high context switch rate.
>>>>>>>>
>>>>>>>> try
>>>>>>>> #dtrace -n 'sched:::off-cpu { @[execname]=count()}'
>>>>>>>>
>>>>>>>> For a few seconds to see if you can get the name of and
>>>> executable.
>>>>>>>>
>>>>>>>> Mike
>>>>>>>> On Thu, 2011-10-20 at 18:44 +0200, Gernot Wolf wrote:
>>>>>>>>
>>>>>>>>> Hello all,
>>>>>>>>>
>>>>>>>>> I have a machine here at my home running OpenIndiana oi_151a,
>>>> which
>>>>>>>>> serves as a NAS on my home network. The original install was
>>>>>>>>> OpenSolaris
>>>>>>>>> 2009.6 which was later upgraded to snv_134b, and recently to
>>>> oi_151a.
>>>>>>>>>
>>>>>>>>> So far this OSOL (now OI) box has performed excellently, with
>>>> one major
>>>>>>>>> exception: Sometimes, after a reboot, the cpu load was about
>>>> 50-60%,
>>>>>>>>> although the system was doing nothing. Until recently, another
>>>> reboot
>>>>>>>>> solved the issue.
>>>>>>>>>
>>>>>>>>> This does not work any longer. The system has always a cpu load
>>>> of
>>>>>>>>> 50-60% when idle (and higher of course when there is actually
>>>> some work
>>>>>>>>> to do).
>>>>>>>>>
>>>>>>>>> I've already googled the symptoms. This didn't turn up very much
>>>> useful
>>>>>>>>> info, and the few things I found didn't apply to my problem. Most
>>>>>>>>> noticably was this problem which could be solved by disabling
>>>> cpupm in
>>>>>>>>> /etc/power.conf, but trying that didn't show any effect on my
>>>> system.
>>>>>>>>>
>>>>>>>>> So I'm finally out of my depth. I have to admit that my
>>>> knowledge of
>>>>>>>>> Unix is superficial at best, so I decided to try looking for
>>>> help here.
>>>>>>>>>
>>>>>>>>> I've run several diagnostic commands like top, powertop,
>>>> lockstat etc.
>>>>>>>>> and attached the results to this email (I've zipped the results
>>>> of
>>>>>>>>> kstat
>>>>>>>>> because they were>1MB).
>>>>>>>>>
>>>>>>>>> One important thing is that when I boot into the oi_151a live dvd
>>>>>>>>> instead of booting into the installed system, I also get the
>>>> high cpu
>>>>>>>>> load. I mention this because I have installed several things on
>>>> my OI
>>>>>>>>> box like vsftpd, svn, netstat etc. I first thought that this
>>>> problem
>>>>>>>>> might be caused by some of this extra stuff, but getting the same
>>>>>>>>> system
>>>>>>>>> when booting the live dvd ruled that out (I think).
>>>>>>>>>
>>>>>>>>> The machine is a custom build medium tower:
>>>>>>>>> S-775 Intel DG965WHMKR ATX mainbord
>>>>>>>>> Intel Core 2 Duo E4300 CPU 1.8GHz
>>>>>>>>> 1x IDE DVD recorder
>>>>>>>>> 1x IDE HD 200GB (serves as system drive)
>>>>>>>>> 6x SATA II 1.5TB HD (configured as zfs raidz2 array)
>>>>>>>>>
>>>>>>>>> I have to solve this problem. Although the system runs fine and
>>>>>>>>> absolutely serves it's purpose, having the cpu at 50-60% load
>>>>>>>>> constantly
>>>>>>>>> is a waste of energy and surely a rather unhealthy stress on the
>>>>>>>>> hardware.
>>>>>>>>>
>>>>>>>>> Anyone any ideas...?
>>>>>>>>>
>>>>>>>>> Regards,
>>>>>>>>> Gernot Wolf
>>>>>>>>> _______________________________________________
>>>>>>>>> OpenIndiana-discuss mailing list
>>>>>>>>> OpenIndiana-discuss at openindiana.org
>>>>>>>>> http://openindiana.org/mailman/listinfo/openindiana-discuss
>>>>>>>>
>>>>>>>>
>>>>>>>> _______________________________________________
>>>>>>>> OpenIndiana-discuss mailing list
>>>>>>>> OpenIndiana-discuss at openindiana.org
>>>>>>>> http://openindiana.org/mailman/listinfo/openindiana-discuss
>>>>>>>>
>>>>>>>
>>>>>>> _______________________________________________
>>>>>>> OpenIndiana-discuss mailing list
>>>>>>> OpenIndiana-discuss at openindiana.org
>>>>>>> http://openindiana.org/mailman/listinfo/openindiana-discuss
>>>>>>
>>>>>>
>>>>>>
>>>>>> _______________________________________________
>>>>>> OpenIndiana-discuss mailing list
>>>>>> OpenIndiana-discuss at openindiana.org
>>>>>> http://openindiana.org/mailman/listinfo/openindiana-discuss
>>>>>
>>>>>
>>>>> _______________________________________________
>>>>> OpenIndiana-discuss mailing list
>>>>> OpenIndiana-discuss at openindiana.org
>>>>> http://openindiana.org/mailman/listinfo/openindiana-discuss
>>>>>
>>>>
>>>>
>>>>
>>>
>>>
>>> _______________________________________________
>>> OpenIndiana-discuss mailing list
>>> OpenIndiana-discuss at openindiana.org
>>> http://openindiana.org/mailman/listinfo/openindiana-discuss
>>
>>
>>
>> _______________________________________________
>> OpenIndiana-discuss mailing list
>> OpenIndiana-discuss at openindiana.org
>> http://openindiana.org/mailman/listinfo/openindiana-discuss
>
>
> _______________________________________________
> OpenIndiana-discuss mailing list
> OpenIndiana-discuss at openindiana.org
> http://openindiana.org/mailman/listinfo/openindiana-discuss
>



More information about the OpenIndiana-discuss mailing list