[OpenIndiana-discuss] Problem with high cpu load (oi_151a)
Gernot Wolf
gw.inet at chello.at
Thu Oct 20 19:48:23 UTC 2011
Results are up, see other post...
Regards,
Gernot Wolf
Am 20.10.11 21:00, schrieb Michael Stapleton:
> +1
>
> Mike
>
> On Thu, 2011-10-20 at 11:47 -0700, Rennie Allen wrote:
>
>> I'd like to see a run of the script I sent earlier. I don't trust
>> intrstat (not for any particular reason, other than that I have never used
>> it)...
>>
>>
>> On 10/20/11 11:33 AM, "Michael Stapleton"
>> <michael.stapleton at techsologic.com> wrote:
>>
>>> Don't know. I don't like to trouble shoot by guess if possible. I rather
>>> follow the evidence to capture the culprit. Use what we know to discover
>>> what we do not know.
>>>
>>> We know CS rate in vmstat is high, we know Sys time is high, we know
>>> syscall rate is low, we know it is not a user process therefor it is
>>> kernel. Likely a driver.
>>>
>>> So what kernel code is running the most?
>>>
>>> What's causing that code to run?
>>>
>>> Does that code belong to a driver?
>>>
>>>
>>> Mike
>>>
>>>
>>>
>>> On Thu, 2011-10-20 at 20:25 +0200, Michael Schuster wrote:
>>>
>>>> Hi,
>>>>
>>>> just found this:
>>>> http://download.oracle.com/docs/cd/E19253-01/820-5245/ghgoc/index.html
>>>>
>>>> does it help?
>>>>
>>>> On Thu, Oct 20, 2011 at 20:23, Michael Stapleton
>>>> <michael.stapleton at techsologic.com> wrote:
>>>>> My understanding is that it is not supposed to be a loaded system. We
>>>>> want to know what the load is.
>>>>>
>>>>>
>>>>> gernot at tintenfass:~# intrstat 30
>>>>>
>>>>> device | cpu0 %tim cpu1 %tim
>>>>> -------------+------------------------------
>>>>> e1000g#0 | 1 0,0 0 0,0
>>>>> ehci#0 | 0 0,0 4 0,0
>>>>> ehci#1 | 3 0,0 0 0,0
>>>>> hci1394#0 | 0 0,0 2 0,0
>>>>> i8042#1 | 0 0,0 4 0,0
>>>>> i915#1 | 0 0,0 2 0,0
>>>>> pci-ide#0 | 15 0,1 0 0,0
>>>>> uhci#0 | 0 0,0 2 0,0
>>>>> uhci#1 | 0 0,0 0 0,0
>>>>> uhci#2 | 3 0,0 0 0,0
>>>>> uhci#3 | 0 0,0 2 0,0
>>>>> uhci#4 | 0 0,0 4 0,0
>>>>>
>>>>> device | cpu0 %tim cpu1 %tim
>>>>> -------------+------------------------------
>>>>> e1000g#0 | 1 0,0 0 0,0
>>>>> ehci#0 | 0 0,0 3 0,0
>>>>> ehci#1 | 3 0,0 0 0,0
>>>>> hci1394#0 | 0 0,0 1 0,0
>>>>> i8042#1 | 0 0,0 6 0,0
>>>>> i915#1 | 0 0,0 1 0,0
>>>>> pci-ide#0 | 3 0,0 0 0,0
>>>>> uhci#0 | 0 0,0 1 0,0
>>>>> uhci#1 | 0 0,0 0 0,0
>>>>> uhci#2 | 3 0,0 0 0,0
>>>>> uhci#3 | 0 0,0 1 0,0
>>>>> uhci#4 | 0 0,0 3 0,0
>>>>>
>>>>> gernot at tintenfass:~# vmstat 5 10
>>>>> kthr memory page disk faults
>>>>> cpu
>>>>> r b w swap free re mf pi po fr de sr cd s0 s1 s2 in sy cs
>>>> us
>>>>> sy id
>>>>> 0 0 0 4243840 1145720 1 6 0 0 0 0 2 0 1 1 1 9767 121
>>>> 37073 0
>>>>> 54 46
>>>>> 0 0 0 4157824 1059796 4 11 0 0 0 0 0 0 0 0 0 9752 119
>>>> 37132 0
>>>>> 54 46
>>>>> 0 0 0 4157736 1059752 0 0 0 0 0 0 0 0 0 0 0 9769 113
>>>> 37194 0
>>>>> 54 46
>>>>> 0 0 0 4157744 1059788 0 0 0 0 0 0 0 0 0 0 0 9682 104
>>>> 36941 0
>>>>> 54 46
>>>>> 0 0 0 4157744 1059788 0 0 0 0 0 0 0 0 0 0 0 9769 105
>>>> 37208 0
>>>>> 54 46
>>>>> 0 0 0 4157728 1059772 0 1 0 0 0 0 0 0 0 0 0 9741 159
>>>> 37104 0
>>>>> 54 46
>>>>> 0 0 0 4157728 1059772 0 0 0 0 0 0 0 0 0 0 0 9695 127
>>>> 36931 0
>>>>> 54 46
>>>>> 0 0 0 4157744 1059788 0 0 0 0 0 0 0 0 0 0 0 9762 105
>>>> 37188 0
>>>>> 54 46
>>>>> 0 0 0 4157744 1059788 0 0 0 0 0 0 0 0 0 0 0 9723 102
>>>> 37058 0
>>>>> 54 46
>>>>> 0 0 0 4157744 1059788 0 0 0 0 0 0 0 0 0 0 0 9774 105
>>>> 37263 0
>>>>> 54 46
>>>>>
>>>>> Mike
>>>>>
>>>>>
>>>>> On Thu, 2011-10-20 at 11:02 -0700, Rennie Allen wrote:
>>>>>
>>>>>> Sched is the scheduler itself. How long did you let this run? If
>>>> only
>>>>>> for a couple of seconds, then that number is high, but not
>>>> ridiculous for
>>>>>> a loaded system, so I think that this output rules out a high context
>>>>>> switch rate.
>>>>>>
>>>>>> Try this command to see if some process is making an excessive
>>>> number of
>>>>>> syscalls:
>>>>>>
>>>>>> dtrace -n 'syscall:::entry { @[execname]=count()}'
>>>>>>
>>>>>> If not, then I'd try looking at interrupts...
>>>>>>
>>>>>>
>>>>>> On 10/20/11 10:52 AM, "Gernot Wolf"<gw.inet at chello.at> wrote:
>>>>>>
>>>>>>> Yeah, I've been able to run this diagnostics on another OI box (at
>>>> my
>>>>>>> office, so much for OI not being used in production ;)), and noticed
>>>>>>> that there were several values that were quite different. I just
>>>> don't
>>>>>>> have any idea on the meaning of this figures...
>>>>>>>
>>>>>>> Anyway, here are the results of the dtrace command (I executed the
>>>>>>> command twice, hence two result sets):
>>>>>>>
>>>>>>> gernot at tintenfass:~# dtrace -n 'sched:::off-cpu {
>>>> @[execname]=count()}'
>>>>>>> dtrace: description 'sched:::off-cpu ' matched 3 probes
>>>>>>> ^C
>>>>>>>
>>>>>>> ipmgmtd
>>>> 1
>>>>>>> gconfd-2
>>>> 2
>>>>>>> gnome-settings-d
>>>> 2
>>>>>>> idmapd
>>>> 2
>>>>>>> inetd
>>>> 2
>>>>>>> miniserv.pl
>>>> 2
>>>>>>> netcfgd
>>>> 2
>>>>>>> nscd
>>>> 2
>>>>>>> ospm-applet
>>>> 2
>>>>>>> ssh-agent
>>>> 2
>>>>>>> sshd
>>>> 2
>>>>>>> svc.startd
>>>> 2
>>>>>>> intrd
>>>> 3
>>>>>>> afpd
>>>> 4
>>>>>>> mdnsd
>>>> 4
>>>>>>> gnome-power-mana
>>>> 5
>>>>>>> clock-applet
>>>> 7
>>>>>>> sendmail
>>>> 7
>>>>>>> xscreensaver
>>>> 7
>>>>>>> fmd
>>>> 9
>>>>>>> fsflush
>>>> 11
>>>>>>> ntpd
>>>> 11
>>>>>>> updatemanagernot
>>>> 13
>>>>>>> isapython2.6
>>>> 14
>>>>>>> devfsadm
>>>> 20
>>>>>>> gnome-terminal
>>>> 20
>>>>>>> dtrace
>>>> 23
>>>>>>> mixer_applet2
>>>> 25
>>>>>>> smbd
>>>> 39
>>>>>>> nwam-manager
>>>> 60
>>>>>>> svc.configd
>>>> 79
>>>>>>> Xorg
>>>> 100
>>>>>>> sched
>>>> 394078
>>>>>>>
>>>>>>> gernot at tintenfass:~# dtrace -n 'sched:::off-cpu {
>>>> @[execname]=count()}'
>>>>>>> dtrace: description 'sched:::off-cpu ' matched 3 probes
>>>>>>> ^C
>>>>>>>
>>>>>>> automountd
>>>> 1
>>>>>>> ipmgmtd
>>>> 1
>>>>>>> idmapd
>>>> 2
>>>>>>> in.routed
>>>> 2
>>>>>>> init
>>>> 2
>>>>>>> miniserv.pl
>>>> 2
>>>>>>> netcfgd
>>>> 2
>>>>>>> ssh-agent
>>>> 2
>>>>>>> sshd
>>>> 2
>>>>>>> svc.startd
>>>> 2
>>>>>>> fmd
>>>> 3
>>>>>>> hald
>>>> 3
>>>>>>> inetd
>>>> 3
>>>>>>> intrd
>>>> 3
>>>>>>> hald-addon-acpi
>>>> 4
>>>>>>> nscd
>>>> 4
>>>>>>> gnome-power-mana
>>>> 5
>>>>>>> sendmail
>>>> 5
>>>>>>> mdnsd
>>>> 6
>>>>>>> devfsadm
>>>> 8
>>>>>>> xscreensaver
>>>> 9
>>>>>>> fsflush
>>>> 10
>>>>>>> ntpd
>>>> 14
>>>>>>> updatemanagernot
>>>> 16
>>>>>>> mixer_applet2
>>>> 21
>>>>>>> isapython2.6
>>>> 22
>>>>>>> dtrace
>>>> 24
>>>>>>> gnome-terminal
>>>> 24
>>>>>>> smbd
>>>> 39
>>>>>>> nwam-manager
>>>> 58
>>>>>>> zpool-rpool
>>>> 65
>>>>>>> svc.configd
>>>> 79
>>>>>>> Xorg
>>>> 82
>>>>>>> sched
>>>> 369939
>>>>>>>
>>>>>>> So, quite obviously there is one executable standing out here,
>>>> "sched",
>>>>>>> now what's the meaning of this figures?
>>>>>>>
>>>>>>> Regards,
>>>>>>> Gernot Wolf
>>>>>>>
>>>>>>>
>>>>>>> Am 20.10.11 19:22, schrieb Michael Stapleton:
>>>>>>>> Hi Gernot,
>>>>>>>>
>>>>>>>> You have a high context switch rate.
>>>>>>>>
>>>>>>>> try
>>>>>>>> #dtrace -n 'sched:::off-cpu { @[execname]=count()}'
>>>>>>>>
>>>>>>>> For a few seconds to see if you can get the name of and
>>>> executable.
>>>>>>>>
>>>>>>>> Mike
>>>>>>>> On Thu, 2011-10-20 at 18:44 +0200, Gernot Wolf wrote:
>>>>>>>>
>>>>>>>>> Hello all,
>>>>>>>>>
>>>>>>>>> I have a machine here at my home running OpenIndiana oi_151a,
>>>> which
>>>>>>>>> serves as a NAS on my home network. The original install was
>>>>>>>>> OpenSolaris
>>>>>>>>> 2009.6 which was later upgraded to snv_134b, and recently to
>>>> oi_151a.
>>>>>>>>>
>>>>>>>>> So far this OSOL (now OI) box has performed excellently, with
>>>> one major
>>>>>>>>> exception: Sometimes, after a reboot, the cpu load was about
>>>> 50-60%,
>>>>>>>>> although the system was doing nothing. Until recently, another
>>>> reboot
>>>>>>>>> solved the issue.
>>>>>>>>>
>>>>>>>>> This does not work any longer. The system has always a cpu load
>>>> of
>>>>>>>>> 50-60% when idle (and higher of course when there is actually
>>>> some work
>>>>>>>>> to do).
>>>>>>>>>
>>>>>>>>> I've already googled the symptoms. This didn't turn up very much
>>>> useful
>>>>>>>>> info, and the few things I found didn't apply to my problem. Most
>>>>>>>>> noticably was this problem which could be solved by disabling
>>>> cpupm in
>>>>>>>>> /etc/power.conf, but trying that didn't show any effect on my
>>>> system.
>>>>>>>>>
>>>>>>>>> So I'm finally out of my depth. I have to admit that my
>>>> knowledge of
>>>>>>>>> Unix is superficial at best, so I decided to try looking for
>>>> help here.
>>>>>>>>>
>>>>>>>>> I've run several diagnostic commands like top, powertop,
>>>> lockstat etc.
>>>>>>>>> and attached the results to this email (I've zipped the results
>>>> of
>>>>>>>>> kstat
>>>>>>>>> because they were>1MB).
>>>>>>>>>
>>>>>>>>> One important thing is that when I boot into the oi_151a live dvd
>>>>>>>>> instead of booting into the installed system, I also get the
>>>> high cpu
>>>>>>>>> load. I mention this because I have installed several things on
>>>> my OI
>>>>>>>>> box like vsftpd, svn, netstat etc. I first thought that this
>>>> problem
>>>>>>>>> might be caused by some of this extra stuff, but getting the same
>>>>>>>>> system
>>>>>>>>> when booting the live dvd ruled that out (I think).
>>>>>>>>>
>>>>>>>>> The machine is a custom build medium tower:
>>>>>>>>> S-775 Intel DG965WHMKR ATX mainbord
>>>>>>>>> Intel Core 2 Duo E4300 CPU 1.8GHz
>>>>>>>>> 1x IDE DVD recorder
>>>>>>>>> 1x IDE HD 200GB (serves as system drive)
>>>>>>>>> 6x SATA II 1.5TB HD (configured as zfs raidz2 array)
>>>>>>>>>
>>>>>>>>> I have to solve this problem. Although the system runs fine and
>>>>>>>>> absolutely serves it's purpose, having the cpu at 50-60% load
>>>>>>>>> constantly
>>>>>>>>> is a waste of energy and surely a rather unhealthy stress on the
>>>>>>>>> hardware.
>>>>>>>>>
>>>>>>>>> Anyone any ideas...?
>>>>>>>>>
>>>>>>>>> Regards,
>>>>>>>>> Gernot Wolf
>>>>>>>>> _______________________________________________
>>>>>>>>> OpenIndiana-discuss mailing list
>>>>>>>>> OpenIndiana-discuss at openindiana.org
>>>>>>>>> http://openindiana.org/mailman/listinfo/openindiana-discuss
>>>>>>>>
>>>>>>>>
>>>>>>>> _______________________________________________
>>>>>>>> OpenIndiana-discuss mailing list
>>>>>>>> OpenIndiana-discuss at openindiana.org
>>>>>>>> http://openindiana.org/mailman/listinfo/openindiana-discuss
>>>>>>>>
>>>>>>>
>>>>>>> _______________________________________________
>>>>>>> OpenIndiana-discuss mailing list
>>>>>>> OpenIndiana-discuss at openindiana.org
>>>>>>> http://openindiana.org/mailman/listinfo/openindiana-discuss
>>>>>>
>>>>>>
>>>>>>
>>>>>> _______________________________________________
>>>>>> OpenIndiana-discuss mailing list
>>>>>> OpenIndiana-discuss at openindiana.org
>>>>>> http://openindiana.org/mailman/listinfo/openindiana-discuss
>>>>>
>>>>>
>>>>> _______________________________________________
>>>>> OpenIndiana-discuss mailing list
>>>>> OpenIndiana-discuss at openindiana.org
>>>>> http://openindiana.org/mailman/listinfo/openindiana-discuss
>>>>>
>>>>
>>>>
>>>>
>>>
>>>
>>> _______________________________________________
>>> OpenIndiana-discuss mailing list
>>> OpenIndiana-discuss at openindiana.org
>>> http://openindiana.org/mailman/listinfo/openindiana-discuss
>>
>>
>>
>> _______________________________________________
>> OpenIndiana-discuss mailing list
>> OpenIndiana-discuss at openindiana.org
>> http://openindiana.org/mailman/listinfo/openindiana-discuss
>
>
> _______________________________________________
> OpenIndiana-discuss mailing list
> OpenIndiana-discuss at openindiana.org
> http://openindiana.org/mailman/listinfo/openindiana-discuss
>
More information about the OpenIndiana-discuss
mailing list