[OpenIndiana-discuss] Problem with high cpu load (oi_151a)
Gernot Wolf
gw.inet at chello.at
Thu Oct 20 19:18:58 UTC 2011
Here are the results (let the script run for a few secs):
CPU ID FUNCTION:NAME
1 2 :END DEVICE TIME (ns)
i9151 22111
heci0 23119
pci-ide0 38700
uhci1 47277
hci13940 50554
uhci3 63145
uhci0 64232
uhci4 103429
ehci1 107272
ehci0 108445
uhci2 112589
e1000g0 160024
Regards,
Gernot Wolf
Am 20.10.11 20:22, schrieb Rennie Allen:
> Try the following script, which will identify any drivers with high
> interrupt load
>
> ---------------------
> #!/usr/sbin/dtrace -s
>
> sdt:::interrupt-start { self->ts = vtimestamp; }
> sdt:::interrupt-complete
> /self->ts&& arg0 != 0/
> {
> this->devi = (struct dev_info *)arg0;
> self->name = this->devi != 0 ?
> stringof(`devnamesp[this->devi->devi_major].dn_name) : "?";
> this->inst = this->devi != 0 ? this->devi->devi_instance : 0;
> @num[self->name, this->inst] = sum(vtimestamp - self->ts);
> self->name = 0;
> }
> sdt:::interrupt-complete { self->ts = 0; }
> dtrace:::END
> {
> printf("%11s %16s\n", "DEVICE", "TIME (ns)");
> printa("%10s%-3d %@16d\n", @num);
> }
> ---------------------
>
>
>
>
>
> On 10/20/11 11:07 AM, "Michael Stapleton"
> <michael.stapleton at techsologic.com> wrote:
>
>> That rules out userland.
>>
>> Sched tells me that it is not a user process. If kernel code is
>> executing on a cpu, tools will report the sched process. The count was
>> how many times the process was taken off the CPU while dtrace was
>> running.
>>
>>
>>
>> Lets see what kernel code is running the most:
>>
>> #dtrace -n 'sched:::off-cpu { @[stack()]=count()}'
>>
>> #dtrace -n 'profile-1001 { @[stack()] = count() }'
>>
>>
>>
>> On Thu, 2011-10-20 at 19:52 +0200, Gernot Wolf wrote:
>>
>>> Yeah, I've been able to run this diagnostics on another OI box (at my
>>> office, so much for OI not being used in production ;)), and noticed
>>> that there were several values that were quite different. I just don't
>>> have any idea on the meaning of this figures...
>>>
>>> Anyway, here are the results of the dtrace command (I executed the
>>> command twice, hence two result sets):
>>>
>>> gernot at tintenfass:~# dtrace -n 'sched:::off-cpu { @[execname]=count()}'
>>> dtrace: description 'sched:::off-cpu ' matched 3 probes
>>> ^C
>>>
>>> ipmgmtd 1
>>> gconfd-2 2
>>> gnome-settings-d 2
>>> idmapd 2
>>> inetd 2
>>> miniserv.pl 2
>>> netcfgd 2
>>> nscd 2
>>> ospm-applet 2
>>> ssh-agent 2
>>> sshd 2
>>> svc.startd 2
>>> intrd 3
>>> afpd 4
>>> mdnsd 4
>>> gnome-power-mana 5
>>> clock-applet 7
>>> sendmail 7
>>> xscreensaver 7
>>> fmd 9
>>> fsflush 11
>>> ntpd 11
>>> updatemanagernot 13
>>> isapython2.6 14
>>> devfsadm 20
>>> gnome-terminal 20
>>> dtrace 23
>>> mixer_applet2 25
>>> smbd 39
>>> nwam-manager 60
>>> svc.configd 79
>>> Xorg 100
>>> sched 394078
>>>
>>> gernot at tintenfass:~# dtrace -n 'sched:::off-cpu { @[execname]=count()}'
>>> dtrace: description 'sched:::off-cpu ' matched 3 probes
>>> ^C
>>>
>>> automountd 1
>>> ipmgmtd 1
>>> idmapd 2
>>> in.routed 2
>>> init 2
>>> miniserv.pl 2
>>> netcfgd 2
>>> ssh-agent 2
>>> sshd 2
>>> svc.startd 2
>>> fmd 3
>>> hald 3
>>> inetd 3
>>> intrd 3
>>> hald-addon-acpi 4
>>> nscd 4
>>> gnome-power-mana 5
>>> sendmail 5
>>> mdnsd 6
>>> devfsadm 8
>>> xscreensaver 9
>>> fsflush 10
>>> ntpd 14
>>> updatemanagernot 16
>>> mixer_applet2 21
>>> isapython2.6 22
>>> dtrace 24
>>> gnome-terminal 24
>>> smbd 39
>>> nwam-manager 58
>>> zpool-rpool 65
>>> svc.configd 79
>>> Xorg 82
>>> sched 369939
>>>
>>> So, quite obviously there is one executable standing out here, "sched",
>>> now what's the meaning of this figures?
>>>
>>> Regards,
>>> Gernot Wolf
>>>
>>>
>>> Am 20.10.11 19:22, schrieb Michael Stapleton:
>>>> Hi Gernot,
>>>>
>>>> You have a high context switch rate.
>>>>
>>>> try
>>>> #dtrace -n 'sched:::off-cpu { @[execname]=count()}'
>>>>
>>>> For a few seconds to see if you can get the name of and executable.
>>>>
>>>> Mike
>>>> On Thu, 2011-10-20 at 18:44 +0200, Gernot Wolf wrote:
>>>>
>>>>> Hello all,
>>>>>
>>>>> I have a machine here at my home running OpenIndiana oi_151a, which
>>>>> serves as a NAS on my home network. The original install was
>>> OpenSolaris
>>>>> 2009.6 which was later upgraded to snv_134b, and recently to oi_151a.
>>>>>
>>>>> So far this OSOL (now OI) box has performed excellently, with one
>>> major
>>>>> exception: Sometimes, after a reboot, the cpu load was about 50-60%,
>>>>> although the system was doing nothing. Until recently, another reboot
>>>>> solved the issue.
>>>>>
>>>>> This does not work any longer. The system has always a cpu load of
>>>>> 50-60% when idle (and higher of course when there is actually some
>>> work
>>>>> to do).
>>>>>
>>>>> I've already googled the symptoms. This didn't turn up very much
>>> useful
>>>>> info, and the few things I found didn't apply to my problem. Most
>>>>> noticably was this problem which could be solved by disabling cpupm
>>> in
>>>>> /etc/power.conf, but trying that didn't show any effect on my system.
>>>>>
>>>>> So I'm finally out of my depth. I have to admit that my knowledge of
>>>>> Unix is superficial at best, so I decided to try looking for help
>>> here.
>>>>>
>>>>> I've run several diagnostic commands like top, powertop, lockstat
>>> etc.
>>>>> and attached the results to this email (I've zipped the results of
>>> kstat
>>>>> because they were>1MB).
>>>>>
>>>>> One important thing is that when I boot into the oi_151a live dvd
>>>>> instead of booting into the installed system, I also get the high cpu
>>>>> load. I mention this because I have installed several things on my OI
>>>>> box like vsftpd, svn, netstat etc. I first thought that this problem
>>>>> might be caused by some of this extra stuff, but getting the same
>>> system
>>>>> when booting the live dvd ruled that out (I think).
>>>>>
>>>>> The machine is a custom build medium tower:
>>>>> S-775 Intel DG965WHMKR ATX mainbord
>>>>> Intel Core 2 Duo E4300 CPU 1.8GHz
>>>>> 1x IDE DVD recorder
>>>>> 1x IDE HD 200GB (serves as system drive)
>>>>> 6x SATA II 1.5TB HD (configured as zfs raidz2 array)
>>>>>
>>>>> I have to solve this problem. Although the system runs fine and
>>>>> absolutely serves it's purpose, having the cpu at 50-60% load
>>> constantly
>>>>> is a waste of energy and surely a rather unhealthy stress on the
>>> hardware.
>>>>>
>>>>> Anyone any ideas...?
>>>>>
>>>>> Regards,
>>>>> Gernot Wolf
>>>>> _______________________________________________
>>>>> OpenIndiana-discuss mailing list
>>>>> OpenIndiana-discuss at openindiana.org
>>>>> http://openindiana.org/mailman/listinfo/openindiana-discuss
>>>>
>>>>
>>>> _______________________________________________
>>>> OpenIndiana-discuss mailing list
>>>> OpenIndiana-discuss at openindiana.org
>>>> http://openindiana.org/mailman/listinfo/openindiana-discuss
>>>>
>>>
>>> _______________________________________________
>>> OpenIndiana-discuss mailing list
>>> OpenIndiana-discuss at openindiana.org
>>> http://openindiana.org/mailman/listinfo/openindiana-discuss
>>
>>
>> _______________________________________________
>> OpenIndiana-discuss mailing list
>> OpenIndiana-discuss at openindiana.org
>> http://openindiana.org/mailman/listinfo/openindiana-discuss
>
>
>
> _______________________________________________
> OpenIndiana-discuss mailing list
> OpenIndiana-discuss at openindiana.org
> http://openindiana.org/mailman/listinfo/openindiana-discuss
>
More information about the OpenIndiana-discuss
mailing list