[OpenIndiana-discuss] Problem with high cpu load (oi_151a)
Michael Stapleton
michael.stapleton at techsologic.com
Thu Oct 20 19:00:20 UTC 2011
+1
Mike
On Thu, 2011-10-20 at 11:47 -0700, Rennie Allen wrote:
> I'd like to see a run of the script I sent earlier. I don't trust
> intrstat (not for any particular reason, other than that I have never used
> it)...
>
>
> On 10/20/11 11:33 AM, "Michael Stapleton"
> <michael.stapleton at techsologic.com> wrote:
>
> >Don't know. I don't like to trouble shoot by guess if possible. I rather
> >follow the evidence to capture the culprit. Use what we know to discover
> >what we do not know.
> >
> >We know CS rate in vmstat is high, we know Sys time is high, we know
> >syscall rate is low, we know it is not a user process therefor it is
> >kernel. Likely a driver.
> >
> >So what kernel code is running the most?
> >
> >What's causing that code to run?
> >
> >Does that code belong to a driver?
> >
> >
> >Mike
> >
> >
> >
> >On Thu, 2011-10-20 at 20:25 +0200, Michael Schuster wrote:
> >
> >> Hi,
> >>
> >> just found this:
> >> http://download.oracle.com/docs/cd/E19253-01/820-5245/ghgoc/index.html
> >>
> >> does it help?
> >>
> >> On Thu, Oct 20, 2011 at 20:23, Michael Stapleton
> >> <michael.stapleton at techsologic.com> wrote:
> >> > My understanding is that it is not supposed to be a loaded system. We
> >> > want to know what the load is.
> >> >
> >> >
> >> > gernot at tintenfass:~# intrstat 30
> >> >
> >> > device | cpu0 %tim cpu1 %tim
> >> > -------------+------------------------------
> >> > e1000g#0 | 1 0,0 0 0,0
> >> > ehci#0 | 0 0,0 4 0,0
> >> > ehci#1 | 3 0,0 0 0,0
> >> > hci1394#0 | 0 0,0 2 0,0
> >> > i8042#1 | 0 0,0 4 0,0
> >> > i915#1 | 0 0,0 2 0,0
> >> > pci-ide#0 | 15 0,1 0 0,0
> >> > uhci#0 | 0 0,0 2 0,0
> >> > uhci#1 | 0 0,0 0 0,0
> >> > uhci#2 | 3 0,0 0 0,0
> >> > uhci#3 | 0 0,0 2 0,0
> >> > uhci#4 | 0 0,0 4 0,0
> >> >
> >> > device | cpu0 %tim cpu1 %tim
> >> > -------------+------------------------------
> >> > e1000g#0 | 1 0,0 0 0,0
> >> > ehci#0 | 0 0,0 3 0,0
> >> > ehci#1 | 3 0,0 0 0,0
> >> > hci1394#0 | 0 0,0 1 0,0
> >> > i8042#1 | 0 0,0 6 0,0
> >> > i915#1 | 0 0,0 1 0,0
> >> > pci-ide#0 | 3 0,0 0 0,0
> >> > uhci#0 | 0 0,0 1 0,0
> >> > uhci#1 | 0 0,0 0 0,0
> >> > uhci#2 | 3 0,0 0 0,0
> >> > uhci#3 | 0 0,0 1 0,0
> >> > uhci#4 | 0 0,0 3 0,0
> >> >
> >> > gernot at tintenfass:~# vmstat 5 10
> >> > kthr memory page disk faults
> >> > cpu
> >> > r b w swap free re mf pi po fr de sr cd s0 s1 s2 in sy cs
> >>us
> >> > sy id
> >> > 0 0 0 4243840 1145720 1 6 0 0 0 0 2 0 1 1 1 9767 121
> >>37073 0
> >> > 54 46
> >> > 0 0 0 4157824 1059796 4 11 0 0 0 0 0 0 0 0 0 9752 119
> >>37132 0
> >> > 54 46
> >> > 0 0 0 4157736 1059752 0 0 0 0 0 0 0 0 0 0 0 9769 113
> >>37194 0
> >> > 54 46
> >> > 0 0 0 4157744 1059788 0 0 0 0 0 0 0 0 0 0 0 9682 104
> >>36941 0
> >> > 54 46
> >> > 0 0 0 4157744 1059788 0 0 0 0 0 0 0 0 0 0 0 9769 105
> >>37208 0
> >> > 54 46
> >> > 0 0 0 4157728 1059772 0 1 0 0 0 0 0 0 0 0 0 9741 159
> >>37104 0
> >> > 54 46
> >> > 0 0 0 4157728 1059772 0 0 0 0 0 0 0 0 0 0 0 9695 127
> >>36931 0
> >> > 54 46
> >> > 0 0 0 4157744 1059788 0 0 0 0 0 0 0 0 0 0 0 9762 105
> >>37188 0
> >> > 54 46
> >> > 0 0 0 4157744 1059788 0 0 0 0 0 0 0 0 0 0 0 9723 102
> >>37058 0
> >> > 54 46
> >> > 0 0 0 4157744 1059788 0 0 0 0 0 0 0 0 0 0 0 9774 105
> >>37263 0
> >> > 54 46
> >> >
> >> > Mike
> >> >
> >> >
> >> > On Thu, 2011-10-20 at 11:02 -0700, Rennie Allen wrote:
> >> >
> >> >> Sched is the scheduler itself. How long did you let this run? If
> >>only
> >> >> for a couple of seconds, then that number is high, but not
> >>ridiculous for
> >> >> a loaded system, so I think that this output rules out a high context
> >> >> switch rate.
> >> >>
> >> >> Try this command to see if some process is making an excessive
> >>number of
> >> >> syscalls:
> >> >>
> >> >> dtrace -n 'syscall:::entry { @[execname]=count()}'
> >> >>
> >> >> If not, then I'd try looking at interrupts...
> >> >>
> >> >>
> >> >> On 10/20/11 10:52 AM, "Gernot Wolf" <gw.inet at chello.at> wrote:
> >> >>
> >> >> >Yeah, I've been able to run this diagnostics on another OI box (at
> >>my
> >> >> >office, so much for OI not being used in production ;)), and noticed
> >> >> >that there were several values that were quite different. I just
> >>don't
> >> >> >have any idea on the meaning of this figures...
> >> >> >
> >> >> >Anyway, here are the results of the dtrace command (I executed the
> >> >> >command twice, hence two result sets):
> >> >> >
> >> >> >gernot at tintenfass:~# dtrace -n 'sched:::off-cpu {
> >>@[execname]=count()}'
> >> >> >dtrace: description 'sched:::off-cpu ' matched 3 probes
> >> >> >^C
> >> >> >
> >> >> > ipmgmtd
> >> 1
> >> >> > gconfd-2
> >> 2
> >> >> > gnome-settings-d
> >> 2
> >> >> > idmapd
> >> 2
> >> >> > inetd
> >> 2
> >> >> > miniserv.pl
> >> 2
> >> >> > netcfgd
> >> 2
> >> >> > nscd
> >> 2
> >> >> > ospm-applet
> >> 2
> >> >> > ssh-agent
> >> 2
> >> >> > sshd
> >> 2
> >> >> > svc.startd
> >> 2
> >> >> > intrd
> >> 3
> >> >> > afpd
> >> 4
> >> >> > mdnsd
> >> 4
> >> >> > gnome-power-mana
> >> 5
> >> >> > clock-applet
> >> 7
> >> >> > sendmail
> >> 7
> >> >> > xscreensaver
> >> 7
> >> >> > fmd
> >> 9
> >> >> > fsflush
> >>11
> >> >> > ntpd
> >>11
> >> >> > updatemanagernot
> >>13
> >> >> > isapython2.6
> >>14
> >> >> > devfsadm
> >>20
> >> >> > gnome-terminal
> >>20
> >> >> > dtrace
> >>23
> >> >> > mixer_applet2
> >>25
> >> >> > smbd
> >>39
> >> >> > nwam-manager
> >>60
> >> >> > svc.configd
> >>79
> >> >> > Xorg
> >>100
> >> >> > sched
> >>394078
> >> >> >
> >> >> >gernot at tintenfass:~# dtrace -n 'sched:::off-cpu {
> >>@[execname]=count()}'
> >> >> >dtrace: description 'sched:::off-cpu ' matched 3 probes
> >> >> >^C
> >> >> >
> >> >> > automountd
> >> 1
> >> >> > ipmgmtd
> >> 1
> >> >> > idmapd
> >> 2
> >> >> > in.routed
> >> 2
> >> >> > init
> >> 2
> >> >> > miniserv.pl
> >> 2
> >> >> > netcfgd
> >> 2
> >> >> > ssh-agent
> >> 2
> >> >> > sshd
> >> 2
> >> >> > svc.startd
> >> 2
> >> >> > fmd
> >> 3
> >> >> > hald
> >> 3
> >> >> > inetd
> >> 3
> >> >> > intrd
> >> 3
> >> >> > hald-addon-acpi
> >> 4
> >> >> > nscd
> >> 4
> >> >> > gnome-power-mana
> >> 5
> >> >> > sendmail
> >> 5
> >> >> > mdnsd
> >> 6
> >> >> > devfsadm
> >> 8
> >> >> > xscreensaver
> >> 9
> >> >> > fsflush
> >>10
> >> >> > ntpd
> >>14
> >> >> > updatemanagernot
> >>16
> >> >> > mixer_applet2
> >>21
> >> >> > isapython2.6
> >>22
> >> >> > dtrace
> >>24
> >> >> > gnome-terminal
> >>24
> >> >> > smbd
> >>39
> >> >> > nwam-manager
> >>58
> >> >> > zpool-rpool
> >>65
> >> >> > svc.configd
> >>79
> >> >> > Xorg
> >>82
> >> >> > sched
> >>369939
> >> >> >
> >> >> >So, quite obviously there is one executable standing out here,
> >>"sched",
> >> >> >now what's the meaning of this figures?
> >> >> >
> >> >> >Regards,
> >> >> >Gernot Wolf
> >> >> >
> >> >> >
> >> >> >Am 20.10.11 19:22, schrieb Michael Stapleton:
> >> >> >> Hi Gernot,
> >> >> >>
> >> >> >> You have a high context switch rate.
> >> >> >>
> >> >> >> try
> >> >> >> #dtrace -n 'sched:::off-cpu { @[execname]=count()}'
> >> >> >>
> >> >> >> For a few seconds to see if you can get the name of and
> >>executable.
> >> >> >>
> >> >> >> Mike
> >> >> >> On Thu, 2011-10-20 at 18:44 +0200, Gernot Wolf wrote:
> >> >> >>
> >> >> >>> Hello all,
> >> >> >>>
> >> >> >>> I have a machine here at my home running OpenIndiana oi_151a,
> >>which
> >> >> >>> serves as a NAS on my home network. The original install was
> >> >> >>>OpenSolaris
> >> >> >>> 2009.6 which was later upgraded to snv_134b, and recently to
> >>oi_151a.
> >> >> >>>
> >> >> >>> So far this OSOL (now OI) box has performed excellently, with
> >>one major
> >> >> >>> exception: Sometimes, after a reboot, the cpu load was about
> >>50-60%,
> >> >> >>> although the system was doing nothing. Until recently, another
> >>reboot
> >> >> >>> solved the issue.
> >> >> >>>
> >> >> >>> This does not work any longer. The system has always a cpu load
> >>of
> >> >> >>> 50-60% when idle (and higher of course when there is actually
> >>some work
> >> >> >>> to do).
> >> >> >>>
> >> >> >>> I've already googled the symptoms. This didn't turn up very much
> >>useful
> >> >> >>> info, and the few things I found didn't apply to my problem. Most
> >> >> >>> noticably was this problem which could be solved by disabling
> >>cpupm in
> >> >> >>> /etc/power.conf, but trying that didn't show any effect on my
> >>system.
> >> >> >>>
> >> >> >>> So I'm finally out of my depth. I have to admit that my
> >>knowledge of
> >> >> >>> Unix is superficial at best, so I decided to try looking for
> >>help here.
> >> >> >>>
> >> >> >>> I've run several diagnostic commands like top, powertop,
> >>lockstat etc.
> >> >> >>> and attached the results to this email (I've zipped the results
> >>of
> >> >> >>>kstat
> >> >> >>> because they were>1MB).
> >> >> >>>
> >> >> >>> One important thing is that when I boot into the oi_151a live dvd
> >> >> >>> instead of booting into the installed system, I also get the
> >>high cpu
> >> >> >>> load. I mention this because I have installed several things on
> >>my OI
> >> >> >>> box like vsftpd, svn, netstat etc. I first thought that this
> >>problem
> >> >> >>> might be caused by some of this extra stuff, but getting the same
> >> >> >>>system
> >> >> >>> when booting the live dvd ruled that out (I think).
> >> >> >>>
> >> >> >>> The machine is a custom build medium tower:
> >> >> >>> S-775 Intel DG965WHMKR ATX mainbord
> >> >> >>> Intel Core 2 Duo E4300 CPU 1.8GHz
> >> >> >>> 1x IDE DVD recorder
> >> >> >>> 1x IDE HD 200GB (serves as system drive)
> >> >> >>> 6x SATA II 1.5TB HD (configured as zfs raidz2 array)
> >> >> >>>
> >> >> >>> I have to solve this problem. Although the system runs fine and
> >> >> >>> absolutely serves it's purpose, having the cpu at 50-60% load
> >> >> >>>constantly
> >> >> >>> is a waste of energy and surely a rather unhealthy stress on the
> >> >> >>>hardware.
> >> >> >>>
> >> >> >>> Anyone any ideas...?
> >> >> >>>
> >> >> >>> Regards,
> >> >> >>> Gernot Wolf
> >> >> >>> _______________________________________________
> >> >> >>> OpenIndiana-discuss mailing list
> >> >> >>> OpenIndiana-discuss at openindiana.org
> >> >> >>> http://openindiana.org/mailman/listinfo/openindiana-discuss
> >> >> >>
> >> >> >>
> >> >> >> _______________________________________________
> >> >> >> OpenIndiana-discuss mailing list
> >> >> >> OpenIndiana-discuss at openindiana.org
> >> >> >> http://openindiana.org/mailman/listinfo/openindiana-discuss
> >> >> >>
> >> >> >
> >> >> >_______________________________________________
> >> >> >OpenIndiana-discuss mailing list
> >> >> >OpenIndiana-discuss at openindiana.org
> >> >> >http://openindiana.org/mailman/listinfo/openindiana-discuss
> >> >>
> >> >>
> >> >>
> >> >> _______________________________________________
> >> >> OpenIndiana-discuss mailing list
> >> >> OpenIndiana-discuss at openindiana.org
> >> >> http://openindiana.org/mailman/listinfo/openindiana-discuss
> >> >
> >> >
> >> > _______________________________________________
> >> > OpenIndiana-discuss mailing list
> >> > OpenIndiana-discuss at openindiana.org
> >> > http://openindiana.org/mailman/listinfo/openindiana-discuss
> >> >
> >>
> >>
> >>
> >
> >
> >_______________________________________________
> >OpenIndiana-discuss mailing list
> >OpenIndiana-discuss at openindiana.org
> >http://openindiana.org/mailman/listinfo/openindiana-discuss
>
>
>
> _______________________________________________
> OpenIndiana-discuss mailing list
> OpenIndiana-discuss at openindiana.org
> http://openindiana.org/mailman/listinfo/openindiana-discuss
More information about the OpenIndiana-discuss
mailing list