[OpenIndiana-discuss] How to get a usable console on OI?
Joshua M. Clulow
josh at sysmgr.org
Mon Jan 25 23:36:33 UTC 2021
On Mon, 25 Jan 2021 at 11:52, Chris <oidev at bsdos.info> wrote:
> On 2021-01-24 23:10, Joshua M. Clulow via openindiana-discuss wrote:
> > # dtrace -x stackframes=100 -n '
> > profile-997 /arg0/ { @[stack()] = count(); }
> > tick-60s { exit(0); }' -o out.kern_stacks
> >
> OK I simply created an sh script (DTTRACE) with the contents above and
> fired it off as; sudo ./DTRACE &
> followed by; ls -Cla /usr/include
> which created: out.kern_stacks (attached).
>
> > That will capture the stack of what's running in the kernel (if the
> > kernel is running at the time) on each CPU, 997 times per second, for
> > 60 seconds. While that's running, kick off the "time ls" again. Take
> > the "out.kern_stacks" file and pass it through the flame graph
> > generator; e.g., something like:
> >
> > $ ./stackcollapse.pl out.kern_stacks | ./flamegraph.pl > output.svg
> The results of the above are attached as: out_kern_stacks.svg
> I has somehow expected a longer spike on the graph, as the output of
> ls -Cla /usr/include took the same ~20 seconds to finish writing to the
> screen as before.
That's great! Thank you.
I expected a bit more as well, but I think I can see what's happening.
It looks like the "nvidia" driver is closed source and built in a way
that doesn't correctly maintain the frame pointer so DTrace is not
able to walk up the stack past that point. On a machine that isn't
using the nvidia driver, it looks more like...
gfx_private`bitmap_cons_display()
gfx_private`do_gfx_ioctl+0x272
gfx_private`gfxp_fb_ioctl+0x63
vgatext`vgatext_ioctl+0xc0
genunix`cdev_ioctl+0x2b
genunix`ldi_ioctl+0x89
tem`tems_display_layered+0x37
tem`tems_safe_display+0x2d
tem`tem_safe_pix_cls_range+0x152
tem`tem_safe_pix_cls+0x4d
tem`tem_safe_clear_chars+0xb0
tem`tem_safe_scroll+0xdc
tem`tem_safe_lf+0xbd
tem`tem_safe_control+0x18d
tem`tem_safe_parse+0x53
tem`tem_safe_input_byte+0x109
tem`tem_safe_terminal_emulate+0x84
tem`tem_write+0x73
wc`wcuwsrv+0xc7
genunix`runservice+0x49
genunix`queue_service+0x41
It looks like one would only get to bitmap_cons_display() by making a
VIS_CONSDISPLAY ioctl(), perhaps via tems_display_layered(). This
routine ends up copying memory around, basically. That it's doing it
100% of the time on one CPU seems like the obvious bottleneck here.
It'd be good to know, perhaps, at what _rate_ calls to
bitmap_cons_display() are being made. You could try something like:
dtrace -q -n '
bitmap_cons_display:return { @ = count(); }
tick-1s { printf("%Y ", walltimestamp); printa("%@d", @);
printf("\n"); trunc(@); }'
I ran that on my system, and then did "echo a >/dev/wscons"
simultaneously and was able to count 91 firings...
2021 Jan 25 15:28:46
2021 Jan 25 15:28:47
2021 Jan 25 15:28:48 91
2021 Jan 25 15:28:49
...
Another thing that would be interesting to know is: if you disable the
nvidia driver completely, is performance better? Because you're not
currently using X11, I don't believe you technically need it. I think
you could try, at the boot loader, hitting escape to get to the "ok"
prompt and then...
set disable-nvidia=true
boot
It should hopefully then give you a WARNING about the "nvidia" module
being disabled at boot. Hopefully performance is at least different,
if not better, if you do that.
Cheers.
--
Joshua M. Clulow
http://blog.sysmgr.org
More information about the openindiana-discuss
mailing list