[OpenIndiana-discuss] Shell to use?

Joshua M. Clulow josh at sysmgr.org
Thu Jan 21 05:35:28 UTC 2021


In the words of the late Joe Armstrong: Goodness, what a lot of mails!

On Tue, 19 Jan 2021 at 08:42, Hung Nguyen Gia <gh_origin at zohomail.com> wrote:
> Both systems are ZFS and have the same configuration (memory, cpu cores,...).
>
> I don't know how to use Solaris specific tools. But I could check some of the value via top.

If you're interested in working on an illumos distribution, you'll
need to learn to use the tools the system provides.  While top is
available, it is a coarse-grained tool which gathers a lot of
expensive information and can impact the very performance you're
trying to measure.  Some examples of tools to look at instead include:

    https://illumos.org/man/1M/mpstat  -- CPU statistics

    https://illumos.org/man/1M/iostat  -- I/O subsystem statistics

    https://illumos.org/man/1M/vmstat  -- memory statistics, including
whether you're swapping to disk

    https://illumos.org/man/1M/prstat  -- process statistics, a bit
like top, but including threading and microstate statistics

These are also relatively coarse, but are good gauges to look at while
you're searching for obvious bottlenecks or points of resource
exhaustion.  There are other tools as well.

There are a pair of books that cover (Open)Solaris, but a lot of
what's inside is still relevant to illumos systems:

    Solaris Internals: Solaris 10 and Opensolaris Kernel Architecture (2nd Ed.)
        https://www.amazon.com/gp/product/0134185978

    Solaris Performance and Tools: DTrace and MDB Techniques for
Solaris 10 and OpenSolaris
        https://www.amazon.com/gp/product/0131568191

There are also lots of blogs and other articles out there on the Internet.

> It seemed the system did the best it can. All cores are used. Memory are not left free, too. Nothing was wasted. It's in heavy load.
>
> But the system is just... slow.

This thread has ballooned to include a lot of guesses at what issues
you might be seeing, and not all of the advice is completely accurate.
Unfortunately, when the system is just not going as fast as you
believe it should, guess work is exactly the opposite of what you
need.

The most productive, and often the _only_ way, to move forward is to
examine what the computer is actually doing.  Fortunately, through
DTrace and (K)MDB, we have tools that allow you to do that:

    DTrace:
        https://illumos.org/books/dtrace/

        https://illumos.org/man/1M/dtrace

    The Modular Debugger (MDB):
        https://illumos.org/books/mdb/

        https://illumos.org/man/1/mdb
        https://illumos.org/man/1/kmdb

Of particular note is that can use DTrace to ask questions about what
the system is doing in real time; e.g.,

  - the latency of various system calls, individually or in aggregate,
for one process or all processes

  - how many processes are being forked, and at what rate

  - what commands are being run, in what order

  - for how long do they each execute before exiting

  - if they are blocking waiting for resources, what resources are they and why?

You can use DTrace for profiling hot code as well, through stack
sampling.  If the CPUs in the system are very busy and spending a lot
of time in the kernel (SYS time, etc) you might generate a Flamegraph
to see where most of the work is being performed:

    https://github.com/brendangregg/FlameGraph#dtrace

Note that this stuff is not merely theoretical.  The folks that work
on the OS for work or as a hobby use techniques like this when they
hit a problem they need to solve.  One concrete example of a
performance improvement from the last couple of years is:

    9936 atomic ops in syscall_mstate() induce significant overhead
        https://www.illumos.org/issues/9936

In this case, some statistics gathering we introduced caused a
reduction in performance.  John explored what the system was doing and
made an improvement to alleviate the issue.

I am 100% confident that whatever issues are causing your builds to
run slower than you would like can be diagnosed and improved -- and
all by you!  It takes time and effort, and absolutely rewards a
methodical engineering approach.  The most important property to
exhibit in systems software work is tenacity.

On the other hand, if you're happier running FreeBSD or Linux, you
should do that!  It's certainly true that illumos is a niche operating
system, and that we have a relatively small community.  This is true
to some extent for many other operating system projects as well.  If
illumos solves a problem for you to the extent that it's worth your
engaging with that community and helping to maintain the software,
we're here to help.


Cheers.

-- 
Joshua M. Clulow
http://blog.sysmgr.org



More information about the openindiana-discuss mailing list