smf(5): a view from the moon

One interesting aspect of smf(5) is that we have pulled apart many of the assumed interrelationships between system services, and made them explicit. Doing this makes building availability and failure models much easier, but it also lets us see one projection of Solaris’s shambling shape. (There’s another interesting technique for dynamic discovery of relationships via DTrace, but I’ll let Bryan show the image from those experiments when he’s ready.) Everyone wanted to visualize the service graph that results, so Dan Price and David Bustos came up with a way to generate one. Here’s the result, generated on my two-way Opteron system earlier today:

Because we’ll be tweaking the graph a bit more, I’m only showing this scaled down version, but we can take a bit of a tour just from the gross features:

  • Generally, earliest to latest proceeds from right to left.
    • The large structure just below the centre is network/rpc/bind and its dependents, the various RPC services.
      • Furthest to the right is network/pfil, which prepares the network interfaces for IP Filter.
        • Furthest to the left is system/fmd, the new Solaris Fault Manager.

        As you might guess, we’ve had to write numerous graph-aware diagnosis algorithms to make a large structure like this one navigable. We’re looking forward to further enhancing our reporting and visualization tools to make troubleshooting easier still.

        (Of course, knowing all the dependencies in Solaris doesn’t protect you from the occupational hazard of overdiagnosis, as John and I spent an hour poking at every possible aspect of his system, which ultimately required a new network cable. I figure once a year I still end up following an “all possible software causes” algorithm, ending up in mdb(1) poking around the kernel, rather than checking cable connections or a bad software install.)