Tag Archives: manifestly

Bespoke services: site/supervisord

Recently, I’ve been experimenting with supervisor, which is a Python-based process restarter for Unix/Linux. Lincoln Loop recently offered instructions on running supervisor under upstart, which is applicable to some of the current Linux distributions. On OpenSolaris and related systems, the service management facility, smf(5), can be used to ensure your supervisors stay online. Below is a simple manifest that starts (and restarts) supervisord after a small set of services becomes available.

If you don’t provide a supervisor.conf in one of the standard locations, enabling this service instance will result in it heading immediately to the maintenance state, as the start method will fail repeatedly. You can use svcs -x to perform this diagnosis:

$ svcs -x                                                         ~
svc:/site/supervisord:default (supervisor process control system)
 State: maintenance since Sat Sep 04 16:34:14 2010
Reason: Start method failed repeatedly, last exited with status 2.
   See: http://sun.com/msg/SMF-8000-KS
   See: utmpd(1M)
   See: utmpx(4)
   See: /var/svc/log/site-supervisord:default.log
Impact: This service is not running.

The log file will contain a message, with some amount of repetition, like

[ Sep  4 16:34:13 Enabled. ]
[ Sep  4 16:34:13 Rereading configuration. ]
[ Sep  4 16:34:13 Executing start method ("/usr/bin/supervisord"). ]
Error: No config file found at default paths (/usr/etc/supervisord.conf, /usr/supervisord.conf, supervisord.conf, etc/supervisord.conf, /etc/supervisord.conf); use the -c option to specify a config file at a different path
For help, use /usr/bin/supervisord -h
[ Sep  4 16:34:13 Method "start" exited with status 2. ]

It’s worth noting that all of the programs run by a single instance of supervisord will be in the same process contract. If you know the fault characteristics of your programs, you may wish to use multiple instances of supervisord to keep programs with “sympathetic” failure modes and frequencies. You may also need to ignore core dumps and external signals, depending on the programs you are running; on recent systems, you can see /var/svc/manifest/network/http-apache22.xml for an example of a startd property group that does so. Alternatively, you could modify your configuration to run each of the programs to be started in independent contracts using ctrun(1).

Exercises

  1. We should really provide a property group that contains the key invocation settings as properties. I’ve omitted it here, particularly for the configuration file, because the method token expansion outlined in smf_method(5) lacks handling for unset property values. (*)
  2. Extend supervisord to understand process contracts. This exercise would include constructing a Python module to interact with the contract filesystem. (***)

Bespoke services: network/rmi/registry

Gary and I were recently prototyping an application that uses Java RMI, and so I ended up searching around to see if anyone has done a service conversion for rmiregistry(1). (rmiregistry(1) is the daemon that lets RMI clients find the available remote objects being served by various virtual machines on a given system.) Turns out no one has (or no one’s published it), which means it’s time to rev up the convert-o-tron.

Since we’re still developing our application and it’s likely we’ll change a definition or two, and since we need to restart the registry to cause the remote objects, we’re going to make our prototype service restart automatically if we restart the registry. That means our prototype service has a dependency on network/rmi/registry with specific restart_on behaviour, meaning that its service description has a fragment like the following:

<!--
As an RMI server application, we expect to be able to
register our RMI classes with the registry server.
-->
<dependency
name='rmi-registry'
grouping='require_all'
restart_on='restart'
type='service'>
<service_fmri value='svc:/network/rmi/registry' />
</dependency>

Inject that fragment into your various RMI servers’ descriptions (or the equivalent property group into the repository) and you’ll save a bit of time on application reinitializations.

So, if you’re interested, please feel free to take a copy of network/rmi/registry; comments and corrections welcome.

[ T: OpenSolaris smf RMI ]

Bespoke services: application/vncserver

In honour of the “Mugs for Manifests” contest, I thought I would spin out another custom service description I wrote some months ago.

My setup for working from home—key during the last six months of Solaris 10—is to tunnel into Sun’s network via one implementation or another of a virtual private network (VPN). In all cases, the VPN solution runs on Solaris. Although the VPN lets your system participate more or less like a regular host, I find it’s easier to use VNC to remotely present an X11 display from my main workstation, muskoka. But, of course, machine running pre-production bits can fail or be rebooted or be reinstalled regularly, so I wanted the VNC server on my system to always be up: I wanted a VNC service.

What’s distinct about running the VNC server is that it should run as me, with my environment, and not as root with init(1M)’s. svc.startd(1M), while it can run methods according to smf_method(5), doesn’t populate the environment fully in the sense of login(1). So we will need to extract some data from the name service, which is cumbersome to perform in a shell script. We’ll write our method in Perl, which implies

Tip 1: Methods need not be shell scripts.

In fact, the start method and the stop method can be totally separate commands: you could write one in Python, and one can be an executable Java .jar archive, or some even more bizarre combination.

The other trick is that, if VNC fails for some reason, I want to be aggressive about cleaning up its various leftover temporary files. For this purpose, I run the stop method with a different credential—the default of root—than the start method, which is done in our brief manifest by locating the <method_context> element on only the start method.

Tip 2: Methods need not be run with identical method contexts. Credentials, privileges, and the like may all differ from method to method.

Our manifest then looks like:

<?xml version='1.0'?>
<!DOCTYPE service_bundle SYSTEM '/usr/share/lib/xml/dtd/service_bundle.dtd.1'>
<service_bundle type='manifest' name='export'>
<service name='application/vncserver' type='service' version='0'>
<single_instance/>
<instance name='sch' enabled='true'>
<dependency name='milestone' grouping='require_all' restart_on='none' type='service'>
<service_fmri value='svc:/milestone/multi-user:default'/>
</dependency>
<dependency name='autofs' grouping='require_all' restart_on='none' type='service'>
<service_fmri value='svc:/system/filesystem/autofs:default'/>
</dependency>
<dependency name='nis' grouping='require_all' restart_on='none' type='service'>
<service_fmri value='svc:/network/nis/client:default'/>
</dependency>
<exec_method name='stop' type='method' exec='/home/sch/bin/vncserver_method stop' timeout_seconds='60'/>
<exec_method name='start' type='method' exec='/home/sch/bin/vncserver_method start' timeout_seconds='300'>
<method_context>
<method_credential user='sch' group='staff' />
</method_context>
</exec_method>
</instance>
</service>
</service_bundle>

The dependencies above are needed if you use NFS for home directories and NIS for name services; they could be reduced for less networked setups.

And, for the method, we have a short Perl program. The complete list of environment variables in login(1) would include LOGNAME, PATH, MAIL, and TZ (timezone), and exclude my silly setting of LANG, but most of these will be set up by the shell that the VNC startup script (its analgue to .xinitrc. The various print calls are just to let the service log show a little activity, and could be removed.

!/usr/perl5/bin/perl

require 5.8.3; use strict; use warnings; use locale; my ($name, $passwd, $uid, $gid, $quota, $comment, $gcos, $dir, $shell, $expire) = getpwuid "$<"; $ENV{USER} = $name; $ENV{HOME} = $dir; $ENV{SHELL} = $shell; $ENV{LANG} = "en_CA"; # Just to create havoc (i.e. expose bugs). #

The stop method is run as root so that it can cleanup.

# if (defined($ARGV[0]) && $ARGV[0] eq "stop") {

ksh and sh specific

print "stop method\n"; system("$ENV{SHELL}", "-c", "/opt/csw/bin/vncserver -kill :1"); if (-S "/tmp/.X11-unix/X1") { unlink("/tmp/.X11-unix/X1"); unlink("/tmp/.X1-lock"); } exit 0; } #

The start method is run with the user's identity.

# print "start method\n"; if (-f "/tmp/.X1-lock") { unlink("/tmp/.X1-lock"); } if (-S "/tmp/.X11-unix/X1") { system("logger -p 1 application/vncserver requires " . "/tmp/.X11-unix/X1 be removed"); exit 0; }

ksh and sh specific

{ exec "$ENV{SHELL}", "-c", "/opt/csw/bin/vncserver -pn -geometry 1600x1200 -depth 24 :1" }; system("logger -p 1 application/vncserver can't exec /opt/csw/bin/vncserver"); exit 1;

And now we have always-on VNC service for the regular telecommuter:

$ svcs -p vncserver
STATE          STIME    FMRI
online         13:01:01 svc:/application/vncserver:sch
13:01:00   100577 Xvnc
13:01:17   100625 xwrits
13:01:17   100626 ctrun
13:01:17   100632 xautolock
13:11:18   102348 xlock
$ uptime
12:00pm  up 23 hr(s),  4 users,  load average: 0.04, 0.07, 0.07

Exercises

  1. Remove the hard coded display numbering (“:1″, “X1″, etc.).
  2. Make the resolution, display depth, RGB encoding, and other standard options into properties.

[ T: smf ]

Bespoke services: application/catman

For various reasons—some reasonable, some suspect—Solaris doesn’t ship with a compiled set of windex databases for its manual pages. The unfortunate result is that helpful commands like apropos(1) or man -k are unhelpful:

$ apropos sort
/usr/man/windex: No such file or directory

<

p> smf(5) provides one way to address this shortcoming, via a transient service to be run during startup. Our service description would be roughly equivalent to the following:

<?xml version="1.0"?>
<!DOCTYPE service_bundle SYSTEM "/usr/share/lib/xml/dtd/service_bundle.dtd.1">
<service_bundle type='manifest' name='sch:catman'>
<service
name='application/catman'
type='service'
version='1'>
<create_default_instance enabled='false' />
<single_instance />
<!--
By default, application/catman will run in the background
during boot.  If you want to run it periodically, execute

/usr/sbin/svcadm restart catman

If you wish to augment the default MANPATH, use the setenv subcommand to svccfg(1M). For instance, to add the Java manual pages to the build:

/usr/sbin/svccfg -s application/catman

> setenv MANPATH /usr/share/man:/usr/java/man > exit

/usr/sbin/svcadm refresh catman

If MANPATH is not defined, the default manual path is /usr/share/man, as per catman(1M). --> <dependency name='local-filesystems' type='service' grouping='require_all' restart_on='none'> <service_fmri value='svc:/system/filesystem/local' /> </dependency> <dependency name='remote-filesystems' type='service' grouping='optional_all' restart_on='none'> <service_fmri value='svc:/network/nfs/client' /> <service_fmri value='svc:/system/filesystem/autofs' /> </dependency> <exec_method type='method' name='start' exec='/usr/bin/catman -w' timeout_seconds='0' /> <exec_method type='method' name='stop' exec=':true' timeout_seconds='0' /> <property_group name='startd' type='framework'> <propval name='duration' type='astring' value='transient' /> </property_group> <stability value='Unstable' /> <template> <common_name> <loctext xml:lang='C'> manual page index generation </loctext> </common_name> <documentation> <manpage title='catman' section='1M' manpath='/usr/share/man' /> </documentation> </template> </service> </service_bundle>

<

p> Following my own instructions in the comment block, I defined a value for MANPATH and refreshed the service. My setting can be double-checked with svcprop(1) like so:

$ svcprop -p start application/catman
start/exec astring /usr/bin/catman\ -w
start/timeout_seconds count 0
start/type astring method
start/environment astring MANPATH=/usr/share/man:/usr/openwin/man:/usr/sfw/man:/usr/dt/man:/usr/perl5/man:/usr/java/man:/usr/apache/man:/usr/X11/man:/opt/sfw/man:/opt/csw/man

Issuing “svcadm enable catman” will cause the service to be executed immediately, and upon each subsequent boot. Our earlier query becomes fecund:

$ apropos sort
FcFontSort      FcFontSort (3fontconfig)    - Return list of matching fonts
aclsort         aclsort (3sec)  - sort an ACL
alphasort       scandir (3c)    - scan a directory
alphasort       scandir (3ucb)  - scan a directory
bsearch         bsearch (3c)    - binary search a sorted table
bunzip2         bzip2 (1)       - a block-sorting file compressor and associated utilities
bzcat           bzip2 (1)       - a block-sorting file compressor and associated utilities
bzip2           bzip2 (1)       - a block-sorting file compressor and associated utilities
bzip2recover    bzip2 (1)       - a block-sorting file compressor and associated utilities
disksort        disksort (9f)   - single direction elevator seek sort for buffers
ldap_sort       ldap_sort (3ldap)   - LDAP entry sorting functions
ldap_sort_entries               ldap_sort (3ldap)   - LDAP entry sorting functions
ldap_sort_strcasecmp            ldap_sort (3ldap)   - LDAP entry sorting functions
ldap_sort_values                ldap_sort (3ldap)   - LDAP entry sorting functions
libbz2          libbz2 (3)      - library for block-sorting data compression
look            look (1)        - find words in the system dictionary or lines in a sorted list
qsort           qsort (3c)      - quick sort
sort            sort (1)        - sort, merge, or sequence check text files
sortbib         sortbib (1)     - sort a bibliographic database
tsort           tsort (1)       - topological sort
...

Exercises

  1. Add a configuration property that makes the service also rebuild the nroffed versions of the manual pages, if set to true.
  2. Make the service regenerate only in the case that components in the path have changed.

Tie knot: Knot 54 (Hanover).