Category Archives: Software

@tomww points out that the Spec Files Extra (SFE) repository is packaging recent node.js versions, built with their GCC 4.6.3 compilers. That means installation is as simple as a pair of pkg(5) invocations

# pkg set-publisher -p http://pkg.openindiana.org/sfe
# pkg install runtime/javascript/nodejs

The current version is 0.8.16, built less than 24 hours ago. Cool.

Building node.js on OpenIndiana

More specifically, these instructions should let you build node 0.8.16, the current stable version, on oi_151a5:

$ uname -a
SunOS cooler 5.11 oi_151a5 i86pc i386 i86pc Solaris

(cooler is in its seventh year of service, having run many builds of Solaris, OpenSolaris, and, now, OpenIndiana.) First, you’ll need a GCC 4.x compiler. If you attempt to use the 3.4.3 gcc compiler, you’ll get

cc1: error: unrecognized command line option "-fno-tree-vrp"
cc1: error: unrecognized command line option "-fno-tree-sink"

in the output from your failed build. So, use the Illumos GCC 4.4.4 build, which you can install via

$ sudo pkg install developer/illumos-gcc developer/gnu-binutils

which led to the installation on my system, of 3 packages, and a total of 56.9MiB of content downloaded. Include these new tools in your path for the build:

$ export PATH=/opt/gcc/4.4.4/bin:/usr/gnu/bin:$PATH

To help the node build find the appropriate Standard C++ library for linking, we set the linker run path, via the environment. (By having a correct run path, our node binary won’t need LD_LIBRARY_PATH to be set to pick up libstdc++.so.6.) We can then configure, and issue the (GNU) make to start a build:

$ export LD_RUN_PATH=/opt/gcc/4.4.4/lib
$ CC=gcc ./configure --prefix=$HOME
$ CC=gcc gmake

You can test the resulting binary

$ ./node
> process.version;
'v0.8.16'
> ^D

and install the node platform to the configured location.

$ CC=gcc gmake install

And now you have a working node.js for your OpenIndiana system. npm is installed as well, so you can begin downloading the modules needed for your development. (If you’re running OmniOS, it looks like the “managed services” repository includes a pkg(5)-installable node.js package, so you can install that interpreter directly. Maybe that’s what cooler should run next.)

post-review returns HTTP 500 with cribbed repository configuration

We run ReviewBoard at work; we set up each hosted Git repository as new projects are started. I made a silly error a few weeks ago: when I created a new repository, I filled in the Mirror Path setting with that of a previous repository. This mistake leads to client output like

$ post-review 7fdd345e22783b289be128205cc0c47935057e20
Error creating review request: HTTP 500

Your administrator mail address should receive a message with a body like

Traceback (most recent call last):

File "/usr/lib/python2.7/site-packages/Django-1.3.3-py2.7.egg/django/core/handlers/base.py", line 111, in get_response
  response = callback(request, *callback_args, **callback_kwargs)

File "/usr/lib/python2.7/site-packages/Django-1.3.3-py2.7.egg/django/views/decorators/cache.py", line 79, in _wrapped_view_func
  response = view_func(request, *args, **kwargs)

File "/usr/lib/python2.7/site-packages/Django-1.3.3-py2.7.egg/django/views/decorators/vary.py", line 22, in inner_func
  response = func(*args, **kwargs)

File "/usr/lib/python2.7/site-packages/Djblets-0.6.22-py2.7.egg/djblets/webapi/resources.py", line 397, in __call__
  result = view(request, api_format=api_format, *args, **kwargs)

File "/usr/lib/python2.7/site-packages/Djblets-0.6.22-py2.7.egg/djblets/webapi/resources.py", line 581, in post
  return self.create(*args, **kwargs)

File "/usr/lib/python2.7/site-packages/ReviewBoard-1.6.11-py2.7.egg/reviewboard/webapi/decorators.py", line 127, in _check
  return view_func(*args, **kwargs)

File "/usr/lib/python2.7/site-packages/Djblets-0.6.22-py2.7.egg/djblets/webapi/decorators.py", line 88, in _checklogin
  return view_func(*args, **kwargs)

File "/usr/lib/python2.7/site-packages/Djblets-0.6.22-py2.7.egg/djblets/webapi/decorators.py", line 62, in _call
  return view_func(*args, **kwargs)

File "/usr/lib/python2.7/site-packages/Djblets-0.6.22-py2.7.egg/djblets/webapi/decorators.py", line 231, in _validate
  return view_func(*args, **new_kwargs)

File "/usr/lib/python2.7/site-packages/ReviewBoard-1.6.11-py2.7.egg/reviewboard/webapi/resources.py", line 5984, in create
  Q(local_site=local_site))

File "/usr/lib/python2.7/site-packages/Django-1.3.3-py2.7.egg/django/db/models/manager.py", line 132, in get
  return self.get_query_set().get(*args, **kwargs)

File "/usr/lib/python2.7/site-packages/Django-1.3.3-py2.7.egg/django/db/models/query.py", line 351, in get
  % (self.model._meta.object_name, num, kwargs))

MultipleObjectsReturned: get() returned more than one Repository -- it returned 2! Lookup parameters were {}

each time a review is submitted against one of these repositories.

You can fix this condition by correcting the Mirror Path of the new repository; it’s not a ReviewBoard issue.

Bespoke services: site/redis

For prototyping web applications, I have recently come to rely on having Redis handy. In various sketches or early versions, I’ve used it to store event logs, to persist a collection of simple objects, or to conveniently manage a particularly large dictionary.

To make it easy to have a redis-server running on an OpenSolaris-derived system, I’ve written an smf(5) service manifest:

The default configuration of Redis is good enough for most prototyping scenarios, so this manifest assumes (a) that you’ve built and installed Redis to /usr/local, its default install location, and (b) are happy with the default configuration. In its default configuration, redis-server does not daemonize, and writes a log message every 5 seconds—you’ll very much want to change the latter if you move to production.

Exercises

  1. Add a property group and property to store a configuration location, and modify the start method appropriately. This enhancement should be on the service, such that it can be easily overridden on each instance. (*)

A government agency I interact with has updated their web-based client software. The original application was a basic sequence of web forms. Its replacement? An approximately ~50MiB Silverlight-based application. In the process of the update, they discarded my original web account and password. The backend service that the application must communicate with is still slow, operating costs now include the bandwidth to update cached copies (for performance reasons), and the application itself has new usability issues. Because of the switch from standardized Web technologies to Silverlight, the majority of their customers can’t run the application on their phone or tablet. (If it were Flash, iPads would still be excluded.) How was this change an upgrade, again?

Bespoke services: site/supervisord

Recently, I’ve been experimenting with supervisor, which is a Python-based process restarter for Unix/Linux. Lincoln Loop recently offered instructions on running supervisor under upstart, which is applicable to some of the current Linux distributions. On OpenSolaris and related systems, the service management facility, smf(5), can be used to ensure your supervisors stay online. Below is a simple manifest that starts (and restarts) supervisord after a small set of services becomes available.

If you don’t provide a supervisor.conf in one of the standard locations, enabling this service instance will result in it heading immediately to the maintenance state, as the start method will fail repeatedly. You can use svcs -x to perform this diagnosis:

$ svcs -x                                                         ~
svc:/site/supervisord:default (supervisor process control system)
 State: maintenance since Sat Sep 04 16:34:14 2010
Reason: Start method failed repeatedly, last exited with status 2.
   See: http://sun.com/msg/SMF-8000-KS
   See: utmpd(1M)
   See: utmpx(4)
   See: /var/svc/log/site-supervisord:default.log
Impact: This service is not running.

The log file will contain a message, with some amount of repetition, like

[ Sep  4 16:34:13 Enabled. ]
[ Sep  4 16:34:13 Rereading configuration. ]
[ Sep  4 16:34:13 Executing start method ("/usr/bin/supervisord"). ]
Error: No config file found at default paths (/usr/etc/supervisord.conf, /usr/supervisord.conf, supervisord.conf, etc/supervisord.conf, /etc/supervisord.conf); use the -c option to specify a config file at a different path
For help, use /usr/bin/supervisord -h
[ Sep  4 16:34:13 Method "start" exited with status 2. ]

It’s worth noting that all of the programs run by a single instance of supervisord will be in the same process contract. If you know the fault characteristics of your programs, you may wish to use multiple instances of supervisord to keep programs with “sympathetic” failure modes and frequencies. You may also need to ignore core dumps and external signals, depending on the programs you are running; on recent systems, you can see /var/svc/manifest/network/http-apache22.xml for an example of a startd property group that does so. Alternatively, you could modify your configuration to run each of the programs to be started in independent contracts using ctrun(1).

Exercises

  1. We should really provide a property group that contains the key invocation settings as properties. I’ve omitted it here, particularly for the configuration file, because the method token expansion outlined in smf_method(5) lacks handling for unset property values. (*)
  2. Extend supervisord to understand process contracts. This exercise would include constructing a Python module to interact with the contract filesystem. (***)

adirent.[ch]: Adding d_type to struct dirent on OpenSolaris

An occasional porting problem you may encounter when compiling programs for OpenSolaris is the absence of d_type in the directory entry structure returned by readdir(3C). I hit this issue when experimenting with mu as a search solution for my accumulated email.

A trivial example of the failure you might see would be caused by the following program:

#include <sys/types.h>
#include <dirent.h>
#include <err.h>
#include <stdio.h>

int
main(int argc, char *argv[])
{
        DIR *d;
        struct dirent *e;

        if ((d = opendir("/")) == NULL)
                err(1, "opendir failed");

        for (e = readdir(d); e != NULL; e = readdir(d)) {
                    if (e->d_type != DT_UNKNOWN)
                        (void) printf("recognized filetype for '%s'\n",
                            e->d_name);
        }

        (void) closedir(d);

        return (0);
 }

When we attempt to compile this program with gcc, we get something like

$ gcc a.c
a.c: In function `main':
a.c:16: error: structure has no member named `d_type'
a.c:16: error: `DT_UNKNOWN' undeclared (first use in this function)
a.c:16: error: (Each undeclared identifier is reported only once
a.c:16: error: for each function it appears in.)

Studio cc will give similar output:

$ /opt/SunStudioExpress/bin/cc a.c
"a.c", line 16: undefined struct/union member: d_type
"a.c", line 16: undefined symbol: DT_UNKNOWN
cc: acomp failed for a.c

The addition of d_type to struct dirent came first for the BSD Unixes and was later added to Linux. Because it’s not easy to add members to well-known structures and preserve binary compatibility, OpenSolaris and Solaris lack this field, as well as the DT_* constant definitions. (If d_type were to become part of the Unix standards, Solaris would likely have to introduce a second family of opendir()/readdir()/closedir() functions and a second version of the structure, similar to how large files were introduced for 32-bit programs.)

Because we fail at compilation time, our workaround has to modify either the program’s source code or its build environment. (Preloading is too late.) It’s probably possible to combine a few definitions and a shared object that we include via LD_PRELOAD but it seems easier to just provide a C wrapper around readdir(3C) and an alternate struct dirent. We develop this approach in the next section.

DIRENT and READDIR

The approach we take is

  1. Introduce DIRENT and READDIR via adirent.h.
  2. Change the source program such that each call to readdir() is replaced by READDIR() and each use of struct dirent is replaced by DIRENT. In each file so modified, add a #include <adirent.h>.
  3. Compile adirent.c via gcc -I. -O2 -c adirent.c or equivalent.
  4. Add adirent.o to the link line for each binary that includes one of the files modified in step 2.

If we apply these steps to our example above, we get

#include <sys/types.h>
#include <adirent.h>
#include <err.h>
#include <stdio.h>

int
main(int argc, char *argv[])
{
        DIR *d;
        DIRENT *e;

        if ((d = opendir("/")) == NULL)
                err(1, "opendir failed");

        for (e = READDIR(d); e != NULL; e = READDIR(d)) {
                    if (e->d_type != DT_UNKNOWN)
                        (void) printf("recognized filetype for '%s'\n",
                            e->d_name);
        }

        (void) closedir(d);

        return (0);
 }

with the result that compilation and execution now work

$ gcc -O2 -I. -c adirent.c
$ gcc -I. a.c adirent.o
$ ./a.out

This shim function and definitions should be sufficient for most ports around this incompatibility, but there are some additional comments worth making.

Performance. Because many programs expect d_type to be one of DT_REG or DT_DIR to save on a stat(2) call, this shim will force those programs into an alleged “slow” path. The actual impact of returning DT_UNKNOWN on every call will be program- and situation-dependent; it didn’t seem to affect my mail indexing.

Multithreaded programs. The current implementation does not protect the static structure defined in adirent.c. Programs with multiple threads performing readdir(3C) calls through READDIR() will get unexpected results. It should be relatively straightforward to dynamically allocate one struct adirent for each thread coming through READDIR() for the first time.

Downloads

I suppose these should be in a repository on Bitbucket or GitHub. For now, they’re just simple downloads:

Acknowledgments

I discussed this problem with Dan, who in particular noted that DT_UNKNOWN was always a legal return value for d_type. Bart looked over my shoulder and spied at least one error during the debugging phase.

~4100MiB

I seeded the 2008.05 release candidate for about 45 hours, ultimately shipping a little over 4100 megabytes. I’m going to take a break, because I want to update my DP2-based workstation and get some work done, but, once we have new bits, I’ll getting seeding again.

(I found the actual result: 4237MiB sent up, so almost 4GiB.)

Stopping Firefox's restore session dialog

I’m sure that there are Firefox users out there who want to restore their previous session; I never do, and so deactivating that dialog is a big timesaver. If you search, you’ll find a few writeups on how to adjust the configuration, but I want to document the minimum steps to suppress the restore session dialog.

  1. Enter about:config into the URL field.
  2. Enter sessionstore into the displayed filter field and press Enter.
  3. Double-click the value cell of the browser.sessionstore.enabled row, so that the value field reads False.
  4. Restart Firefox. One click saved.

[ T: ]

OpenSolaris: Defect tracking relationships

Just prior to the Board election, we ran a poll of the core contributors to get some sense of what one active subset felt were the five most pressing obstacles to open development. Dan just issued an initial Beta of a webrev-based approach, derived from his earlier experiment on http://cr.grommit.com, so that’s a starting point for Priority 3. The Board is tackling, in full public view on ogb-discuss, Priority 4: there’s OGB/2007/001, Project creation enhancements and OGB/2007/002, Community and Project Reorganization as two significant chunks of a stablized reorganization. Priority 2, the deployment of a “request to integrate” system, is somewhat gated on ON and sister consolidations’ switch to Mercurial, being pursued in the scm-migration project—it’s an aspect of workflow that isn’t required by all of the hosted consolidations. Priority 5, the deployment of an opensolaris.org-hosted wiki, is in a requirements gathering phase over on website-discuss.

That leaves Priority 1, the deployment of a public bug tracking system. Bug tracking has loomed over the OpenSolaris effort for pretty much its entire implementation phase; we’ve known that aspects of the current bug tracking methodologies impact many parts of Sun’s business, and that any solution will require the identification of which entanglements are strategic—meaning that there’s a requirement for any new system—and which are accidental—meaning that there’s only some transition cost, as the entangled system can be adjusted to consume information in some designed fashion. So, as part of getting a set of draft requirements together for discussion, I thought I would work through some of the issues facing defect tracking as we move to open development. Most of these points are about the developer–distribution (or upstream–downstream) interface that a defect tracking system (DTS) or defect tracking architecture represents.

The primary requirement that a distribution has on their DTS is some ability to maintain confidential data associated with each tracked defect. Let’s call that database a proprietary annotation system (PAS)—the data within it capture the customer histories associated with various defects or collections of defects (“products”?) represented in the system. The DTS, meanwhile, is meant to be neutral across all participants, developers and distribution assemblers, and unconcerned with non-technical characteristics of the defect.

This contrast allows us to postulate a set of relationships among various active DTSs and PASs for an open development community.

The association of confidential information with an existing defect looks something like

p simple as OpenSolaris: Defect tracking relationships

Of course, in the OpenSolaris case, SMI’s annotations become just one potential PAS; other distributions may also choose to annotate publicly known (“community tracked”) defects:

p multiple as OpenSolaris: Defect tracking relationships

One requirement that we’ve heard during preliminary discussions with the various teams is there must be some ability to search the entirety of the product-relevant portion of the DTS. One possibility is that each PAS operator builds a search corpus that combines the upstream DTS with the PAS content. A potentially more economic alternative would be to allow the the associations to be bidirectional (so that an indexer with authorizations allowing it to access one or more proprietary annotation systems can present a complete defect corpus). Making the existence of the annotations public does not seem like a significant leak of proprietary information, while the existence of annotations might be a useful measure of defect significance. (It is probably worth explicitly stating that having a complete defect corpus for searching does not imply that use of a single DTS is the only means of obtaining such a corpus.)

These associations are more complex objects than the current See Also links in typical monolithic DTSes, in that they carry one or more mappings of status and responsibility between the DTS and each PAS. Potentially, this capability could lead to more precise handling of release readiness, in that a query involving a group of PAS-tracked defects could indicate that one or more stoppers are blocking the release of a certain version of the tracked product. Of course, prior to open development, the Solaris organization has historically managed that fairly well for its products, so why is it worth discussing?

Well, for OpenSolaris today, the set of defects in our own DTS isn’t the complete set of relevant defects for product release. In fact, as examples, Solaris tracks the defects, branches, and patches for GNOME, Mozilla Firefox, and Perl, among others. That is, Sun’s current combined DTS/PAS, Bugster, is in the following relationships with an upstream source

p us relns 1 OpenSolaris: Defect tracking relationships

when the product is affected by an upstream-sourced component, and

p us relns 2 OpenSolaris: Defect tracking relationships

when a customer is affected by an upstream-sourced component. There’s also an awkward local cost since upstream state is usually tracked manually (or possibly via custom scripting).

We can choose to apply this tracking need to our second defect/annotation figure above, in two ways: we can track the OpenSolaris consolidation defects as a peer of other upstream DTSs

p full relns 1 OpenSolaris: Defect tracking relationships

or, more interestingly, we can allow community management of the relevance of upstream defects, like the following

p full relns 2 OpenSolaris: Defect tracking relationships

(A distribution may not choose to annotate the public DTS record on an upstream bug, but may instead choose to annotate on the upstream DTS record directly. Ignoring the community signal when it has a mismatch with distribution priorities seems likely; using it as a guide when no conflicting principle exists seems like a safe course as well.)

I think that introducing these kinds of associations allows us to solve a large class of defect tracking problems that are currently impacting the OpenSolaris space. (Obviously there are issues, too–for instance, one could introduce association cycles (that clients must detect).) I would expect that the confidentiality issue impacts all of the distributions pulling from OpenSolaris consolidations (and probably, more generally, all groups looking at open development): I believe every distribution has customers of some kind. The importance of upstream relationships may presently be operating environment-specific, although I know other groups are also dependent on some number of OSS components.