Monthly Archives: December 2008

collie 0.13 released

I fixed a few bugs and I’m hitting one of them enough to be annoying so I decided to cut a release so I could do an official build. A tarball available for this one.

Changes in this release:

  • KILLED tags should return non-zero too
  • Fixed signal handling so test programs don’t ignore SIGINT
  • Fixed sequence padding in the nanny scheduler
  • Fixed buffering when tag names collide

Introducing qarsh

This tool is a little more dangerous than the other two.  This one is an acronym for Quality Assurance Remote SHell and it is a replacement for ssh in testing environments.  Let me say that again, this is for testing environments on a secured network.  This is NOT something you want to run on a production machine or on an Internet facing server because it is essentially a back door.

How is it different from ssh?

Logs all commands run

Every time you run a command through qarsh, the daemon logs the full command to syslog.  When something goes wrong on the remote system you can look through /var/log/messages to see what commands were run mixed in with system log messages.

No authentication or encryption or related setup required

In a testing environment, authentication and encryption are not important features.  We’re more interested in running command A on host B as user C.  Being able to do that without jumping through hoops allows us to be more productive in our testing.  By leaving these features out, we also save a lot of overhead per connection.

Exits the same way as the remote command

In order to make qarsh more transparent, we made it return exactly the same way as the remote command.  That means a wait() gets the same status back including which signal killed the remote command.

Relays signals to remote commands

We also pass signals forward so when you kill qarsh it in turn kills the remote command.  This makes it easier to clean up a test run which is spread across multiple nodes.  We forward INT, TERM, HUP, USR1, and USR2.

Does not hang when system hangs or reboots

One problem with ssh is that it can take a long time to detect when a remote host reboots and hangs because it waits for TCP to return an error.  This can take 10 minutes or more.  Qarsh uses a secondary daemon to detect when a remote host is no longer responding or has rebooted.

No tty support, don’t try running vi

This is not a complete replacement for ssh as it does not provide tty services needed to run interactive commands.  For system investigation, you should still use ssh.

Where to get it

The source code for this project is in my public_git directory.

git clone git://