Wednesday, June 30, 2010

First Post


On a more serious note... if anyone stumbles across this page, I will be using it for discussing my project Pushy. Pushy is a Python (now Java too) package for connecting to a remote Python interpreter, and accessing objects therein as if they were local. In other words, it's a sort of RPC package.

So why another RPC package? While I was working on test automation, I identified a couple of things I didn't like about existing RPC frameworks:
  1. Invariably, a nailed-up (i.e. runs for an extended period of time) server is required to be running for you to connect to. This leads to the problem #2.
  2. Custom software needs to be maintained on both the client and the server.
  3. The security mechanisms in existing frameworks have a tendency to suck. For nailed-up servers running as an arbitrary user, the server program must perform its own authentication/authorisation to ensure the user can't access resources it isn't supposed to.
My thinking was this: rather than implementing RPC services and maintaining them on all of these different servers, why can't I put all of the logic in the client? The Python Standard Library is rather extensive, why don't we just expose that to the client, and let the client define the "service"? That's the basis of Pushy.

So first I started hacking away to develop a proof of concept, using XML-RPC to transparently access objects in a remote interpreter, automatically creating proxy objects to represent remote objects and performing method/operator calls by sending requests. Then I found RPyC, which did essentially the same thing, only better. 

So at this stage I've moved the "service" from server to client, but I still need to run some code on the server, and it's still "nailed-up". What's more, is that the server is running as a single user, which poses a massive security risk. How can we do better? Enter SSH... SSH is prevalent on Linux and UNIX operating systems, and one of the cool things you do with it is remotely execute a command and pipe to/from its standard I/O. Maybe we could do something with that?

Using SSH solves all three problems, in fact. Pushy works as follows: the client application imports the Pushy package, and invokes a function to "connect" to a remote host. What this does is creates an SSH connection to the remote host, using the username and password (or public-key encryption) specified by the caller. After the connection is created, Pushy executes Python in the remote system, passing it a command-line program. i.e. Something like "python -c 'run_server()'". This command-line program is a short one, which reads a larger program off its standard input stream, and executes it to start the Pushy server program. The program didn't exist on disk before the connection, and won't exist after.

So let's revisit the three problems now:
  1. (Problem: a nailed-up server is required.) Unless you count sshd, no nailed-up services are required on the server. I think it's fair to discount sshd, as it is so commonplace.
  2. (Problem: custom software must be maintained on both client and server.) As described in the paragraph above, there is no longer any custom server code required to be maintained on the server. So we can implement programs to access arbitrary portions of the Python Standard Library on a remote system, with nary a change on the server. There is an added benefit here: if the client/server protocol changes, there's still nothing to upgrade on the server.
  3. (Problem: server code must perform application-level authentication/authorisation.) Did I mention I'm lazy? Doing authentication/authorisation properly is a pain in the neck, and I'd like to avoid it if I can. Turns out I can if I'm using SSH, as the "server" program is running as the user specified by the client program. Usual operating system authentication and access controls ensues.

Over time I decided to drop RPyC in favour of a writing my own protocol. I was using RPyC for things it wasn't intended and it showed in various areas, such as exception handling. One thing remains though: the auto-importing feature of RPyC is lovingly imitated by Pushy.