Sunday, September 25, 2011

cmonster update

I've been quietly beavering away at cmonster, and thought I should share an update.

Note: the changes I describe below are not yet part of a cmonster release. I'll release something once I've stabilised the API and tested it more thoroughly. In the mean time you can pull the source from github.

In the last month or so I have been adding C/C++ parsing capabilities to cmonster, by exposing the underlying Clang parser API. I've been wanting a simple, scriptable framework for analysing and manipulating C++ source for a few years now. The reason for wanting such a thing is so that I, and others, can more rapidly develop tools for writing better C++, and eliminate some of the drudgery. I've only just made a start, but cmonster now provides an API for parsing a C/C++ source file, returning a Python object to inspect the result.

So what's cmonster looking like now? We now have a preprocessor and a parser interface, the former now being incorporated into the latter. The parser interface will parse a single source file, and return its Abstract Syntax Tree (AST). As is typical with parsers, there are many classes involved to describe each declaration, statement, type, etc. So I've added Cython into the mix to speed up the process of defining Python equivalents for each of these classes.

Unfortunately Cython does not yet support PEP 384 (Py_LIMITED_API), so at the moment cmonster is back to requiring the full Python API, and thus must be rebuilt for each new release. I've had a tinker with Cython to get its output to compile with Py_LIMITED_API, and hope to provide a patch in the near future.

What's next? Once I get the AST classes mapped out, I intend to introduce a source-to-source translation layer. I'm not entirely sure how this'll work yet, but I think ideally you'd just modify the AST and call some function to rewrite the main source file. Elements of the AST outside of the main source file would be immutable. That's the hope, but it may end up being something a little more crude, using Clang's "Rewriter" interface directly to replace source ranges with some arbitrary string. I expect this will be a ways off yet, though.

Monday, September 5, 2011

Hello, Mr. Hooker.

I've been procrastinating on cmonster. I have some nasty architectural decisions to make, and I keep putting it off. In the mean time I've been working on a new little tool called "Mr. Hooker" (or just mrhooker).

Introducing Mr. Hooker

The idea behind mrhooker is very simple: I wanted to be able to write LD_PRELOAD hooks in Python. If you're not familiar with LD_PRELOAD, it's a mechanism employed by various UNIX and UNIX-like operating systems for "preloading" some specified code in a shared library. You can use this to provide your own version of native functions, including those in standard libraries such as libc.

Anyway, I occasionally find the need for an LD_PRELOAD library to change the behaviour of a program that I can't easily recompile. Often these libraries will be throw-away, so it might end up taking just as long to write the LD_PRELOAD library. So I wrote mrhooker to simplify this.

It turns out there's very little to do, since Cython (and friends) do most of the hard work. Cython is a programming language that extends Python to simplify building Python extensions. It also has an interface for building these extensions on-the-fly. So mrhooker doesn't need to do much - it takes a .pyx (Pyrex/Cython source) and compiles it to a shared library using Cython. Mrhooker takes this, and some common code, and loads it into a child process using LD_PRELOAD.

Example - Hooking BSD Sockets

Let's look at an example of how to use mrhooker. Hooks are defined as external functions in a Cython script. Say we want to hook the BSD sockets "send" function. First we'd find the signature of send (man 2 send), which is:

ssize_t send(int sockfd, const void *buf, size_t len, int flags);

Given this, we can produce a wrapper in Cython, like so:

cdef extern ssize_t send(int sockfd, char *buf, size_t len, int flags) with gil:

There's a couple of important things to note here. First, the parameter type for "buf" drops const, since Cython doesn't know about const-ness. Second, and crucially, the function must be defined "with gil". This ensures that the function acquires the Python Global Interpreter Lock before calling any Python functions. Okay, with that out of the way, let's go on...

We'll want to do something vaguely useful with this wrapper. Let's make it print out the argument values, and then continue on with calling the original "send" function. To do that we'll use dlsym/RTLD_NEXT to find the next function called "send".

cdef extern ssize_t send(int sockfd, char *buf, size_t len, int flags) with gil:
    print "====> send(%r, %r, %r, %r)" % (sockfd, buf[:len], len, flags)
    real_send = dlsym(RTLD_NEXT, "send")
    if real_send:
        with nogil:
            res = (<ssize_t(*)(int, void*, size_t, int) nogil>real_send)(
                sockfd, buf, len, flags)
        return res
        return -1

We'll also need to declare dlsym and RTLD_NEXT. Let's do that.

# Import stuff from <dlfcn.h>
cdef extern from "dlfcn.h":
    void* dlsym(void*, char*)
    void* RTLD_NEXT

Now you just run:

mrhooker <script.pyx> <command>

And there we go. This is trivial - it would also be fairly trivial to write a C program to do this. But if we wanted to do anything more complex, or if we were frequently changing the wrapper function, I'd much rather write it in Python - or Cython, as it were.


Edit: I just noticed that it's broken if you don't have a certain config file. I always had one while testing... until I got to work.
You'll get an error "ConfigParser.NoSectionError: No section: 'default'". I'll fix the code at home, but in the mean time you can do this:

$ mkdir ~/.mrhooker
$ echo [default] > ~/.mrhooker/mrhooker.config

P.S. if you add "build_dir = <path>" in that section, or a per-module section, mrhooker/Cython will store the shared library that it builds. Then if you don't change the source it'll be used without rebuilding.

Thursday, September 1, 2011

Google App Engine Agent

A couple of months ago I wrote about my foray into the world of Google App Engine. More recently, I'd gotten the itch again, and had some ideas of how to fix the problems I found when attempting to get Pushy to work in Google App Engine.

The root of most of the problems is that Google App Engine is stateless in nature. Server instances can be spun up or spun down without notice, and so we can't store complex state, which Pushy really requires. So a couple of weeks ago I set to investigating implementing a server-initiated RPC mechanism that is asynchronous and (mostly) stateless.

How would it work? Well, earlier this year I read that ProtoRPC was released, which brought RPC services to Google App Engine. In our case, Google App Engine is the client, and is calling the agent - but we can at least reuse the API to minimise dependencies and hopefully simplify the mechanism. Okay, so we have a ProtoRPC service running on a remote machine, consumed by our Google App Engine application. How do they talk?

One thing I wanted to avoid was the need for polling, as that's both slow and expensive. Slow in that there will necessarily be delays between polls, and expensive in that unnecessary polls will burn CPU cycles in Google App Engine, which aren't free. Long-polling isn't possible, either, since HTTP requests are limited to 30 seconds of processing time. If you read my last post, you probably already know what I'm going to say: we'll use XMPP.

What's XMPP? That's the Extensible Messaging and Presence Protocol, which is the protocol underlying Jabber. It is also the primary protocol that Google Talk is built on. It's an XML-based, client-server protocol, so peers do not talk directly to each other. It's also asynchronous. So let's look at the picture so far...

  • The client (agent) and server (GAE application) talk to each other via XMPP.
  • The agent serves a ProtoRPC service, and the GAE application will consume it.
Because our RPC mechanism will be server-initiated, we'll need something else: agent availability discovery. Google App Engine provides XMPP handlers for agent availability (and unavailability) notification. When an agent starts up it will register its presence with the application. When agent is discovered, the application will request the agent's service descriptor. The agent will respond, and the application will store it away in Memcache.

We (ab)use Memcache for sharing of data between instances of the application. When you make enough requests to the application, Google App Engine may dynamically spin up a new instance to handle requests. By storing the service descriptor in Memcache, it can be accessed by any instance. I said abuse because Memcache is not guaranteed to keep the data you put in it - it may be expelled when memory is constrained. Really we should use Datastore, but I was too lazy to deal with cleaning it up. "Left as an exercise for the reader." One thing I did make a point of using was to use the new Python Memcache CAS API, which allows for safe concurrent updates to Memcache.

Orrite. So now we have an agent and application which talk to each other via XMPP, using ProtoRPC. The application discovers the agent, and, upon request, the agent describes its service to the application. How can we use it? Well the answer is really "however you like", but I have created a toy web UI for invoking the remote service methods.

Wot 'ave we 'ere then? The drop-down selection has all of the available agent JIDs (XMPP IDs). The textbox has some Python code, which will be executed by the Google App Engine application. Yes, security alert! This is just a demonstration of how we can use the RPC mechanism - not a best practice. When you hit "Go!", the code will be run by the application. But before doing so, the application will set a local variable "agent", which is an instance of the ProtoRPC service stub bound to the agent selected in the drop-down.

ProtoRPC is intended to be synchronous (from the looks of the comments in the code, anyway), but there is an asynchronous API for clients. But given that application requests can only take up to 30 seconds to service a request, our application can't actively wait for a response. What to do? Instead, we need to complete the request asynchronously when the client responds, and convey some context to the response handler so it knows what to do with it.

In the demo, I've done something fairly straight forward with regards to response handling. When the UI is rendered, we create an asynchronous channel using the Channel API. We use this to send the response back to the user. So when the code is executed, the service stub is invoked, and the channel ID is passed as context to the client. When the client responds, it includes the context. Once again, security alert. We could fix security concerns by encrypting the context to ensure the client doesn't tamper with it. Let's just assume the client is friendly though, okay? Just this once!

So we finally have an application flow that goes something like this:
  1. Agent registers service.
  2. Server detects agent's availability, and requests client's service descriptor.
  3. Client sends service descriptor, server receives and stores it in Memcache.
and then...
  1. User hits web UI, which server renders with a new channel.
  2. User selects an agent and clicks "Go!".
  3. Server instantiates a service stub, and invokes it with the channel ID as context. The invocation sends an XMPP message to the agent.
  4. Agent receives XMPP message, decodes and executes the request. The response is sent back to the server as an XMPP message, including the context set by the server.
  5. The server receives the response, and extracts the response and channel ID (context). The response is formatted and sent to the channel.
  6. The web UI's channel Javascript callback is invoked and the response is rendered.


I've put my code up on GitHub, here: Feel free to fork and/or have a play. I hope this can be of use to someone. If nothing else, I've learnt a few new tricks!