Saturday, June 11, 2011

C-Preprocessor Macros in Python

TL;DR: I've started a new project, csnake, which allows you to write your C preprocessor macros in Python.

Long version ahead...

You want to do what now?

I had this silly idea a couple of years ago, to create a C preprocessor in which macros can be defined in Python. This was borne out of me getting sick of hacking build scripts to generate code from data, but pursued more for fun.

I started playing around with Boost Wave, which is a "Standards conformant, and highly configurable implementation of the mandated C99/C++ preprocessor functionality packed behind an easy to use iterator interface". With a little skulduggery coding, I managed to define macros as C++ callable objects taking and returning tokens. Then it was a simple matter of adding a Python API.


The Result

What we end up with is a Python API that looks something like this:

import sys
from _preprocessor import *
def factorial(n):
    import math
    return [Token(T_INTLIT, math.factorial(int(str(n))))]

p = Preprocessor("test.cpp")
p.define(factorial)
for t in p.preprocess():
    sys.stdout.write(str(t))
sys.stdout.write("\n")

Which will take...

int main(){return factorial(3);}

And give you...

int main(){return 6;}

If it's not immediately clear, it will translate "factorial()" into an integer literal of the factorial of the input token. This isn't a very interesting example, so if you can imagine a useful application, let me know ;)

The above script will work with the current code, using Python 3.2, compiling with GCC. If you'd like to play with it, grab the code from the csnake github repository. Once you've got it, run "python3.2 setup.py build". Currently there is just an extension module ("csnake._preprocessor"), so set your PYTHONPATH to the build directory and play with that directly.

I have chosen to make csnake Python 3.2+ only, for a couple of major reasons:

  • All the cool kids are doing it: it's the way of the future. But seriously, Python 3.x needs more projects to become more mainstream.
  • Python 3.2 implements PEP 384, which allows extension modules to be used across Python versions. Finally. I always hated that I had to recompile for each version.
... and one very selfish (but minor) reason: I wanted to modernise my Python knowledge. I've been ignoring Python 3.x for far too long.



The Road Ahead

What I've done so far is very far from complete, and not immediately useful. It may never be very useful. But if it is to be, it would require at least:
  • A way of detecting (or at least configuring) pre-defined macros and include paths for a target compiler/preprocessor. A standalone C preprocessor isn't worth much. It needs to act like or delegate to a real preprocessor, such as GCC.
  • A #pragma to define Python macros in source, or perhaps if I'm feeling adventurous, something like #pydefine.
  • A simple, documented Python API.
  • A simple command line interface with the look and feel of a standard C preprocessor.
  • Some unit tests.
I hope to add these in the near future. I've had code working for the first two points, and the remaining points are relatively simple. I will post again when I have made some significant progress.