Saturday, April 28, 2012

An llgo runtime emerges

It's been a long time coming, but I'm now starting to put together pieces of the llgo runtime. Don't expect much any time soon, but I am zeroing in on a design at least. The sleuths in the crowd will find that only string concatenation has been implemented thus far, which is pretty boring. Next up, I hope, will be interface-to-interface conversions, and interface-to-value conversions, both of which require (for a sane implementation) a runtime library.

I had previously intended to write the runtime largely in C, as I expected that would be the easiest route. I started down this road writing a basic thread creation routine using pthread, written in C. The code was compiled using Clang, emitting LLVM IR which could be easily linked with the code generated by llgo. It's more or less the same idea implemented by the gc Go compiler (linking C and Go code, not relying on pthread). Even so, I'd like to write the runtime in Go as much as possible.

Why write the runtime in Go? Well for one, it will make llgo much more self contained, which will make distribution much easier since there won't be a reliance on Clang. Another reason is based on a lofty, but absolutely necessary goal: that llgo will one day be able to compile itself. If llgo compiles itself, and compiles its own runtime, then we have a great target for compiler optimisations: the compiler itself. In other words, "compiler optimisations should pay for themselves."

In my last post I mentioned that LLVM 3.1 is coming up fast, and this release has the changes required by llgo. Unfortunately, I've just found that the C API lacks an interface for linking modules, so I'm going to have to submit a patch to LLVM again, and the window for inclusion in 3.1 has certainly passed. Rather than break gollvm/llgo's trunk again, I'll create a branch for work on the runtime. I'll post again when I've submitted a patch to LLVM, assuming the minor addition is accepted.

Sunday, April 8, 2012

llgo update: Go1, automated tests

This week I finished up Udacity CS373: Programming a Robotic Car, and also finally finished reading GEB. So I'll hopefully be able to commit some more time to llgo again.

I moved on to Go's weekly builds a while back, and updated both llgo and gollvm to conform. I'm now on Go 1, as I hope most people are by now, and llgo is in good shape for Go 1 too. That's not to say that it compiles all of the Go 1 language, just that it runs in Go 1. Apart from that, I've just been working through some sample programs to increase the compiler's capability.

One of the things that I've been a bit lazy about with llgo is automated testing, something I'm usually pretty keen on. I've grown anxious over regressions as time has gone on in the development, so I've spent a little bit of time this week putting together an automated test suite, which I mentioned in golang-nuts a few days ago. The test suite doesn't cover a great deal yet, but it has picked up a couple of bugs already.

One of the numerous things I like about Go is its well integrated tooling. For testing, Go provides the testing package, and go test tool. So you write your unit tests according to the specifications in the "testing" package, run "go test", and your tests are all run. This is comparable to, say, Python, which has a similar "unittest" package. It is vastly more friendly than the various C++ unit test frameworks; that's in large part due to the way the Go language is designed, particularly with regard to how it fits into build systems and is parsed.

In Go, everything you need to build a package is in the source (assuming you use the "go" command).
  • The only external influences on the build process (environment variables GOOS, GOARCH, GOROOT, etc.) apply to the entire build procedure, not to single compilation units. Each variant will end up in a separate location when built: ${GOPATH}/pkg/${GOOS}_${GOARCH}/<pkgname>.
  • Platform-specific code is separated into multiple files (xxx_linux.go, xxx_windows.go, ...), and they're automatically matched with the OS/architecture by the "go" command.
  • Package dependencies are automatically and unambiguously resolved. Compare this with C/C++ headers, which might come from anywhere in the preprocessor's include path.
So anyway, back to llgo's testing. It works just like this: I've created a separate program for each test case in the llgo/llgo/testdata directory. Each of these programs corresponds to a test case written against the "testing" package, which does the following:
  1. Run the program using "go run", and store the output.
  2. Redirect stdout to a pipe, and run a goroutine to capture the output to a string.
  3. Compile the program using llgo's Compile API, and then interpret the resultant bitcode using gollvm's ExecutionEngine API.
  4. Restore the original stdout, and compare the output with that of the original "go run".
Pretty obvious I guess, but I was happy with how easy it was to do. Defer made the job of redirecting, restoring and closing file descriptors pain free; the go statement and channels made capturing and communicating the resulting data a cinch.

This is getting a little ramble-ish, so I'll finish up. While testing, I discovered a problem with the way LLVM types are generated from types.Type's, which basically means that they need to be cached and reused, rather that generated afresh each time. At the same time I intend to remove all references to LLVM from my clone of the "types" package, and offer my updates back to the Go team. It's not fully functional yet, but there's at least a few gaps that I've filled in.

One last thing: LLVM 3.1 is due out May 14, so gollvm and llgo will no longer require LLVM from SVN. I really want to eliminate the dependency on llvm-config from the build of gollvm. I'm considering a dlopen/dlsym shim and removing the cgo dependency on LLVM. I'd be keen to hear some opinions, suggestions or alternatives.

Until next time.