I'm a programming language geek. And every programming language geek has to, at some point, attempt to design a language. I've been trying since the late 1980s, going through quite a few different things in that time. This is my latest project.
I work mostly in C++. I wanted something that would interface nicely with standard libraries (libc on Unix, and also add-ons like Qt and SQLite). But I find C++ lacking in several areas.
Here's my list of primary goals (which can also be read as a list of shortcomings of C++):
- Garbage collection (GC)
- Easy-to-use string, list, and dictionary (aka hash, associative array) types
- An OS-independent I/O library, e.g., an easy-to-use file type
- Decent support for regular expression (this may seem minor, but it's a huge part of what makes Perl useful)
- A better module system: one file per module instead of separate .h and .cpp files, and a build system that doesn't require writing makefiles (Java gets these two right)
- Some way to run script-type programs without going through separate step(s) to compile them
I want a language that does all of that, while maintaining the ability to interface easily with C/C++ libraries.
STL does some of those, but I really don't like the syntax, e.g., using the shift operators for stream I/O.
The Boehm garbage collector adds GC, but conservative collectors make me nervous.
The name "Quip" originally came from "Quick Programming". It's also nice that the ".q" file name extension is not in common use.
I considered two approaches. The first is a new language, with a syntax similar to C++, with a full compiler (and maybe an interpreter, too). To be able to call C and C++ APIs, this language would need to maintain compatibility at the object layout / method call / name mangling level, which would be pretty difficult to do. In addition, writing a compiler is a big project.
The second approach is to build a framework around C++, along with some build tools. This is what Quip does.
This section explains how Quip addresses each of the goals listed earlier.
GC is done via reference-counting, using a "smart pointer" type.
Heap-allocated classes do "
typedef SmartPtr<Foo> P
type, and "
Foo::P" is used in place of the usual
In complex systems, where there are loops in the object graph, the
programmer must use "
Foo*" for backward pointers. This
allows reference counts to decrement to zero, avoiding memory leaks.
(Note that there's no actual requirement that all heap-allocated objects be garbage collected. Quip code can interface to C++ libraries that have their own memory management scheme.)
String, list, and dictionary types; I/O library
These are all pretty straightforward. There are lots of examples to look at (C++ STL, Java, .NET, Perl, Python), and Quip tries to take the best bits from each.
I ended up choosing PEGs instead of regular expressions. For a description of PEGs and an explanation of why they're better than REs, see Bryan Ford's paper from POPL '04.
Quip provides a PEG class, as well as simpler PEG functions on the String class.
Module system and scripts
Modules are written as a single .q file. Interface and implementation
code are denoted by
@implementation sections. Module imports use
@import directive. The Quip build tool reads .q
files and generates .cpp and .h files, which are handed off to the C++
compiler. All dependences are handled by the build tool. It can
either run the code immediately (like a script) or build an
executable. (Actually, it always builds an executable; running it
immediately is optional.) On Unix systems, you can start a script
(Yes, this looks a bit like Objective-C, though I'm trying to be a bit less kludgey.)
Pros and Cons of the C++ Framework Approach
A C++ framework has some advantages:
- Calling C and C++ libraries is easy. Building libraries that can be called by other C/C++ code is also easy.
- Performance-critical code can be written in plain old C++ (or C). It can avoid garbage collection, heap-allocated types, etc. where it makes sense to do so.
There are also disadvantages:
- Reference counting requires more effort from the programmer (treating backward pointers differently). And unlike mark-sweep GC (and other more sophisticated collectors), it can't do memory compaction.
- Pointers require ugly "
- It's impossible to provide a truly good string type: literals are stored by the compiler as C strings, and must be converted at run-time.
- It's impossible to provide list and dictionary literals. (That may change with the C++11 standard.)
- There's no way to do "
<<EOF"-style literal strings (which are really handy in scripts).
- It's impossible to replace libc functions (printf, exit, etc.)
without using a prefix ("
something::" or "
- It's hard to override operators, since they would have to be on
SmartPtr<Foo>class, rather than on
This is an ongoing project. I'm planning to publish source code and documentation when it's ready for public consumption.