Game Scripting Languages
Many games use scripting languages for animation and game play logic. This has the advantage of quick prototyping, and better organization of code. Almost every non-trivial game engine uses some scripting language.
There are many scripting languages suited to such a task. Lua is perhaps the most popular game scripting language. Other choices include: AngelScript, GameMonkey, Io, Pawn, Squirrel, and Scheme. Sometimes heavy-weight languages are also used, like Python or Ruby. These languages are usually quite a bit harder to embed, and aren't known for their speed.
In this post I will compare AngelScript, GameMonkey, Pawn, Lua and Squirrel for my comparisons. I'll also take a quick look at TinyScheme, although not run it through all the comparisons since it is interpreted. Io looks very interesting, but unfortunately it is severely lacking in documentation and won't be compared here.
Pawn is the odd one in the group, so we'll start there.
Pawn (formally called Small) is small and simple. It uses a C like syntax and has only one data type, the cell. A cell is usually an integer, but it can also be treated as a character, boolean, or floating point value. Pawn has no support for structures or classes, but structures can be faked using named array positions.
Pawn is the only language in this roundup that completely separates its compiler from its virtual machine. It also, by far, has the most static compile time checks. All variables must be declared, and all native functions must have forward declarations. This is nice, because it alleviates run time checking on native function parameters. And of course, it's much quicker to find an error at compile time than at run time. I found the compiler to also have very thorough warnings, for example it even warns about inconsistent indentation. My only grievance with the compiler, is that it requires leading zeros for floating point values. For example, it won't accept .5 as a constant, but instead requires 0.5.
Pawn has very through documentation, has enjoyed some widespread use, and has an almost inactive forum. Pawn is the quickest and lightest scripting language I've seen. I'm using it for low level scripting in my game, including character and object animations.
Lua has its own unique syntax that's a bit reminiscent of Basic (i.e. no curly brackets). It's also very fast and compiles into fairly small byte-code. It's very dynamic, variables don't need to be declared before use, and functions are first class values (meaning that the compiler and abstract machine are very tightly coupled, and functions can be stored in variables). Lua makes extensive use of tables (associative arrays), which are its only complex data type. Tables are able to mimic classes and objects, by using some entries to store functions (first class value, remember) and some entries to store data.
Lua has been used extensively in the industry, has an active online community, and a large base of open source modules. Lua has predated and influenced every other language in this roundup. I found that Lua's reference API documentation seems a bit vague until you get the hang of it (but there are many examples that make up for it). Lua also has a book available, Programming in Lua, that covers both the language itself and embedding, in detail.
GameMonkey borrows some concepts from Lua, but uses a C like syntax. It makes heavy use of tables, and can fake objects using tables the same way as Lua. It has finite state machine support, variables don't need to be declared, and functions are first class values.
GameMonkey's speed and small byte-code size really surprised me (it's lean and mean). However, its API reference documentation is quite lacking. The source code is commented with Doxygen statements so running Doxygen helps (I couldn't find a distribution of the generated reference online). It comes with a remote script debugger. I only played around with debugging briefly, but it's quite neat.
There doesn't seem to be a lot of hype surrounding GameMonkey, but they do have an active forum. Several community members seem to contribute back to GameMonkey, community contributions include many bindings, an even better debugger, etc.
Squirrel is a high level dynamically typed object oriented language support classes and inheritance with C like syntax. It also borrows tables from Lua. It was very easy to compile (one makefile) and seems to have top notch documentation.
Although Squirrel is still young, it has already been used in some commercial applications and has an active online forum.
AngelScript is a statically typed language with a C++ like syntax and classes. AngelScript has the best native binding in the bunch. Usually, a function or class only needs to be registered with the AngelScript virtual machine to be usable by a script. All the other languages in this roundup require intermediate helper functions for binding. However, the scripts cannot be compiled unless each native function is first registered. This adds an extra step to anyone wanting to ship pre-compiled AngelScript byte-code.
AngelScript doesn't support tables, and in fact they wouldn't be terrible useful because of AngelScript's static typing.
AngelScript has an active online forum, pretty good documentation, and seems to be updated often.
TinyScheme is also worth a mention. It is contained in one C source file. Unlike the other languages in this review, it is interpreted, and so is about an order or two of magnitude slower. However, if speed isn't an issue and you just want to easily add Scheme to your project, I would highly recommend it. I tried some other schemes, but always had trouble compiling. TinyScheme is easy to compile, has many options, and is easily hackable.
Versions Tested and Licenses
Embedded scripting languages are useless without being able to call native C functions or C++ methods. AngelScript easily wins here, just tell the library about your function and it's instantly available to the script. The other languages need some glue code.
Pawn passes its function parameters as an array of ints. Floats simply need a cast, but for strings you'll need to call a couple special functions first.
GameMonkey, Lua, and Squirrel use a stack to pass parameters. Because they are dynamic languages, values can be popped off the stack as different types (e.g. int, float, char*). The idea is a bit tricky to grasp, but it works quite well once you get the hang of it. Each language defines several macros to make using the stack a bit less verbose.
Each library also has third party libraries available to make binding easier. For example, here is a list of binding libraries for Lua. Personally, I feel you're better off using native bindings, maybe with some custom macros. I question the reasoning behind wanting to use a bloated template binding library to expose your whole engine to a scripting language automatically.
AngelScript, Lua, GameMonkey, and Squirrel all support some form of concurrency. This allows scripts to create threads that appear to run parallel in the virtual machine. This is useful for when a script needs to run a long algorithm without disrupting its other responsibilities. These threads don't give a performance advantage, but rather a programming advantage.
GameMonkey has native support for blocking threads until another event happens. For example, if one thread needs to wait for a door to open, it can go to sleep until another thread throws a "door open" event. I don't think this would be hard to add with the other languages, but it's nice that GameMonkey already has it.
I did a few speed tests in each language. These tests in no way reflect actual real world scripts, but they should give a comparison of each language's basic overhead. The Fibonacci test tests function call overhead, by having a script function calls itself recursively 1,402,817,464 times. The prime test implements repeated iteration and basic integer math. The native string tests has the script call a native (C or C++) function ten million times while passing a 12 character string argument. The native number tests calls a native function a billion times while passing a single numeric parameter. Lua only supports doubles, but the other languages are tested by passing an integer type variable.
Each test was first compiled to byte-code, so the test time includes loading byte-code from disk, but it does not include compiling times for any language. Pawn, Lua, and Squirrel come with stand-alone compilers. AngelScript and GameMonkey use byte-code, but I had to actually throw together programs to save compiled scripts (which wasn't a huge deal in either case).
It's worth mentioning that Lua, Pawn, and Squirrel have Just-In-Time (JIT) compilers available. I didn't test any of them. Pawn comes with a pre-compiled assembly implementation, so I did test it.
Since scripting languages are often used for string processing, I also bench-marked a script passing variable length strings to native functions. Pawn represents strings in a special way inside of its virtual machine, and it can be costly to convert longer strings. The other languages don't need any special conversion, and so have a relatively constant speed regardless of the string's length.
It's likely that a published game may distribute compiled byte-code instead of source code scripts. Each tested language produces fairly small byte-code, which makes most of these suitable for use as simple configuration scripts. In many cases, the size and speed may be less costly than traditional configuration files, like XML (which wasn't designed for that purpose either).
This is probably rarely an actual issue, but these are the virtual machine library sizes. In others words, if you want your program to run scripts, expect your executable to bloat by this amount.
I'm including the scripting code used in the benchmarks here. The tests should give you a basic feel for each languages' syntax.
AngelScript, GameMonkey, Pawn, and Squirrel each use a C like syntax with brackets. Lua has its own syntax, which works quite well. Non-programmers may find Lua's syntax easier.
With the excepting of scheme (.scm), I feel that the syntaxes are basically interchangeable. If you know C, should you should be able to learn 90% of the syntax for any of these languages in ten minutes.
Each language has its own strengths and weaknesses. If you need an embeddable scripting language, I hope this post gave you a bit of a head start.
I tried to be as fair as possible, but I'm sure I made mistakes. Any comments/suggestions/corrections are welcome.
Like this post? Consider following me on Twitter or following me on Github. Don't forget to subscribe to my feed.