Hmm. That looks interesting, though not quite the same as what I had in mind. The sort of thing I was thinking of might look something like:
array-of-double coefficients; // coefficients of a polynomial
array-of-double input, output; // desired evaluations of the polynomial
// Construct a function to evaluate this polynomial without loop overhead
jit_function newfunc = new jit_function [ double x -> double ];
newfunc.mainblock.declare { double ret = 0; }
for (i = coefficients.length; i-- > 0 ;)
newfunc.mainblock.append { ret = ret * x + @{coefficients[i]}; }
newfunc.mainblock.append { return ret; }
function [ double -> double ] realfunc = newfunc.compile();
// Now run that function over all our inputs
for (i = 0; i < input.length; i++)
output[i] = realfunc(input[i]);
(Disclaimer: syntax is 100% made up on the spot for illustrative purposes and almost certainly needs major reworking to not have ambiguities, infelicities, and no end of other cockups probably including some that don't have names yet. I'm only trying to illustrate the sort of thing that I'd like to be possible, and about how easy it should be for the programmer.)
So an important aspect of this is that parsing and semantic analysis are still done at compile time – the code snippets we're adding to newfunc are not quoted strings, they're their own special kind of entity which the compile-time parser breaks down at the same time as the rest of the code. We want to keep runtime support as small as we can, so we want to embed a code generator at most, not a front end. The idea is that we could figure out statically at compile time the right sequence of calls to an API such as libjit, and indeed that might be a perfectly sensible way for a particular compiler to implement this feature. The smallest possible runtime for highly constrained systems would do no code generation at all – you'd just accumulate a simple bytecode and then execute it – but any performance-oriented implementation would want to do better than that.
Importantly, the function we're constructing gets its own variable scope (ret in the above snippet is scoped to only exist inside newfunc and wouldn't clash with another ret in the outer function), but it's easy to import values from the namespace in which a piece of code is constructed (as I did above with the @ syntax to import coefficients[i]). It should be just as easy to import by reference, so that you end up with a runnable function which changes the outer program's mutable state.
Example uses for this sort of facility include the above (JIT-optimising a computation that we know we're about to do a zillion times), and also evaluation of user-provided code. My vision is that any program which embeds an expression grammar for users to specify what they want done (e.g. gnuplot, or convert -fx) should find that actually the easiest way to implement that grammar is to parse and semantically analyse it, then code-gen by means of calls to the above language feature, and end up with a runnable function that does exactly and only what the user asked for, fast, without the overhead of bytecode evaluation or traversing an AST.
If you're going to parse it at compile time, then any language with first-class functions will do something much simpler than this, unless I'm missing something. in C#:
Func<int, int> doubler = x => x * 2;
in Javascript:
var doubler = function(x) { return x * 2 };
I know, there's no "compile time" in JS. But it's equivalent syntax anyway.
If it's deferred until runtime, then the c# syntax is far more complex and unwieldy but probably more flexible: http://blogs.msdn.com/b/csharpfaq/archive/2009/09/14/generating-dynamic-methods-with-expression-trees-in-visual-studio-2010.aspx
any language with first-class functions will do something much simpler than this
Indeed, it's simpler and hence less flexible. Both those examples are more or less fixed in form at compile time; you get to plug in some implicit parameters (e.g. capturing variables from the defining scope) but you can't change the number of statements in the function, as I demonstrated in my half-baked polynomial example above. I don't know C# well at all, but I know that in JS you'd only be able to do my polynomial example by building the function source up in a string and doing an eval.
(OK, I suppose you can build one up in JS by composing smaller functions, along the lines of
but I've no confidence that running it wouldn't still end up with n function call overheads every time a degree-n polynomial was evaluated. Also, I had to try several times to get the recursion to do the right thing in terms of capturing everything by value rather than reference, so even if that does turn out to work efficiently it still fails the puzzle-game test.)
I see - you could vary the number of statements with the "generating dynamic methods with expression trees" method above, but it would be a) checked at runtime not compile-time. and b) fugly. Roslyn may address the second issue somewhat, but probably not the first.
Th next version of .Net should have something like this - http://en.wikipedia.org/wiki/Microsoft_Roslyn
array-of-double coefficients; // coefficients of a polynomial array-of-double input, output; // desired evaluations of the polynomial // Construct a function to evaluate this polynomial without loop overhead jit_function newfunc = new jit_function [ double x -> double ]; newfunc.mainblock.declare { double ret = 0; } for (i = coefficients.length; i-- > 0 ;) newfunc.mainblock.append { ret = ret * x + @{coefficients[i]}; } newfunc.mainblock.append { return ret; } function [ double -> double ] realfunc = newfunc.compile(); // Now run that function over all our inputs for (i = 0; i < input.length; i++) output[i] = realfunc(input[i]);(Disclaimer: syntax is 100% made up on the spot for illustrative purposes and almost certainly needs major reworking to not have ambiguities, infelicities, and no end of other cockups probably including some that don't have names yet. I'm only trying to illustrate the sort of thing that I'd like to be possible, and about how easy it should be for the programmer.)So an important aspect of this is that parsing and semantic analysis are still done at compile time – the code snippets we're adding to
newfuncare not quoted strings, they're their own special kind of entity which the compile-time parser breaks down at the same time as the rest of the code. We want to keep runtime support as small as we can, so we want to embed a code generator at most, not a front end. The idea is that we could figure out statically at compile time the right sequence of calls to an API such as libjit, and indeed that might be a perfectly sensible way for a particular compiler to implement this feature. The smallest possible runtime for highly constrained systems would do no code generation at all – you'd just accumulate a simple bytecode and then execute it – but any performance-oriented implementation would want to do better than that.Importantly, the function we're constructing gets its own variable scope (
retin the above snippet is scoped to only exist insidenewfuncand wouldn't clash with anotherretin the outer function), but it's easy to import values from the namespace in which a piece of code is constructed (as I did above with the@syntax to importcoefficients[i]). It should be just as easy to import by reference, so that you end up with a runnable function which changes the outer program's mutable state.Example uses for this sort of facility include the above (JIT-optimising a computation that we know we're about to do a zillion times), and also evaluation of user-provided code. My vision is that any program which embeds an expression grammar for users to specify what they want done (e.g.
gnuplot, orconvert -fx) should find that actually the easiest way to implement that grammar is to parse and semantically analyse it, then code-gen by means of calls to the above language feature, and end up with a runnable function that does exactly and only what the user asked for, fast, without the overhead of bytecode evaluation or traversing an AST.Func<int, int> doubler = x => x * 2;
in Javascript:
var doubler = function(x) { return x * 2 };
I know, there's no "compile time" in JS. But it's equivalent syntax anyway.
If it's deferred until runtime, then the c# syntax is far more complex and unwieldy but probably more flexible: http://blogs.msdn.com/b/csharpfaq/archive/2009/09/14/generating-dynamic-methods-with-expression-trees-in-visual-studio-2010.aspx
Indeed, it's simpler and hence less flexible. Both those examples are more or less fixed in form at compile time; you get to plug in some implicit parameters (e.g. capturing variables from the defining scope) but you can't change the number of statements in the function, as I demonstrated in my half-baked polynomial example above. I don't know C# well at all, but I know that in JS you'd only be able to do my polynomial example by building the function source up in a string and doing an eval.
(OK, I suppose you can build one up in JS by composing smaller functions, along the lines of
var poly = function(x) { return 0; } for (i = ncoeffs; i-- > 0 ;) { builder = function(p, coeff) { return function(x) { return x*p(x)+coeff; }; }; poly = builder(poly, coeffs[i]); }but I've no confidence that running it wouldn't still end up with n function call overheads every time a degree-n polynomial was evaluated. Also, I had to try several times to get the recursion to do the right thing in terms of capturing everything by value rather than reference, so even if that does turn out to work efficiently it still fails the puzzle-game test.)