It's not ‘OK because we're going to do X’ unless you actually do X
Yeah. It seems like a special case of "Your argument is reasonable, but your premises are blatantly false. But I'm embarrassed to suggest you're that self-deluded" :)
there isn't a Law of Conservation of Blame
Yes, this, very much so. It's a perennial fallacy that blame must be exactly 100%, not 0% or >=200% or somewhere between.
the less often they guess wrong, the more painful it is when they do
Yeah, in retrospect I've often seen that, but didn't notice until you put it in those terms.
Write the hard part first
Hm. I think it depends how sure you are that the goal is the actual goal. I definitely agree in the case of your puzzle-game contributors "hi, I've done all the easy bits, can you finish off the impossible 10% for me" isn't helpful.
But there's many other cases where the hardest bit is impossible, but one of the other bits turns out to be amazingly useful anyway, and you don't achieve what you set out to do, but you succeed at something else.
Treating your programming language as a puzzle game is a sign that the language isn't powerful enough. ... even if the puzzles are quite fun!
I think it definitely needs the disclaimer, that it's not a problem with the puzzles, just that if you _always_ have to do that, it would be nice if the language took care of it for you.
I'm curious what your ideal language would be. I'm currently thinking I'd like something like Python but with optional static checking that can turn it into C when you need. But I'm not sure if that's sensible :) And I want people to still need C or C++ programmers so I still have a career, though I obviously can't base my predictions on what would be convenient for me :)
Simon's Law of Google
Hm. This seems like a hint that asking people "how would you google for X" is a necessary step before giving up?
Hm. This seems like a hint that asking people "how would you google for X" is a necessary step before giving up?
Yes, but literally so – you only need to ask, since SLoG states that it isn't generally necessary for them to answer :-)
I'm curious what your ideal language would be.
I have a giant pile of notes that I've always vaguely intended to turn into an oversized LJ post on the subject, but I rather doubt they'd come together into a coherent whole. I have lots of things I'd like out of my ideal language, some of which can be pretty convincingly specified and others of which are so vague as to be more like blue-sky pipe dreams than serious design proposals, but every time I write down the entire list it becomes impossible to avoid writing "AND A PONY" at the end, because it's that kind of totally infeasible wishlist-of-everything. I'll post the list in a separate comment due to size.
Multi-paradigm is good, Do Everything My One True Way is bad. Some things are OO-shaped, some functional-shaped, some plain procedural, and you want to be able to switch back and forth within different parts of the same program without undue faff.
Compile to nice fast native code, or at the very least, don't design any language so that you can't sensibly do that. Requiring users to wait for Moore's Law is a mug's game.
Complete and correct bindings to all OS APIs. This is a major thing that keeps me using C and occasionally C++ in spite of their many flaws: whenever you use a different language, you sooner or later end up having to do some OS interaction that that language's abstraction didn't bother to include. But it's not really a language feature as such; I think what I really want is for all OS maintainers to publish cross-language specs for their APIs so that every language can auto-process them into its own appropriate form.
Define the language semantics better than C/C++. Undefined behaviour can bite you in so many different ways, and you have to know all the gotchas to reliably avoid it. Each language feature should be safely usable without having to be an expert in the rest of the language first.
Check as much as possible at compile time; in Python it's really annoying to find a code path your tests didn't encounter crashing in the field due to a typo or type-o that static type checking would have picked up instantly.
Make it possible to extend the language in a metaprogramming-type way, but try to do better than C++ at making the simple things simple, and try to avoid the need for puzzle games (which C++ is no better at doing than C – the CRTP is a good example).
Managed languages (GC, bounds checking etc) versus non-managed (C/C++ with feet firmly in the line of fire at all times) I'm not sure about; sometimes I think what I'd really like is a hybrid (e.g. GC with an optional explicit free operation which causes an allocated thing to become instantly invalid, so that refs from it become GCable in turn and refs to it throw a StalePointerException, and also I want a feature to let me conveniently treat a big byte array as an unmanaged sandbox), and other times what I think I'd really like is a means of writing unmanaged code which includes a static compile-time proof that you don't overrun any array etc. Probably neither is actually feasible.
I'm vaguely intrigued by aspect-oriented programming, in that it seems to be a reaction to a thing that's annoyed me a lot, but I've never actually tried it so who knows if it would work well in practice. I think ideally I'd just want a metaprogramming facility strong enough to let me do it myself if I happened to feel like it in a particular program.
Extending 'define the semantics clearly' above, a slightly silly idea that I can't quite convince myself is actually wrong would be to specify that all intermediate values in integer arithmetic expressions have the semantics of true mathematical integers, to the extent that the runtime will link in GMP or equivalent if it really has to (which I wanted available anyway, since arbitrary-precision integers are a really nice facility to have available and the best thing about Python is that it provides them as a native type). Narrowing should occur only as a result of an assignment or explicit cast. If you need speed, you can request a compiler warning when the expression you've written requires GMP to code-generate, and then manually insert casts to eliminate the need; then you'll be in control of where the casts go and won't be startled by them.
My biggest blue-sky wish is that I'd like a built-in language feature to construct pieces of code on the fly and JIT them into native code, so you could implement all sorts of user-scripting of things in a way that was actually reasonably efficient, and also increase the use of the approach "first laboriously work out exactly what you want to do, and then do it in a loop a zillion times really fast".
I was about to say, it's possibly surprising more languages aren't designed to compile to C, and then let the C compiler writers do all the heavy lifting of making them fast on every platform. But I guess I've just reinvented Java, oops :)
hybrid eg. GC with an optional explicit free operation which causes an allocated thing to become instantly invalid
Yeah, except I always think the of reverse: it seems most variables go out of scope at fairly clearly defined places, and I've grown very fond of having RAII work the C++ way (always) rather than the C# way (every time you use a IDisposable class, you can be leak-safe and exception-safe, provided you remember to add the "don't leak this" boilerplate). But it would be nice to have a garbage collector for things left over. Although I suppose the GC needs to be aware of any other variables which might still hold a pointer to them.
Compiling to C: yes, I've had that thought too. It's certainly attractive as a means of writing an initial implementation of an experimental language or one you aren't sure will see wide uptake yet, because as you say, you can get all the portability of C (up to OS API integration) and its performance too by using existing compilers. I think where it falls down worst is debugging – it'd be almost impossible to take a language compiled like that and get sensible integration with gdb or your other debugger of choice, and sooner or later that will become a serious pain. So sooner or later you'll want your language to have an end-to-end compiling and debugging story.
provided you remember to add the "don't leak this" boilerplate
Argh, yes, dammit, any time you find that you always have to remember to add the "work properly" option it's a sign that someone has cocked something up. My usual example of that is find -print0 or grep -Z piped into xargs -0.
Hmm. That looks interesting, though not quite the same as what I had in mind. The sort of thing I was thinking of might look something like:
array-of-double coefficients; // coefficients of a polynomial
array-of-double input, output; // desired evaluations of the polynomial
// Construct a function to evaluate this polynomial without loop overhead
jit_function newfunc = new jit_function [ double x -> double ];
newfunc.mainblock.declare { double ret = 0; }
for (i = coefficients.length; i-- > 0 ;)
newfunc.mainblock.append { ret = ret * x + @{coefficients[i]}; }
newfunc.mainblock.append { return ret; }
function [ double -> double ] realfunc = newfunc.compile();
// Now run that function over all our inputs
for (i = 0; i < input.length; i++)
output[i] = realfunc(input[i]);
(Disclaimer: syntax is 100% made up on the spot for illustrative purposes and almost certainly needs major reworking to not have ambiguities, infelicities, and no end of other cockups probably including some that don't have names yet. I'm only trying to illustrate the sort of thing that I'd like to be possible, and about how easy it should be for the programmer.)
So an important aspect of this is that parsing and semantic analysis are still done at compile time – the code snippets we're adding to newfunc are not quoted strings, they're their own special kind of entity which the compile-time parser breaks down at the same time as the rest of the code. We want to keep runtime support as small as we can, so we want to embed a code generator at most, not a front end. The idea is that we could figure out statically at compile time the right sequence of calls to an API such as libjit, and indeed that might be a perfectly sensible way for a particular compiler to implement this feature. The smallest possible runtime for highly constrained systems would do no code generation at all – you'd just accumulate a simple bytecode and then execute it – but any performance-oriented implementation would want to do better than that.
Importantly, the function we're constructing gets its own variable scope (ret in the above snippet is scoped to only exist inside newfunc and wouldn't clash with another ret in the outer function), but it's easy to import values from the namespace in which a piece of code is constructed (as I did above with the @ syntax to import coefficients[i]). It should be just as easy to import by reference, so that you end up with a runnable function which changes the outer program's mutable state.
Example uses for this sort of facility include the above (JIT-optimising a computation that we know we're about to do a zillion times), and also evaluation of user-provided code. My vision is that any program which embeds an expression grammar for users to specify what they want done (e.g. gnuplot, or convert -fx) should find that actually the easiest way to implement that grammar is to parse and semantically analyse it, then code-gen by means of calls to the above language feature, and end up with a runnable function that does exactly and only what the user asked for, fast, without the overhead of bytecode evaluation or traversing an AST.
If you're going to parse it at compile time, then any language with first-class functions will do something much simpler than this, unless I'm missing something. in C#:
Func<int, int> doubler = x => x * 2;
in Javascript:
var doubler = function(x) { return x * 2 };
I know, there's no "compile time" in JS. But it's equivalent syntax anyway.
If it's deferred until runtime, then the c# syntax is far more complex and unwieldy but probably more flexible: http://blogs.msdn.com/b/csharpfaq/archive/2009/09/14/generating-dynamic-methods-with-expression-trees-in-visual-studio-2010.aspx
any language with first-class functions will do something much simpler than this
Indeed, it's simpler and hence less flexible. Both those examples are more or less fixed in form at compile time; you get to plug in some implicit parameters (e.g. capturing variables from the defining scope) but you can't change the number of statements in the function, as I demonstrated in my half-baked polynomial example above. I don't know C# well at all, but I know that in JS you'd only be able to do my polynomial example by building the function source up in a string and doing an eval.
(OK, I suppose you can build one up in JS by composing smaller functions, along the lines of
but I've no confidence that running it wouldn't still end up with n function call overheads every time a degree-n polynomial was evaluated. Also, I had to try several times to get the recursion to do the right thing in terms of capturing everything by value rather than reference, so even if that does turn out to work efficiently it still fails the puzzle-game test.)
I see - you could vary the number of statements with the "generating dynamic methods with expression trees" method above, but it would be a) checked at runtime not compile-time. and b) fugly. Roslyn may address the second issue somewhat, but probably not the first.
Fast compilers: check. More or less. Common Lisp is dynamically typed, but with optional type annotations. Good compilers can do extensive type inference. By default, Lisp is fairly safe (array bounds are checked, for example), but this can be turned off locally. Common Lisp was always intended to be compiled, and designed by people who'd already made Lisp implementations competitive with the Fortran compilers of the day.
OS bindings: hmm. Every decent implementation has an FFI. The last time I looked, the CFFI-POSIX project wasn't going anywhere, though.
Semantics: yep. Pretty good ANSI standard; a draft of it is available online as an excellently hyperlinked document -- the `HyperSpec' -- which, to the tiny extent that it differs from the official standard, probably reflects real life better. Common Lisp nails down a left-to-right order of evaluation, which eliminates a lot of C annoyance; aliasing just works correctly; and while runtime type-checking isn't mandated, all implementations I know of will do it unless you wind the `safety' knob down.
Compile-time checking: hmm. A decent implementation will surprise you with how much it checks, even in the absence of explicit annotations. Unlike Python, Lisp implementations will want you to declare variables explicitly or they'll give you lots of warnings. SBCL's compiler diagnostics are less than excellent. The usual clue is that it emits a warning explaining how it elided the entire body of your function, and then you notice that it proved that you'd passed an integer to `cdr' somewhere near the beginning. If you like writing type declarations then you can do that and get better diagnostics.
Metaprogramming: Lisp's most distinctive feature is that it's its own metalanguage.
Explicit free: no, sorry. The FFI will let you allocate and free stuff manually, but it's a separate world filled with danger.
Slabs of bytes: a standard part of the FFI. Don't expect this to be enjoyable, though.
AOP: you can probably make one out of macros. Maybe you'll need a code walker.
Integer arithmetic: Lisp integers always have the semantics of true mathematical integers. Lisp systems typically have lots of different integer representations (heap-allocated bignums; immediate fixnums which have type-tag bits and live in the descriptor space; and various sizes of unboxed integer used for intermediate results, and as array or structure elements) and use whichever is appropriate in any given case (so `narrowing' occurs automatically, but only when it's safe). A good compiler, e.g., SBCL, will do hairy interval arithmetic in its type system in order to work out which arithmetic operations might overflow. If you wind up SBCL's `speed' knob you get notes about where the compiler couldn't prove that overflow was impossible. SBCL also has some nonportable features for declaring variables which should have wrap-on-overflow semantics instead.
Runtime compiler: got that too. It compiles Lisp code to native machine code. And it's used to program-generated code, because of all the macro expansions.
It is certainly true that some of the things on my wish list are things Lisp has long been famous for having. Unfortunately, I'm sorry to say, I would really like them in a language that isn't syntactically Lisplike!
I'm not so wedded to the C/C++ style of syntax as to tolerate no divergence from even the bad parts (e.g. C's declarator syntax would probably be the first thing to go if I ever did sit down at the drawing board for serious), but I do think that one or two basic amenities such as an infix expression grammar are not things I'm prepared to do without in the language I use all the time for everything. I tolerate Lispy syntax in my .emacs because I don't spend my whole life writing .emacs; I'd lose patience if I did.
It's not ‘OK because we're going to do X’ unless you actually do X
Yeah. It seems like a special case of "Your argument is reasonable, but your premises are blatantly false. But I'm embarrassed to suggest you're that self-deluded" :)
there isn't a Law of Conservation of Blame
Yes, this, very much so. It's a perennial fallacy that blame must be exactly 100%, not 0% or >=200% or somewhere between.
the less often they guess wrong, the more painful it is when they do
Yeah, in retrospect I've often seen that, but didn't notice until you put it in those terms.
Write the hard part first
Hm. I think it depends how sure you are that the goal is the actual goal. I definitely agree in the case of your puzzle-game contributors "hi, I've done all the easy bits, can you finish off the impossible 10% for me" isn't helpful.
But there's many other cases where the hardest bit is impossible, but one of the other bits turns out to be amazingly useful anyway, and you don't achieve what you set out to do, but you succeed at something else.
Treating your programming language as a puzzle game is a sign that the language isn't powerful enough. ... even if the puzzles are quite fun!
I think it definitely needs the disclaimer, that it's not a problem with the puzzles, just that if you _always_ have to do that, it would be nice if the language took care of it for you.
I'm curious what your ideal language would be. I'm currently thinking I'd like something like Python but with optional static checking that can turn it into C when you need. But I'm not sure if that's sensible :) And I want people to still need C or C++ programmers so I still have a career, though I obviously can't base my predictions on what would be convenient for me :)
Simon's Law of Google
Hm. This seems like a hint that asking people "how would you google for X" is a necessary step before giving up?
Yes, but literally so – you only need to ask, since SLoG states that it isn't generally necessary for them to answer :-)
I'm curious what your ideal language would be.
I have a giant pile of notes that I've always vaguely intended to turn into an oversized LJ post on the subject, but I rather doubt they'd come together into a coherent whole. I have lots of things I'd like out of my ideal language, some of which can be pretty convincingly specified and others of which are so vague as to be more like blue-sky pipe dreams than serious design proposals, but every time I write down the entire list it becomes impossible to avoid writing "AND A PONY" at the end, because it's that kind of totally infeasible wishlist-of-everything. I'll post the list in a separate comment due to size.
I was about to say, it's possibly surprising more languages aren't designed to compile to C, and then let the C compiler writers do all the heavy lifting of making them fast on every platform. But I guess I've just reinvented Java, oops :)
hybrid eg. GC with an optional explicit free operation which causes an allocated thing to become instantly invalid
Yeah, except I always think the of reverse: it seems most variables go out of scope at fairly clearly defined places, and I've grown very fond of having RAII work the C++ way (always) rather than the C# way (every time you use a IDisposable class, you can be leak-safe and exception-safe, provided you remember to add the "don't leak this" boilerplate). But it would be nice to have a garbage collector for things left over. Although I suppose the GC needs to be aware of any other variables which might still hold a pointer to them.
provided you remember to add the "don't leak this" boilerplate
Argh, yes, dammit, any time you find that you always have to remember to add the "work properly" option it's a sign that someone has cocked something up. My usual example of that is
find -print0orgrep -Zpiped intoxargs -0.Th next version of .Net should have something like this - http://en.wikipedia.org/wiki/Microsoft_Roslyn
array-of-double coefficients; // coefficients of a polynomial array-of-double input, output; // desired evaluations of the polynomial // Construct a function to evaluate this polynomial without loop overhead jit_function newfunc = new jit_function [ double x -> double ]; newfunc.mainblock.declare { double ret = 0; } for (i = coefficients.length; i-- > 0 ;) newfunc.mainblock.append { ret = ret * x + @{coefficients[i]}; } newfunc.mainblock.append { return ret; } function [ double -> double ] realfunc = newfunc.compile(); // Now run that function over all our inputs for (i = 0; i < input.length; i++) output[i] = realfunc(input[i]);(Disclaimer: syntax is 100% made up on the spot for illustrative purposes and almost certainly needs major reworking to not have ambiguities, infelicities, and no end of other cockups probably including some that don't have names yet. I'm only trying to illustrate the sort of thing that I'd like to be possible, and about how easy it should be for the programmer.)So an important aspect of this is that parsing and semantic analysis are still done at compile time – the code snippets we're adding to
newfuncare not quoted strings, they're their own special kind of entity which the compile-time parser breaks down at the same time as the rest of the code. We want to keep runtime support as small as we can, so we want to embed a code generator at most, not a front end. The idea is that we could figure out statically at compile time the right sequence of calls to an API such as libjit, and indeed that might be a perfectly sensible way for a particular compiler to implement this feature. The smallest possible runtime for highly constrained systems would do no code generation at all – you'd just accumulate a simple bytecode and then execute it – but any performance-oriented implementation would want to do better than that.Importantly, the function we're constructing gets its own variable scope (
retin the above snippet is scoped to only exist insidenewfuncand wouldn't clash with anotherretin the outer function), but it's easy to import values from the namespace in which a piece of code is constructed (as I did above with the@syntax to importcoefficients[i]). It should be just as easy to import by reference, so that you end up with a runnable function which changes the outer program's mutable state.Example uses for this sort of facility include the above (JIT-optimising a computation that we know we're about to do a zillion times), and also evaluation of user-provided code. My vision is that any program which embeds an expression grammar for users to specify what they want done (e.g.
gnuplot, orconvert -fx) should find that actually the easiest way to implement that grammar is to parse and semantically analyse it, then code-gen by means of calls to the above language feature, and end up with a runnable function that does exactly and only what the user asked for, fast, without the overhead of bytecode evaluation or traversing an AST.Func<int, int> doubler = x => x * 2;
in Javascript:
var doubler = function(x) { return x * 2 };
I know, there's no "compile time" in JS. But it's equivalent syntax anyway.
If it's deferred until runtime, then the c# syntax is far more complex and unwieldy but probably more flexible: http://blogs.msdn.com/b/csharpfaq/archive/2009/09/14/generating-dynamic-methods-with-expression-trees-in-visual-studio-2010.aspx
Indeed, it's simpler and hence less flexible. Both those examples are more or less fixed in form at compile time; you get to plug in some implicit parameters (e.g. capturing variables from the defining scope) but you can't change the number of statements in the function, as I demonstrated in my half-baked polynomial example above. I don't know C# well at all, but I know that in JS you'd only be able to do my polynomial example by building the function source up in a string and doing an eval.
(OK, I suppose you can build one up in JS by composing smaller functions, along the lines of
var poly = function(x) { return 0; } for (i = ncoeffs; i-- > 0 ;) { builder = function(p, coeff) { return function(x) { return x*p(x)+coeff; }; }; poly = builder(poly, coeffs[i]); }but I've no confidence that running it wouldn't still end up with n function call overheads every time a degree-n polynomial was evaluated. Also, I had to try several times to get the recursion to do the right thing in terms of capturing everything by value rather than reference, so even if that does turn out to work efficiently it still fails the puzzle-game test.)I'm not so wedded to the C/C++ style of syntax as to tolerate no divergence from even the bad parts (e.g. C's declarator syntax would probably be the first thing to go if I ever did sit down at the drawing board for serious), but I do think that one or two basic amenities such as an infix expression grammar are not things I'm prepared to do without in the language I use all the time for everything. I tolerate Lispy syntax in my .emacs because I don't spend my whole life writing .emacs; I'd lose patience if I did.