Layer-free shell syntax [entries|reading|network|archive]
simont

[ userinfo | dreamwidth userinfo ]
[ archive | journal archive ]

Wed 2011-03-16 11:36
Layer-free shell syntax
LinkReply
[personal profile] simontWed 2011-03-16 13:21
The other nice feature of brackets, of course, is that they nest. Half the escaping problem arises because the 'treat this literally' syntax can't be nested: if shells used the Postscript approach of enclosing literal strings in parens instead of quotes, then you could trivially quote any piece of shell syntax you liked (by induction, that piece would contain matched parens if it quoted anything in turn) without having to worry about escaping the escapes.

In fact, doesn't Tcl do that? And also GNU m4, IIRC.
Link Reply to this | Parent | Thread
[identity profile] bjh21.me.ukWed 2011-03-16 13:45
Yes. Tcl's original quoting characters are { and }. These days it has " as well.
Link Reply to this | Parent | Thread
[personal profile] pneWed 2011-03-16 14:00
Though they're not interchangeable, of course, due to the different interpolation behaviour.

What I found amusing, though, was when I learned that except for the interpolation, they are identical - in particular, you can use quotation marks for "blocks" (arguments to things like if or proc or the like), if there are no interpolatables in the block (or you escape them yourself). Even multi-line ones.

That was simply something I wasn't used to, coming from C-like languages: blocks had braces and strings had quotes, and never the twain shall meet - but in Tcl, essentially everything's a string. (Or everything's a list of strings? At any rate, "blocks" aren't special.)
Link Reply to this | Parent
[personal profile] fanfWed 2011-03-16 14:08
And in fact Tcl's syntax is remarkably shallow especially compared with the shell.

A lot of the problem with quotation syntax is that unquoting the string implies a rewrite, which leads to problems with repeated doubling of \\\\ and suchlike. Perhaps quotations should be passed down to the next layer verbatim, so that they are only unquoted at the last possible moment.

One way of looking at the problem is that it arises from being "stringly typed". Perhaps a command line should be parsed once, at which point more specific types are inferred for the various parts of the command, and subsequent processing of the command happens in a type-safe manner. So for example, when expanding a glob the resulting list is a list of filenames, not a list of undistinguished command line arguments, so there can be no confusion if one of the files is named -rf.
Link Reply to this | Parent | Thread
[personal profile] pneWed 2011-03-16 14:20
A lot of the problem with quotation syntax is that unquoting the string implies a rewrite, which leads to problems with repeated doubling of \\\\ and suchlike. Perhaps quotations should be passed down to the next layer verbatim, so that they are only unquoted at the last possible moment.

But then how can you tell at which level a \} (or \\\} or \\\\\}, etc.) is to be interpreted? (That is, where the "last possible moment" for unquoting that character sequence is: the bottom-most layer, or somewhere in between?)
Link Reply to this | Parent | Thread
[personal profile] fanfWed 2011-03-16 14:30
If it isn't the bottom-most it isn't the last possible moment.

The problem of course is this leads to an un-unixy design where each program has to do unquoting of its arguments if necessary, rather than relying on the shell to handle all metacharacters, and this in turn inevitably leads to incompatibilities.
Link Reply to this | Parent | Thread
[identity profile] bjh21.me.ukWed 2011-03-16 15:57
It's not really Unixy, but POSIX already has a kind of shared dequoting system in the form of getopt(), which handles the quoting of operands using --. Of course, this leads to precisely the kind of incompatibilities you refer to.
Link Reply to this | Parent
[identity profile] bjh21.me.ukWed 2011-03-16 14:22
I'd been pondering the "only unquote once, as late as possible" rule, since that's effectively what URIs do -- there's one quoting scheme, and effectively each layer only treats non-quoted characters as special and passes quoted ones on to the next layer down. This only works because the layers don't conflict over their special characters, but I wonder if you could combine this with Tclish nestable quotes to get something useful.

It also occurs to me that you mostly don't pass shell commands to programs as a single quoted string, but as the tail of another command, which is kind of reminiscent of the way the URI syntax is defined to allow an entire URI to be used with no additional quoting as a query string.
Link Reply to this | Parent
navigation
[ go | Previous Entry | Next Entry ]
[ add | to Memories ]