A few ideas for a Lisp-Shell-Language

Bash is one of these languages who seem to be made just for a few small purposes and are now used for vitally everything using ugly hacks and producing unreadable, ununderstandable and undebuggable code.

There are other Shell-Languages, and of course, there is the Clisp-Shell. Except for Clisp they are not „lispy“, and mostly have a lot of Syntax-Structures. I myself dont like too much Syntax. Infixes can make code more readable, if fit into the language, like StandardML does. And having different kinds of Brackets also can do. But having a lot of special syntax for several purposes makes a language more complicated and harder to use, at least to me. This is why I mostly dont like Perl – you can write beautiful Perl-Code, but most people dont. To me, it is not an argument that a Perl-Programmer can express something in one line which can be expressed in Scheme only in 5 lines – the readability of the latter is mostly better.

And well, Clisp as a shell-language sounds not well, because it is a Common Lisp Implementation, which means it mighty – too mighty for a shell. A shell should be turing-complete and simple, it should give the possibility to call other programs with parameters it gets in various ways – but its primary use remains calling other programs.

There is no point in efficiency (other than maybe having a polynomial time algorithm instead of an exptime), it should be simple, easy-understandable, and serve its purpose.

So, thinking of a lispy shell-language, I had the following ideas:

The REPL should be Line-Aware

That is, you dont always have to write parentheses around every instruction to call programs. A line without parentheses and context should be interpreted as if parentheses were put around it. That is, outside any construct like let, loop, etc. – just for the purpose of small function calls outside any other context, just for convenience. As soon as you are inside another lispy context (like let, if, begin, etc.), this should fail. Whether to allow this inside shellscripts or not is a good question – some programs may rely on syntax without parentheses. I would allow it under the same conditions. And above all: As soon as there is one parenthese, parentheses are not optional anymore. That is to prevent ambiguousities. Having one pair of parentheses around a line should really mean the same as having none of them, while having two of them should really always mean having two pairs of them.

There should be only one type of token

Strings, Symbols and several kinds of numbers can be distinguished in most lisps. For a shell-language, this isnt really necessary, I think. Strings are enough. They can carry every information of the other ones. Numbers can always be encoded in decimal Strings – well, arithmetic will get a lot slower then, but for simple calculations it should be fully sufficient, for complex ones one shouldnt use a shell-language anyway. There should be two ways of specifying them: When they dont contain any special character, they can be passed like symbols known from other Lisps. With special characters, they have to be put in quotes. And – above all – no unquoting inside strings, like in bash or php. Some function like string-append should be provided, but inside quotation marks, there may be a syntax for special characters, but nothing more.

One problem which arises with this approach is how to handle variables. If you dont distinguish between numbers, symbols and strings, you dont have any possibility of determining whether a variable or the actual value is to be requested. Well, for the global environment-variables, there should be special commands getenv and setenv, anyway.

For local variables, it is the same as the problem about quoting and unquoting, that already arises in other Lisps. On the one hand, in most cases one could just say „just dont name your local variables like strings and other symbols are named“. Maybe that would be the best way to do, and leave the rest to the programmer. But that could also cause confusion. So for variables, maybe a bash-like syntax like $“…“ would be good, i.e. having a $ in front of a string to denote variables. But then, one either would have to enforce quotation marks after $, or to add some additional conventions for strings which could get complicated. I think, just defining a function-like object called $ is also sufficient. Then you can access your variables with ($ var). You can pass Strings in both Syntaxes, and in the end, having ${varname} is not so much less verbose than having ($ varname). Of course, maybe this is not the most convenient thing one can imagine, but in my oppinnion, instead of adding a lot of additional Syntax to spare a few characters, a simple, easily-understandable programming model is more important.

There should be no high-level-objects

Lists can sometimes be useful in a shell, too. Like lists of files, or – if you want streaming – lazy lists of characters. So the basic instructions cons, car and cdr should be there (maybe named „pair“, „first“, „second“). But I dont see any use of cyclic data-structures in a Shell-Language. Without cycles, a simple reference-counter suffices for memory-management rather than a fully-blown garbage collector. Good garbage collectors can get very complex, while bad garbage collectors can be extremely bad in performance and memory-usage, both facts do not fit to a shell-language – a shell-language should do its stuff quickly and then be quiet and keep out of the way. A static reference counter and not allowing cyclic data structures anyway sounds like the best way to do. So, for lists, car, cdr and cons exists, but no way of setting the car and cdr of an existing cons.

There should be C-like scope-declarations rather than let-declarations

One thing that sometimes really annoyes me when using Scheme or Common Lisp is that local variables cannot be declared in the place they are used without adding an additional let-block. Therefore, either the variables need to be declared on top of the block, or you will get code which isnt readable because of the many sublists it contains. Or you will just use a long let*-declaration which saves all intermediate results, so you can code like you would do in a C-Like language.

Of course, inside a real Lisp, the let-approach gives you a lot of benefits like a complete control of which variables are currently declared in your scope, so in this case its a trade that you will kindly accept. But shell-scripting is mostly about quickly coding a few basic computations. Therefore, I think its better to give a possibility to declare variables and scopes in the way C does using curly parentheses and equal-instructions. One could declare the commands (block …) for declaring that here starts a new scope, and (var … = …) for declaring variables, as commands, and additionaly (set … = …) to set them.

Above, the usage of arbitrary strings as variable names was suggested, without the bounds a fixed let-Syntax would give, you can define variables named by arbitrary strings. In fact, the approach of allowing an arbitrary string for accessing variables has some disadvantages. You can produce very bad-looking code, and of course, you can use this as a hack for getting arrays and dictionaries. But on the one hand, if you quickly want to collect a few elements into indexed variables, there is nothing wrong with this, if you really need to exploit it to the maximum, you will get ugly unreadable code, but thats your problem then.

There should be a convenient piping- and streaming-API

Yes, I said a shell language shouldnt be too mighty. But piping and partially interpreting the output and generating the input of a process is something that has to be done so often that there should be an API included in the shell language. What happens if you dont add such an API can be seen in the various hacks that have found their ways into bashscripts. To be more versatile, tools like awk and sed evolved, but – honestly – many code generated with them is pretty unreadable. And its still hard to handle binary data, which is somehow needed, too.

So I think forbidding the handling of binary data and forbidding to read from stdout and write into a stdin directly produces more complicatedness than just adding a simple API that can handle all of this. I mean, in the end, its the only real way for the shell to communicate with programs it calls, except for the return values.

Conclusion

A lot more could be said about shell languages I think. And of course, every point I mentioned is arguable. Something that goes into the direction I mentioned seems to be Tcl, but as far as I see, Tcl is more a scripting language than really a shell language, even though the interpreter is called „wish“.

2 Antworten zu A few ideas for a Lisp-Shell-Language

  1. Leslie P. Polzer sagt:

    Lots of good ideas in that proposal.
    When’s the prototype to be expected? :)

    Leslie

  2. dasuxullebt sagt:

    Either after a long night of bash-hacking when I am so annoyed that I have to show the world „I can do it better“, or … after my Jump n Run (on which I am currently focusing) gets somehow release-ready.

    But … feel free to produce one ;-)

Schreibe einen Kommentar

Trage deine Daten unten ein oder klicke ein Icon um dich einzuloggen:

WordPress.com-Logo

Du kommentierst mit Deinem WordPress.com-Konto. Abmelden / Ändern )

Twitter-Bild

Du kommentierst mit Deinem Twitter-Konto. Abmelden / Ändern )

Facebook-Foto

Du kommentierst mit Deinem Facebook-Konto. Abmelden / Ändern )

Google+ Foto

Du kommentierst mit Deinem Google+-Konto. Abmelden / Ändern )

Verbinde mit %s

%d Bloggern gefällt das: