A few ideas for a Lisp-Shell-Language

Wed, 28 Apr 2010 00:08:42 +0000

Bash is one of these languages who seem to be made just for a few small purposes and are now used for vitally everything using ugly hacks and producing unreadable, ununderstandable and undebuggable code.

There are other Shell-Languages, and of course, there is the Clisp-Shell. Except for Clisp they are not „lispy“, and mostly have a lot of Syntax-Structures. I myself dont like too much Syntax. Infixes can make code more readable, if fit into the language, like StandardML does. And having different kinds of Brackets also can do. But having a lot of special syntax for several purposes makes a language more complicated and harder to use, at least to me. This is why I mostly dont like Perl – you can write beautiful Perl-Code, but most people dont. To me, it is not an argument that a Perl-Programmer can express something in one line which can be expressed in Scheme only in 5 lines – the readability of the latter is mostly better.

And well, Clisp as a shell-language sounds not well, because it is a Common Lisp Implementation, which means it mighty – too mighty for a shell. A shell should be turing-complete and simple, it should give the possibility to call other programs with parameters it gets in various ways – but its primary use remains calling other programs.

There is no point in efficiency (other than maybe having a polynomial time algorithm instead of an exptime), it should be simple, easy-understandable, and serve its purpose.

So, thinking of a lispy shell-language, I had the following ideas:

The REPL should be Line-Aware

That is, you dont always have to write parentheses around every instruction to call programs. A line without parentheses and context should be interpreted as if parentheses were put around it. That is, outside any construct like let, loop, etc. – just for the purpose of small function calls outside any other context, just for convenience. As soon as you are inside another lispy context (like let, if, begin, etc.), this should fail. Whether to allow this inside shellscripts or not is a good question – some programs may rely on syntax without parentheses. I would allow it under the same conditions. And above all: As soon as there is one parenthese, parentheses are not optional anymore. That is to prevent ambiguousities. Having one pair of parentheses around a line should really mean the same as having none of them, while having two of them should really always mean having two pairs of them.

There should be only one type of token

Strings, Symbols and several kinds of numbers can be distinguished in most lisps. For a shell-language, this isnt really necessary, I think. Strings are enough. They can carry every information of the other ones. Numbers can always be encoded in decimal Strings – well, arithmetic will get a lot slower then, but for simple calculations it should be fully sufficient, for complex ones one shouldnt use a shell-language anyway. There should be two ways of specifying them: When they dont contain any special character, they can be passed like symbols known from other Lisps. With special characters, they have to be put in quotes. And – above all – no unquoting inside strings, like in bash or php. Some function like string-append should be provided, but inside quotation marks, there may be a syntax for special characters, but nothing more.

One problem which arises with this approach is how to handle variables. If you dont distinguish between numbers, symbols and strings, you dont have any possibility of determining whether a variable or the actual value is to be requested. Well, for the global environment-variables, there should be special commands getenv and setenv, anyway.

For local variables, it is the same as the problem about quoting and unquoting, that already arises in other Lisps. On the one hand, in most cases one could just say „just dont name your local variables like strings and other symbols are named“. Maybe that would be the best way to do, and leave the rest to the programmer. But that could also cause confusion. So for variables, maybe a bash-like syntax like $“…“ would be good, i.e. having a $ in front of a string to denote variables. But then, one either would have to enforce quotation marks after $, or to add some additional conventions for strings which could get complicated. I think, just defining a function-like object called $ is also sufficient. Then you can access your variables with ($ var). You can pass Strings in both Syntaxes, and in the end, having ${varname} is not so much less verbose than having ($ varname). Of course, maybe this is not the most convenient thing one can imagine, but in my oppinnion, instead of adding a lot of additional Syntax to spare a few characters, a simple, easily-understandable programming model is more important.

There should be no high-level-objects

Lists can sometimes be useful in a shell, too. Like lists of files, or – if you want streaming – lazy lists of characters. So the basic instructions cons, car and cdr should be there (maybe named „pair“, „first“, „second“). But I dont see any use of cyclic data-structures in a Shell-Language. Without cycles, a simple reference-counter suffices for memory-management rather than a fully-blown garbage collector. Good garbage collectors can get very complex, while bad garbage collectors can be extremely bad in performance and memory-usage, both facts do not fit to a shell-language – a shell-language should do its stuff quickly and then be quiet and keep out of the way. A static reference counter and not allowing cyclic data structures anyway sounds like the best way to do. So, for lists, car, cdr and cons exists, but no way of setting the car and cdr of an existing cons.

There should be C-like scope-declarations rather than let-declarations

One thing that sometimes really annoyes me when using Scheme or Common Lisp is that local variables cannot be declared in the place they are used without adding an additional let-block. Therefore, either the variables need to be declared on top of the block, or you will get code which isnt readable because of the many sublists it contains. Or you will just use a long let*-declaration which saves all intermediate results, so you can code like you would do in a C-Like language.

Of course, inside a real Lisp, the let-approach gives you a lot of benefits like a complete control of which variables are currently declared in your scope, so in this case its a trade that you will kindly accept. But shell-scripting is mostly about quickly coding a few basic computations. Therefore, I think its better to give a possibility to declare variables and scopes in the way C does using curly parentheses and equal-instructions. One could declare the commands (block …) for declaring that here starts a new scope, and (var … = …) for declaring variables, as commands, and additionaly (set … = …) to set them.

Above, the usage of arbitrary strings as variable names was suggested, without the bounds a fixed let-Syntax would give, you can define variables named by arbitrary strings. In fact, the approach of allowing an arbitrary string for accessing variables has some disadvantages. You can produce very bad-looking code, and of course, you can use this as a hack for getting arrays and dictionaries. But on the one hand, if you quickly want to collect a few elements into indexed variables, there is nothing wrong with this, if you really need to exploit it to the maximum, you will get ugly unreadable code, but thats your problem then.

There should be a convenient piping- and streaming-API

Yes, I said a shell language shouldnt be too mighty. But piping and partially interpreting the output and generating the input of a process is something that has to be done so often that there should be an API included in the shell language. What happens if you dont add such an API can be seen in the various hacks that have found their ways into bashscripts. To be more versatile, tools like awk and sed evolved, but – honestly – many code generated with them is pretty unreadable. And its still hard to handle binary data, which is somehow needed, too.

So I think forbidding the handling of binary data and forbidding to read from stdout and write into a stdin directly produces more complicatedness than just adding a simple API that can handle all of this. I mean, in the end, its the only real way for the shell to communicate with programs it calls, except for the return values.

Conclusion

A lot more could be said about shell languages I think. And of course, every point I mentioned is arguable. Something that goes into the direction I mentioned seems to be Tcl, but as far as I see, Tcl is more a scripting language than really a shell language, even though the interpreter is called „wish“.

Advertisements

Der fünfhundertfünfundfünfzigste Artikel

Sun, 14 Feb 2010 01:10:59 +0000

Ich möchte hiermit darauf aufmerksam machen, dass dies der fünfhundertfünfundfünfzigste Artikel auf diesem Blog ist.

Nun, was sagt uns diese, ominöse Zahl, fünfhundertfünfundfünfzig? Was findet zum Beispiel Wikipedia dazu? Nun, Wikipedia schlägt unter Anderem das Jahr fünfhundertfünfundfünfzig vor, gefolgt von der Telefonnummer fünfhundertfünfundfünfzig, die in den USA für fiktionale Telefonnummern reserviert zu sein scheint, was der Grund ist, warum diese so oft mit 555 anfangen. Es gibt sogar eine Liste solcher Telefonnummern. Sehr wichtig, das. Ob es wohl auch ein Urheberrecht darauf gibt? Wer weiß. Haben wir in Deutschland sowas eigentlich auch? Wenn nicht muss sofort ein Gesetz erlassen werden. Kann ja nicht sein, dass der Raum der fiktiven Telefonnummern ein rechtsfreier Raum ist.

Jedenfalls haben wir in Deutschland eine Bundesautobahn 555, und die deutsche Wikipedia kennt außerdem noch einen Schaltkreis. Naja, wie lange noch, ist natürlich die Frage. Denn zweifelsohne ist das alles nicht relevant, wird also bestimmt irgendwann gelöscht.

In der englischen Wikipedia erfährt man noch, dass fünfhundertfünfundfünfzig diejenige natürliche Zahl ist, die zwischen fünfhundertvierundfünfzig und fünfhundertsechsundfünfzig ist. Und eine Sphenische Zahl ist, also eine Zahl, die sich als Produkt von drei Primzahlen in der ersten Potenz schreiben lässt, in diesem Falle drei, fünf und siebenunddreißig. Außerdem ist sie eine Harshad-Zahl, also eine Zahl die durch ihre Quersumme teilbar ist, bezüglich der Basen 2, 10, 11, 13 und 16, die jeweiligen Darstellungen sind 1000101011(2), 555(10), 465(11), 339(13), 22B(16), die Quersummen also fünf und fünfzehn. Soweit Wikipedia. Genauer ist da das folgende Programm, das nicht unbedingt effizient geschrieben ist, aber als schneller Hack mit Gedankenminimierung ausreicht:

clisp -x ‚(defun basednum-to-int (base digits) (if digits (+ (car digits) (* base (basednum-to-int base (cdr digits)))) 0)) (defun basednum-inc (base digits) (if digits (if (< (car digits) (1- base)) (cons (1+ (car digits)) (cdr digits)) (cons 0 (basednum-inc base (cdr digits)))) (list 1))) (defun convert-to-base (base int &optional (sofar nil)) (if (equal int (basednum-to-int base sofar)) sofar (convert-to-base base int (basednum-inc base sofar)))) (defun sum-of-digits (base int) (let ((ret 0)) (dolist (i (convert-to-base base int)) (incf ret i)) ret)) (dotimes (i 555) (if (zerop (mod 555 (sum-of-digits (+ 2 i) 555))) (format t „~d, “ (+ 2 i))))‘

dessen Ausgabe 2, 10, 11, 13, 16, 19, 21, 23, 37, 38, 46, 55, 61, 75, 91, 109, 111, 112, 136, 149, 181, 185, 186, 223, 260, 271, 276, 277, 371, 445, 519, 541, 551, 553 wohl alle Basen (außer denen größer gleich 555, für die das sowieso trivialerweise gilt) sein dürften, zu denen diese Zahl eine Harshad-Zahl ist.

Interessant.


Software Namings

Wed, 03 Feb 2010 22:52:18 +0000

Have you ever searched for „Elephant“ in Google? Well, if you do so, you will maybe find a lot of films and articles about the well-known animal we call „Elephant“. So if somebody told you about a Software called „Elephant“, well, you would have to add a lot of additional descriptions or you would have no real chance to find it. Besides maybe other projects, there is this Persistence-Library for Common Lisp called „Elephant“. Well, I heard of this library the first time when reading the Reddit-Entry „Comparison of CL-SQL and Elephant„, and actually, this title pretty much shows the problem with softare namings – it simply sounds stupid!

Now search for „Python“. Yeah, the Python programming language is more common than the animal – you will find a lot of stuff for both.

I can remember a lecturer telling about a project named „ant“. Of course, ants are insects. And ANT is a compiling infrastructure for Java. He meant an example of an Agent-Oriented Programming-Language. Calling it Ant may fit to agent-oriented programming, but since this is so generic, in almost every website about agent-oriented programming, you will find the word „ant“ mentioned – even without any reference to this system.

Seems like generally animal names (Elephant, Python, Ant) are liked by people writing software. But of course, its not limited to animal names. Generally, ambiguous names seem to magically attract people.

At least Pidgin – which is named after pigeons –  names itself after the language. Before that, it was called GAIM. And the library behind it was called libgaim. Ok, they had to change the name for licensing resons. But did they really have to call it pidgin? A name which has almost nothing to do with IM? But ok – they called it pidgin. Lets accept that. So the backend-library could have been called libpidgin, and everybody would be happy. But no, the backend is called libpurple now – such that it is nearly impossible to find that name if you dont know it already. Great!

Calling an operating system Windows already made some people forget what „Windows“ are.

Does anybody recognize, what an „Apache“ originally was? Well, at least the inventors of the Cherokee Webserver obviously did …

In the Lisp-World, words sounding like „Closure“ seem to be liked. There is the Closure Webbrowser, the Common Lisp compiler named Clozure, and – of course – Clojure. Why not starting more projects like this.

How about Closhure? Or Clochure? Or Closchure? Or Cloyure? Or Cloşure? Or Cloжure? Or Clošure? Or Cloʃure? Or Cloש‎ure? Or Cloシュre?

Nevermind. Since I was very nerved by the situation of fastcgi and clos-streams under common lisp about two years ago, I began to write a library for fastcgi which should not depend on any implementation-specific features (except for a simple socket-binding) and had the problem that this is complicated without having clos-streams, so I began to try to write something portable for this (well, today I think differently, but in those days I was still a beginner – and doing experiments is a good thing). I called it „Fastcgi Usable in Common-lisp Kits Yachting through the Oceans of Unportability“ (which is – of course – mostly shortened to an acronym), and the stream-library was called „Commonlisp User Mode Binary ALLpurpose Streams“ (which is also shortend to an acronym). A pity that I havent released it – maybe reddit would have a title „comparison between clos-streams and …“ now.


The finale of my project thesis …

Wed, 06 Jan 2010 10:13:39 +0000

This is how it feels:

(For lack of time this was drawn in 5 minutes … so it doesnt look good …)


What „Lispy“ means to me

Sat, 12 Dec 2009 03:48:02 +0000

Well, its quite a long time ago since I had my last real Meta-Lisp-Post, so well, why not having it now, while waiting to get tired to go to bed.

So well, „Lispyness“ is something which is sometimes discussed when talking about solutions for problems. „Lispyness“ is the reason why it is so hard to create widely-accepted and widely-used ffi-bindings, „Lispyness“ is the reason why some software is less efficient than it could be made, and in general, „Lispyness“ can be the reason for quite a lot design decisions. But even though there is a common sense about some aspects of  „Lispyness“, the concept „Lispy“ is not clearly defined.

I am now giving a few things which are essential for a programming language (or concept) to be „Lispy“. They are just my opinions, and they are likely to change in the future – at least partially.

Simple – one thing which is essential. Having a small core which can be easily implemented. And removing boundaries rather than adding new features. An example for such a lisp-feature is the reader-macro technology of common lisp (even though common lisp doesnt really have a small core) – you dont have to change the standard of the language to add new syntax structures to it. The syntax in general – bare S-Expressions – are such a thing. Macros in general are such a feature. In fact, this was one aspect which convinced me of Java (at a time when I didnt know Lisp). Java had about 50 keywords, a simple Reflection API, and a huge library – but these keywords were well-defined, and you could basically do everything whith them, while the library was just a bunch of classes which is a nice thing to have, but is not part of the core of Java. With a simple parser, one could produce an own Java-Interpreter and Compiler – unfortunately, the Java Bytecode is a lot less simple, and most of the Library was already compiled.

Liberal – remove boundaries if there is no need for them. Something I often notice in computer science in general is that boundaries are made to systems, even though they do not really make any sense. They are just there, because nobody cares, but sometimes they can produce problems. For example, it really took a long while until finally someone added DrawingCanvases to webpages – for a rendering engine, such an object is not hard to realize. Same for videos and sound. But instead of just providing it, and maybe extending its API, a whole plugin technology evolved, trying to substitute this lack. Most lisps are liberal in what you can do. Macros let you generate code, but in theory, you can download that code while compiling – which is not nice and shouldnt be done, but can be useful in some certain situations, in which nobody has to find a hack around that problem. The point is to understand the difference between the things one can do and the things one should do.

Dynamic – also essential. You dont build ferroconcrete blocks, you build lumps of mud. You use dynamic structures like lists, trees and structured objects, rather than having some static structure which is hardly bound. Comparing the Apache Web Server with the Hunchentoot Web Server is maybe the best way of expressing this difference: Apache is a static webserver – it has a plugin technology, so it can be extended, at least if you know how to adapt the configuration files and you restart it everytime you change them. Under hunchentoot, you can have a REPL running inside the Lisp-Process running your webserver. You can add and modify sites on the fly, try new settings without restarting the server, and change its behavior while running. Outside of Lisp, this can be found inside JavaScript, for example – objects can simply be extended by new slots, almost nothing is static. I think this is something all the Lisp-Dialects share somehow. Another example is the CLOS – actually, what you do by extending generic functions is building dynamic case-decisions.

Imperative – arguable, but my opinion. I dont understand why Clojure and Scheme try to get rid of the imperative features they have. Imperative languages became discredited in the scientific world, maybe because many imperative languages lack of modern features. Anyway, to me Lisp shows how imperative programming can be done well. In many situations, a functional approach is the most natural approach of doing something. But sometimes, when doing tail-recursive loops, etc., it just gets artificial. Sometimes an imperative algorithm is just more natural than trying to put it into a tail-recursive form. And also, sometimes, prog-go-forms („goto“ – for the C-Programmer) is simply the easiest way of programming something. To quote one of my professors: Programming Languages are not made for the computer, they are made for the humans. And they are also not made for some strict Type-System, they are made to give an easy and convenient way of expressing algorithms. Standard ML also has imperative features. And even Haskell has them – encapsulated inside monads. Monads may solve formal problems, but in the end, to me they are as artificial as forbidding the usage of „goto“-commands – trying not to use something which can cause problems when not used correctly. Lispyness means Imperativeness to me. I think I wouldnt consider a language which is not imperative in some sense as a lisp.

Chaotic – also arguable, but to me also essential. If it is completed, it is not lisp. There is always something that can be made better. There is always some edge which needs an additional hack. There is always a library which doesnt run on some platform. Thats software – but most software tries to hide it. Its – to me – the spirit of lisp to just accept it, and try to get along with it as good as possible. Well, there is a huge standard for Common Lisp – but outside this official standard, few inofficial standards really evolved, and the different implementations of Common Lisp are really behaving different. This can get on ones nerves, but you can always find a simple workaround – and minimize the need of changes to the code you already wrote. When looking at C/C++, you basically have one possibility – namely preprocessing instructions – to influence the compiled code, and adapt it to the compiler you recently use – but often you have to write the same thing twice, and as far as I see, more often than under Common Lisp. For Scheme, there is only a small standard anyway, and the compilers differ – as far as I see – in almost anything besides this standard (and even inside it, sometimes). Portability is very important to me – but it can only be gained when anybody tries to make his own software as portable as possible, as there are situations in which portability is not possible or not even useful. When somebody writes a binding for some elementary Linux-Syscall to make file-access more efficient, of course this can not necessarily be used under Windows, and so, anybody using this binding will have to write a wrapper around it, using normal file-access (or something similar for windows), to run it under windows, but as long as everybody keeps trying to write his code as portable as possible, and in a way such that it can be easily ported if somebody needs it, portability of the software in the end will be only a small issue.

So. Some aspects of lispyness, or what it means to me.


Yeah, just read there is a Scheme-Interpreter in Clojure …

Tue, 01 Dec 2009 22:28:15 +0000

Just read on Reddit that there is a Scheme-Interpreter for Clojure. What a nice thingy.

I mean one could just use JScheme but hey, who cares. Its Clojure. Its fresh. Its new. Its Web 2.0.


Best lisp-server-framework for blog-hosting?

Tue, 13 Oct 2009 19:54:15 +0000

Since I finally decided to get my own little vps, I want to begin hosting this blog on it asap.

Well, I could use WordPress there, and certainly, I will do so in the beginning, because it seems to me the easiest way of porting the old data from this blog to some storage I possess.

And actually, in the beginning, I want to continue posting my stuff here, while posting it parallely on the other server.

Still, in the long term, I hope that I can completely switch to this new server. Anyway, there will be a lot of work to do. At the moment I wonder about the Software I will use.

I would like to use Kompottkin’s blog-software (if he allows me) which bases on Common Lisp, using Clisp with CGI as far as I know. It looks nice, and – well, its the only CL-Blog-Software I know anyway. But actually, I dont think that Clisp+CGI is good, since I want to use lisp primarily.

Maybe Hunchentoot is a better choice (and it shouldnt be too hard to port that software). But at least the last time I tried that, Hunchentoot wasnt well on CLISP, it couldnt handle more than one connection at a time. I saw a new thread-management in the API Docs, and as far as I understood it should be possible to make the server single-threaded, but anyway, I dont know if this is a good idea. Then there is SBCL which has a strange memory-management. On my VPS, I can use at most 2 gigabytes of RAM with it, and can only run one SBCL-Process at a time. At least, SBCL seems to be very stable, as long as there is enough memory. Maybe it is possible to tell SBCL to stop all functions that consume too much memory, etc., since SBCL doesnt crash on heap exhaustion, but sends an error. Then there is CCL. CCL appears to use memory more efficient, as far as I tested, but it segfaults when you allocate too much of it. Then there is ECL, which shouldnt have these problems at all, but I dont know whether Hunchentoot runs on ECL yet.

Well, there are a lot of possibilities. All should work, but all must be tested. I know that „begging for comments“ is not a good thing for a blogger to do, but … well, I would appreciate any comment by any person knowing more.