Get a free PDF Reader

Thu, 27 May 2010 01:11:55 +0000

Looking for some instructions for mozplugger embedding evince, besides the solution I found here, I also found a nice link to this campaign from the FSF Europe.

It is an appeal to use a free PDF Reader. Well, under Linux and other free systems, there are a lot of them, and they are all mostly good. I actually do not understand why there are still people who prefer the Adobe Reader under Linux. Not only are there a lot of alternatives, they are also mostly much better (faster, easier to use). Few PDFs are not working on them – the ones created with some strange WMF-Tools (M$ for the win) and of course the ones which are encrypted such that explicitly only Adobe Reader can open them. I had this situation exactly twice in my whole life – one time a PDF created with some strange settings from Scientific Workplace, and the other time from a Professor who wasnt allowed to publish parts of its book without encryption. Even commercial pdf-providers usually dont use this, because its basically useless – it is a crude form of DRM, but modern eBook-Formats have much better techniques for that.

Also under Windows, I dont want to use the Adobe Reader, but actually I mostly use (the non-free) Foxit Reader there. The FSE’s list names Evince for Windows – but Evince for Windows was in a Beta-State and I wouldnt have recommended it to normal people. Okular was stable but needed a full-blown KDE-Installation, and KDE for Windows is still no fun. I never tried Sumatra PDF though. I will have to do this.

Well, actually, I dont like PDF much. Many modern PDF-Files are bloated. I liked early versions of PostScript much better. And at the moment, I like djvu very much. At least for ebooks, djvu seems to be a good format. As a comparably simple format, I like SVG. I mean, its bloated with XML-Stuff, but at least the inner structure is simple.

Its a pity that only few pieces of free software work properly under Windows. Windows is still the main platform for most people, and to convince them of free software, it could be a good thing to actually make them work with it under Windows already.

Advertisements

Comment Feeds, Please! (and other things about blogging)

Wed, 12 May 2010 18:46:05 +0000

Well, there may be a lot of „professional“ and „famous“ bloggers out there who might say what they like or dislike when reading blogs and if you want to create a „professional“ blog rather than a private little blog about the things you are interested in, then you might ask these people better than reading on what I am going to write. Because I will now tell you about a few things I dont like on some Blogs, and reasons why a blog might be thrown out of my RSS-Feed-List.

Have and Maintain a Feed

Yes, there are still people who proudly write their own Blog-Software but dont provide any feed. Even though their site might have interesting content, there are thousands of other sites who provide interesting contents, too, and at least for me, its rather hard to produce something so interesting that I am willing to periodically go to your Site and watch for news.

These times are gone. There are too much people writing their opinion online. I have just counted 439 Newsfeeds in my Feed-Reader, and at least half of them are providing information that interests me, but most of them dont do this often. I cannot manage to watch 439 Websites every time, especially because mostly I am just reading this stuff in my free time, mostly without getting anything out of it I really need, i.e. just for fun.

And something that especially gets on my nerves is when I already subscribed to a feed and then the blogger changes his Software and with it the Feed-URL, without writing a note on the old newsfeed. So I only get notice about it by the error messages of my feed reader. This is annoying!

Ah, and especially: Make it easily findable. Provide feed-links as well as link-tags which Feedreaders can recognize. I dont want to have to „search“ your site for them.

I dont care much about design, just make your site work with as little as possible

Many people like Websites which put great efforts into their site-design. There is nothing wrong with that, except that these efforts often lead to huge requirements of your browser.

In particular: If I go to your site, then dont expect me to activate JavaScript, if there is no explicit reason. If you use jsMath because you cannot use LaTeX on your provider’s Server, or you are writing browser games and therefore need JavaScript, then kindly excuse yourself when I go to your site, and ask me to activate JavaScript, rather than commanding me to do so. JavaScript uses my system resources and might produce additional security vulnerabilities – and if you are just too lazy to provide an Interface that doesnt need JS, without really needing it, I am not willing to give you the trust of letting your code execute on my Computer!

Same for Flash-Animations. Flash is like a cullender when it comes to security. There are a few domains which I trust. For example, I trust large Video-Portals like YouTube, Vimeo or Dailymotion. Because if they would become vulnerable, then they would fix it as fast as they could. I dont want to see Flash on your Website, except when its really necessary. Ok, its still necessary for embedding videos or sound – I hope that these times will go away soon, but there is no other possibility that really works by now. So yeah, I can understand that you might use Flash when its impossible not to do so.

Advertisement-Services also sometimes use Flash. I dont see why they do it, instead of just using GIF-Animations, but well, I can understand that you want to get back the money you pay for your provider, so well, keep your Flash-Advertisements – I will block Flash anyway if you dont give me a reason not to do so.

But as soon as your site has some fancy-looking sidebar or other shit programmed in Flash, I will certainly not use it.

Ah, and dont use Cookies, if there is no reason for it. Some Ad-Services might require them, but I will block them if I dont see a reason to give you the opportunity to save data on my PC!

You might use Cascading Style Sheets, and well, if you really want, you might provide additional functionality using JavaScript and Cookies – yes, these technologies are nice for some purposes, and if I read your website for a long time, I might feel comfortable with giving you the opportunity to save small pieces of data and execute small pieces of code on my PC. But if you try to force me to do so, I will not give an inch.

Oh, and a note on CSS: CSS is made to put design on your site to make it viewable with many technologies. Maybe I want to go to your site using lynx. Then please put the boilerplate-elements below the interesting stuff. I dont want to have to scroll down 5 screens of stupid login-, blogroll- and linklist-information before I finally can get to the content I want to see.

Allow comments to all people

There is a lot of comment-spam so I can understand why you might want to look at the comments I write before publishing them. I can understand when you want me to enter a captcha, I can even understand when you require JavaScript for commenting to prevent spam. But if so, dont just assume I have JavaScript turned on, tell me that comments need JavaScript before producing strange errors (or just doing nothing).

You want my name (or a representative Nickname) and of course an E-Mail-Adress of mine. Maybe you are kindly even adding my Gravatar-Icons. But dont forget to give me the possibility to put some Website-URL of mine on top of my comments. Maybe you are not interested, but other people reading my comment might be interested to get to know more about me – I help you to keep your blog alive, so in exchange you can help me. Fair is fair.

Dont expect me to register anywhere or have an OpenID. Yes, I have an OpenID, and if you kindly ask me to provide one, I might think about it. But requiring to have such a service or even registering on your blog before I can post is arrogant and if you dont give something really awesome to me, I just wont post comments on your site. And if I cannot discuss about what you write, well, your site might get less interesting to me.

Of course I can understand you if you have a blog-provider not allowing this. But well, then you might consider changing the provider. At least WordPress allows comments in general. If a blogging service doesnt allow it, just dont use it.

Have thread-based comment-feeds or at least mailing-notifications

So well, you have managed to make me put a comment on your page. Con grats! Now maybe I expect some reaction by you or some other person. If you are using one of the larger blogging-services like WordPress, the thread I just posted in has a Comment-Feed, telling me about comments given there. Sometimes, Blogs dont provide this, but they provide Mailing-Notifications if some new comments come up. I can live with that (I gave you my mail-adress anyway).

But you have to give me something. Otherwise I will have to keep that tab with the comments open. And since I am working on at least 3 distinct computers, partially with distinct browsers, I will certainly not follow these comments for a long time.

But if I cant follow the reactions on my Comment, I will think twice before posting a comment anyway.

Dont be professional

Except when you are a real journalist who has already worked for newspapers or plans to do something like a newspaper, or you host a science blog, dont be professional. I am sick of all this „professional blogging“ stuff. For me, a weblog must not be professional, except maybe when its about science – if its professional, it becomes an online newspaper, but then, it should be stated as such, and compete with others of its kind. In blogs, I want to read the opinions of many unprofessional people.

Conclusion

Well, thats what you should do if you want me to read your blog. If you are a famous blogger, than you might as well ignore me, because you have so much other followers. But surprisingly, most famous bloggers meet my requirements – a coincidence?

I like private blogs. I like scientific blogs. I like small blogs that dont write much more than once a month, as well as bloggers who write five articles a day. I wouldnt read your blog because its special, I would read it because its one of many.

Always remember that you are unique – just like everybody else.


Personal Notes on LaTeX

Fri, 07 May 2010 02:34:08 +0000

I am making this post to finally have everything written down in a central place.

Firstly, everytime I can, I use TexMacs. I dont like LaTeX, I never found a real introduction that goes into the implementation details of latex or tells how to really write code with it – I mean its said to be turing-complete, but I dont actually know how to do some quite simple things with it.

However, TexMacs is said to be capable of vitally anything LaTeX also is, and its scriptable with scheme, but well, sometimes its not capable of some things, and its not as well-documented as latex is. If I just want to write lecture notes, then I use TexMacs because you can really typeset the stuff fast and looking very good.

But for „larger“ things like my project thesis, I decided to use LaTeX. And before I knew TexMacs (which is – unfortunately – very unknown), I also used LaTeX, and before I knew LaTeX, I used the OpenOffice.org-Math-Editor (which isnt that bad at all).

And so, here a few notes of problems that I had and their solutions:

Sometimes one wants to have one page without a page number. Setting \pagestyle{empty} should work in most cases. But not in all. When you use \maketitlepage, it somehow changes the pagestyle, so you have to use \thispagestyle{empty} afterwards instead. Setting the pagestyle back to the „default“ value is done by \pagestyle{plain}, which is the default, afaik.

Well, then there are dedication-pages. What I want is a single page which is empty except for one small text in the center of the page. Well, there are the commands \hfill and \vfill, filling up a line horizontally and a page vertically. Using these should make this possible. So I tried something like \vfill \hfill my-dedication \hfill \vfill \newpage. Didnt work. After a lot of trying around, finally, I „hacked“ around it by just using empty formulas, which make LaTeX think that it has to keep place: $ $\vfill$ $ \hfill my-dedication \hfill $ $ \vfill $ $. Well, not perfect, but it works.

For Code-Listings I finally found the LaTeX-Package „listings“ and this nice tutorial (which is in german). This is yet another of these „I can do everything“-Packages of which LaTeX has so many. In my opinion, a language should give you the possibilities of defining your own routines and only help you for this, but not keeping you as good as it can from doing anything yourself, while providing packages for „everything“.

Meanwhile I always use UTF-8. I dont see any reason for using anything else for my documents. Especially, when I want to include special characters like Hiragana or Katakana. Just to prevent the encoding hell. Actually, I dont quite understand why anybody is using anything else than UTF-8. Ok, some software needs encodings with an equal width, but these are special needs. For vitally everything the user has to handle with, UTF-8 should be the best.

Including graphics is also one major problem which always occurs. There may be a lot of packages which should place the graphics somewhere special, etc. – but none of them actually worked everywhere. Using pdf-files with \includegraphics from the graphicx-Package was sufficient for me so far – especially because I couldnt find anything that really worked better so far.

Then linebreaks. If I have a large formula, or a large word, or a large \tt-form, then LaTeX either goes over the side-boundaries, or gives up completely. I already used \emergencystretch=10000pt which sort of solved this problem (that is, it made some lines stretched pretty hard, but I didnt mind), but it created widows and orphans (seems to undermine the prevention-mechanisms somehow). Ok, it is a problem to choose what to do then. But the default I found was „just do it by hand“, but seriously, this cannot be a solution. Especially since the solution for me was clear: Use your algorithm where you can, but if a line would become too empty to stretch it, then simply dont do this with that line, just use \flushleft for this line. In my opinion, that sounds like the only thing one really can do about it – that is, even if I did it by hand, I would do it that way. But I couldnt find any pre-defined package or instruction defining this behaviour. So what I basically did was to just use \flushleft everywhere. It doesnt look that „pretty“, but well, it also doesnt look that „bad“, at least it looks continuous.


Besser weniger JavaScript

Thu, 06 May 2010 01:24:39 +0000

Nachdem ich ja jetzt mein MacBook bei Ebay versteigere (nur um das nochmal anzumerken), hatte ich viel Gelegenheit, das Ebay-Interface zu benutzen.

Eigentlich sollten gerade Unternehmen wie Ebay, Amazon, Microsoft, etc., ein besonders großes Interesse daran haben, dass das Internet „sicher“ wird – zumindest sollen nur die Leute an die Informationen kommen, von denen sie das wollen, z.B. die Werbekunden.

Eine Sicherheitslücke die immer wieder mal ausgenutzt wird wären da zum Beispiel nicht richtig eingeschränkte Cookies. Es ist nicht gut, durch das Internet zu gehen, und alle Cookies anzunehmen. Es gibt gute Cookie-Filter. Wie wäre es zum Beispiel, wenn man einfach „*.microsoft.com“ freigeben müsste, bzw. „*.msn.de“, um sich bei MSN einzuloggen? Können die nicht Subdomains anlegen, die auf ihre Werbekunden weiterleiten?

Noch viel schlimmer als Cookies sind Scripts. Die modernen „AJAX“-Interfaces. Aktuell musste ich mich ja mit dem Interface von Ebay herumärgern. Erstmal wieder das Problem mit Cookies. Es gibt ebay.com und ebay.de – gut, das sei ja verziehen. Dann gibt es aber noch ebaystatic.com, ebayobjects.com, und weiß der Geier was mir jetzt noch alles nicht einfällt. Immer wieder hat irgendwas gefailed, immer wieder wurden meine Formulardaten zurückgesetzt, weil ich nicht wusste, für welche Domains ich Cookies und JavaScript freigeben musste. Die Integration von PayPal war dann nochmal ein riesen Generve. Am Ende hatte ich einem ganzen Haufen von Domains Scripting und Cookies erlaubt. Und ich frage mich: Wofür? Wie wäre es denn stattdessen mit static.ebay.com, de.ebay.com, objects.ebay.com? Dann müsste ich *.ebay.com und ebay.de freigeben, und gut wäre es.

Außerdem verstehe ich bei diesem Interface ohnehin nicht, wozu es JavaScript verwendet. Einzig der HTML-Editor scheint JS wirklich sinnvoll einzusetzen. Ansonsten werden irgendwelche Werte On-The-Fly aktualisiert – klar, das ist möglicherweise „komfortabler“, aber einen „Wert neu Berechnen“-Button zu Clicken sollte man jedem Ebay-Nutzer zutrauen können.

Das Problem hierbei ist: Wo ein komplexes Interface ist, sind auch einen haufen komplexe Bugs. Man hat zum Beispiel die Möglichkeit, in seinen Angeboten HTML-Code anzugeben. Es muss nur irgendeiner der Webmaster irgendwann mal etwas übersehen, und schon können Leute ihre eigenen Scripts einschleusen, und so schlimmstenfalls an Session-IDs und sonstige Daten der Nutzer kommen. Deshalb wäre es eigentlich doch sehr sinnvoll, sein Interface lieber auf IFrames aufzubauen, dem Nutzer ein paar Buttonclicks mehr zuzumuten, und stattdessen diesem zu empfehlen, JavaScript zu deaktivieren, vor Allem, da JavaScript keinen wirklich erheblichen Vorteil liefert.

Aber wenn schon JavaScript, dann bitte doch nur von der selben Domain. Also meinetwegen *.ebay.com. Lediglich den Nutzer-Content kann man in eine eigene Domain packen, für den man dann JavaScript in seinem Browser deaktivieren kann.


A few ideas for a Lisp-Shell-Language

Wed, 28 Apr 2010 00:08:42 +0000

Bash is one of these languages who seem to be made just for a few small purposes and are now used for vitally everything using ugly hacks and producing unreadable, ununderstandable and undebuggable code.

There are other Shell-Languages, and of course, there is the Clisp-Shell. Except for Clisp they are not „lispy“, and mostly have a lot of Syntax-Structures. I myself dont like too much Syntax. Infixes can make code more readable, if fit into the language, like StandardML does. And having different kinds of Brackets also can do. But having a lot of special syntax for several purposes makes a language more complicated and harder to use, at least to me. This is why I mostly dont like Perl – you can write beautiful Perl-Code, but most people dont. To me, it is not an argument that a Perl-Programmer can express something in one line which can be expressed in Scheme only in 5 lines – the readability of the latter is mostly better.

And well, Clisp as a shell-language sounds not well, because it is a Common Lisp Implementation, which means it mighty – too mighty for a shell. A shell should be turing-complete and simple, it should give the possibility to call other programs with parameters it gets in various ways – but its primary use remains calling other programs.

There is no point in efficiency (other than maybe having a polynomial time algorithm instead of an exptime), it should be simple, easy-understandable, and serve its purpose.

So, thinking of a lispy shell-language, I had the following ideas:

The REPL should be Line-Aware

That is, you dont always have to write parentheses around every instruction to call programs. A line without parentheses and context should be interpreted as if parentheses were put around it. That is, outside any construct like let, loop, etc. – just for the purpose of small function calls outside any other context, just for convenience. As soon as you are inside another lispy context (like let, if, begin, etc.), this should fail. Whether to allow this inside shellscripts or not is a good question – some programs may rely on syntax without parentheses. I would allow it under the same conditions. And above all: As soon as there is one parenthese, parentheses are not optional anymore. That is to prevent ambiguousities. Having one pair of parentheses around a line should really mean the same as having none of them, while having two of them should really always mean having two pairs of them.

There should be only one type of token

Strings, Symbols and several kinds of numbers can be distinguished in most lisps. For a shell-language, this isnt really necessary, I think. Strings are enough. They can carry every information of the other ones. Numbers can always be encoded in decimal Strings – well, arithmetic will get a lot slower then, but for simple calculations it should be fully sufficient, for complex ones one shouldnt use a shell-language anyway. There should be two ways of specifying them: When they dont contain any special character, they can be passed like symbols known from other Lisps. With special characters, they have to be put in quotes. And – above all – no unquoting inside strings, like in bash or php. Some function like string-append should be provided, but inside quotation marks, there may be a syntax for special characters, but nothing more.

One problem which arises with this approach is how to handle variables. If you dont distinguish between numbers, symbols and strings, you dont have any possibility of determining whether a variable or the actual value is to be requested. Well, for the global environment-variables, there should be special commands getenv and setenv, anyway.

For local variables, it is the same as the problem about quoting and unquoting, that already arises in other Lisps. On the one hand, in most cases one could just say „just dont name your local variables like strings and other symbols are named“. Maybe that would be the best way to do, and leave the rest to the programmer. But that could also cause confusion. So for variables, maybe a bash-like syntax like $“…“ would be good, i.e. having a $ in front of a string to denote variables. But then, one either would have to enforce quotation marks after $, or to add some additional conventions for strings which could get complicated. I think, just defining a function-like object called $ is also sufficient. Then you can access your variables with ($ var). You can pass Strings in both Syntaxes, and in the end, having ${varname} is not so much less verbose than having ($ varname). Of course, maybe this is not the most convenient thing one can imagine, but in my oppinnion, instead of adding a lot of additional Syntax to spare a few characters, a simple, easily-understandable programming model is more important.

There should be no high-level-objects

Lists can sometimes be useful in a shell, too. Like lists of files, or – if you want streaming – lazy lists of characters. So the basic instructions cons, car and cdr should be there (maybe named „pair“, „first“, „second“). But I dont see any use of cyclic data-structures in a Shell-Language. Without cycles, a simple reference-counter suffices for memory-management rather than a fully-blown garbage collector. Good garbage collectors can get very complex, while bad garbage collectors can be extremely bad in performance and memory-usage, both facts do not fit to a shell-language – a shell-language should do its stuff quickly and then be quiet and keep out of the way. A static reference counter and not allowing cyclic data structures anyway sounds like the best way to do. So, for lists, car, cdr and cons exists, but no way of setting the car and cdr of an existing cons.

There should be C-like scope-declarations rather than let-declarations

One thing that sometimes really annoyes me when using Scheme or Common Lisp is that local variables cannot be declared in the place they are used without adding an additional let-block. Therefore, either the variables need to be declared on top of the block, or you will get code which isnt readable because of the many sublists it contains. Or you will just use a long let*-declaration which saves all intermediate results, so you can code like you would do in a C-Like language.

Of course, inside a real Lisp, the let-approach gives you a lot of benefits like a complete control of which variables are currently declared in your scope, so in this case its a trade that you will kindly accept. But shell-scripting is mostly about quickly coding a few basic computations. Therefore, I think its better to give a possibility to declare variables and scopes in the way C does using curly parentheses and equal-instructions. One could declare the commands (block …) for declaring that here starts a new scope, and (var … = …) for declaring variables, as commands, and additionaly (set … = …) to set them.

Above, the usage of arbitrary strings as variable names was suggested, without the bounds a fixed let-Syntax would give, you can define variables named by arbitrary strings. In fact, the approach of allowing an arbitrary string for accessing variables has some disadvantages. You can produce very bad-looking code, and of course, you can use this as a hack for getting arrays and dictionaries. But on the one hand, if you quickly want to collect a few elements into indexed variables, there is nothing wrong with this, if you really need to exploit it to the maximum, you will get ugly unreadable code, but thats your problem then.

There should be a convenient piping- and streaming-API

Yes, I said a shell language shouldnt be too mighty. But piping and partially interpreting the output and generating the input of a process is something that has to be done so often that there should be an API included in the shell language. What happens if you dont add such an API can be seen in the various hacks that have found their ways into bashscripts. To be more versatile, tools like awk and sed evolved, but – honestly – many code generated with them is pretty unreadable. And its still hard to handle binary data, which is somehow needed, too.

So I think forbidding the handling of binary data and forbidding to read from stdout and write into a stdin directly produces more complicatedness than just adding a simple API that can handle all of this. I mean, in the end, its the only real way for the shell to communicate with programs it calls, except for the return values.

Conclusion

A lot more could be said about shell languages I think. And of course, every point I mentioned is arguable. Something that goes into the direction I mentioned seems to be Tcl, but as far as I see, Tcl is more a scripting language than really a shell language, even though the interpreter is called „wish“.


„Lazy Evaluation“ in POSIX-C, using sigaction(2)

Wed, 31 Mar 2010 23:32:16 +0000

It amazes me time and again how flexible POSIX (and other lowlevel-stuff in common unix-like environments in general) is. Surprising enough that fork(3)-ing doesnt result in a doubled memory-consumption in general, because the Pages are mapped Copy-On-Write – which is a nice thing, but still done by the kernel – it even gives the userspace-processes a lot of control about paging. So I wonder why most of the programmers just use malloc(3) – maybe because its the most portable non-gc-possibility to organize memory. I know that SBCL uses libsigsegv to optimize its memory management.

For a long time now I had the idea of implementing lazy evaluation with pure C using libsigsegv. The plan was to allocate some address-space and mprotect(2) it, and then, as soon as someone accesses it, a sigsegv is thrown, and the handler calculates the contents of the accessed memory block, unprotects it, and saves the calculated value, and then returns back, so the calculated value can be accessed from this point on. Ok, this is not quite lazy evaluation – for example you cannot (at least not trivially) create infinite lists with this, but it goes into that direction, its like „calculating on demand“.

So I wrote a program doing this using libsigsegv, but unfortunately, I couldnt work recursively with libsigsegv, i.e. if I accessed a protected part of the memory during the sigsegv-handler, the program would exit. But libsigsegv is only a wrapper, probably around sigagtion(2). Sigaction itself allows recursive handling of signals, when using the flag SA_NODEFER. So I wrote the below Program. It calculates the fibonacci-sequence „blockwise“ – an array fib is allocated, and – on an x86-processor – split into 1024-wise arrays of integers. It also prints out the registers of the context of the error (that is, it produces a lot of output – be carefull when compiling it). And – well, I only tested it on x86 linux 2.6.26 and x86_64 linux 2.6.32, in the latter case, it doesnt work, it traps in an infinite loop.

#include <sys/mman.h>
#include <sys/user.h>
#include <malloc.h>
#include <stdio.h>
#include <errno.h>
#include <string.h>
#include <strings.h>
#include <ucontext.h>
#define __USE_GNU
#include <sys/ucontext.h>
#include <signal.h>

int * fib;
int * fib_e;

#define PAGE_FIBNUM (PAGE_SIZE / sizeof(int))

void sa_sigac (int sig, siginfo_t * siginfo, void* vcontext) {
 struct ucontext* context = (struct ucontext*) vcontext;

 size_t regs_size = sizeof(context->uc_mcontext.gregs);
 char format_hex[(3*regs_size)+10], format_dec[(3*regs_size)+9];
 memcpy (&format_hex, (void*) "Regshex:", 8);
 memcpy (&format_dec, (void*) "Regsdec:", 8);
 format_hex[(3*regs_size)+8] = format_dec[(3*regs_size)+8] = '\n';
 format_hex[(3*regs_size)+9] = format_dec[(3*regs_size)+9] = '';
 int i;
 for (i=8; i < (3*regs_size)+8; i+=3) {
 format_hex[i] = format_dec[i] = ' ';
 format_hex[i+1] = format_dec[i+1] ='%';
 format_hex[i+2] = 'x';
 format_dec[i+2] = 'u';
 }

 printf(format_hex, context->uc_mcontext.gregs);
 printf(format_dec, context->uc_mcontext.gregs); 

 void* failt_address = siginfo->si_addr;
 int number = ((int)failt_address - (int)fib) / sizeof(int);
 printf("Accessed: %d\n", number);
 int firstcalc = number - (number % PAGE_FIBNUM);
 int lastcalc = firstcalc + PAGE_FIBNUM;
 printf("Calculating Fibonacci-Sequence from %d to %d\n",
 firstcalc, lastcalc);
 mprotect(fib+firstcalc, PAGE_SIZE*sizeof(int), PROT_READ | PROT_WRITE);

 if (firstcalc == 0) {
 /* initial elements of fibonacci sequence */
 *(fib+firstcalc) = 0;
 *(fib+firstcalc+1) = 1;
 } else {
 *(fib+firstcalc) = *(fib+firstcalc-1) + *(fib+firstcalc-2);
 *(fib+firstcalc+1) = *(fib+firstcalc) + *(fib+firstcalc-1);
 }

 int * ccalc;

 for (ccalc = fib+firstcalc+2; ccalc < fib+lastcalc; ccalc++) {
 *ccalc = *(ccalc-1) + *(ccalc-2);
 }
}

int main (int argc, char* argv[]) {

 int fnum;

 if (argc == 1) {
 printf ("Please supply a number.\n");
 return -1;
 } else {
 sscanf(argv[1], "%d", &fnum);
 if (fnum > 20*PAGE_SIZE) {
 printf ("The number must not be greater than %u.\n",
 20*PAGE_SIZE);
 return -1;}
 }

 struct sigaction myaction;
 myaction.sa_sigaction = &sa_sigac;
 bzero(&myaction.sa_mask, sizeof(sigset_t));
 myaction.sa_flags = SA_NODEFER | SA_SIGINFO;
 myaction.sa_restorer = NULL;

 sigaction(SIGSEGV, &myaction, NULL);

 int fib_begin[24*PAGE_SIZE];
 int fb = (int) (&fib_begin);
 fib = (int*) ((fb - (fb % PAGE_SIZE)) + PAGE_SIZE);
 fib_e = fib + PAGE_SIZE;
 int e = mprotect (fib, 20*PAGE_SIZE*sizeof(int), PROT_NONE);
 perror("");
 printf("fib(%d) %% %u := %u\n", fnum, -1, fib[fnum]);
 return 0;

}

There are a few flaws in this source: It doesnt work under x86_64, even though I dont really see which part is not independent of the bus-width. Of course, handling it that way is highly unportable, but most unix-derivates should be able to do at least something like that.

The major flaw I dont like is that I am bound to calculating a whole block at once, and not being able to mprotect this block again afterwards. In theory, it should be possible to find out in which register the value should have been read and then manipulating the context and setjmp(3)-ing to it, so it doesnt try to read it again afterwards. But I myself cannot even access the registers directly by their name, though in sys/ucontext.h there is an enumeration with defined index-names, but they seem not to be defined when I am trying to use them. I dont know why. I just print them all out in the hope that I see some structure inside them, but so far, I didnt see much. I guess one has to know a lot of lowlevel-stuff about x86 to understand whats actually going on, which I dont have.

Anyway, I think this is very interesting. It may not be usefull at all, but its a nice thing.


Ist es Zeit für eine neue Programmiersprache?

Tue, 09 Mar 2010 23:52:12 +0000

Huch, eine gefährliche Sicherheitslücke in Opera erlaubt die Ausführung von beliebigem Code. Schon irgendwie episch, dass sowas heutzutage immernoch passiert. In Zeiten, wo man jedes Bit noch dreimal verwenden musste, kann ich ja verstehen, dass sich hin und wieder Fehler eingeschlichen haben – man musste ziemlich lowlevel programmieren, um zu optimieren.

Nun, spätestens seit DOM und dynamic HTML sollten diese Zeiten aber zumindest für Browser so langsam zu Ende sein. Nun weiß ich freilich nicht, in was Opera genau geschrieben ist, aber ich vermute doch stark einen ziemlich großen C++-Kern, mit ziemlich vielen lowleveligen Aufrufen (ich bin bekanntermaßen kein C++-Freund, aber selbst C++ erlaubt einem  zum Beispiel, Arraygrenzen automatisch prüfen zu lassen, wenn man entsprechende Kapselungen verwendet – diese sind freilich ineffizienter).

Ich persönlich tendiere aber eher zu der Meinung, ein Browser sollte in einer sinnvollen Hochsprache geschrieben sein, mit einer guten Grundengine. Nachdem man von sämtlichen Browsern immer wieder hört, sie hätten Speicherlecks, was mich bei der Notwendigkeit der Behandlung dynamisch wachsender und sich verformender DOM-Bäume nicht wirklich wundert, ist hier vermutlich sogar ein Garbage Collector angebracht – letztendlich wäre nämlich alles, was man erzeugt, um dieses Problem effizient zu lösen, sowieso bereits ein Garbage Collector (zumal JavaScript sowieso einen Garbage Collector voraussetzt). Hochsprachen die das leisten gäbe es genug. Common Lisp und Scheme fielen mir da ein, Perl (ok, sort of) und Python, Haskell, *ml, Java, aber selbst ein konservativer Boehm Weiser GC sollte es bei entsprechender Kapselung in C++ tun. Sicherlich, Runtimes können auch Bugs haben. Aber es ist sinnvoller, auf eine Runtime zu setzen, die – sofern sie korrekt funktioniert – Sicherheit garantiert, und diese ggf. zu patchen, und damit gleich ziemlich viel Software abzusichern.

Wenn man sich nun aber die oben aufgelisteten Sprachen anschaut, fällt recht schnell eines auf: Irgendwie haben alle Sprachen inzwischen ziemlich viele Gemeinsamkeiten. Es mag sein, dass mit Haskell das imperative Programmieren erheblich schwerer ist (aber dank Syntaktischem Zucker im Zusammenhang mit Monaden eben nicht wirklich unmöglich – ja, man kann darüber streiten, ob das wirklich imperative Programmierung ist, aber … es fühlt sich so an, es sieht so aus, also ist es für mich imperativ), während C++ gerade mal Lambda-Konstrukte bekommen hat, die – wie ich C++ einschätze – kaum benutzbar sind. Trotzdem: Im Grunde bekommen momentan die meisten Programmiersprachen die Features von so ziemlich allen anderen. Das liegt wohl wiederum daran, dass die meisten derartigen Features eher kleine Anpassungen voraussetzen.

Wie dem auch sei, was entsteht ist also mehr oder weniger ein Einheitsbrei – zumindest habe ich momentan diesen Eindruck. Alle Programmiersprachen bekommen mehr oder weniger alle existenten Features irgendwie hinzugepfriemelt. Das ist auch nicht wirklich etwas schlechtes – an sich ist es gut, wenn man versucht, verschiedene Features irgendwie ineinander zu integrieren um sie kompatibel zu machen. Nur fehlt es mir persönlich momentan an echten Neuerungen. Ich sehe auf Freshmeat z.B. ständig irgendwelche neuen Interpreter, aber ich sehe selten irgendetwas wirklich neues.

Einen dann doch ganz anderen Weg gehen aber zum Beispiel Logikprogrammiersprachen, wobei ich gestehen muss, bis auf eine Vorlesung und ein wenig Doku über LogTalk wenig bis nichts darüber zu wissen. Logikprogrammierung scheint zwei Aspekte zu kombinieren, die ich mit meinem begrenzten Wissen als Trends prognostiziere: Kontextorientierte Auswertung und Programmverifkation bzw. -extraktion.

Kontextorientierte Auswertung geht in Richtung Lazy Evaluation, und bietet eine ideale Möglichkeit, Berechnungsprocesse zu verteilen –  nachdem Prozessoren jetzt eher dazu tendieren, mehr zu werden, als schneller, sehr sinnvoll. Außerdem lässt sie sich sehr gut in bestehenden Programmcode integrieren, und steht Typsystemen und Verifikation nicht im Weg.

Programmverifikation und Programmextraktion hingegen sind wiederum Techniken, die moderne Softwareprojekte brauchen werden. Spätestens, wenn man wirklich mal relevantes Cloud Computing betreiben will, wird man auf das Problem stoßen, sinnvoll und effizient Berechnungsprozesse eines anderen Rechners ausführen zu müssen, ohne Angst vor Exploits zu haben. Und man wird auf das Problem stoßen, eine fertige Berechnung zu verifizieren, in signifikant kürzerer Zeit als die Berechnung selbst gedauert hat.

Dabei ist Programmverifikation keine Möglichkeit, alle Exploits zu verhindern – das hört spätestens dann auf, wenn die Hardware Fehler macht. Aber es ist eine Möglichkeit, bekannte Sicherheitslücken zu verhindern, indem man Spezifikationen erstellt, die ein Programm erfüllen muss. Somit lassen sich Probleme wie Buffer Overflows effektiv beheben. Programmextraktion wiederum ist eine meiner Meinung nach sehr schöne Möglichkeit, Prototypen für Spezifikationen zu erzeugen, die man später effizienter macht. Ich  vermute, der höhere Aufwandt bei der Erzeugung der Software würde sich bald Amortisieren.

Das sind die Trends, die ich vorhersage. Freilich gibt es das alles schon, aber es führt momentan noch eher ein Nischendasein, während funktionale Konzepte inzwischen ziemlich weitreichend verbreitet sind. Was natürlich auch mit der Jahrzehntelangen Forschung auf diesem Gebiet zu tun hat. Da müssen diese Gebiete noch nachlegen.

Soweit meine Meinung. Andere Meinungen und Ergänzungen dazu würden mich interessieren.