Are shared libraries still appropriate?

Thu, 20 May 2010 18:13:59 +0000

Currently, I am trying to remove some dependencies of Uxul-World. I was thinking of completely kicking LTK – though I like LTK – but as this is just part of the Level-Editor, till now I just thought I should keep it. On the other hand, it produces additional dependencies – lisp-magick right now, maybe I will switch to cl-gd or to my own little ffi-binding. On the other hand, if I did all that stuff directly without LTK, inside SDL, I would just have to use sdl-gfx to stretch and captionize Images.

However, hardlinking with SBCL against ffi-bindings is hard to impossible, and the License of SDL forbids this for free software anyway as far as I remember. Under Linux, SDL may be a default library which is nearly always installed, while under Windows, I dont think so. Under linux, there is no problem with providing a simple package-dependency-list, as long as the packages are not too exotic and can be easily installed. But of course, I also want the game to be playable under Windows, without having to install a whole Unix-Like Environment before. So maybe, under Windows, I should use OpenGL instead. Well, I will see  that.

I am currently not concentrating on portability but on finally getting some playable content into it. In general though, its good to already think about it: I dont want to produce a dependency-hell. I hate dependency-hells. Having a lot of additional dependencies in a software package can really make me sad. Mostly this leads to a lot of strange Download- and Installation-Procedures, since every project has its own policies, and in the end the only thing I have is additional knowledge about small libraries which I didnt even want to know about.

Having libraries like the zlib or libpng linked dynamically is something that really sounds anachronistic to me. Maybe in embedded devices this makes sense, but on every modern PC, the additional memory footprint should be negligibly small. A real dependency-monster depends on thousands of such small libraries, such that the footprint can get remarkable large. When using dynamic libraries, the executable code can be mapped multible times between different processes by the kernel, which needs less memory, and makes the code really „shared“.

But in the end, the only real bottleneck when just hardlinking against everything and deploying large binaries with small dependencies is the Usage of RAM. Neither hard disk space should be an issue nor should the additional needed traffic be.

And again, the solution I would suggest to this could come from deduplication technologies. Assume you download a binary, and execute it. Then the kernel has to read it, and can therefore create an index of checksums of the memory blocks the binary contents. Assuming that mostly the same libraries are hardly linked, and thus, the same or very similar binary code occurs, the kernel will notice that it loaded equivalent blocks into memory already, and can therefore map them together, like it would do with shared libraries. A main difference would be that the pages would have to be mapped as copy-on-write-pages, since some software may change its executable code (willingly or through a bug ). The binary could additionally provide hints for the kernel, for example flags that tell the kernel not to try to deduplicate certain parts of the loaded process image, for it may change or will only be used in extremely seldom cases, or flags telling to what library (and source-file) some memory-pages belonged, so the kernel can optimize the memory-deduplication.

Just to emphasize this – I am not talking about deduplication of all of the RAM-Memory, only about a small procedure run at the start of a new process, which searches for identical pages that are already mapped somewhere. I am sure this would take longer than just softlinking. But it shouldnt take too much additional time, and one could add heuristics for very small process-images not to deduplicate at all to make them load faster.

In any case, I think it would make the work with binaries easier, as well deploying as using, especially outside some package manager. For example it would produce an easier way of maintaining multiarch-systems.

And – imo – it fits more into the world of free software, where you have a lot of chaotic dependencies and a developer cannot keep track of all of these dependencies‘ newest versions and installation procedures, so he would just put everything inside his project directly.

Its basically giving up a bit of infrastructure while getting a new way of solving problems for which this infrastructure was basically created. And it sounds like everything is already there to implement this. Of course, I am not a kernel developer, I cant say how hard it really is. I am pretty sure, in Linux there wont ever be such a thing, but maybe more innovative Operating Systems like Open Solaris could provide it – as Solaris is known for its propensity to new technologies.