Monday, May 31, 2004
I just spent a couple of days writing up a short case study comparing two implementations of a DNS resolver decoding library in Java and in Lisp. One of the things that first attracted me to Lisp was Paul Graham's claim that Lisp was a more efficient language and that people could write programs faster in Lisp. I wrote my resolver library as my first "larger-than-factorial" Lisp program because I had just been working on a similar library for Java, and I wanted to compare the two languages. This case study provides a pseudo-academic comparison of the two implementations.
The results are quite favorable for Lisp. Line count and method/function count were about 45% less than in Java when I normalized the programs for equivalent functionality (see the paper for more info). This is quite a difference. While I didn't write either of the programs under controlled conditions such that I can actually compare development time directly, by implication shorter programs get written faster, and Lisp wins that comparison.
Sunday, May 23, 2004
I have been consuming Lisp books lately at quite a clip. My latest purchase just arrived yesterday: The Art of the Metaobject Protocol. I'm looking forward to this one and will post a review as soon as I get done. I'm hopping on a coast-to-coast (Oakland -> New York) flight tomorrow that should give me some time to sink my teeth into it.
Believe it or not, I'm still working my way through Lisp in Small Pieces. As I said before, there are parts of that book that are just deep and require you to be very much awake to consume them. Unfortunately, I have just not been up to it for the past couple of weeks. I'm coming to the conclusion that that particular book may have to wait until I build up some more Lisp experience and background. It's just too hard of a slog without that.
Kevin Rosenberg seems to have been working quite heavily on UFFI for the last couple of weeks. In the course of that work, Kevin tightened up some type information usage in the API. As a result of the changes, a Resolver bug surfaced. I was incorrectly specifying an array type, but it hadn't been caught previously because the type was previously ignored under SBCL and CMUCL. Now, Kevin is using that type information for something and it was causing Resolver to compile with errors. Anyway, if you are tracking Resolver development, this should now compile and run well.
Thanks to Dave Pearson for reporting the problem and helping test the fix.
Friday, May 21, 2004
There is a battle taking place in the Linux OSS community about which high-level programming language should be used for developing the next generation of Linux applications. While people are generally in agreement that the system should be a language with garbage collection, some level of dynamism, and general resistance to various security risks such as buffer overruns, there are many possible choices. On one side of the debate, we have the Ximian folks, now at Novell. They have developed Mono, an open-source clone of Microsoft's .Net platform, including a C# compiler, runtime libraries, and virtual machine. On the other side, you have the folks at Red Hat, who are very worried about Microsoft's intellectual property rights with respect to .Net, and therefore Mono.
To their credit, the Ximian guys seem to have done some really nice development here. Mono is starting to be used in a few open source projects and Novell wants to develop all new applications in C#/Mono rather than C/C++.
The trouble is that Microsoft seems to have some patents that cover various portions of the .Net system. Microsoft has stated that they will offer these patents under reasonable and non-discriminatory (RAND) license terms. This is required by ECMA as a part of the standardization work on .Net that Microsoft is pushing. The trouble is, RAND doesn't necessarily mean free, and it doesn't necessarily mean that you can just use the technology. You may still have to engage with Microsoft to hash out the legal terms.
The folks at Red Hat are, rightfully, in my opinion, concerned that Microsoft hasn't stated categorically what the RAND terms are and that Microsoft is obviously not an open source friend. Red Hat fears that if too much development is done on open source projects with Mono before Microsoft clarifies its position on the .Net patent licensing issue, Microsoft could announce terms that while "RAND" are not amenable to FOSS development. Something that is RAND and acceptable to a company like Novell that has a whole legal staff is not necessarily acceptable to a solo developer who finds it difficult to negotiate the Microsoft bureaucracy. As a result, such terms would disrupt a whole bunch of projects and could jeopardize some important pieces of development. Red Hat has suggested using Java or an open source clone based on GCJ and Classpath instead of Mono. And then there is always Python, which Red Hat already uses extensively.
My hunch is that the guys at Microsoft are laughing like crazy at this controversy and are probably stalling deliberately. Why clarify things when you can turn the two leading Linux distributions upside down and against one another?
Of course, my own feeling is, why go with any of those. Lisp has been around for more than enough time for any intellectual property rights to expire, and it's an international standard. It meets all the criteria. The only thing missing is a set of good Gnome bindings, but that could be easily overcome faster than continued development of Mono.
In any case, this will be interesting to watch...
Monday, May 17, 2004
I never got a very deep answer as to why Lispers seem to prefer Arch. I had figured that there was something about Arch that makes it really good for Lisp code. Not so, it seems.
The best thing I heard was from Christian Neukirchen who wrote and said it was "...because Lispers like to do it the Right Way." Somehow, that sort of makes sense. Lispers certainly don't follow the crowd. They look for good solutions to problems and they aren't afraid to stick with something that isn't winning the technology popularity contest. So, while the rest of the masses struggle on with CVS, it seems like a lot of Lispers are turning to Arch for its superior approach to change sets, branching and merging, etc.
I exchanged a few emails with Christian and as a result of his urging, I have started using Arch myself. It's too early to tell you my experiences, but if CVS has you down and you want to look for something that may be a little less popular and a little bit better, try Arch. I'll write another blog entry summarizing my experiences once I have enough of them.
Sunday, May 16, 2004
I have released another update to my Resolver library, bringing the version number to 0.4. The basic changes this time around are:
- Support has been added for signaling error conditions when the decoding logic finds something amiss. This could be caused by corruption of the reply packet, for instance.
- General small bug fixes having to do with the handling of packet corruption.
- Got some self-tests working correctly. These were previously left in the source but did not function.
- Clarified the release licensing. I chose LGPL. This was indicated before in the
resolver.asdfile, but I have now made this a bit more obvious with the inclusion of a
COPYINGfile containing the LGPL license.
- There is now a documentation file,
DOCS, that explains a bit how to use the library.
You can install Resolver using ASDF-Install. See http://www.findinglisp.com/packages for more information.
Thanks to Nikodemus Siivola for his help with ASDF packaging.
Thursday, May 13, 2004
I saw a link to this article, titled Extensible Programming for the 21st Century, posted today on OSNews. The author, Dr. Gregory V. Wilson, is basically suggesting that next-generation programming systems need to be extensible. Specifically, he says:
This article argues that next-generation programming systems will accomplish this by combining three specific technologies:
- compilers, linkers, debuggers, and other tools will be plugin frameworks, rather than monolithic applications;
- programmers will be able to extend the syntax of programming languages; and
- programs will be stored as XML documents, so that programmers can represent and process data and meta-data uniformly.
The author goes on to describe "frameworks," which allow extension of the compiler/linker, and "extensible syntax," better known as macros, as being key to the next-generation language.
I found this quite funny. When I read it, all I could think of is that back-to-the-future, next-generation language called Lisp. Interestingly, the author seems to like Lisp as he mentions it (Scheme, in particular) several times during the paper. However, the author concludes that some XML-based storage format will be required to realize these things.
The author then recognizes that a pure XML-based programming language would, well, suck. So instead he suggests that XML will just be the storage format and special editors will transform the XML into something more human-readable at edit time. The author suggests this as a sort of "model/view" issue: the model is XML while the view is whatever the editor wants it to be. It could be something Lisp-like or something Java-like, for instance, depending on the programmer's preference.
The author seems to want to recreate Lisp in XML. My main question was, if you want Lisp, why not just program in Lisp? Toward the end, the author suggests that this is, in fact, Lisp (Scheme), but that Lisp failed because of the parenthesis:
Scheme proves by example that everything described in this article could have been done twenty years ago, and could be done today without XML. However, the fact is that it didn't happen: as attractive as parenthesized lists are conceptually, they failed to win programmers' hearts and minds.
In contrast, it has only taken HTML and XML a decade to become the most popular data format in history. Every large application today can handle it; every programming language contains libraries for manipulating it; and every young programmer is as familiar with it as the previous generation was with streams of strings. S-expressions might have deserved to win, but XML has.
And yes, there are better (i.e. more succinct, and hence easier to process) ways to represent the semantics of programs than XML, but we believe that will turn out in practice to be irrelevant. XML can do the job, and is becoming universal; it is therefore difficult to imagine that anything else will be so compelling as to displace it.
Hmmm.... I think I'll stick with Lisp.
Friday, May 07, 2004
In surfing through all the various Lisp sites, I have noticed a preponderance of Arch being used as a sourcecode control system. This percentage of Arch users seems very high in the Lisp user base versus the rest of the open source world, which seems predominately CVS. So, my question is, why?
I would suspect that a vast majority of Lispers also use Emacs vs. vi because of the way Emacs works well with Lisp source code (not to offend the vi Lispers out there; I know people use vi on Lisp, too). At least those not using one of the commercial products with an IDE editor, etc. That correlation I can understand.
If you know, feel free to drop me a line.
Last night I posted a question on c.l.l asking about modernizing Common Lisp. I had found a paper in a Google search that had suggested some new areas for CL standardization. The features under consideration were basically high-level library features, the two notable ones being theading and network APIs. I personally think this would be useful to encourage more portable complex library code. While you can share portable CL library code today, it gets more difficult when you get into higher-level "OS interface features."
Put another way, we all standardized CAR, CDR, LIST, and MAPCAR a while ago. Now, it's possible to have completely portable functions at the level of FACTORIAL. ;-) But we don't have a portable networking API, so it's more difficult have portable HTTP libraries. Can those libraries be written? Yes, but the implementor either has to retrieve a compatibility library and build on that (then which one?), or do all the implementation-specific coding herself.
Does this mean that it's impossible to write portable code? No, absolutely not. But the overall friction means that less portable code gets written and thus people spend more time porting code between implementations or rewriting code entirely because they are unaware of code that works with another platform.
Thursday, May 06, 2004
I have built out a bit more of this site, including the home page and a page for each of the hopefully many CL packages that may reside here.
Nobody sent me any leads on a good CSS template, so I have stuck with this just-better-than-ugly standard template from Blogger.com. If you want to save yourself the eyestrain, somebody will have to either point me at something free on the 'net or design something wonderful and send it to me.
Over time, I'll be filling in more of the site and adding some other sections. Check the blog or homepage to be notified of changes.
Tuesday, May 04, 2004
Now that my basic resolver library is "out there," in the wild, it's time to start learning some parts of CL that will enhance it. In particular, I'm starting to work on the condition system. From what I can glean, this is a bit like Java's exception handling mechanism, but with some additional features that make it far more powerful. In particular, the CL condition system has the concepts of restarts. When a condition is signaled, a handler (the analog of a Java
catch block) has the ability to restart the computation in one of several ways. Often, you see this capability when your program has an error and breaks into the debugger. The restarts are then offered to the user to select, but one of those choices could have been selected under program control, just as well. (In this case, it really is being selected under program control; the debugger is the handler, since your program didn't specify it's own. It's just that the debugger is a pretty dumb handler; it just displays the restarts available and lets the user choose rather than implementing more sophisticated logic to select among them itself.)
The biggest problem I'm having is that most of the books and resources I have don't provide a very thorough treatment of the condition system. Some mention it but give very cursory treatment. If you know of a resource which really handles it well, feel free to pass it along.
You can find great background on the condition system: why it is the way it is and the previous systems that had influence on it in Ken Pitman's Condition Handling in the Lisp Language Family. Unfortunately, this resource isn't a good tutorial; it describes the why of the condition system, but not so much the what.
Saturday, May 01, 2004
David Steuber spawned this thread the other day on comp.lang.lisp. The thread started off asking about multi-threading in SBCL but degenerated into a discussion of special (aka "dynamic") vs. lexical variables. I, myself, had spawned a huge thread earlier this year that covered much of the same ground. In the process, I managed to step on the toes of just about every comp.lang.lisp guru, including Erik Naggum and Erann Gat. Yup, sometimes I don't know when to shut up... ;-)
Well, special variables are a difficult concept; they are very different than anything I have encountered in my decades of programming. Coming from a language like C, you're likely to think of special variables as globals, which they are on one level, but with some special properties. Further, the distinction between binding and assignment is also different than what you first expect and the distinction impacts how special variables act.
The easiest way that I have found to think about special variables is to imagine each global variable as a stack of slots storing values. The top of the stack represents the value that is returned if the variable is evaluated. You create a special variable binding using DEFVAR or LET/LAMBDA with a (DECLARE SPECIAL ...) form. The first such form creates the conceptual stack of value slots. Now, when you assign to a special variable, using SETQ or SETF, you are just changing the value of the top slot on the stack. The interesting part is when you create a new binding for the variable using LET (or LAMBDA since LET is just syntactic sugar for introducing another LAMBDA form). The LET form has the effect of pushing a new value slot onto the stack and setting it's value (conceptually, that is, the implementation is probably radically different). From that point on in the dynamic execution of the program "beneath" the LET form, an evaluation of the variable will retrieve the new value represented by the top of the value stack. Setting the variable with SETQ/SETF (simple assignment) sets the value of the top slot and does not push/pop the stack. As soon as the LET form terminates, however, it pops the stack and the variable is returned to the old value that it had prior to the execution of the LET form.
Another way to think about this is that each activation record created by the thread of execution has a list of bindings that are created in that context. When the system needs to retrieve or set of a special variable, it searches backwards through the activation records to find the last dynamic binding. It stops upon finding the last previous binding and performs the appropriate action (retrieving or setting the value).
Now, the thing that is really strange is when you have a Lisp with multiple threads of execution. This is what David Steuber and I both had trouble with initially. There is no standard for multi-threaded Lisp; the ANSI spec doesn't consider threading at all. With a single thread of execution, it's relatively easy to understand special variables once you get a good conceptual model for how they work. Now, when we extend the model to multiple threads, most Lisps seem to implement the behavior as each thread conceptually having its own "value stack" for each special variable. That is, when a thread is created, it's as if all the dynamic variables associated with the original thread are duplicated (this is obviously optimized in an implementation) with their initial values set to the current values in the original thread (UPDATE: Dan Barlow wrote and told me that this is SBCL's behavior, but many other Lisps set the inital values to the original value of the special variable, ignoring all the subsequent dynamic binding). Rebindings of a special variable in a given thread only affect the values seen by that particlular thread; other threads have their own sets of special variables that are separate from all other threads. Now, this effectively gives each thread a set of local variables, much like you could create manually under various other threading systems (Windows, Java, etc.).
In the old days, they say (I wasn't there), many Lisps had only special variables (are they really that special when they're all you have?). The problem is that bugs associated with the dynamic/special behavior are quite difficult to debug. You can have one part of the program affect another part and it isn't clear what the connection between the two is unless you're looking at the call graph.
Scheme and Common Lisp seemed to push the Lisp community toward lexical variables which were much easier to handle. Indeed, it seems the convention of adding "*" characters to special variable names was done specifically to highlight their special properties and reduce the chance that a programmer would write a clashing "(LET ((VARNAME ...)))" form by accident. Any time you say "(LET ((*VARNAME* ...)))," you know you are dealing with a special variable and are creating another dynamic binding.
Yes, I know that it's confusing. Understand the semantics, however, as they are important. For starters, you can just think of special variables as roughly similar to global variables in a language like C, and that's true as long as you simply set them and never rebind them using a LET/LAMBDA. You'll soon want to understand the differences, though, as the special nature of these variables can be very helpful in some situations.
A great resource that might help you understand more is Erann Gat's Idiot's Guide to Special Variables.