Monday, June 28, 2004
One of the nice things about posting code into your blog is there are a wealth of people on the 'net that will keep you honest. After posting my automaton code yesterday, Tim Moore wrote me with a style tip. Basically, Tim suggested I clean up the CONS/APPEND nightmare that existed in the innermost LOOP form. I had been thinking the same thing the night before when I did the post, so the suggestion was spot on. In particular, Tim said that his own personal rule is that when he sees lots of CONS/APPEND/LIST action, he takes it as a sign to replace the mess with a backquoted form.
Without further delay...
(defmacro define-automaton (name states &key (stop 'stop) (debug nil)) (let ((event-func (gensym "func"))) `(defun ,name (,event-func) (tagbody ,@(loop for (state-name . transitions) in states appending (list state-name `(case (funcall ,event-func) ,@(loop for (match next . actions) in transitions collecting `(,match ,@actions ,@(when debug `((format t "Matched ~A. Transitioning to state ~A.~%" ',match ',next))) (go ,next)))) `(go ,state-name))) ,stop))))
Ahhhh... much better.
Sunday, June 27, 2004
After I wrote the blog entry on the simple automaton macro I have been playing with, I got a couple of good comments back from astute readers. Bill Clagett wrote to show me his version of the macro which he had also been playing with. Bill had used LOOP instead of the mapping functions I used in my first version. Further, Bill gently pointed out that my first version used a
(reduce #'append (mapcar f lst)) form, which is a very verbose way of saying
(mapcan f lst). Doh! He was right.
Anyway, after correcting my first version to use MAPCAN, I also decided that enough was enough. I had been avoiding the LOOP macro for a while and just decided that I had to dive in, do the reading, and then start using it. On another one of my numerous plane rides, I pulled out my laptop and started going through the HyperSpec section on LOOP. I was pretty blown away by the power of the LOOP macro. There are so many nifty features that can really make your code short and sweet.
In particular, I was amazed at how easy it is to destructure a sublist as you're iterating over the top-level list. I had done a bunch of destructuring manually with CAR, CDR, CADR, and CDDR in the first version of the automaton macro. LOOP promised to make this much easier.
So, I finally rewrote a version of the macro, shown below. Thanks to Bill, both for his education about MAPCAN as well as giving me the encouragement to dive into LOOP. As a result, the macro is now six lines shorter than the first attempt. Notice how nicely the LOOP form destructures the nested list structure of the macro syntax and then collects the results back up for inclusion in the final macro expansion.
If LOOP has you scared (like it had me), two good references are the HyperSpec itself and an appendix of Peter Seibel's upcoming book, Practical Common Lisp, titled "LOOP for Black-Belts." I don't know that I have quite reached black-belt status yet, but I'm feeling a lot more comfortable with LOOP now.
(defmacro define-automaton (name states &key (stop 'stop) (debug nil)) (let ((event-func (gensym "func"))) `(defun ,name (,event-func) (tagbody ,@(loop for (state-name . transitions) in states appending (list state-name `(case (funcall ,event-func) ,@(loop for (match next . actions) in transitions collecting (cons match (append actions (when debug `((format t "Matched ~A. Transitioning to state ~A.~%" ,match ',next))) `((go ,next)))))) `(go ,state-name))) ,stop))))
Update, 28-Jun: The macro as first posted had a slight error in it. The first LOOP form had been using a COLLECTING clause which caused an extra set of parenthesis in the expansion. The LOOP should have been using an APPENDING clause.
Tuesday, June 22, 2004
The last few months, I have been (slowly) reading Lisp in Small Pieces. The book is a survey of different implementation techniques for Scheme, written in Lisp (sometimes a hybrid of Scheme/CL). The book describes a number of interpreters and compilers for the language. I'm still not done (yet!), but I'm making slow but steady progress.
When I was in college, compilers was one area of computer science that I did not take any classes in, instead concentrating on computer hardware engineering, graphics, and networking. I've always had this blind spot regarding compilers that I'd like to fill. When I was 17, I implemented a compiler for Pascal using 6502 assembly language on an Apple II. It compiled down to a bytecode/p-code instruction set which was then interpreted. The reality was, though, that I had no formal understanding of what I was doing and was learning as I went.
My long term goal here is to be able to learn enough to make useful contributions to a project like SBCL. I'd really like to understand how a sophisticated compiler works and be able to engineer improvements. Right now, I'm very far away from that.
Anyway, last night I decided to check out some other papers on Scheme and Lisp implementation and compilation. The Read Scheme site has a wealth of archived papers on various language implementations (Scheme, obviously, but also Lisp in general, Haskell, etc.). The implementation page has a whole bunch of links to the seminal papers on Scheme compilers, including Guy Steele's masters thesis on RABBIT and David Kranz's PhD. thesis on ORBIT.
Last night, I spent some time reading Steele's thesis. One of the things that struck me is that more than any other language, Lisp, because of lambda calculus as well as the convenient notation provided by s-expressions, lends itself to source-level transformations. Macros are the most obvious user-level expression of this capability, but Lisp compilers also make very heavy use of this capability to reshape code symbolically before they actually start generating object code. At one point (page 53) Steele is describing transformations related to common subexpression elimination. Many of the transformations involve creating lambda expressions for each of the common expressions and then passing the results as parameters into a new lambda expression holding the resulting expression after common subexpressions have been removed.
For instance, you can take this,
(lambda (a b c) (list (/ (+ (- b) (sqrt (- (^ b 2) (* 4 a c)))) (* 2 a)) (/ (- (- b) (sqrt (- (^ b 2) (* 4 a c)))) (* 2 a))))
and transform it into this,
(lambda (a b c) ((lambda (q1 q2 q3) (list (/ (+ q1 q2) q3) (/ (- q1 q2) q3))) (- b) (sqrt (- (^ b 2) (* 4 a c))) (* 2 a)))
After these transformations are done, the compiler can then go on to simply produce object code (simply?), using all the various other techniques for doing so. While other compilers do common subexpression elimination (and RABBIT didn't), they do it using other mechanisms, typically by analyzing expression parse trees directly. This transformation is a bit more general and can operate on anything that doesn't have side effects, whether a traditional numeric expression or not. Steele goes on to say,
We point out these possibilities despite the fact that they have not been implemented in RABBIT because the problem of isolating common subexpressions seems not to have been expressed in quite this way in the literature on compilation strategies. We might speculate that this is because most compilers which use complex optimization strategies have been for ALGOL-like languages which do not treat functions as full-fledged data objects, or even permit the writing of "anonymous" functions in function calls as LISP does.
This just made me realize that maybe the big difference in Lisp is how easily code (source, intermediate, whatever) can be generated and reshaped by the system at any point. That's an amazingly powerful capability. Previously, I had some idea of this from reading a bit of Paul Graham's writings about the power of macros, but Steele's thesis really drove it home.
Monday, June 14, 2004
In contrast to ANSI Common LISP, which I just reviewed, Peter Siebel's upcoming book, Practical Common Lisp will cover CLOS in some detail. Peter has been writing his book since late last year and has been putting chapters up on the web for public comment as he completes them. Peter's chapters were one of the first web resources I found when I started learning CL and they have earned a spot in my Links for Newbies over there on the side of the blog page (for those not reading on an aggregator). In any case, check out Peter's book and be sure to tell him what you think of the chapters now so the book is solid when it arrives. Then, when it arrives, be sure to buy a copy. (For what it's worth, I have written three books previously and it's very hard work. You need all the moral support you can find. The hours are long and the pay is horrible. Even a pat on the back can make a big difference.)
I just took a look at my blog on a Windows machine (I have totally switched to Linux here at home). Yuck! That preformatted macro from a couple days ago blows the whole CSS layout with the menu on the right side. Argh! If only Microsoft would catch a clue and fix the horrible CSS bugs in IE.
As a side note, more than 68% of you don't care about this. I just checked my web stats and slightly over 50% of you seem to use Mozilla, even though Windows is the more common OS (almost 57%). The second most popular browser is Safari with 18% share. IE comes up third with 16%. Needless to say, that surprises me. Heh, the Mac beats Windows by a nose on browsers, even though 50% of the OS pie is Windows.
Sunday, June 13, 2004
Additionally, I just finished a review of Paul Graham's ANSI Common LISP. The summary is that I give this book high marks. While it doesn't give an exhaustive treatment of every major feature of Common Lisp, it provides a great introduction and is well worth your time and money. Graham is a great writer and he does a great job with this book.
Saturday, June 12, 2004
After getting my dosage of The Swine Before Perl, I figured I would code up a macro to generate a basic automaton. This really shows the power of Lisp on a number of levels. First, the macro:
(defmacro define-automaton (name states &key (stop 'stop) (debug nil) ) (let ((event-func (gensym "func"))) `(defun ,name (,event-func) (tagbody ,@(reduce #'append (mapcar (lambda (state) (let ((state-name (car state)) (transitions (cdr state))) (list state-name `(case (funcall ,event-func) ,@(mapcar (lambda (trans) (let ((match (car trans)) (next (cadr trans)) (actions (cddr trans))) (cons match (append actions (when debug `((format t "Matched ~A. Transitioning to state ~A.~%" ,match (quote ,next)))) `((go ,next)))))) transitions)) `(go ,state-name)))) states)) ,stop))))
This allows you to define a state machine like this:
(define-automaton look-for-lisp ((start ('l found-l) ('x stop)) (found-l ('i found-i) ('l found-l) (otherwise start)) (found-i ('s found-s) ('l found-l) (otherwise start)) (found-s ('p start (format t "Found LISP~%") (return-from look-for-lisp t)) ('l found-l) (otherwise start))))
Each state has a name and a set of transitions. Each transition includes an input to match, a next state to transition to when that input appears, and a set of forms that should be executed when that transition occurs. In this case, we include forms only when we match the final "P" symbol in the input. STOP is a default stop state which exits the machine with a NIL return value. You can override the name of the default stop state if it clashes with a another state name (handy if you're programming traffic signal state machines, for instance). Finally, you can return from the machine at any time with a RETURN-FROM form.
The macro expands the previous state machine definition to the following code that uses a TAGBODY to define states. Basically, the current state is held in the program counter. This makes a very fast state machine for doing something like scanning an input stream. Each state first calls the function passed to the state machine to retrieve the next input and then uses CASE to dispatch to the next event.
(DEFUN LOOK-FOR-LISP (#:|func3231|) (TAGBODY START (CASE (FUNCALL #:|func3231|) ((QUOTE L) (GO FOUND-L)) ((QUOTE X) (GO STOP))) (GO START) FOUND-L (CASE (FUNCALL #:|func3231|) ((QUOTE I) (GO FOUND-I)) ((QUOTE L) (GO FOUND-L)) (OTHERWISE (GO START))) (GO FOUND-L) FOUND-I (CASE (FUNCALL #:|func3231|) ((QUOTE S) (GO FOUND-S)) ((QUOTE L) (GO FOUND-L)) (OTHERWISE (GO START))) (GO FOUND-I) FOUND-S (CASE (FUNCALL #:|func3231|) ((QUOTE P) (FORMAT T "Found LISP~%") (RETURN-FROM LOOK-FOR-LISP T) (GO START)) ((QUOTE L) (GO FOUND-L)) (OTHERWISE (GO START))) (GO FOUND-S) STOP))
We can then use the state machine like this:
CL-USER> (look-for-lisp #'read) l i s p Found LISP T CL-USER> (look-for-lisp #'read) x NIL CL-USER> (look-for-lisp #'read) l i p l i s p Found LISP T CL-USER> (look-for-lisp #'read) l i l i s p Found LISP T CL-USER> (look-for-lisp #'read) l i s x x NIL
One of the neat things about Lisp is how easily you can do incremental debugging by passing in functions like READ and then just using all the Lisp machinery to do most of your work for you.
Now, this macro also has a debugging mode. Simply add the :DEBUG keyword to the definition and you're in business:
(define-automaton look-for-lisp ((start ('l found-l) ('x stop)) (found-l ('i found-i) ('l found-l) (otherwise start)) (found-i ('s found-s) ('l found-l) (otherwise start)) (found-s ('p start (format t "Found LISP~%") (return-from look-for-lisp t)) ('l found-l) (otherwise start))) :debug t) CL-USER> (look-for-lisp #'read) l Matched L. Transitioning to state FOUND-L. i Matched I. Transitioning to state FOUND-I. s Matched S. Transitioning to state FOUND-S. p Found LISP T
Tuesday, June 08, 2004
Cullen O'Neill wrote to me to remind me of Shriram Krishnamurthi's presentation titled The Swine Before Perl (PDF, PPT, audio). Krishnamurthi is a Schemer from Brown University who works on PLT Scheme. In his talk, Krishnamurthi uses the short example of a pattern matching automaton to show how Scheme (and by implication Lisp) macros can be used to develop custom-syntax for things like state machines. References to this talk have been posted a few times on comp.lang.lisp and it is very good. If you haven't done so already, I highly encourage you to download the presentation and go through it with the audio. The audio track is excellent and Krishnamurthi is a very entertaining speaker. I'm sure he's an excellent lecturing professor and his students must really enjoy every minute with him (well, except exam time, etc...).
Monday, June 07, 2004
Steven H. Rogers wrote to me the other day and let me know that the J programming language has state machines as part of its native syntax. Steven said, "J tends to look like line noise, but it's quite powerful." Judging from the J website, I'd have to agree on both counts. It seems like a very high-level programming language, quite suitable for a number of disciplines. Some of the success stories on the web site talk about it being a good replacement for APL (which people also say looks like line noise). In any case, many of the success stories sound like Lisp success stories: high level programming language saves time, money, the world, etc...
No wonder I had never heard of it...
For grins, I checked out Finding Lisp's ranking on Google. Finding Lisp made page 12. I think that's good judging by the various high quality resources that were on pages 1 - 11. Interestingly, the first reference to the HyperSpec occurred on page 5. That was odd. Of course, Google can't delve into people's browser bookmarks, just the cross-links (every Lisp user has the HyperSpec bookmarked, right?). Pascal Costanza's Highly Opinionated Guide to Lisp also made page 5. (All this with the default 10 links per page, of course. I suppose I could have made page 2 with 100 links per page. ;-) )
Friday, June 04, 2004
One of the most fundamental concepts in all of computer engineering is the state machine. Whether implemented in hardware or software, state machines are core to the whole field of digital information processing. State machines are the foundation for every CPU design in existence, network protocols, regular expressions, most GUI architectures, and just about everything else you can imagine. Yet, I often find that software engineers haven't been trained to think strongly in terms of state machines, but rather sequential code. Sure, when pressed everybody knows what a state machine is, but they implement them infrequently, prefering sequential code instead. Why is that? Maybe it's my combination of both a hardware and software background, but I find state machines a quite natural expression of many problem solutions.
A secondary question is, why do all popular programming languages have such poor support for representing such a foundational abstraction? The Gang of Four devote eight pages to documenting the State pattern in Design Patterns. While this is helpful, it seems to me that it would be easier to simply add some syntax to the language to help out with something this basic. Why force everybody to reimplement the same idea of states and a driver machine every time we solve a problem?
In pondering these questions, I decided to do something about that. The cool thing is that Lisp, of course, makes this easy with macros. I'm now starting to implement a state machine mini-language as a macro library. I'll publish some code snippets as I get them better fleshed out.