Friday, March 16, 2007
If you haven't spent any time with Factor, the best way to describe it is "high-level Forth." Forth is a stack-based language that is great for embedded work because you can do a whole lot in a very small footprint. Forth, like Lisp, is interactive. Unfortunately (or perhaps fortunately, if you're a big Forth fan), Forth is fairly low-level in terms of its operation. Forth likes to think in terms of machine words. A lot of things like string handling are done with pointers. When data is stored on the stack, it's untyped and if you put parameters in the wrong order, it's easy to blow things up. In general, if you're an embedded programmer, Forth rocks. If you're an application programmer, I think Forth is the wrong tool. (Please, Forth people, don't write to me and tell me that Charles Moore, the creator of forth wrote his own CAD system all in Forth to design Forth machine chips. I know all that. Moore is a genius and there are few people in the world that could do what he has done.)
But what would happen if you took some of the ideas that Forth has: interactivity, high-level compiler that travels with the application, implicit stack-oriented parameters in function calls, and married that with some high level data types? What if those high-level data types had embedded typing, like Lisp, and so the system could determine when you're trying to add a number and a string or other type errors? Well, you'd end up with Factor.
Factor is the creation of Slava Pestov, another genius. Factor runs on just about any OS that runs on x86 and on several different processors under Linux. Factor has a great GUI development environment that takes some cues from Lisp Machines and CLIM (think lots of hyper-linked documentation and help, along with presentations). Factor has lots of example code, including such things as a web server on which the Factor web site runs. Check out the Factor web site for more info. There is a lot of goodness here.
As good as Factor is, however, I'm not sure it's my cup of tea for general application programming. I generally like Reverse Polish Notation (RPN) and have always used HP calculators all my life. That said, I just find it difficult to keep track of stack parameters through a long set of Forth or Factor function calls. It isn't that working this way is wrong in any sense of the word, but I find that I prefer named parameters where I can attach a description to a value, as in something like Lisp.
So, I decided to look at Erlang. Erlang was developed by Joe Armstrong at Ericsson to address problems in the telecom field. Luke Gorrie (another programming genius) has been telling me that I should get some Erlang experience for years now. Luke and I both work in the telecom/datacom field. Luke was at Bluetail with some of the key Erlang folks and later wound up at Nortel through a series of acquisitions. I had just left Nortel a while earlier.
Erlang's big claim to fame is concurrency. An Erlang program is composed of multiple "processes," similar in function to OS-level processes, but running in one or more virtual machines. Processes communicate by message passing. When the processes are all located in the same VM, this happens very quickly, but the nice thing is that the communication semantics are identical if the processes are running on multiple VMs, possibly on multiple computers. This makes it very easy to write Erlang programs that run in a distributed fashion.
Erlang is functional and keeps the memory of each process completely separate from the others ("shared nothing"). This improves the reliability of programs for several reasons. First, you don't have to worry about mutable data structures messing you up, violating an assumption without you knowing about it. Second, processes can't interact with each other except through message passing. If a process crashes for whatever reason, the other processes around it can generally continue to function until the crashed process is restarted.
And Erlang processes can crash a lot. This isn't because the programs are necessarily buggy, though that's one reason, but rather because Erlang actually encourages you to program only for the common case and to crash the moment your program detects any violation of its assumptions. The theory here is that rather than trying to continue a failed computation, it's often better for the overall system reliability for a process to give up and let other processes outside the failed process clean up the resulting mess. This is an interesting philosophy but it has a lot of merit.
Think about your typical PC. If you're a Linux user and you encounter a buggy program, how many of you will restart it in order to try to clean up the mess? If you have a buggy Windows system, how many of you reboot it? And generally, this works. Erlang simply takes the same idea and applies what we all know intuitively to be true , rebooting often fixes problems, to processes within a larger program.
Now, one of the neat things about Erlang crashes is that they produce a stack trace, much like you'd have in a Lisp system, such that a programmer isn't left with no data, scratching his head, wondering why the process crashed. You at least have some data to go on when you start debugging. Erlang aims to have systems that operate non-stop for years. To support this, Erlang supports hot code replacement.
So imagine you have the scenario where a customer reports a bug. Your support person asks the customer to send the log file, in which is the stack trace. A programmer examines the data and determines a fix. You recompile the program and send the customer the new version. The customer loads the new version while the old one is still running and the new version takes over seamlessly, with no downtime. Yes, Lisp has had many of these same ideas, and Erlang incorporates them.
Erlang is not all a bed of roses. I don't like the syntax. It's scary to say this, but I really do like and appreciate Lisp sexprs. Erlang has primitive macros, ala C, but nothing approaching the power of Lisp.
Pet peeve: Modern software reliability is horrible. I think Erlang at least gets its philosophy right. It's built to make highly reliable systems and it has the features to support that goal. From the get-go, it says, "Okay, we're going to be building systems that will operate non-stop for years. Of course we'll find bugs, but we need ways to be able to debug the system and then introduce changes to it without taking the system down. Further, bugs should only result in partial failures if at all possible. Where there are other tasks in the system unaffected by the bugs, they should remain available through the whole problem period." There are very few languages that could rise to that challenge. Lisp and Smalltalk come the closest, I think, but even with those there are issues of corrupted data structures hanging around and causing problems.
I know it's all in fashion right now, but I'm starting to contemplate my dream language. It looks a lot like CL or Scheme, but with some rather nice ideas borrowed from Erlang (and possibly Smalltalk and maybe even Ruby). In particular, the ability to have large numbers of concurrent processes, with message passing semantics. Keep the same sexpr syntax as Lisp and leave in all the introspection and meta-programming facilities. While there have been attempts to merge Erlang and Lisp concepts before, notably with Erlisp (and another one on the Scheme side whose name escapes me right now), I think what's needed is a ground-up rethink. Erlisp attempts to capture some of Erlang's process and message passing ideas in standard Common Lisp. Unfortunately, most Common Lisps don't have great multiprocessing capabilities, and none have thought through the "shared none" semantics that Erlang uses to increase reliability of the overall system. Semantically, in an Erlang system, sending a message to another process always creates a copy (of course, the copy may be optimized away by the compiler if the semantics are preserved). SETF has all sorts of abilities to trip you up in standard Common Lisp if you aren't careful.
Maybe to spare myself the embarrassment of trying to implement this (I'm not a Slava Pestov, a Charles Moore, or a Luke Gorrie), I'll base it on Arc. As soon as Paul is done, I'll get cracking on this new thing...
(Oh, and I'm not a Paul Graham either.)
About Erlang and Macros:
Erlang _does_ have metaprogramming facilities, see this:
The Smerl library (used for metaprogramming in Erlang):
The problem with Factor is that that guy Slava is... how shall we put this... just a little bit prickly. Long term, I think that kind of thing has a bad influence on a language's community.
Let's face it, at some point you're gonna be working on your code at 2am, after a loooong day. At that point your gonna need a language that is easy to read, but with all the big-boy features. For me, that language is Ada95. It's easy to read, encourages modularity, and has a full-featured tasking environment.
You might be interested in Termite, which is based on Gambit Scheme and provides a distributed framework similar to Erlang on top of a high-performance compiled scheme system.
The factor community seems to be growing reasonably well - I don't think Slava's all that 'prickly' personally.
It's getting a large number of contributed libraries including:
- unicode support
- Ogg Vorbis Playback
- Erlang style concurrency
- Various crypto libraries
- continuation based web framework
Dealing with a stack based language language takes some getting used to I agree. But so does dealing with Monads and IO in Haskell, OTP in Erlang, etc.
It's pretty early days for Factor though. It'll be interesting to see where it ends up in a couple of years.
What don't you like about termite? I've never used it, but if you want something like erlang and lisp, the obvious thing would be to take an existing library, and modify a current scheme/lisp implementation to provide better threading support, or alternatively to get lisp/scheme to sit atop erlang.
You didn't mention cl-muproc in your blog, so I'll just point you towards it. http://www.mu.dk/cl-muproc. From the header: "CL-MUPROC — Erlang-inspired multiprocessing in Common Lisp".
That said, I just find it difficult to keep track of stack parameters through a long set of Forth or Factor function calls.
Dave, I'm a Lisper myself, but I just had to grin at that comment. That sounded so much like the typical Lisp newbie complaint about keeping track of all the parentheses ;-)
May I recommend Leo Brodie's classic Thinking Forth, now freely available online in PDF form. Not only is that one of the best books about Forth, but also one of the better ones about software development in general. Despite having first been published over two decades ago, it has a certain timelessness about it not dissimilar to Fred Brooks's works.
Also, Joe Armstrong's new book Programming Erlang - Software for a Concurrent World is to be published by the Pragmatic Programmers this July. You can already buy a beta PDF copy of the book - I'm looking forward to reading my copy this weekend.
Arto, yes, you're right. That was a bit lame and does sound like keeping track of parenthesis. But I have to say that with parenthesis, I'd still be lost if not for Emacs. With stack parameters, there are no helps. Sure, you can have stack comments in Forth and Factor, but you still have to parse it all and keep it in your head when you're reading code.
I agree with you about Brodie's books and I have read them. As I said, when handed a standard calculator, I think in HP RPN, so on one level Forth and Factor make perfect sense to me. I just find that I can't program "in the large" that way.
And I have already bought Armstrong's Erlang book in beta PDF form. Way ahead of you there. ;-)
Post a Comment
Links to this post: