Books of Note

Practical Common
LispThe best intro to start your journey. Excellent coverage of CLOS.

ANSI Common
LispAnother great starting point with a different focus.

Paradigms of Artificial Intelligence
ProgrammingA superb set of Lisp examples. Not just for the AI crowd.

Saturday, February 12, 2005

Escaping CL string characters in Emacs 

I have been continuing to play around with Edi Weitz's Regex Coach since I wrote about it the other day. As I said earlier, Regex Coach makes it easy to develop a regular expression from CL-PPCRE interactively and then cut-and-paste the regex string into your code.

Now, the code I happened to be working on the other day was for log-file parsing for a web server. The format happens to have a lot of double-quote marks embedded in it. Further, I was using a bunch of special backslash operators to match digits, words, etc. In fact, here's the whole regular expression in CL source form:

(defparameter *log-regex*
  (concatenate 'string
        "(\\S+)\\s+" ; ip address
        "\\S+\\s+\\S+\\s+" ; two dashes for what??
        "\\[(\\d+)/(\\w+)/(\\d+)" ; date
        ":(\\d+):(\\d+):(\\d+)\\s+([+-]?\\d+)\\]\\s+" ; time
        "\"(\\S+)\\s+" ; method
        "(\\S*)\\s*" ; url
        "(\\S*)\"\\s+" ; protocol
        "(\\d+)\\s+" ; response
        "(\\S+)\\s+" ; length
        "(\\S+)\\s+" ; site
        "\"([^\"]*)\"\\s+" ; referrer
        "\"([^\"]*)\"\\s+" ; agent
        "\"[^\"]*\"")) ; ??

All of these special characters need to be escaped within a CL string using backslashes. In this case, this leads to a lot of backslashes. (I chopped the regex into separate lines once I got it the way I wanted it such that I could add some documentation in comments.) Needless to say, copying strings between Regex Coach and my CL source was getting to be a pain, adding and removing backslashes each time, so I automated...

I came up with a couple of Emacs Lisp functions that can escape all the backslashes and embedded double-quote characters in a string and the reverse, instantly.

(defun escape-lisp-string-region (start end)
  "Escape special characters in the region as if a CL string.
Inserts backslashes in front of special characters (namely backslash
and double quote) in the region, according to the Common Lisp string
escape requirements.

Note that region should only contain the characters actually
comprising the string, without the surrounding quotes."
  (interactive "*r")
  (save-excursion
    (save-restriction
      (narrow-to-region start end)
      (goto-char start)
      (while (search-forward "\\" nil t)
 (replace-match "\\\\" nil t))
      (goto-char start)
      (while (search-forward "\"" nil t)
 (replace-match "\\\"" nil t)))))


(defun unescape-lisp-string-region (start end)
  "Unescape special characters from the CL string specified by the region.
This amounts to removing preceeding backslashes from the characters
they escape.

Note that region should only contain the characters actually
comprising the string, without the surrounding quotes."
  (interactive "*r")
  (save-excursion
    (save-restriction
      (narrow-to-region start end)
      (goto-char start)
      (while (search-forward "\\" nil t)
 (replace-match "" nil t)
 (forward-char)))))

Simply bind these functions to a couple of open keys in Emacs and you're set.

I should note that these functions are generic. They are handy whenever you have any text to cut-and-paste into a CL string from another source. They are particularly handy with strings that will be processed by downstream engines that use backslash quoting conventions.


Comments:


You might also try CL-INTERPOL, which defines reader syntax similar to this and can do some other useful things as well.
 

Post a Comment


Links to this post:

Create a Link

This page is powered by Blogger. Isn't yours?