Emacs Lisp

Table of Contents

In emacs lisp intro, the Robert J. Chassell quoted the following.

I prefer to learn from reference manuals. I “dive into” each paragraph, and “come up for air” between paragraphs.

When I get to the end of a paragraph, I assume that that subject is done, finished, that I know everything I need (with the possible exception of the case when the next paragraph starts talking about it in more detail). I expect that a well written reference manual will not have a lot of redundancy, and that it will have excellent pointers to the (one) place where the information I want is.

1 IO

princ is for human, it print object without quotes. print is the most verbose, print quotes and newlines. prin1 omit the newlines. If you just evaluate the print, the result is the object being printing, so the echo area will have two copy of the object.

message accepts only string, and used inclusively on echo area.

2 Symbol

Since elisp is lisp-1, a symbol can be both variable and a function at the same time. Macros and functions use the same namespace.

Elisp use nil in three ways: the symbol, the logical false, and the empty list.

Elisp also has #', but instead of syntax, it is the read syntax of quoting for function, i.e. function.

Elisp by default uses dynamic binding and dynamic extent for local variables. This means, the variable refers to the most recent local binding, and a binding exists all the way as long as the binding form is executing (e.g. body of let). setq works on the most recent binding.

Thus, when using a local dynamic binding, always make sure (by yourself, unfortunately) the variable is bound. When really using global variable, declare it at the top, via defvar and defconst. defvar will initialize the variable if it is originally void, while defconst will unconditionally initialize it. Other than that, there's no difference, the compiler will not complain if you changed the constant. The variable will be marked as "special", meaning that it will always have dynamic binding. There's a third way to create global binding, the defcustom. It is used to create customizable variable, also called user option. It is special in that, it is shown in customize interface, and the defcustom will specify how it should be displayed, and what values to take.

On the other hand, lexical scope establish lexical binding, and has indefinite extent. This means the variable has to refer to a binding that is lexical written in scope. The binding is available even outside the execution of the binding form, and construct a closure. To enable lexical binding, you have to set buffer-local variable lexical-binding to non-nil. Even after this, special variables are still dynamic binding.

Emacs supports another binding, called buffer-local binding. As name suggests, the binding is in effect when that buffer is the current buffer, and goes out of effect when it is not. This is most useful in major modes. Two ways can make buffer-local variable. make-local-variable set the variable to local to current buffer, while make-variable-buffer-local set a variable buffer-local in all buffers.

3 Regular Expression

You can use basic .*+?, as well as non-greedy counter part *?, +?, ??.

Bracket is special in elisp regex. Character classes can be used inside []. E.g. [[:ascii:]]. Possible values include

  • ascii: 0-127
  • alnum: letter or digit
  • alpha: letter
  • blank: space and tab
  • digit: 0-9
  • lower: lower case
  • upper
  • punct
  • space: white space
  • word: same as \w

Parenthesis and braces are not special, thus can be used literally. When using for grouping, they need to be escaped for capturing, otherwise it is literal. Non-capturing group is also supported by \(?:\). \1 for back reference.

Back slash some code has special meanings. e.g. \w \b. The uppercase is negation.

  • \w: word
  • \b:
  • \s-: whitespace
  • \sw: \w
  • \s.: punctuation

When constructing regexp that match string literals, you can use regexp-quote and regexp-opt to avoid getting specially interpreted. regexp-quote returns a regular expression, whose only exact match is string. regexp-opt returns an efficient regular expression, that will matches any of the strings supplied.

The mostly used functions are re-search-forward and backward. It search in the buffer. You can also search in a string by string-match or string-match-p. They will set match data.

After search, you can retrieve the previous match data by match-string and match-string-no-property (for clean string). You can also use match-beginning and match-end to get the position of the match instead of content.

Finally, replace-regexp-in-string replaces all matches in a string.

4 Lisp Common Sense

eq, equal, = are available.

Numeric function:

  • comparison: max, min, abs
  • rounding: truncate, floor, ceiling, round
  • arithmetic: %, mod
  • bit-wise: lsh, ash, logand, logior, logxor, lognot
  • math: expt, exp, sin, cos, log, sqrt
  • random: random

5 string

Creating string by make-string. Most likely we are creating from existing strings, e.g. substring, concat, split-string. String are compared using string=, string< (no string>?). Converted by number-to-string, string-to-number, and casing operations downcase, upcase, capitalize.

Of course, the most powerful string construction function is formating, with foramt, and format-message. The format string follows C style though, using %s as printed representation (princ), %S for prin1, %c for character,

6 list

List is defined as the last cdr to be nil. If the last is not nil, it is called dotted list instead of improper list.

  • append: the interesting part is, all arguments except the last one are copied. If you want to force copy the last one as well, add a nil as the last of append.
  • reverse

list generation:

  • number-sequence: inclusive from a to b

Apart from car and cdr, elisp has car-safe and cdr-safe, that, if the argument is not a cons cell, return nil. nth, nthcdr, last are available.

destructive means the cdr of the cons cells are modified.

pop and push is destructive. pop will return the car of the list. push is the counter part for cons onto the list. add-to-list only adds if the element is not there already. There are also very bare-bone functions setcar and setcdr. Note that sort is also destructive.

List can be, of course, used as set. member does predicate, remove removes item from set, delete destructively removes. They use equal, but have eq counter parts obviously. Finally, delete-dups remove duplication.

Association list is same as scheme, a list of pairs. assoc can be used to retrieve by car, while rassoc retrieve by cdr.

Property list is a flat list. The odd elements are property name, and the even elements are values. The property names must be unique. The order of the "pairs" does not matter. plist-get and plist-put modify the list. plist-member is useful because it can distinguish the missing property and the property with value "nil"

A symbol can have a property list. It has a simpler syntax, get and put with the symbol as argument. symbol-plist can retrieve the plist from symbol, setplist gives a plist to a symbol.

7 Sequence

Sequence is more general than list, specifically it also covers array. elt is used to retrieve from sequence by position. copy-sequence creates new sequence, but the elements are not copied.

Array is fixed length sequence, can be vector or string. make-vector or vector constructs vector, and aref and aset access it.

8 Hash Table

make-hash-table constructs a table, and access by gethash, puthash, remhash, clrhash. Hash table can be counted by hash-table-count instead of length, iterated by maphash instead of map.

9 Function

Functions are defined by following. To specify optional argument, use &optional before all optional arguments. Collect rest arguments by putting &rest before the final argument. A lambda expression evaluates to a function object.

(defun name (var ...) body ...)
(lambda (arg ...) body ...)
(required-var ...
   [&optional op-var ...]
   [&rest rest-var])

apply append the arguments into a list, and call the function with the splice of list as arguments. The last argument must be a list. funcall just call with the rest arguments.

mapcar is the typical map, return the list. mapc is used for side effect. mapconcat is a shorthand for concatenate the result as a string.

A function with (interactive) is a command, i.e. it can be executed with M-x. This apply to both defun and lambda. Although interactive is often used without argument, it can actually do very interesting staff. It basically defines what kind of arguments the user should provide to the command. Most likely, it is a multi-line string containing key code of what kind of values to expect, and prompt string. The numeric prefix argument "p" is just one of them, and it can differentiate C-u prefix of the command.

10 Macro

defmacro name (args) body...

The macro is very simple: leave the arguments as is and put them into the macro body to form an expression. The expression is then evaluated for result.

11 Control Structure

Sequential structure has progn, prog1, prog2.

if, when, unless, not, and, or are common.

cond takes the following form

(cond (condition body ...) ...)

pcase takes

(pcase exp (pat code ...) ...)

Loops takes follows. There's no mention what is the return of while. dolist does return the value of result, defaults to nil. dotimes bind var to [0,count).

(while condition forms ...)
(dolist (var list [result]) body ...)
(dotimes (var count [result]) body ...)

12 Packages

12.1 Dash.el

https://github.com/magnars/dash.el

This is a collection of list libraries.

  • -map takes a function to map over the list, the anaphoric form with double dashes executed with it exposed as the list item.

     ;; normal version
     (-map (lambda (n) (* n n)) '(1 2 3 4))
     ;; also works for defun, of course
     (defun square (n) (* n n))
     (-map 'square '(1 2 3 4))
     ;; anaphoric version
     (--map (* it it) '(1 2 3 4))
    
  • -update-at: (-update-at N FUNC LIST) Return a list with element at Nth position in LIST replaced with `(func (nth n list))`.
  • -flatten: (-flatten L): Take a nested list L and return its contents as a single, flat list.

12.2 s.el

https://github.com/magnars/s.el

The string manipulation library

12.3 cl-lib.el loop

This package ports many common lisp facilities into elisp, most importantly, the loop facility. So this section, at least for now, focus on cl-loop.

12.3.1 general loop form

(cl-loop clauses...)

The clauses can be:

  • for clauses
  • TODO

12.3.2 for clauses

for VAR from FROM to TO by STEP
  • FROM defaults to 0. STEP must be positive and default to 1.
  • inclusive [from,to]
  • from can be upfrom and downfrom. I think it is wired to use this.
  • to can be upto and downto. This makes more sense.
  • above and below can be used, but exclusive. e.g. for var below 10
for VAR in LIST by FUNCTION
FUNCTION is used to traverse the list, defaults to cdr
for VAR on LIST by FUNCTION
VAR is bound to the cons cell of the list instead of the element.
for VAR across ARRAY
iterates all elements of array
for VAR = EXPR1 then EXPR2
this is the most general form. The VAR is bound to EXPR1 initially, and will be set by evaluating EXPR2 in successive iterations. EXPR2 can refer the old VAR

12.3.3 iteration clauses

repeat integer
repeat the loop how many times
while condition
stops the loop when the condition becomes nil
until condition
always condition
like while except it returns nil, and finally clauses are not executed.
never condition
counter part for always

12.3.4 accumulation clauses

collect form
collect into a list and return the list in the end
append form
collect the lists into a list by appending, and return it in the end
concat form
for string only
count form
count how many times form evaluates to non-nil.
sum form
sum all the values
maximize form
get the max. If the form is never executed, result is undefined
minimize form

12.3.5 Other clauses

with var = value
set the value one-time at the beginning of the loop. Often used as return variable. The spaces around = is essential!.
if condition clause [else clause]
when condition clause
same as if
unless condition clause
similar
initially [do] forms...
execute before the loop begins, but after the for and with variable bindings. do is optional.
finally [do] forms...
execute after the loop finishes
finally return form
finally return it …
do forms...
execute as an implicit progn in the body
return form
this is often used in if or unless, because put it in top level will cause the loop always execute only once.

12.4 cl-lib other

Of course, cl-lib provides much more than just loops …

incf PLACE
is i++

13 Debugging

13.1 lisp debugger

The simplest debugger is called lisp debugger. You can turn on the debug-or-error flag, but I found inserting the (debug) command useful. Simply insert (debug) where you want program to suspend, and run it. You will enter the debugger at that point. In the debugger buffer, the following commands are available:

c
continue run program
d
step
e
evaluate an prompt expression
R
like e, but also save the result in *Debugger-record*
q
quit
v
toggle display of local variables ???

13.2 Edebug

For this to work, first you need to instrument the code. You can instrument the defun by C-u C-M-x. Actually this is adding a prefix before eval-defun, which instrument, and then evaluate the defun.

After instrumentation, running the defun will cause the program to stop at the first stop point of the function. The stop points are

  • before and after each subexpression that is a list
  • after each variable reference

13.2.1 breakpoints

b
set a breakpoint
u
unset a breakpoint
x CONDITION
set a conditional breakpoint

You can also set the source breakpoints, by adding (edebug).

13.2.2 Moving of point

B
move point to the next breakpoint
w
move point back to the current stop point

13.2.3 executions

<SPC>
run to next stop point
g
execute until next breakpoint
q
exit
S
stop and wait for Edebug commands
n
evaluate a sexp and stop at stop point
t
trace, pause one second at each stop point …
T
rapid trace. Update the display at each stop point but don't actually pause …
c
pause one second at each breakpoint
C
rapid continue.
G
run and ignore breakpoints (but you can stop it by S)
h
proceed to the stop point near the point …
f
run one expression
o
step out the containing expression
i
step in

13.2.4 evaluation

e EXP
evaluate a prompt expression
C-x C-e
evaluate an expression at point

13.2.5 other commands

?
show help
r
redisplay the most recent sexp result
d
display the backtrace

14 Unit Testing

Use ert for unit testing.

14.1 Write test

(ert-deftest addition-test()
  "Outline docstring."
  (should (= (+ 1 2) 4)))

The family of functions:

  • should
  • shoult-not
  • should-error

expected failure:

(ert-deftest addition-test()
  "Outline docstring."
  :expected-result :failed
  (should (= (+ 1 2) 4)))

skip test

(ert-deftest addition-test()
  "Outline docstring."
  (slip-unless (featurep 'dbusbind'))
  (should (= (+ 1 2) 4)))

14.2 Run test

M-x ert will run it. The selector of test accept some more fancy staff like regular expression matching. But in the case of scratch testing, I need to evaluate the deftest and then call ert.

The nice thing is it supports interactive debugging. In the ert buffer, the following commands are available:

r
re-run the test
.
jump to the source code of this test
b
show back-trace
m
show the message this test printed
d
re-run the test with debugger enabled
instrumentation
go to source code, type C-u C-M-x, and re-run the test. You are able to step!

Also, select test by this:

(ert-run-test (ert-get-test 'my-defined-test))

15 Some random code snippets

(cl-prettyprint (font-family-list)) ;; see all font family available on this system

15.0.1 Url retrieval

  (with-current-buffer (url-retrieve-synchronously "http://scholar.google.com/scholar?q=segmented symbolic analysis")
    (goto-char (point-min))
    (kill-ring-save (point-min) (point-max))
    )
  (let ((framed-url (match-string 1)))
    (with-current-buffer (url-retrieve-synchronously framed-url)
      (goto-char (point-min))
      (when (re-search-forward "<frame src=\"\\(http[[:ascii:]]*?\\)\"")
        (match-string 1))))

16 Emacs Related

16.1 Buffer

  • with-temp-buffer (with-temp-buffer &rest BODY) Create a temporary buffer, and evaluate BODY there like progn.
  • (insert-file-contents FILENAME &optional VISIT BEG END REPLACE): Insert contents of file FILENAME after point.
  • (secure-hash ALGORITHM OBJECT &optional START END BINARY): the object can be a buffer. This can be used to compare if a file has changed.
  • (current-buffer): Return the current buffer as a Lisp object.
  • (message FORMAT-STRING &rest ARGS): Display a message at the bottom of the screen.

There will be many buffers in an Emacs session, and the current-buffer returns the current one, which is the default target for most commands. When you want to make something interesting to some other buffer, you will need to set-buffer to set that buffer current. You will likely want to switch back to the original buffer after those operations, for that, don't use set-buffer to set back, because it is not error-safe. Instead, use save-current-buffer, or better with-current-buffer. with-temp-buffer don't need a provided buffer object, but creates a temporary one. The temporary buffer will be killed at the end of execution of body. All of these 3 form does not display the buffer, just make it current.

A buffer has a name, retrieved by buffer-name. The name can be set using rename-buffer. Buffers can be obtained by name via get-buffer. Buffers are also likely to be associated with a file, and the non-directory file name is buffer-file-name. You can also get the buffer using the file name via get-file-buffer. Since it just the filename, there must be multiple ones, and this function returns the first.

To create a buffer, use get-buffer-create, which returns the new buffer, or an existing buffer. It does not make that buffer current. Create a new unique buffer name by generate-new-buffer-name. It is not typically directly used though. The function generate-new-buffer uses that function to generate new name (by post-fixing <N>), if the provided name is in use.

Obtain all the live buffers using buffer-list. The order of list matters. The newly created buffer is added to the end of list, the current displayed buffer moves to the front. When a buffer is buried, it is moved to the end. other-buffer returns the first in the list that is not current one. last-buffer returns the last (end) in the list. bury-buffer and unbury-buffer moves a buffer to the end and switch buffer to the last buffer respectively. A buffer is killed by kill-buffer, in which case it is removed from the list.

16.2 Position

A position is the index in a buffer. There of course will be a character before and one after the position. When we say "at position", we mean after position. Position in a buffer starts from 1, while position in a string starts from 0.

The point is the current cursor position. point returns the current point, point-min and point-max returns the beginning and end point.

There are many commands to move point. goto-char moves by position, and all other commands build upon it. I'm omitting the opposite version, e.g. forward v.s. backward, up v.s. down., beginning v.s. end

  • moves by characters: forward-char
  • moves by word: forward-word
  • buffer: beginning-of-buffer moves to point-min
  • line: beginning-of-line and end-of-line, forward-line and backward-line
  • screen: you can also count the current vertical screen lines, and move the corresponding lines accordingly.
  • balanced expression: forward-list, up-list, forward-sexp, end-of-defun
  • skipping: skip-chars-forward skips over a list of chars represented by a pattern string. It is like regular expression, but is put implicitly inside brackets. Thus you can use for example "a-zA-Z".

It is useful to temporarily move to some position, do some tasks, and move back. It is called execursion, and is done via save-execursion.

Narrowing works with two positions. narrow-to-region does the narrowing, and widen undoes it. This creates the following effects:

  1. determine the accessible portion of the buffer, but don't alter the position of the actual buffer.
  2. The point cannot move outside the positions
  3. no texts outside are displayed
  4. most (?) functions refuse to operate on outside text

16.3 Marker

A marker has two component: the buffer it is in, and the position in the buffer. They can be retrieved by marker-position and marker-buffer.

The position is updated automatically when the text changes. The invariant is the surrounding two characters. The updating of marker position takes time, especially there are a lot of them. Thus, remove the marker if you know you won't use if any more.

You can make a marker by 4 functions, which differs only its initial point. make-marker, point-marker, point-min-marker, point-max-marker. You can also copy-markder from existing one. A marker can be moved by set-marker.

There's one special marker, designated the mark, whose position is returned by mark. To return the actual marker, use mark-marker, but this is dangerous, try to avoid it. The mark is mainly used to provide a default region for a command. The text between point and the mark is called the region. The beginning and end of it can be obtained by region-beginning and region-end. When using (interactive) to define a command, the "r" code will give the command two numeric values as the (point) and the mark, the smaller first. This region is used for most region based command by default.

Some command will set the mark, and when it does this, it will typically save the old mark on the mark ring. set-mark set the position of the mark, but it is not commonly used, because it discard the previous mark. Instead, push-mark and pop-mark handles the mark ring automatically.

16.4 Process

Elisp can create async or sync processes. There are three primitives to create subprocess: make-process for async, call-process and call-process-region for sync. All others are built upon them.

To get a list of current live async processes, use list-processes. This seems to be for display purpose, and process-list seems to return process objects. You can also get process by its name via get-process. Process information can be retrieved by process-command, process-id, process-name, process-status, process-live-p, process-type, process-exit-status.

You also want to communicate with the subprocess: either send input, receive output, or send signals. To send string as input, use process-send-string, process-send-region, process-send-eof. To send signals, use interrupt-process, kill-process, quit-process, stop-process, continue-process, or the general one signal-process.

The output of a subprocess is inserted into a associated buffer, called the process buffer. This buffer serves two purposes: receive the output, and kill the process by kill the buffer. process-buffer returns the buffer with a particular process, and get-buffer-process returns the process object associated with the buffer. The position to insert is determined by the process mark, which is always set to the end of the buffer. You can set process buffer by set-process-buffer.

Network connection is also represented by a process object, but it is not a child process, has no process id, cannot be killed or sent signal. You can only send and receive data, or close the connection. make-network-process creates network connection. It seems to be a primitive, able to create TCP, UDP, or a server. Alternatively, open-network-stream creates TCP specifically.

16.5 File System Related

16.5.1 Traversing

(directory-files DIRECTORY &optional FULL MATCH NOSORT)

Return a list of names of files in DIRECTORY.

Usage example:

(bib-files (directory-files bib-dir t ".*\.bib$"))

16.5.2 Predicates

directory-files will throw error if the directory does not exist. So a safe way is to check if the directory exists first. This predicate does this:

(file-exists-p FILENAME)

Directory is also a file.

Other predicates includes:

file-readable-p
file-executable-p
file-writable-p
file-accessible-directory-p

16.6 Other

  • (defalias SYMBOL DEFINITION &optional DOCSTRING): Set SYMBOL's function definition to DEFINITION. E.g. (defalias 'helm-bibtex-get-value 'bibtex-completion-get-value), serves as a temporary patch for helm-bibtex update its API to bibtex-completion

16.6.1 make-obsolete-variable

(make-obsolete-variable OBSOLETE-NAME CURRENT-NAME WHEN &optional ACCESS-TYPE)

Make the byte-compiler warn that OBSOLETE-NAME is obsolete.

helm-bibte used it when it refactored the "helm" part off into a module, to support different backend other than helm. As a result, most helm-bibtex- prefixes are changed to bibtex-completion- ones. But they want the end user's configuration will not break, and at the same time warn them to update to the new name. Here's the code, and the last line is what actually uses the function. The actual effect is the user's configuration will be marked as warning, the mini-buffer will describe the obsolete detail.

  (cl-loop
   for var in '("bibliography" "library-path" "pdf-open-function"
                "pdf-symbol" "format-citation-functions" "notes-path"
                "notes-template-multiple-files"
                "notes-template-one-file" "notes-key-pattern"
                "notes-extension" "notes-symbol" "fallback-options"
                "browser-function" "additional-search-fields"
                "no-export-fields" "cite-commands"
                "cite-default-command"
                "cite-prompt-for-optional-arguments"
                "cite-default-as-initial-input" "pdf-field")
   for oldvar = (intern (concat "helm-bibtex-" var))
   for newvar = (intern (concat "bibtex-completion-" var))
   do
   (defvaralias newvar oldvar)
   (make-obsolete-variable oldvar newvar "2016-03-20"))