SXEmacs Lisp Reference Manual: Formatting Strings

10.10 Formatting Strings

Formatting means constructing a string by substitution of computed values at various places in a constant string. This string controls how the other values are printed as well as where they appear; it is called a format string.

Formatting is often useful for computing messages to be displayed. In fact, the functions message and error provide the same formatting feature described here; they differ from format only in how they use the result of formatting.

Function: format string &rest objects: This function returns a new string that is made by copying string and then replacing any format specification in the copy with encodings of the corresponding objects. The arguments objects are the computed values to be formatted.

A format specification is a sequence of characters beginning with a ‘%’. Thus, if there is a ‘%d’ in string, the format function replaces it with the printed representation of one of the values to be formatted (one of the arguments objects). For example:

(format "The value of fill-column is %d." fill-column)
     ⇒ "The value of fill-column is 72."

If string contains more than one format specification, the format specifications correspond with successive values from objects. Thus, the first format specification in string uses the first such value, the second format specification uses the second such value, and so on. Any extra format specifications (those for which there are no corresponding values) cause unpredictable behavior. Any extra values to be formatted are ignored.

Certain format specifications require values of particular types. However, no error is signaled if the value actually supplied fails to have the expected type. Instead, the output is likely to be meaningless.

Here is a table of valid format specifications:

‘%s’

Replace the specification with the printed representation of the object, made without quoting. Thus, strings are represented by their contents alone, with no ‘"’ characters, and symbols appear without ‘\’ characters. This is equivalent to printing the object with princ.

If there is no corresponding object, the empty string is used.

‘%S’

Replace the specification with the printed representation of the object, made with quoting. Thus, strings are enclosed in ‘"’ characters, and ‘\’ characters appear where necessary before special characters. This is equivalent to printing the object with prin1.

If there is no corresponding object, the empty string is used.

‘%o’

Replace the specification with the base-eight representation of an integer.

‘%d’

‘%i’

Replace the specification with the base-ten representation of an integer.

‘%x’

Replace the specification with the base-sixteen representation of an integer, using lowercase letters.

‘%X’

Replace the specification with the base-sixteen representation of an integer, using uppercase letters.

‘%b’

Replace the specification with the base-two representation of an integer.

‘%c’

Replace the specification with the character which is the value given.

‘%e’

Replace the specification with the exponential notation for a floating point number (e.g. ‘7.85200e+03’).

‘%f’

Replace the specification with the decimal-point notation for a floating point number.

Please bear in mind that floating point numbers have a limited and fixed precision although the print output may suggest something else. The precision varies (depending on the machine) between 12 and 38 digits. This means if you use specifiers like ‘%.60f’ on ‘1.0’ or ‘1.5’ only the first 12 to 38 digits are real. Also note, that internally numbers are processed in a 2-adic arithmetic, so you may experience strange rounding effects, e.g. ‘%.60f’ on ‘1.2’ or ‘%f’ on ‘1e+40’, this is because you force the printer to be more precise than actually valid. No error is thrown in these cases!

‘%g’

Replace the specification with notation for a floating point number, using a “pretty format”. Either exponential notation or decimal-point notation will be used (usually whichever is shorter), and trailing zeroes are removed from the fractional part.

‘%%’

A single ‘%’ is placed in the string. This format specification is unusual in that it does not use a value. For example, (format "%% %d" 30) returns "% 30".

If ENT support is compiled in, there are several additional specifiers which may become available, depending on the provided additional libraries. see Enhanced Number Types.

Any other format character results in an ‘Invalid format operation’ error.

Here are several examples:

(format "The name of this buffer is %s." (buffer-name))
     ⇒ "The name of this buffer is strings.texi."

(format "The buffer object prints as %s." (current-buffer))
     ⇒ "The buffer object prints as #<buffer strings.texi>."

(format "The octal value of %d is %o,
         and the hex value is %x." 18 18 18)
     ⇒ "The octal value of 18 is 22,
         and the hex value is 12."

There are many additional flags and specifications that can occur between the ‘%’ and the format character, in the following order:

An optional repositioning specification, which is a positive integer followed by a ‘$’.
Zero or more of the optional flag characters ‘-’, ‘+’, ‘ ’, ‘0’, and ‘#’.
An asterisk (‘*’, meaning that the field width is now assumed to have been specified as an argument.
An optional minimum field width.
An optional precision, preceded by a ‘.’ character.

A repositioning specification changes which argument to format is used by the current and all following format specifications. Normally the first specification uses the first argument, the second specification uses the second argument, etc. Using a repositioning specification, you can change this. By placing a number n followed by a ‘$’ between the ‘%’ and the format character, you cause the specification to use the nth argument. The next specification will use the n+1’th argument, etc.

For example:

(format "Can't find file `%s' in directory `%s'."
        "ignatius.c" "loyola/")
     ⇒ "Can't find file `ignatius.c' in directory `loyola/'."

(format "In directory `%2$s', the file `%1$s' was not found."
        "ignatius.c" "loyola/")
     ⇒ "In directory `loyola/', the file `ignatius.c' was not found."

(format
    "The numbers %d and %d are %1$x and %x in hex and %1$o and %o in octal."
    37 12)
⇒ "The numbers 37 and 12 are 25 and c in hex and 45 and 14 in octal."

As you can see, this lets you reprocess arguments more than once or reword a format specification (thereby moving the arguments around) without having to actually reorder the arguments. This is especially useful in translating messages from one language to another: Different languages use different word orders, and this sometimes entails changing the order of the arguments. By using repositioning specifications, this can be accomplished without having to embed knowledge of particular languages into the location in the program’s code where the message is displayed.

All the specification characters allow an optional numeric prefix between the ‘%’ and the character, and following any repositioning specification or flag. The optional numeric prefix defines the minimum width for the object. If the printed representation of the object contains fewer characters than this, then it is padded. The padding is normally on the left, but will be on the right if the ‘-’ flag character is given. The padding character is normally a space, but if the ‘0’ flag character is given, zeros are used for padding.

(format "%06d is padded on the left with zeros" 123)
     ⇒ "000123 is padded on the left with zeros"

(format "%-6d is padded on the right" 123)
     ⇒ "123    is padded on the right"

format never truncates an object’s printed representation, no matter what width you specify. Thus, you can use a numeric prefix to specify a minimum spacing between columns with no risk of losing information.

In the following three examples, ‘%7s’ specifies a minimum width of 7. In the first case, the string inserted in place of ‘%7s’ has only 3 letters, so 4 blank spaces are inserted for padding. In the second case, the string "specification" is 13 letters wide but is not truncated. In the third case, the padding is on the right.

(format "The word `%7s' actually has %d letters in it."
        "foo" (length "foo"))
     ⇒ "The word `    foo' actually has 3 letters in it."

(format "The word `%7s' actually has %d letters in it."
        "specification" (length "specification"))
     ⇒ "The word `specification' actually has 13 letters in it."

(format "The word `%-7s' actually has %d letters in it."
        "foo" (length "foo"))
     ⇒ "The word `foo    ' actually has 3 letters in it."

After any minimum field width, a precision may be specified by preceding it with a ‘.’ character. The precision specifies the minimum number of digits to appear in ‘%d’, ‘%i’, ‘%o’, ‘%x’, and ‘%X’ conversions (the number is padded on the left with zeroes as necessary); the number of digits printed after the decimal point for ‘%f’, ‘%e’, and ‘%E’ conversions; the number of significant digits printed in ‘%g’ and ‘%G’ conversions; and the maximum number of non-padding characters printed in ‘%s’ and ‘%S’ conversions. The default precision for floating-point conversions is six.

The other flag characters have the following meanings:

The ‘ ’ flag means prefix non-negative numbers with a space.
The ‘+’ flag means prefix non-negative numbers with a plus sign.
The ‘#’ flag means print numbers in an alternate, more verbose format: octal numbers begin with zero; hex numbers begin with a ‘0x’ or ‘0X’; a decimal point is printed in ‘%f’, ‘%e’, and ‘%E’ conversions even if no numbers are printed after it; and trailing zeroes are not omitted in ‘%g’ and ‘%G’ conversions.