A note about the performance of Printf and Format.

Alain Frisch

The goal is to display the following to stdout:

(0,0) (1,1) (2,2) … (1000000,1000000).

How would you implement that in OCaml? For such a simple task, we probably expect the program to be IO bound, right? Ok, let’s try with the idiomatic way, which is to use format strings as provided by the Printf module from OCaml standard library:

let printf () =
  for i = 1 to n do
    Printf.printf "(%d,%d)\n" i i
  done

Or maybe the same with the Format module:

let formatf () =
  for i = 1 to n do
    Format.printf "(%d,%d)@." i i
  done

On my machine, compiling with ocamlopt and running these functions (redirecting stdout to a local file) take respectively 0.94s and 3.33s. So the first lesson is:

If performance matters, don’t use Format if Printf is enough.

Ok, can we beat the Printf version? This version parses and interprets the format string at runtime, and then maps to low-level output functions. So let’s try to use directly those functions:

let direct () =
  let oc = stdout in
  for i = 1 to n do
    output_char oc '(';
    output_string oc (string_of_int i);
    output_char oc ',';
    output_string oc (string_of_int i);
    output_char oc ')';
    output_char oc '\n';
  done

This version takes 0.51s (0.94s for printf).

If performance matters, use direct output functions instead of Printf.

Sometimes, you are stuck with the Format module, because you must call existing printers of type Format.formatter -> foo -> unit, or you indeed make use of formatting boxes. Still, nothing forces you to rely on format strings to use Format. Here a more direct version:

let format () =
  let open Format in
  for i = 1 to n do
    print_char '(';
    print_int i;
    print_char ',';
    print_int i;
    print_char ')';
    print_flush ()
  done

This one takes 2.58s in native code (3.33s for formatf).

If performance matters and you are stuck with using Format, avoid the use of format strings.

Note that the direct version could still be optimised a little bit if we had a direct output_int which did not allocate its result as a string. Also note that in bytecode, the relative slowdowns are even more impressive.

Here is a table which summarises those results (also showing the number of bytes allocated, as returned by Gc.allocated_bytes):

[              direct]:     0.51 sec,             32000128 bytes
[              printf]:     0.94 sec,           1704075840 bytes
[              format]:     2.58 sec,            728008488 bytes
[             formatf]:     3.33 sec,           2664063504 bytes

How often does the performance of printing textual data matters? Well actually, quite rarely, but sometimes it does, and the numbers above suggest that it might pay to rewrite the most intensively used pieces of the pretty-printing code. For instance, we’ve observed a total speedup of about 15% for compiling our entire code tree by optimising the -annot printer of ocamlc (by avoiding some use of Format and format strings). I suspect that people doing textual logs could also be interested in those performance remarks.