Migrating to floatarray: experience report.

Nicolás Ojeda Bär

This post is about our use of OCaml inside LexiFi and how we managed a large refactoring of our codebase that we enterprised in order to take advantage of a new feature of OCaml.

Namely, in OCaml 4.08 a specific type floatarray for arrays of floating-point numbers was introduced which has a number of advantages relative to the usual type float array (see below for some details or this blog post to learn more about the motivation for this change).

At LexiFi we had been wanting to switch to the new type for a long time to profit from the advantages it offers, but the magnitude of the task had kept us from doing so until now. Luckily for us, Nicolas Chataing, who interned at LexiFi from Sep 2020 to Feb 2021 came along with the right skill set at the right time to tackle this task and was able to make the migration happen (thank you Nicolas!).

In the sections that follow we will recall the background for this migration, the challenges we faced and the solutions we came up with. We hope this will be useful to others who are thinking of doing the same.

Before going any further, note that the issues discussed here are only relevant if you have a medium-to-large codebase making heavy use of float array. If your codebase is small or if your use of float array is light, then migration to floatarray is easy: just change the signatures in your code to use floatarray instead of float array, and let the compiler guide you in fixing the rest of the code.

However, for larger codebases (LexiFi’s codebase is ~600k LOC, including third-party dependencies), this manual approach is not feasible. To give an idea of the size of the migration, the final patch represented ~10k LOC, most of it produced by an automatic tool.

The tool is available online at https://github.com/nchataing/caml-migrate-floatarray.

Background: runtime representation of float arrays

Let us explain the “why” of the migration. For more details, head to this previous post where you will find an in-depth discussion.

Recall that at runtime every OCaml value is represented either by a word-sized integer, or as a pointer to a memory block:

          │
 ┌─────┬──▼──┬─────┬─────┬──────┬─────┐
 │ TAG │     │     │     │ ...  │     │
 └─────┴─────┴─────┴─────┴──────┴─────┘

In order for the GC to differentiate between the two kinds of representations, values encoded as integers have their least significant bit set to one (pointers are always aligned, so their least significant bit is zero). In particular, integer-encoded values must fit in 31 or 63 bits (depending on machine architecture). Otherwise, they must be encoded as pointers to a memory block.

So bool (1 bit), char (8 bits) and int (31/63 bits) can all be represented as integers. On the other hand, float (64 bits) cannot, and must be represented as pointer:

          │
 ┌─────┬──▼────────┐
 │ 253 │  1.92e12  │
 └─────┴───────────┘

An array a of length n must also be represented as a pointer:

          │
 ┌─────┬──▼──┬─────┬─────┬──────┬───────┐
 │  0  │a.(0)│a.(1)│a.(2)│ ...  │a.(n-1)│
 └─────┴─────┴─────┴─────┴──────┴───────┘

Putting these two representations together, it would mean that a value of type float array would be represented thus:

          │
 ┌─────┬──▼──┬─────┬─────┬──────┬───────┐
 │  0  │a.(0)│a.(1)│a.(2)│ ...  │a.(n-1)│
 └─────┴──┬──┴──┬──┴──┬──┴──────┴──┬────┘
          │     │     │            │
 ┌─────┬──▼─────┴──┐  │            │
 │ 253 │  1.92e12  │  │            │
 └─────┴────────┬──┘  │            │
       ┌─────┬──▼─────┴──┐         │
       │ 253 │  1.92e12  │         │
       └─────┴────────┬──┘         │
             ┌─────┬──▼────────┐   │
             │ 253 │  1.92e12  │   │
             └─────┴───────────┘   │
                          ┌─────┬──▼────────┐
                          │ 253 │  1.92e12  │
                          └─────┴───────────┘

This is inefficient in terms of memory use and locality, but that is not the only issue. For example, this memory layout does not match the C memory layout for an array of floats. This means that interacting with C libraries (a common use-case in numerical code) requires converting float array back-and-forth into the C representation.

To mitigate this problem, OCaml has had an optimization in place since the beginning of time: values of type float array are represented specially by embedding the float values directly in the memory block:

             │
 ┌────────┬──▼──────┬─────────┬─────────┬──────┬──────────┐
 │  254   │ 1.92e12 │ 1.92e12 │ 1.92e12 │ ...  │ 1.92e12  │
 └────────┴─────────┴─────────┴─────────┴──────┴──────────┘

(we refer to this representation as “flat”). This solves the most of the issues mentioned above, but introduces some problems of its own.

  • Every runtime operation on arrays is less efficient, as it must check to see which representation it is dealing with.

    The compiler is able to optimize away this check in some cases: for example, if it is able to determine that the type of the array elements is float or that it is not float. But this still leaves many other cases which are not covered: for example, polymorphic functions are never able to opitimize the check away, or if the type of elements is abstract, etc.

  • Reading an element from a float array involves an allocation of a float value, not just a memory read.

  • The different runtime representation can inhibit certain other optimizations (see the blog post referred to above for more on that).

Enter floatarray

Given these issues, an alternative that would still allow to profit from the optimized, “flat” representation for arrays of floats, without incurring in the downsides was desired.

With this objective in mind, three changes were made to the compiler:

  • a new type floatarray was introduced in the standard library. Values of this type are always represented using the “flat” layout.

  • a module Float.Array was added to the standard library which mirrors the Array module but that acts of values of type floatarray

  • a configure-time option was added to OCaml to disable the “flat float array” mode.

    Incidentally, note that code compiled with the mode disabled is incompatible with code compiled with it enabled, so if you are thinking about disabling it, make sure all your code is compiled the same way!

Migrating to floatarray

The objective for us was to migrate all or most of the code that used float array into similar code using floatarray, and to disable the “flat float array” mode.

As a starting point, we considered the following naïve approach to carry out the migration:

  • replace every instance float array by floatarray (in type annotations, type signatures, type declarations, etc).

  • replace every array operation (get, set, length, map, etc) on a value of type float array by the corresponding operation on floatarray.

The expectation was that after doing these two changes, the resulting code would be close to compilable and we would be able to fix the remaining issues by hand.

One sees right away that while the the first step can be done mechanically, the second step is harder. Indeed, most of the time, OCaml code has few or no type annotations, and finding expressions of type float array requires typechecking information, which has to be extracted from the compiler.

With this in mind, we refined our initial approach into an iterative workflow:

  • An automatic tool will handle the mechanical part of the migration: replacing every instance of float array in type annotations, signatures or type definitions by floatarray, replacing calls to Array module acting on float array values by the corresponding functions Float.Array, etc. It will do this in two passes.

  • An annotation pass tasked with adding type annotations to make functions which are polymorphic on 'a array but are only ever used on float array values effectively monomorphic. The objective is to maximize the code that can be handled by the following pass.

  • A rewriting pass that rewrites code that is statically known to be acting or returning values of type float array into similar code involving floatarray. Note that polymorphic functions cannot be rewritten in this way, as the resulting code would no longer be capable of being used with values of type 'a array.

  • Whenever the tool cames across instances of such polymorphic functions or any other case it cannot handle, the issue will be logged into a file, together with location information that would allow to locate the issue in the codebase.

  • After the run, the log file is investigated manually. Once an issue is identified, changes are made to the codebase to put it into a form that will allow the tool to handle the code in question the next time around.

A key point is that the tool requires typing information to work, so it can only be run in a codebase that can be compiled. In particular, it cannot be run in the codebase that results from applying the tool itself during an intermediate stage of the migration. Indeed, this code is not typically in a compilable state due to the spots in the code that the tool is not yet able to handle.

Because of this, we organized the above iterative workflow around two copies of the codebase, “source” and “test”. The tool is always run on the “source” repo and modifies files in the “test” repo. The manual fixes done by the developer are always done in the “source” repo, in a way that the code remains compilable, so that the process can be iterated. In a diagram:

              adapt "test"
           ┌──────────────────┐
    ┌──────┴─────┐      ┌─────┴─────┐       ┌─────────┐
    │    run     │      │   run     │       │         │
    │            │      │           │       │ examine │
 ┌──► "annotate" ├──────► "rewrite" ├───────►         ├──┐
 │  │            │      │           │       │   log   │  │
 │  │    tool    │      │   tool    │       │         │  │
 │  └────────────┘      └───────────┘       └─────────┘  │
 └───────────────────────────────────────────────────────┘
                        adapt "source"

The annotation pass

As mentioned, the goal of this pass is to add type annotations to functions which are only ever used with float array values but which have a definition polymorphic on 'a array. By constraining their signature to float array, the effectiveness of the following rewriting pass is increased considerably.

For example, this pass will transform the following polymorphic function (assumed to only ever be used with x of type float array),

let get_re x i = x.(2 * i)

into

let get_re (x : float array) i = x.(2 * i)

The annotation logic also works on the return type: if the function

let create x = Array.make 10 x

is ever only used in contexts where x is of type float, then the this pass will annotate the function as

let create x : float array = Array.make 10 x

Both this and the rewriting pass need access to the typing information produced by the compiler. This information is available in the .cmt[i] files produced by the compiler during compilation. These files contain a version of the source AST of every module where each node is annotated with typing information.

Armed with this information, the tool scans the AST looking for function calls the signature of which is polymorphic on 'a array. It considers two cases:

  • If the function is local to the module (ie not exposed in the interface), all callsites in the same module can be found and the tool can decide whether the function is only used at float array or not. If this is the case then it syntactically modifies the source code to add a type annotation constraining the signature of the function.

    If the function is used both at float array and at some other type 'a array, an entry is made in the log for manual examination.

  • If the function is exposed in the module interface, then it is not possible to know if it is used at some other type other than float array without doing a global analysis. To keep things simple, in this case we simply logged the function for manual examination.

An interesting aspect of this transformation is that by “monomorphising” functions which were polymorphic on 'a array, this pass is effectively optimizing the code relative to the original version, independently of the migration to floatarray. Indeed, as we mentioned in the background section, by removing the polymorphism, the compiler would then be able to optimize away the representation check that is required for polymorphic functions.

The rewriting pass

For the rewriting pass, the tool needed to identify expressions of type float array in the code. Again it made use of the “typed” AST stored in .cmt[i] files to do so.

This pass performs the following syntactic rewritings:

  • For calls to functions such as Array.get, Array.length, Array.map, etc the type of the argument is examined. If it is float array, then the function call is rewritten into Float.Array.get, Float.Array.length, Float.Array.map, etc.

    For example, this pass will rewrite the function get_re from the previous section to

    let get_re x = Float.Array.get x (2 * i)
    

    (The tool has some logic to remove the type annotations it inserts in the annotation pass during rewriting, in order to keep the code clean.)

  • For functions such as Array.make or Array.init, the type of the result is examined. If it is found to be float array then the function is rewritten into Float.Array.make, Float.Array.init, etc.

    For example, this pass will rewrite the function create from the previous section to

    let create x = Float.Array.make 10 x
    

    Sometimes the corresponding function in the Float.Array module did not exist. For example, Array.iter2 applied to a float array in the first argument and a (non-float) 'a array in the second argument. In this case, the missing function was written, and the tool adapted to use it during rewriting. After the migration was complete, a PR was submitted upstream to add the missing functions to the standard library.

  • Instances of float array appearing in type signatures and type definitions are changed to floatarray. Note that this includes the signature of externals. Normally this should not cause problems if the C bindings are written using the usual Double_field and Store_double_field macros (see here for more on this).

  • float array literals are rewritten. As there is no syntax for floatarray literals, we added functions

    val make1: float -> floatarray
    val make2: float -> float -> floatarray
    val make3: float -> float -> float -> floatarray
    val make4: float -> float -> float -> float -> floatarray
    

    that the tool uses to rewrite small literals. For maximum efficiency, these functions are defined as follows:

    type float2 = {mutable x1: float; mutable x2: float}
    let make2 x1 x2 : floatarray = Obj.magic {x1; x2}
    

    taking advantange of the fact that floatarray values share the same representation at runtime as values of record types where all the fields have type float. For larger literals, the tool rewrites it into a call to Float.Array.map_from_array Fun.id applied to the float array literal.

Note that previous annotation pass increases the chance that an expression may be given the type float array, thereby increasing the chances that it could be handled by this pass.

Dealing with polymorphism

The previous passes still left the case of really polymorphic functions (ie functions that could not be made monomorphic by adding type annotations) unhandled. This case was treated manually. We did so in two different ways.

Duplicating code

For small library functions, we simply duplicated the code. For example, the function:

let map_sum f xs =
  let len = Array.length xs in
  let sum = ref 0.0 in
  for i = 0 to len - 1 do
    sum := !sum +. f (Array.unsafe_get xs i);
  done;
  !sum

would be duplicated to a specific function acting on float array:

let map_sum_float f xs =
  let len = Array.length xs in
  let sum = ref 0.0 in
  for i = 0 to len - 1 do
    sum := !sum +. f (Array.unsafe_get xs i);
  done;
  !sum

and all places where map_sum was called with arguments of type float array would be changed to calls to map_sum_float. Note that map_sum_float is identical to map_sum, which made the change easy to review. Once this no-op change was done, the annotation/rewriting passes of the tool would detect that map_sum_float could be made monomorphic and could then proceed to rewrite the function to its final form:

let map_sum_float f xs =
  let len = Float.Array.length xs in
  let sum = ref 0.0 in
  for i = 0 to len - 1 do
    sum := !sum +. f (Float.Array.unsafe_get xs i);
  done;
  !sum

Note that it was important that map_sum_float act on float array and not floatarray so that the “source” repo continued to be in a compilable state. As already mentioned, this was necessary in order to run the tool, since it depends on typing information produced during compilation.

Ad-hoc polymorphism

For more complicated functions, or functions polymorphic on complex types 'a t that included in their structure 'a array, we opted to maintain the polymorphism. To do so, we utilized a GADT type to express polymorphism over both 'a array and floatarray, as follows:

type (_, _) array_kind =
  | Array : ('a, 'a array) array_kind
  | Floatarray : (float, floatarray) array_kind

This allowed to define array operations which could act on both 'a array and floatarray, eg:

let get : type a arr. (a, arr) kind -> arr -> int -> a =
  fun kind arr n ->
  match kind with
  | Array -> Array.get arr n
  | Floatarray -> Float.Array.get arr n

We would then simply thread an extra argument of type (a, arr) kind through the polymorphic function in question allowing the same code to be used both for 'a array and for floatarray.

Odds and ends: floatarray pattern matching

There is no syntax to pattern match on floatarray values, so instances of pattern matching on values of type float array were flagged by the automatic tool. As an example, consider the following function found in our codebase:

let add_float xs_ ys_ =
  match xs_, ys_ with
  | [| x |], ys
  | ys, [| x |] ->
      let zs = Array.create_float (Array.length ys) in
      A.shift ys x ~dst:zs;
      zs
  | xs, ys ->
    map2_add xs ys

We systematically rewrote this by a pattern match on the length of the arrays:

let swap_args xs ys =
  if Array.length ys = 1 then ys, xs else xs, ys

let add_float xs_ ys_ =
  let xs, ys = swap_args xs_ ys_ in
  match Array.length xs with
  | 1 ->
      let x = A.unsafe_get xs 0 in
      let zs = Array.create_float (Array.length ys) in
      A.shift ys x ~dst:zs;
      zs
  | _ ->
      map2_add xs_ ys_

Odds and ends: low-level uses of Obj

For low-level code that uses Obj, one needs to be careful when disabling the “flat float array” mode. Indeed, Obj.field and Obj.set_field do the right thing for both flat and normal blocks when “flat float array” mode is enabled.

However, when the mode is disabled, these two functions assume that the argument is a normal block, and will segfault if passed a flat block. In order to get or set the field of a “flat” block, one must use Obj.double_field and Obj.set_double_field.

Another instance of this problem is the idiom (Obj.magic x : float array), used to access the elements of x, a block representing a flat block. After disabling the “flat float array” mode, this will lead to a segfault. Instead, the code should be changed to (Obj.magic x : floatarray).

See https://github.com/ocsigen/js_of_ocaml/pull/1083 for an example of such a bug in the js_of_ocaml compiler, discovered during our migration.

Odds and ends: C bindings

There are three sets of macros to read and write to arrays of floating-point numbers that can be used in C bindings:

  • Double_flat_field and Store_double_flat_field. These macros are to be used for floatarray values only.

  • Double_array_field and Store_double_array_field. These macros are to be used for float array values only. They will work correctly regardless of whether the “flat float array” mode is enabled or not.

  • Double_field and Store_double_field. These macros are for backwards compatibility and will work correctly on both floatarray and float array values, regardless of whether the “flat float array” mode is enabled or not. However, since they perform check at runtime to branch according to the representation of the array, they are a bit less efficient that the other macros.

In pratice, LexiFi’s code did not use any of these macros. Instead, array values are casted on the C side to (double *) and accessed directly. This amounts to assuming a “flat” representation for these values.

This will work correctly with float array values if the “flat float array” mode is enabled, and with floatarray values regardless of whether the mode is anabled or not. However, this kind of code will not work with float array values if the mode is disabled (since the representation is no longer “flat” in that case).

In LexiFi’s codebase, all the C externals were rewritten to use floatarray instead of float array, so this problem did not come up.

Odds and ends: a runtime warning

In order to increase our confidence on the migration tool, a runtime warning was also implemented that would print a backtrace to standard output whenever a float array was allocated by the OCaml runtime.

By running our testsuite and looking for such runtime warnings we were able to to gain confidence that the tool was working well and not missing any instances where float array was being used.

The runtime warning has also been submitted for inclusion upstream https://github.com/ocaml/ocaml/pull/10236.

A few additional remarks

  • In some places of our code, we decided to stick with float array instead of floatarray. We did this in places that were not performance-sensitive and where switching to floatarray would have compexified the code too much. A feature was added to the tool to allow “whitelisting” such places so that they would not be automatically rewritten.

  • Globally, we did not observe a significant efficiency increase, but disabling the “flat float array” mode of course makes some GC tag operations a bit quicker, and the simplification in runtime semantics was more than enough justification for us to consider this project.

  • We disabled the “flat float array” mode shortly after the migration. However, the code obtained by the migration process could have been used with that mode enabled as well. Indeed, the memory layout of floatarray values is the same in both cases. The only difference is that if we had kept the mode enabled we would not have profited from the simplification in runtime semantics.

  • For floatarray values, there is no built-in indexing operators similar to the ones for usual arrays: a.(n) to read an element from an array, and a.(n) <- x to write an element to it.

    The corresponding expressions for floatarray would be Float.Array.get a n and Float.Array.set a n x. As this notation was a bit too heavy, we defined a pair of custom infix operators:

    let (.!()) = Float.Array.get
    let (.!()<-) = Float.Array.set
    

    This makes it possible to write a.!(n), and a.!(n) <- x when dealing with floatarray, which is almost as good as the notation for usual arrays.

Conclusion

We migrated LexiFi’s codebase to use floatarray in place of float array with an iterative workflow and the help of a semi-automatic tool:

              adapt "test"
           ┌──────────────────┐
    ┌──────┴─────┐      ┌─────┴─────┐       ┌─────────┐
    │    run     │      │   run     │       │         │
    │            │      │           │       │ examine │
 ┌──► "annotate" ├──────► "rewrite" ├───────►         ├──┐
 │  │            │      │           │       │   log   │  │
 │  │    tool    │      │   tool    │       │         │  │
 │  └────────────┘      └───────────┘       └─────────┘  │
 └───────────────────────────────────────────────────────┘
                        adapt "source"
  • The annotation pass adds type annotations to those functions which can be guaranteed to be monorphic on float array, thus increasing the applicability of the rewriting pass.

  • The rewriting pass uses type information to mechanically rewrite most instances of uses of float array and functions operating on values of this type to the corresponding code in terms of floatarray.

  • Any unhandled problem is then investigated by the tool operator, who adapts the code in question into a form that can be handled by the tool. The process is then repeated.

Additionally, a runtime warning was implemented to track allocations of float array values. By using it, we could be confident that the tool had not missed any rewriting opportunities.

While the tool was developed with LexiFi’s needs in mind, if you have a big code base to migrate, you may be interested in checking it out, either to use it directly or to draw inspiration from it for your own solution.

If you have any questions about this blog post, you can get in touch at nicolas.ojeda.bar AT lexifi.com.

Thanks for reading!