Dune watch mode support for Windows.

Nicolás Ojeda Bär

Between May and August 2022, Uma Kothuri from the US interned with us at LexiFi and worked on adding native Windows support for Dune’s watch mode (dune build -w). The internship was a big success, but, unfortunately we ran out of time before the contribution could be polished for submission upstream, which delayed its integration in Dune.

This is now done, and the support has been included in the latest release of Dune. To mark the occassion we are writing this note to give a semi-technical overview of how the feature is implemented. And, of course, we warmly invite all Windows Dune users to give it a go and report any issues that come up in the official bug tracker.

About Dune’s watch mode

Dune’s watch mode (triggered with dune build -w) causes Dune to continue to run and monitor for changes in the sources after the initial build. When a change is detected, Dune triggers a rebuild. If another change is detected before the rebuild has finished, it is aborted, and a new one is triggered, etc.

Apart from being convenient when iterating on a codebase, this mode of operation can be more efficient, as the initial “startup” cost of launching Dune (scanning the file system, interpreting Dune files, etc) is paid only once, at the start, and later only the incremental building cost is incurred. Another big upside is that when in this mode, Dune acts as an “RPC server”, allowing it to interface with other external tools: this is how the integration with OCaml-LSP for showing live diagnostics works, for example.

Before we go on to the technical portion of this note, let us mention that Dune already had a generic fallback to make watch mode work on Windows by using the fswatch command-line tool. However, this backend had a number of downsides: it was less efficient than having something built into Dune itself, one needed to install an external tool in addition to Dune, the event filtering mechanism implemented by the tool was unreliable at times, etc.

Because of all those reasons, it was important to have Dune support watch mode natively on Windows (that is, by integrating directly into the operating system API). Especially since this support already existed for the other usual operating systems such as Linux (using the inotify API) and macOS (using the FSEvents API).

Implementing watch mode on Windows

Adding “native” support for watch mode means hooking into the native operating system’s API for file watching so that Dune may register a set of directories to be monitored and to be notified in case of any changes in them.

On Windows, there are two file-watching APIs:

The first API, FindFirstCHangeNotification does not provide information about which change caused the notification (ie whether the file was added, removed, modified, etc). Since this information is very much needed by Dune, we quickly focused on ReadDirectoryChangesW instead. Once settled on a choice of API, we had to figure out how to receive notifications from the operating system. On this point, there is a panoply of choices: various flavors of synchronous, as well as asynchronous, mechanisms.

Below we will explain our final design, but for those that would like to understand the possible choices better, we recommend reading the blog post Understanding ReadDirectoryChangesW, by Jim Beveridge, a veritable treasure trove of information about this API (even if in the end we did not use any of the approaches discussed in that article!).

Quick use of the excellent grep.app revealed two existing bindings of this API in OCaml:

However, each project makes some specific technical choices: Unison uses Asynchronous Procedure Calls, while Flow uses the API synchronously by spinning one native thread per watched directory. As we didn’t have a good picture of Dune’s requirements at the onset of the project, we preferred to start afresh instead of basing the work on these existing implementations. That said, they were an invaluable source of inspiration.

Our final design uses ReadDirectoryChangesW in combination with I/O completion ports. I/O completion ports are Windows' native high-performance mechanism for asynchronous I/O.

I/O completion ports can be used in many of the same contexts where one would use select, poll or epoll under Linux. The main difference in usage is that after registering a request for an I/O operation, I/O completion ports notify the program on completion of the operation, while the Linux APIs notify the program when the system is ready to perform the operation. In that sense, I/O completion ports are closer in spirit to the more modern io_uring API.

In any case, for us the main attraction of I/O completion ports was that they are highly performant and they have a very simple API (if you have never encountered them before, check out this short and sweet tutorial).

The OCaml library fswatch_win

To interface Dune with ReadDirectoryChangesW we added a small library to Dune called fswatch_win. Below is the main part of the interface, with comments:

type t
(* A value of type `t` represents a "file watcher". Each file watcher can watch
   an arbitrary number fo directories, and multiple file watchers may be used
   simultaneously (this possibility is not used by Dune). *)

val create : unit -> t
(** Create a file watcher. *)

val add : t -> string -> unit
(** Add a new directory to the "watched set" of the given file watcher. *)

val wait : t -> sleep:int -> Event.t list
(** Wait for notifications from the given file watcher. The [sleep] argument is
    the number of milliseconds to wait before returning when a notification is
    received, to reduce the number of times Dune is woken up when there are many
    notifications that arrive close together. *)

val shutdown : t -> unit
(** Shut down the file watcher and release all used resources. *)

The wait function itself is blocking; Dune runs it in a separate thread. The ~sleep parameter helps “batching” many notifications in one go to avoid Dune repeatedly restarting a build when many notifications arrive very close together. Currently it is set at 500msec in Dune. The downside is that after making a change to a file, there is a small lag before Dune reacts.

The C implementation

As we will explain further in the next section, most of the work is done in a native Windows thread created by the fswatch_win that runs completely independently of the OCaml runtime. This thread takes care, in particular, of adding new directories to the watched set and waiting for notifications from the operating system. When the notifications arrive they are stored, waiting to be retrieved by the OCaml thread.

The sequence of operations that need to happen to start watching a new directory for changes using ReadDirectoryChangesW and I/O completion ports is as follows:

  • Create a HANDLE pointing to the directory that you want to watch using CreateFileW.

  • Register the handle with the I/O completion port using CreateIoCompletionPort. As a side-effect this tells the operating system that future asynchronous I/O operations on this handle should be reported through the I/O completion port.

  • Request to be notified of any changes in the directory by passing the handle to ReadDirectoryChangesW. As mentioned in the previous point, this call will return immediately, and the actual notifications will arrive through the I/O completion port.

The rest of the time, the native Windows thread repeatedly runs a “notification loop” that works as follows:

  • Wait for notifications to arrive on the I/O completion port by using GetQueuedCompletionStatus.

  • When notifications arrive, they are stored in a list waiting to be retrieved by the OCaml thread.

  • The request to receive notifications on the directory is resubmitted using ReadDirectoryChangesW in order to receive further notifications.

The GetQueuedCompletionStatus API is actually flexible enough to be used to receive messages from other threads in addition to the file change notifications from the operating system, and we use it in this way, to receive messages from the OCaml thread, as we explain in the next section.

To see the gory details, you can read the implementation, it is self-contained, not very long and, we hope, readable.

Communicating between OCaml and C

As mentioend in the previous section, most of the work takes place in a separate native Windows thread that is completely independent of the OCaml runtime. The main advantage of this approach is simplicity: no need to retain and release the OCaml runtime lock, register and unregister roots with the garbage collector for values that are used from C, etc.

The main downside is that one needs to implement a communication mechanism between the OCaml thread and the native Windows thread. But since we were already using GetQueuedCompletionStatus to receive notifications from the operating system about file system changes, and this same API could be used to receive messages from other threads, in our case, the “cost” of this design was substantially reduced.

Concretely, when calling the OCaml functions Fswatch_win.add, Fswatch_win.wait, etc., the OCaml thread does not actually do any work, it just sends a corresponding message to the native Windows thread using PostQueuedCompletionStatus. Upon receipt of the message the Windows thread performs the actual work: adding a new directory to the “watched set” or returning the list of notifications that have been accumulated to the OCaml thread.

This last point needs a bit of care since the data structure that keeps track of received notifications is mutated from two different threads: new notifications are added from the Windows thread, and existing notifications are removed from the OCaml thread. This means that access to this data structure must be synchronized. One usual way of doing this is by using mutexes, but instead we used a form of “lockless” synchronization that we learned from Flow’s bindings mentioned above:

/* Retrieve the list of events from the shared list. */
static struct events* pop_events(struct fsenv* fsenv) {
  struct events* res;

  /* Perform [res = fsenv->events; fsenv->events = NULL] atomically */
  do {
    res = fsenv->events;
  } while (InterlockedCompareExchangePointer(&(fsenv->events), NULL, res) != res);

  return res;
}

The idea behind this code is that it performs res = fsenv->events followed by fsenv->events = NULL atomically by repeatedly setting res = fsenv->events and then setting fsevents->events = NULL but only if the value of fsevents->events has not changed in between. Tricky code, but rather pleasent when it works!

Summary

In this post we gave a technical overview of the work of Uma Kothuri during her internship at LexiFi between May and August 2022, which consisted in adding native Windows support for Dune’s watch mode. This support is included in the recently released Dune 3.7.0, and was achieved by leveraging the ReadDirectoryChangesW API in conjunction with I/O completion ports.

We hope you enjoyed this technical note and do not hesitate to get in touch at nicolas.ojeda.bar AT lexifi.com if you have any questions. Thanks for reading!