Towards a native toplevel for the OCaml language

Reading time: 5 minute
...

📝 Original Info

  • Title: Towards a native toplevel for the OCaml language
  • ArXiv ID: 1110.1029
  • Date: 2011-06-01
  • Authors: : Alain Frisch, Jean-Christophe Filliâtre, Jacques Garrigue, Xavier Leroy, David Monniaux, Didier Rémy, François Trahay

📝 Abstract

This paper presents the current state of our work on an interactive toplevel for the OCaml language based on the optimizing native code compiler and runtime. Our native toplevel is up to 100 times faster than the default OCaml toplevel, which is based on the byte code compiler and interpreter. It uses Just-In-Time techniques to compile toplevel phrases to native code at runtime, and currently works with various Unix-like systems running on x86 or x86-64 processors.

💡 Deep Analysis

Figure 1

📄 Full Content

The OCaml [17,32] system is the main implementation of the Caml language [6], featuring a powerful module system combined with a full-fledged object-oriented layer. It ships with an optimizing native code compiler ocamlopt, for high performance; a byte code compiler ocamlc and interpreter ocamlrun, for increased portability; and an interactive top-level ocaml based on the byte code compiler and runtime, for interactive use of OCaml through a read-eval-print loop.

ocamlc and ocaml translate the source code into a sequence of byte code instructions for the OCaml virtual machine ocamlrun, which is based on the ZINC machine [16] originally developed for Caml Light [18]. The optimizing native code compiler ocamlopt produces fast machine code for the supported targets (at the time of this writing, these are Alpha, ARM, Itanum, Motorola 68k, MIPS, PA-RISC, PowerPC, Sparc, and x86/x86-64), but is currently only applicable to static program compilation. For example, it cannot yet be used with multi-stage programming in MetaOCaml [34,35], or the interactive toplevel ocaml.

This paper presents our work 1 on a new native OCaml toplevel, called ocamlnat, which is based on the native runtime, the compilation engine of the optimizing native code compiler and an earlier prototype implementation of a native toplevel by Alain Frisch. Our implementation currently supports x86 and x86-64 processors [1,10] and should work with any POSIX compliant operating system supported by the OCaml native code compiler. It is verified to work with Mac OS X 10.6 and 10.7, Debian GNU/Linux 6.0 and above, and CentOS 5.6 and 5.7. The full source code is available from the ocamljit-nat branch of the ocaml-experimental Git repository hosted on GitHub at [28].

The paper is organized as follows: Section 2 motivates the need for a usable native OCaml toplevel. Section 3 presents an overview of the OCaml compilers and Section 4 describes the previous ocamlnat prototype which inspired our work, while Section 5 presents our work on ocamlnat. Performance measures are given in Section 6. Sections 7 and 8 conclude with possible directions for further work.

Interactive toplevels are quite popular among dynamic and scripting languages like Perl, Python, Ruby and Shell, but also with functional programming languages like OCaml, Haskell and LISP. In case of scripting languages the interactive toplevel is usually the only frontend to the underlying interpreter or Just-In-Time compiler.

In case of OCaml, the interactive toplevel is only one possible interface to the byte code interpreter; it is also possible to separately compile source files to byte or native code object files, link them into libraries or executables, and deploy these libraries or executables. The OCaml toplevel is therefore mostly used for interactive development, rapid prototyping, teaching and learning, as an interactive program console or for scripting purposes.

The byte code runtime is the obvious candidate to drive the interactive toplevel, because the platform independent byte code is very portable and easy to generate -compared to native machine code. And in fact the byte code toplevel has served users and developers well during the last years. But nevertheless there are valid reasons to have a native code toplevel instead of or in addition to the byte code toplevel:

Performance This is probably the main reason why one wants to have a native code toplevel. While the performance of the byte code interpreter is acceptable in many cases (which can be improved by using one of the available Just-In-Time compilers [27,26,29,33]), it is not always sufficient to handle the necessary computations. Sometimes one needs the execution speed of the native runtime, which can be up to hundred times faster than the byte code runtime as we will show in Section 6.

For example, the Mancoosi project [24] has developed a library that allows to perform analysis of large sets of packages in free software distributions, that can be done acceptably efficiently with the native code compiler and runtime, but are too slow in bytecode. To perform interactive analysis (i.e. select packages with particular properties, analyse them, . . . ), having a native toplevel is really the only way to go for them, as it can combine the flexibility of the toplevel interaction with the speed of native code.

Tools such as ocamlscript [25] try to combine the performance of the code generated by the optimizing native code compiler with the flexibility of a “scripting language interface”. But this is basically just a work-around -with several limitations. A native toplevel would address this issue in a much cleaner and simpler way.

Native runtime There are scenarios where only the native code runtime is available and hence the byte code toplevel, which depends on the byte code runtime, cannot be used. One recent example here is the Mirage cloud operating system [21,22,23], which compiles OCaml programs to Xen microkernels [

📸 Image Gallery

cover.png

Reference

This content is AI-processed based on open access ArXiv data.

Start searching

Enter keywords to search articles

↑↓
ESC
⌘K Shortcut