The MLOPE system: a Standard ML Offline Partial Evaluator
Partial Evaluation
Partial evaluation is a form of Program Specialization based on aggressive constant propagation and
function unfolding
Both the terms "specialization" and "partial evaluation" are freely used to denote each other,
but it should be stressed that the
latter is usually considered an instance of the former. There exist many other specialization techniques, although
partial evaluation, is the probably the most popular one since it is fairly efficient, controlable and yields
satisfactory results.
The idea behind partial evaluation is to provide a program with only parts of its input (usually called the static input) and transform it into
a new specialized program (the so-called residual program) which has as much of its static input been computed away as possible. This residual program
can consecutively be given the rest of the original input (the dynamic input in specialization terminology) and will return the same
output as when the original program would have been executed with both inputs at the same time.
The following equations show the principle:
[P] (in1, in2) = out
[mix] (P, in1) = P-res
[P-res] in2 = out
P is the original program, in1 the static input, in2 the dynamic input, P-res the residual program and mix
the program specializer (this name is historical).
These equations together constitute the 1st Futamura Projection after Yoshihito Futamura who was the first to come up with the idea
in the early 70's.
There also exist a 2nd and 3rd Futamura projection, which have to do with self-application of the specializer. More details on that can be
found in the literature.
A crucial observation is that the residual program is likely to run faster than the original program (although
this is in general not guaranteed!) Then, if the dynamic input varies considerably more often than the static input an overall speedup may
be obtained. Since specialization merely removes interpretation layers in a program, it actually tranforms code into data.
An obvious application for program specialization is an interpreter which takes two inputs:
a program and the input of that program. If we specialize the
interpreter with respect to the program, we obtain a compiled program which can be run many times on its input. Indeed, partial evaluation can
be considered a more general form of compilation and interpretation.
Other example applications of partial evaluation can be found in compiler technology, scientific computing, computer graphics (e.g. ray tracing),
operating systems, neural networking, domain-specific language applications (in particular database applications), etc.
A small overview and some references on partial evaluation can be found
here,
although the systems presented on this page are not state of the art anymore and research in this area is making rapid progress.
Online vs Offline
Partial evaluators come in two flavors, the online and offline systems. The difference between these two is the method they use to figure out
which parts of the code can be specialized away and which parts have to be residualised. An online system will just take the static input and
execute the code as much as it can, generating code along the way. An offline system, on the other hand, will first perform a program analysis to
figure out the binding time of every construct in the input program. Such
a binding-time analysis results in an annotated program that only says if an expression is supposed to be static or dynamic. In other words,
the static analysis has separated the binding times.
Afterwards, the specializer will use the actual static values to specialize the statically annotated code away and residualise the dynamic code.
An online system often yields more accurate results since it can exploit the actual static values, whereas an offline system is bound
to make conservative assumptions (and undecidability of the halting problem does the rest).
However, offline systems tend to be much more efficient and allow better control on the finiteness of the
specialization process. This makes them more popular for functional and imperative programming languages. The logic programming community tend
to use online methods. However, partial evaluation for logic programming languages (better known as partial deduction) is more a flavor
of supercompilation which is a different kind of specialization technology altogether.
Standard ML for partial evaluation
Partial evaluation is, like every kind of specialization technique, a program transformation. In order to make it usable in practice, it is critical
that such systems preserves the semantics of the source language. If the behaviour of a piece of program text is not well defined, the partial
evaluator may need to defer its execution to runtime, even though the programmer-intended semantics may be clear.
This implies that a specialization system works only well if it builds on a well-defined source language with a clear semantics.
Another problem is that partial evaluation tries to evaluate a program without having all its input available. This causes problems if the thread of
execution is difficult to estimate which is typically the case for programming languages with implicit state.
For these reasons, functional programming languages have been the main focus of program specialization research. Not only are they
formally defined, but they are, at least in their pure subsets, "state"-free. The latter is a
grateful property for program transformation. Recent partial evaluation research, however, is also concentrating on state aspects of functional
languages. This may eventually bridge the gap to specialization of imperative languages.
Standard ML,
the source language of our system, is a statically typed polymorphic functional programming language with references, exceptions,
modules, algebraic datatypes etc. It had birth at the end of the 70s and was standardised in a formal Definition in 1990. A small revision
took place in 1997 and SML can now be considered a matured general-purpose programming language. Until now, two partial evaluation systems
for SML have been developed: Sml-Mix and Pell-Mell. However, they only address a subset of the language and are not under development anymore.
The MLOPE System
We intend to design and implement an offline partial evaluator for the full Standard ML programming language
using state of the art techniques in partial
evaluation (and inventing a few others along the way ;-) We have two main goals:
-
Develop a tool to study the pragmatics and effectiveness of new partial evaluation techniques based on Hindley-Milner typing. Up to date, no satisfactory solutions have been proposed to specialize polymorphic programs. This is the main focus of this project. We also aim to develop a tool which is easy extensible so that it can be used to
experiment with new techniques.
-
Next to that, we also want to implement a useful, user-friendly and practical tool for researchers and
programmers who would like
to exploit partial evaluation in their work and are prepared to use SML as an underlying language.
The MLOPE System is under development at
the Programming Languages Group of the University of Freiburg with funding from the German Fund for Scientific Research. It is mainly implemented by Simon Helsen and supervised by
Peter Thiemann and Michael Sperber.
The system is built on the front-end of an early version of the
MLton system,
a whole-optimizing compiler for Standard ML.
Project Status
- The current development version has a simply-typed binding-time analysis with a simple HTML-interface for
binding-time debugging purposes
- A Program Generator Generator (PGG or COGEN) based on this binding-time analysis is being implemented - the only failing component at the moment (August 2002) is a memoization mechanism.
- The first working version with a COGEN that supports memoization is to be expected in November or December 2002.
This site is maintained by Simon Helsen.
Last update: 7th of September 2002.