The mzc compiler takes MzScheme (or MrEd) source code and produces
either platform-independent byte-code compiled files (.zo
files) or platform-specific native-code libraries (.so or
.dll files) to be loaded into MzScheme (or MrEd). In the
latter mode, mzc provides limited suport for interfacing directly
to C libraries.
mzc works on either individual files or on collections. (A
collection is a group of files that conform to MzScheme's
library collection system; see section 16 in PLT MzScheme: Language Manual). In general, mzc
works best with code using the module form.
As a convenience for programmers writing low-level MzScheme
extensions, mzc can compile and link plain C files that use
MzScheme's escheme.h header. This facility is described in
Inside PLT MzScheme.
Finally, mzc can perform miscellaneous tasks, such as embedding
Scheme code in a copy of the MzScheme (or MrEd) binary to produce a
stand-alone executable, or creating .plt distribution
archives.
A byte-code file typically uses the file extension
.zo. The file starts with #~ followed by
the byte-code data.
Byte-code files are loaded into MzScheme in the same way as regular
Scheme source files (e.g., with load). The
#~ marker causes MzScheme's reader to load byte
codes instead of normal Scheme expressions. When a .zo file
exists in a compiled subdirectory, it is sometimes loaded in
place of a source file; see section 3.3 for details.
Byte-code programs produced by mzc run exactly the same as source
code compiled by MzScheme directly (assuming the same set of bindings
are in place at compile time and load time). In other words,
byte-code compilation does not optimize the code any more than
MzScheme's normal evaluator. However, a byte-code file can be loaded
into MzScheme much faster than a source-code file.
A native-code file is a platform-specific shared library. Under
Windows, native-code files typically use the extension
.dll. Under Unix and MacOS, native-code files typically
use the extension .so.
Native-code files are loaded into MzScheme with the
load-extension procedure (see section 14.7 in PLT MzScheme: Language Manual). When a
native-code file exists in a compiled subdirectory, it is
sometimes loaded in place of a source file; see section 3.3
for details.
The native-code compiler attempts to optimize a source program so that
it runs faster than the source-code or byte-code version of the
program. See section 1.4 for information on obtaining the best
possible performance from mzc-compiled programs.
The cffi.ss library of the compiler collection
defines Scheme forms, such as c-lambda, for accessing C
functions from Scheme. The forms produce run-time errors when
interpreted directly or compiled to byte code. See section 2 for
further information.
Native-code compilation produces C source code in an intermediate
stage; your system must provide an external C compiler to produce
native code. The mzc compiler cannot produce native code directly
from Scheme code.
Under Unix, gcc is used as the C compiler if it can be
found in any of the directories listed in the PATH environment
variable. If gcc is not found, cc is used if it can be
found.
Under Windows, cl.exe, Microsoft Visual C, is used as
the C compiler if it can be found in any of the directories listed in
the PATH environment variable. If cl.exe is not found, then
gcc.exe is used if it can be found. If neither cl.exe
nor gcc.exe is found, then bcc32.exe (Borland) is used
if it can be found.
Under MacOS, Metrowerks CodeWarrior is used as the C compiler
if it can be found.
Except for MacOS, the C compiler and compiler flags used by mzc can
be adjusted via command line flags.
mzc does not generally produce stand-alone executables from Scheme
source code. The compiler's output is intended to be loaded into
MzScheme (or MrEd or DrScheme). However, see also section 5
for information about embedding code into a copy of the MzScheme (or
MrEd) executable.
mzc does not translate Scheme code into similar C code. Native-code
compilation produces C code that relies on MzScheme to provide
run-time support, which includes memory management, closure creation,
procedure application, and primitive operations.
Under Unix and Windows, run mzc from a shell, passing in flags and
arguments on the command line.
Under MacOS, double-click on the mzc launcher application with the
Command key pressed, then provide arguments in the command line
dialog that appears. (Close the MzScheme application first if
it is already running, since mzc is itself a MzScheme-based
application.) If the Command key is not pressed while mzc is
started, the command-line dialog will not appear. If a file is
dragged onto the mzc icon, then the command-line will contain the
file's path; this is useful for compiling a Scheme file directly to
an extension. If a file is dragged onto the mzc icon, additional
command-line argument can be provided by holding down the Command
key, but the arguments will go after the file name, which is almost
never useful (since the order of command-line arguments is
important).
In this manual, each example command line is shown as follows:
mzc --extension--prefix macros.ss file.ss
To run this example under Unix or Windows, type the command line into
a shell (replacing mzc with the path to mzc on your
system, if necessary). Under MacOS, launch mzc with the Command
key pressed, and enter everything aftermzc into the
dialog that appears.
Simple on-line help is available for mzc's command-line
arguments by running mzc with the -h or
--help flag.
Compiling a program to native code with mzc can provide significant
speedups compared to interpreting byte code (or running the program
directly from source code), but only for certain kinds of
programs. The speedup from native-code compilation is typically due
to two optimizations:
Loop Optimization -- When mzc statically detects a
tail-recursive loop, it compiles the Scheme loop to a C loop that has
no interpreter overhead. For example, given the program
mzc can detect the odd-even loop and produce
native code that runs twice as fast as byte-code interpretation. In
contrast, given a similar program using top-level definitions,
(define (oddx) ...)
(define (evenx) ...)
the compiler cannot assume an odd-even loop,
because the global variables odd and even can be
redefined at any time. Note that defined variables in a
module expression are lexically scoped like letrec
variables, and module definitions therefore permit loop
optimizations.1
Primitive Inlining -- When mzc encounters the
application of certain primitives, it inlines the primitive
procedure. However, the compiler must be certain that a variable
reference will resolve to a primitive procedure when the code is
loaded into MzScheme. In the preceding example, the compiler cannot
inline the application of sub1 because the global variable
sub1 might be redefined. To encourage the inlining of
primitives -- which produces native code that runs 30 times
faster than byte-code interpretation for the preceding example -- the
programmer has three options:
Use module -- If the original example is
encapsulated in a module that imports mzscheme, then each
primitive name, such as sub1, is guranteed to access the
primitive procedure (assuming that the name is not lexically
bound). The ``modulized'' version of the preceding program follows:
To run this program, the oe module must be
required at the top level.
Use a (require mzscheme) prefix -- If the
preceding example is prefixed with (require mzscheme),
then sub1 refers not to the global variable, but to the
sub1 export of the mzscheme module. See section 3.2
for more information about prefixing compilation.
Use the --prim flag -- The --prim flag
alters the semantics of the langugage for compilation such that
every reference to a global variable that is built into MzScheme is
converted to its keyword form. Actually, specifying the
--prim flag causes mzc to automatically prefix the program
with (require mzscheme).
Programs that permit these optimizations also to encourage a host of
other optimizations, such as procedure inlining (for
programmer-defined procedures) and static closure detection. In
general, module-based programs provide the most opportunities
for optimization.
Native-code compilation rarely produces significant speedup for
programs that are not loop-intensive, programs that are heavily
object-oriented, programs that are allocation-intensive, or programs
that exploit built-in procedures (e.g., list operations, regular
expression matching, or file manipulations) to perform most of the
program's work.
1 The compiler cannot always prove that
module definitions have been evaluated before the
corresponding variable is used in an expression. Use the -v or
--verbose flag to check whether mzc reports a ``last known
module binding'' warning when compiling a module expression,
which indicates that definitions after a particular line in the
source file might be referenced before they are defined.