Java Supercompiler Version 0.1.x README
Contents
Java Supercompiler (JScp) version 0.1.x is a technology preview (alpha version) of a supercompiler for the Java programming language. It is a
global optimizer based on the supercompilation method conceived by
Valentin Turchin. JScp performs source-to-source transformation of (part of) a Java
program.
The modern JScp is capable of replacing some methods in a source Java program by their optimized versions. What JScp performs now may be
characterized as inlining method bodies to some depth and specializing thus obtained code. This is a limited form of supercompilation (driving
is implemented to a large extent, while configuration analysis is rather simplified). Future JScp versions will gradually cover more and more
advanced techniques of supercompilation.
JScp is intended for use on Windows NT 4.0 or Windows 2000 operating systems running on Intel hardware. (Windows 95, 98, Me are not
supported). On your request, versions running on Unix operating systems may be supplied.
A Pentium 200MHz or faster processor, and hundreds megabytes of RAM are required to run the supercompiler. The more the better.
Supercompilation is time and space consuming depending on a Java program. JScp does not use disk space for temporary data, and gradually request more
and more virtual memory while working. Lack of physical RAM causes disk swapping which has a severe effect on performance.
JScp program installation occupy about 1M of disk space.
A prerequisite is Sun JDK 1.2.x or later installation.
bin — subdirectory with JScp executables
camlrt.dll* —
the MosML run-time library used by
jscp.exe
camlrunm.exe* —
the MosML run-time used by
jscp.exe
jarithm.bat — a .bat file to start the "arithmetic server" used by jscp.exe
jarithm.jar — the classes of the "arithmetic
server"
javap-patch.jar — a .jar file with patched classes of javap.exe used
by jscp-javap.bat
jscp.exe — the JScp executable
jscp-javac.bat — jscp.exe calls javac.exe via this .bat
file
jscp-javap.bat — jscp.exe calls the patched version of javap.exe
via this .bat file
libmregex.so* —
a run-time library used by
jscp.exe
libmsocket.so* —
a run-time library used by
jscp.exe
sample — subdirectory with a test to check JScp installation
Hello.java — a sample Java program subject to supercompilation
jscp.bat — a .bat file that checks environment variables settings and calls
jscp.exe
run.bat — a .bat file to compile and run Hello.java (without
supercompilation)
supercompile.bat — a .bat file to supercompile Hello.java
supercompile-and-run-result.bat — a .bat file to supercompile Hello.java
and run the resulting program
README.htm — this file
Note 4 files marked by * constitute the MosML run-time
used by jscp.exe. These files have been taken from http://www.dina.kvl.dk/~sestoft/mosml.html.
Executable jscp.exe takes several .java files whose names are given by command-line arguments or loaded by demand, and outputs transformed
.java files to a specified directory. The process is controlled by a supercompilation task defined by command-line options
or a task file in XML format. The task
tells to JScp which methods to supercompile and how. E.g.,
jscp.exe Hello.java -method main -destdir res -invoke10
Here Hello.java is the file name of a source Java program. Option -method main tells that method main is to be replaced by its
supercompiled version. Option -destdir res specifies destination directory where the resulting file Hello.java is put. Option -invoke10
says that during supercompilation methods are invoked (inlined) recursively not more than 10 times.
When jscp.exe needs to evaluate arithmetic and other operations with known data, it calls JVM with so called arithmetic server, which
must be started in advance. (The rational behind using a separate JVM is that JScp is written in SML
rather than in Java.) The server is to be launched once by calling jarithm.bat and then permanently resides in computer memory. If
you use the
supercompiler often, it is convenient to put the link to jarithm.bat in Windows Startup, and forget about the server.
During its execution, jscp.exe calls the Java compiler javac.exe and a parched
version of the Java class file disassembler javap.exe to check that the Java program under
supercompilation is correct and to gather information about used .class files. These programs are
invoked by jscp.exe via .bat files jscp-javac.bat and jscp-javap.bat
that lie in the same directory as jscp.exe, which must be in the system path when
jscp.exe is invoked.
-
Install Sun Java 2 SDK 1.2 or later after having downloaded it from
http://java.sun.com/j2se. It is recommended to use
version 1.4.x. Set the following environment variable:
set JAVA_HOME=path-to-Java-installation
-
Unpack the downloaded .zip file with JScp executable and other files listed above to any directory.
Set environment variable JSCP_HOME to this directory:
set JSCP_HOME=path-to-JScp-installation
-
Include the JScp installation directory in the system path or copy file %JSCP_HOME%\sample\jscp.bat
to a directory which is in your system path already.
-
This step is optional. Execute it, if you don't want to think about starting the arithmetic server each time before using JScp. Put the link to jarithm.bat in
your Windows Startup directory:
- Open directory with JScp installation.
- Right-click to
jarithm.bat in this directory.
- Choose
Create Shortcut from menu. File jarithm.bat.lnk will be created in this directory.
- Open Windows Startup directory: right-click the
Start button; select Open in menu and Start Menu will be opened;
then open Programs folder and Startup folder in it.
- Move
jarithm.bat.lnk from JScp installation directory to Startup folder.
Now each time you login to Windows, the arithmetic server starts without your intervention.
-
The bat file
jarithm.bat also checks environment variables JAVA_HOME
and JCSP_HOME are set properly. You may want to additionally check the installation
by calling without parameters the following files in the %JSCP_HOME%\bin directory:
jscp.exe
jscp-javac.bat
jscp-javap.bat
Each program will return usage information about itself.
-
Go to the directory with JScp installation and start the arithmetic server by executing jarithm.bat, if it is not started yet. (Don't
worry, if you call it for the second time, you would receive the following message: "Server can't start. Perhaps the one is working
already" and nothing happen.)
-
Go to subdirectory sample that contains Hello.java along with .bat files to run and supercompile the sample:
public class Hello { void test() { System.out.println("Hello!"); } public static void main(String[] args) { new Hello().test(); } }
Execute run.bat to check Java installation. Hello.java will be compiled and run. You will see "Hello!" in black window
before "Press any key to continue . . .".
-
Execute supercompile-and-run-result.bat. It contains the following lines:
call jscp -invoke -m main Hello -d res %* cd res call "%JAVA_HOME%\bin\javac"
Hello.java
call "%JAVA_HOME%\bin\java" Hello @pause
In a black window you will see a printout of JScp options and the result of supercompilation of method main enclosed in comment
lines with time, which are output when (1) supercompilation is started, (2) supercompilation proper has been finished and post-processing is
started and (3) the whole of supercompilation is done:
//-------------------------------------- 0 sec - method
Hello.main(java.lang.String[])
//-------------------------------------- 0 sec - method Hello.main(java.lang.String[])
postprocessing...
public static void main (final java.lang.String[] args_1)
{
java.lang.System.out.println("Hello!") /*virtual*/;
return;
}
}
//-------------------------------------- 0 sec - JScp version 0.1.99 ---
Here method invocation test() has been inlined and instance creation expression new Hello() discarded as if garbage collection
has been performed in supercompile time.
-
You may want to look at the resulting file Hello.java in subdirectory res.
Simplest case
The simplest scenario to use the Java supercompiler is as follows:
- A Java program subject to supercompilation must be successfully compiled by a Java compiler. (The use of Sun's
javac is strongly
recommended.) To check this, call Sun's javac having set the classpath environment variable if needed:
javac A.java B.java C.java
You may want to check that the original program successfully runs, e.g., in case where the method main is located in class A and
the Java program requires no arguments, execute the following command:
java A
- Then call JScp with the same arguments and additional options for JScp:
jscp A.java B.java C.java -d res -m test -cons
Here -m, -d and -cons are shorthands for options -method, -destdir and -conservative:
- Go to directory
res and compile the resulting file. Indicate the directory where other .class files lies, in the classpath
environment variable or the -classpath option, e.g.,
cd res javac -classpath .;.. A.java
Syntactic errors may be reported by the Java compiler. Some of these may be the result of our underdevelopment
concerning Java 1.1, which will be fixed soon. Other messages may complain that access modifiers (private, protected or default)
does not allow accessing to members of other classes from within supercompiled code. This is a known bug. Now you should
edit the source Java files manually and change access to public for required members.
- Run the supercompiled version of the Java program, e.g., in case where method
main is located in class A:
java -classpath .;.. A
Compare the results for equivalence.
Further experiments
Repeat supercompilation with other options that control supercompilation strategies. We recommend experimenting with the following options for a
start:
-
Delete option -conservative. Supercompilation may take more time but the result may be better.
- Use option
-all instead of -method identifier to supercompile all methods in the given compilation units.
Such supercompilation may be rather long; for the first time set the most conservative mode by options -conservative and -invoke0:
jscp A.java B.java C.java -d res -all -cons -i0
Here -i0 is the shorthand for -invoke0. Option -invoken specifies the inlining depth, that is the number of
recursive method invocations performed in supercompilation time. -i0 means no method invocations, no inlining.
Experiment with different values of n and with no limit, which is set by option -invoke without parameter. Default is -invoke1.
Exclude "bad" methods (e.g., those that supercompile too long) from supercompilation by -except... options described below.
- As the alternative to listing the names of all .java files that should be known to the supercompiler, use dynamic loading. Specify option
-dynamicLoading or -dl, and when a class C belonging to a package p1.p2.p3 is needed, file p1\p2\p3\C.java is
sought and loaded if found. Otherwise, the methods of the class are considered unknown and supercompilation continues as without dynamic loading.
By default, the file path is relative to the current working directory. Use option -sourcepath d1;d2;d3 to specify other directories to
look for .java files.
Main options to control supercompilation
-
The depth of inlining is controlled by options -invoken and -recurn. While -invoken sets the limit to
the total number of recursive invocations, option -recurn sets the limit of recursive invocations of the same method:
inlining is stopped on (n+1)th invocation of any method. No limit is set by option -recur without parameter.
Default is -recur1.
Options -invoken and -recurn affect supercompilation jointly: inlining is not performed when the limit set by
either of the options is reached. -invoke0 and -recur0 are equivalent (no inlining). Options -invoke -recur set
no inlining limit. Inlining may be stopped by other reasons as well, e.g., if the .java file containing a required method is not given to JScp.
-
The set of methods to be supercompiled may be specified in more detail by the following options:
-all, -allmethods — supercompile all method in all top-level classes of all compilation units, except
ones excluded by options -exceptMethod, -exceptClass, -exceptUnit.
-un, -unitn — supercompile nth compilation unit. Units are counted as
listed in command line. Default is -unit1.
-xun, -exceptUnitn — do not supercompile nth
compilation unit.
-c classIdentifier, -class classIdentifier — supercompile top-level method of class(es) with
given identifier.
-xc classIdentifier, -exceptClass classIdentifier — do not supercompile classes with given identifier.
-m methodIdentifier, -method methodIdentifier — supercompile method(s) with given identifier.
-xm methodIdentifier, -exceptMethod methodIdentifier — do not supercompile methods with given identifier.
Run jscp.exe without arguments to see the list of all command-line options. The most of
them set modes, strategies of supercompilation. The command line options set default values for the
whole of supercompilation. More precise options at the level of each class and method may be set in a
separate JScp task file in XML format. This will be described elsewhere.
The main underdevelopment of the versions 0.1.x is simplified configuration analysis. This means that no new methods are generated by the
JScp. Driving has been implemented to large extend. Our current goal is to complete development, testing and debugging of the ground level
of supercompilation relying mainly on driving, and apply it to optimizing and specializing such Java programs, for which this collection of program
transformation techniques is sufficient. Then we will continue the development of next levels of configuration analysis.
The most unpleasant known bugs in the current version of the Java Supercompiler are as follows:
- JScp composes resulting Java program by merging supercompiled methods with fragments of original Java files that are
synthesized from the internal parsed representation. Some constructs of Java 1.1 are output incorrectly now. These results in syntactic error when
compiling the resulting Java program. The Java subset corresponding to Java 1.0 is output correctly.
Java code is moved from one classes to others by inlining. In new classes access to old members from the moved
code may be not allowed by modifiers private, protected or default, and hence, resulting program contains syntactic errors reported
by the Java compiler. Now the user have to change modifiers to public manually. In future JScp will output Java files with modified access
by itself.
- The statement
try is implemented "too approximately" now. More special cases should be considered. This means that
rather few information is now propagated from its body to the statements after try. This noticeably limits the depth of program
specialization when statement try is often used, which is the case of a lot of real programs.
- Moreover, even the current approximation of the
try statement is not always correct been "not enough general". The main
underdevelopment of try that may result in incorrect program transformation, is the absent of check whether the values of local variables
may change inside try in such a way that they have other values on exit from try on an exception than by normal control flow.
Send questions, comments and bug reports to info@supercompilers.com.
Thank you!
|