To start optimising bytecodes we need to have some tools. Fortunately there are a number of useful tools freely available on the Internet.
Firstly, to convert a Java class file into human-readable text we need a disassembler. The D-Java disassembler, by Shawn Silverman, is available at:
http://www.cat.nyu.edu/meyer/jvm/djava/
putchar('\n'); after line 66 of
shjasmin.c. This prevents the disassembled code for interfaces
which extend other interfaces but do nothing else from having an incomplete
last line.
if ((label_num = getLabel(labels, num_labels, i)))
printf("Label%u:\n", label_num);
after the line if (opcode == WIDE) (line 113 of
shjasmin.c). This ensures that a label is generated even
if the opcode at the label is wide. (The wide
instruction isn't necessary in the Jasmin source, as the assembler works out
for itself if a wide opcode is necessary. Taking out the
opcode also zapped the label.)
offset' on line 300 of shjasmin.c to
'default_offset'. Otherwise the wrong label is generated
for the default case of lookupswitch.
Secondly, we need an optimiser. I found copt, a generic peephole optimiser by Chris Fraser, at:
ftp://ftp.cs.princeton.edu/pub/packages/lcc/contrib/copt.shar
This is a small part of a much larger package, a retargetable C compiler, described in the book 'A Retargetable C Compiler: Design and Implementation' (Addison-Wesley, 1995, ISBN 0-8053-1670-1). Check out this site for further details.
My own contribution to this enterprise is a set of rules which are used
by the generic optimiser to rewrite the assembly language source file.
These rules are contained in three small text files:
rules,
rules.s and
rules.f.
(Actually, rules.f is currently empty because I haven't found any suitable
optimisations to put in it.)
There's also a small shell script, jopt
which runs the optimiser and massages its output slightly.
jopt takes two arguments, the input and output files, and
can have one of two optional flags.
-s flag: optimise for size, possibly at the expense of
performance: smaller, slower.
-f flag: optimise for speed, possibly at the expense of
code size: faster, fatter.
Finally, we need an assembler to turn our optimised source file back into a class file. My choice here is the Jasmin assembler, by Jon Meyer. Jasmin is written in Java and is described in the book 'Java Virtual Machine' (O'Reilly, 1996, ISBN 1-56592-194-1). The home page for Jasmin, from which it can be downloaded, is:
http://mrl.nyu.edu/meyer/jvm/jasmin.html
The D-Java disassembler can output source code in the format required as input by Jasmin, so these tools make a good combination.
Also, Jasmin didn't support the Unicode escape sequence in string literals so I added this method to Scanner.java.
int readHex() throws java.io.IOException {
char d[] = new char[4] ;
int i ;
advance();
d[0] = (char)next_char;
advance();
d[1] = (char)next_char;
advance();
d[2] = (char)next_char;
advance();
d[3] = (char)next_char;
try {
i = Integer.parseInt(new String(d), 16) ;
}
catch (NumberFormatException nfe) {
i = 0 ;
}
return i ;
}
The readHex method is called from the switch statement
which handles backslashes in quoted strings. I also took the opportunity
to add support for escaped backslashes as well:
case '\\': next_char = '\\'; break;
case 'u':
next_char = readHex() ;
break;
Finally, it doesn't do any harm to increase the size of the array
chars in the same file. If you intend to have a
go at optimising the JDK classes.zip file you'll need to give it
35,000 elements so that it can handle some of the LocaleElements
and ByteToChar classes.