The examples on the previous page related to instance variables. The same techniques can be applied to class variables and local variables. Here's some sample code:
public class test {
static int k = 0 ;
public void f(int i, int j) {
}
public void g() {
int k1, k2, k3, k4 ;
k1 = k2 = k3 = k4 = 0 ;
f(test.k, test.k);
f(k4, k4);
}
}
The bytecode generated for the method calls is:
aload_0
getstatic test/k I
getstatic test/k I
invokevirtual test/f(II)V
aload_0
iload 4
iload 4
invokevirtual test/f(II)V
As before we can replace the second instance of a variable load with a
dup. Notice that I used the local variable k4
rather than one of the other ones. k1, for example, would
result in the iload_1 instruction being used, and no saving
in the number of bytes generated is possible. Indeed, if the
iload_1 is replaced by a dup the execution
time is increased slightly.
As with instance variables there are a number of other cases to consider:
getstatic instruction
can be any of the nine other types described previously, and two of
these types (long and double) require the use of the dup2
instruction.
iload there are also
aload, fload, lload and
dload, for object references, floats, longs
and doubles respectively. Each of these has an efficient form, which
we can't optimise, for the first four local variables. And as before
we need to use dup2 for long and double.
Here are the results of some timing tests:
Instruction Bytes Unoptimised Optimised
saved time (ms) time (ms)
getstatic I 2 521 459
getstatic F 2 733 462
getstatic D 2 573 796
getstatic J 2 584 782
iload_1 0 368 405
iload 4 1 437 415
fload 18 1 638 413
aload 10 1 430 413
lload 24 1 669 641
dload 22 1 668 656
I haven't reported the times for static byte, char, short, boolean, object reference or array reference variables, as they're identical to those for integers.
The interesting thing to note about getstatic is that
replacing a duplicate fetch of a long or double variable with a
dup2 actually results in reduced performance.
The optimisation rule for these cases has been placed in the rules.s
file, so they're only invoked if the -s flag to jopt
is used. (The -s flag optimises for code size, possibly
at the expense of performance.)
As discussed above, the efficient form of the iload
instruction can't be improved upon by replacing it with a
dup. For all types, though, the less efficient two
byte form gives a slight increase in performance.
Also of interest is the fact that the performance improvement for float variables is considerably larger than for other types.