JMH(Java Microbenchmark Harness),作为Java中一款优秀的微基准测试框架。其可以很方便地进行方法级的基准测试,便于开发者更好的测试、验证方法的性能
依赖 JMH的POM依赖如下所示
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 <dependencies > <dependency > <groupId > org.openjdk.jmh</groupId > <artifactId > jmh-core</artifactId > <version > 1.19</version > </dependency > <dependency > <groupId > org.openjdk.jmh</groupId > <artifactId > jmh-generator-annprocess</artifactId > <version > 1.19</version > </dependency > </dependencies >
实践 事实上对于JMH而言,其还是很容易上手的。下面通过两个例子来帮助大家快速入门上手
Demo1 这里通过一个Demo来入门JMH的基本使用,示例程序如下所示
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 @BenchmarkMode( Mode.AverageTime ) @OutputTimeUnit( TimeUnit.SECONDS ) @Fork(1) @Warmup(iterations = 5, time = 1, timeUnit = TimeUnit.SECONDS) @Measurement(iterations = 6,time = 2, timeUnit = TimeUnit.SECONDS) public class Demo1 { public static void main (String[] args) throws RunnerException { Options options = new OptionsBuilder () .include( Demo1.class.getName() ) .build(); new Runner (options).run(); } @Benchmark public void helloWorld () throws Exception { Thread.sleep(700 ); } }
其中,涉及到的相关注解说明如下
@BenchmarkMode :指定基准测试的模式。具体地有
AverageTime:方法调用的平均时间
Throughput:吞吐量模式,即每秒调用的次数
SampleTime:采样模式,即方法调用的耗时分布
SingleShotTime:上述模式在一次Iteration的时间内会调用方法。而在该模式中一次Iteration仅会调用一次方法。与此同时也会将Warmup预热次数设为0,用于测试方法冷启动时的耗时
All:其会分别使用上述四种模式进行测试
@Warmup, @Measurement :设置预热、测试的次数及每次的持续时间
@Benchmark :标识这是一个需要测试的方法
测试结果如下所示,helloWorld方法平均耗时703ms,可以看到测试结果与我们在helloWorld中的700ms延时基本吻合
对于Throughput吞吐量模式而言,其测试的是1秒内方法平均调用次数。显然其与AverageTime平均时间模式而言,是如出一辙的。故这里就不进行赘述了。这里重点对于SampleTime采样模式的输出结果进行介绍,示例如下
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 # JMH version: 1.19 # VM version: JDK 1.8 .0 _251, VM 25.251 -b08 ... # Measurement: 6 iterations, 2 s each # Timeout: 10 min per iteration # Threads: 1 thread, will synchronize iterations # Benchmark mode: Sampling time # Benchmark: com.aaron.JMH.Demo1.helloWorld # Run progress: 0.00 % complete, ETA 00 :00 :17 # Fork: 1 of 1 # Warmup Iteration 1 : 0.701 s/op # Warmup Iteration 2 : 0.704 s/op # Warmup Iteration 3 : 0.703 s/op # Warmup Iteration 4 : 0.703 s/op # Warmup Iteration 5 : 0.705 s/op Iteration 1 : 0.702 ±(99.9 %) 0.022 s/op helloWorld·p0.00 : 0.700 s/op helloWorld·p0.50 : 0.703 s/op helloWorld·p0.90 : 0.703 s/op helloWorld·p0.95 : 0.703 s/op helloWorld·p0.99 : 0.703 s/op helloWorld·p0.999 : 0.703 s/op helloWorld·p0.9999 : 0.703 s/op helloWorld·p1.00 : 0.703 s/op ... Iteration 6 : 0.703 ±(99.9 %) 0.011 s/op helloWorld·p0.00 : 0.703 s/op helloWorld·p0.50 : 0.704 s/op helloWorld·p0.90 : 0.704 s/op helloWorld·p0.95 : 0.704 s/op helloWorld·p0.99 : 0.704 s/op helloWorld·p0.999 : 0.704 s/op helloWorld·p0.9999 : 0.704 s/op helloWorld·p1.00 : 0.704 s/op Result "com.aaron.JMH.Demo1.helloWorld" : N = 18 mean = 0.702 ±(99.9 %) 0.001 s/op Histogram, s/op: [0.700 , 0.700 ) = 0 [0.700 , 0.701 ) = 4 [0.701 , 0.701 ) = 0 [0.701 , 0.701 ) = 0 [0.701 , 0.701 ) = 0 [0.701 , 0.702 ) = 1 [0.702 , 0.702 ) = 0 [0.702 , 0.702 ) = 0 [0.702 , 0.702 ) = 0 [0.702 , 0.703 ) = 0 [0.703 , 0.703 ) = 7 [0.703 , 0.703 ) = 0 [0.703 , 0.703 ) = 0 [0.703 , 0.704 ) = 0 [0.704 , 0.704 ) = 6 Percentiles, s/op: p(0.0000 ) = 0.700 s/op p(50.0000 ) = 0.703 s/op p(90.0000 ) = 0.704 s/op p(95.0000 ) = 0.704 s/op p(99.0000 ) = 0.704 s/op p(99.9000 ) = 0.704 s/op p(99.9900 ) = 0.704 s/op p(99.9990 ) = 0.704 s/op p(99.9999 ) = 0.704 s/op p(100.0000 ) = 0.704 s/op # Run complete. Total time: 00 :00 :20 Benchmark Mode Cnt Score Error Units JMH.Demo1.helloWorld sample 18 0.702 ± 0.001 s/op JMH.Demo1.helloWorld:helloWorld·p0.00 sample 0.700 s/op JMH.Demo1.helloWorld:helloWorld·p0.50 sample 0.703 s/op JMH.Demo1.helloWorld:helloWorld·p0.90 sample 0.704 s/op JMH.Demo1.helloWorld:helloWorld·p0.95 sample 0.704 s/op JMH.Demo1.helloWorld:helloWorld·p0.99 sample 0.704 s/op JMH.Demo1.helloWorld:helloWorld·p0.999 sample 0.704 s/op JMH.Demo1.helloWorld:helloWorld·p0.9999 sample 0.704 s/op JMH.Demo1.helloWorld:helloWorld·p1.00 sample 0.704 s/op
Demo2 现在我们通过SingleShotTime模式测试下方法的冷启动性能。示例代码如下所示
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 import org.openjdk.jmh.annotations.Param;@BenchmarkMode( Mode.SingleShotTime ) @OutputTimeUnit( TimeUnit.SECONDS ) @Fork(1) @Measurement(iterations = 1) @State(Scope.Thread) public class Demo2 { public static void main (String[] args) throws RunnerException { Options options = new OptionsBuilder () .include( Demo2.class.getName() ) .build(); new Runner (options).run(); } @Param({"2","7"}) private Integer num; @Setup public void init () { System.out.println("----------- <init: " + num +" > -----------" ); } @TearDown public void shutdown () { System.out.println("----------- <shutdown: " + num +" > -----------" ); } @Benchmark public void test1 () throws Exception { System.out.println("<test1> : " + num); } @Benchmark public void test2 () throws Exception { System.out.println("<test2> : " + num); } }
这里我们通过@Param注解实现测试多个不同的参数条件。测试结果的部分内容如下所示
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 # Run progress: 0.00 % complete, ETA 00 :00 :00 # Fork: 1 of 1 Iteration 1 : ----------- <init: 2 > ----------- <test1> : 2 ----------- <shutdown: 2 > ----------- ≈ 10 ⁻⁴ s/op ... # Run progress: 25.00 % complete, ETA 00 :00 :02 # Fork: 1 of 1 Iteration 1 : ----------- <init: 7 > ----------- <test1> : 7 ----------- <shutdown: 7 > ----------- ≈ 10 ⁻⁴ s/op ... # Run progress: 50.00 % complete, ETA 00 :00 :01 # Fork: 1 of 1 Iteration 1 : ----------- <init: 2 > ----------- <test2> : 2 ----------- <shutdown: 2 > ----------- ≈ 10 ⁻⁴ s/op ... # Run progress: 75.00 % complete, ETA 00 :00 :00 # Fork: 1 of 1 Iteration 1 : ----------- <init: 7 > ----------- <test2> : 7 ----------- <shutdown: 7 > ----------- ≈ 10 ⁻⁴ s/op ... # Run complete. Total time: 00 :00 :02 Benchmark (num) Mode Cnt Score Error Units JMH.Demo2.test1 2 ss ≈ 10 ⁻⁴ s/op JMH.Demo2.test1 7 ss ≈ 10 ⁻⁴ s/op JMH.Demo2.test2 2 ss ≈ 10 ⁻⁴ s/op JMH.Demo2.test2 7 ss ≈ 10 ⁻⁴ s/op
Note DCE 死码消除 DCE(Dead code elimination)死码消除,作为编译器中一项常见的优化技术。当我们进行基准测试时,稍有不慎即有可能影响我们的测试结果。下面是一个示例程序,用于测试Math.log方法的性能
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 @BenchmarkMode( Mode.AverageTime ) @OutputTimeUnit( TimeUnit.NANOSECONDS ) @Fork(1) @Warmup(iterations = 5, time = 1, timeUnit = TimeUnit.SECONDS) @Measurement(iterations = 5, time = 1, timeUnit = TimeUnit.SECONDS) @State(Scope.Thread) public class DCETest { public static void main (String[] args) throws RunnerException { Options options = new OptionsBuilder () .include( DCETest.class.getName() ) .build(); new Runner (options).run(); } private double x = Math.PI; @Benchmark public void base () { } @Benchmark public void test1 () { Math.log(x); } @Benchmark public double test2 () { return Math.log(x); } @Benchmark public void test3 (Blackhole blackhole) { blackhole.consume( Math.log(x) ); } }
可以看到,我们使用了三种形式(test1、test2、test3)来进行测试,测试结果如下所示
可以发现,当我们使用test1的形式进行测试,发现其测试结果明显偏低,几乎等同于一个空方法的耗时。原因就在于我们未显式地使用该计算结果,被检测为死码优化掉了。为了避免该问题一方面我们可以通过return语句来显式的使用,另一方面JMH提供了一个blackhole.consume方法以供对返回结果进行消费来避免发生DCE
常量折叠 所谓常量折叠值得的是,编译器会在编译期间就完成常量间的计算并将结果存储下来,而不再保留常量的计算表达式。对于Java而言,使用final修饰的变量即会在编译期被特殊处理
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 @BenchmarkMode( Mode.AverageTime ) @OutputTimeUnit( TimeUnit.NANOSECONDS ) @Fork(1) @Warmup(iterations = 5, time = 1, timeUnit = TimeUnit.SECONDS) @Measurement(iterations = 5, time = 1, timeUnit = TimeUnit.SECONDS) @State(Scope.Thread) public class ConstantTest { public static void main (String[] args) throws RunnerException { Options options = new OptionsBuilder () .include( ConstantTest.class.getName() ) .build(); new Runner (options).run(); } private double x = Math.PI; private final double x_final = Math.PI; @Benchmark public double base () { return Math.PI; } @Benchmark public double test1 () { return Math.log( Math.PI ); } @Benchmark public double test2 () { return Math.log( x_final ); } @Benchmark public double test3 () { return Math.log( x ); } }
测试结果如下所示,可以看到只有test3的测试结果才是可靠的