Java字节码（.class文件）的代码解析

java二进制指令代码以以下格式紧凑排列（opcode占一个字节）：

opcode operand*

除了tableswitch和lookupswitch两条指令中间存在填充字节以外，其他指令都没有填充字节，即使在两条指令之间也没有。因而在读取指令的时候，要根据指令的定义读取。

通过对上面java指令集的分析可以知道，java指令集中很大一部分没有操作数，因而对这部分指令，只需要读取一个字节的操作码，将操作码映射成助记符即可。

而对其他带操作数的指令，则需要根据不同类型分析（由于apache中的bcel（binary code engineering library）对字节码的支持，操作码和助记符的映射可以用com.sun.org.apache.bcel.internal.constats中提供的映射表数组来完成）。

1. 处理两条特殊的指令tableswitch和lookupswitch指令。

对这两条指令，首先都要去掉填充字符以使defaultbyte1索引号是字对齐的。

private static void make4bytealignment(bytesequence codes) {

int usedbytes = codes.getindex() % 4;

int paddingbytes = (usedbytes == 0) ? 0 : 4 - usedbytes;

for(int i = 0;i < paddingbytes;i++) {

codes.readbyte();

}

对tableswitch指令，读取defaultoffset值，最小项的值，最大项的值以及在最小项和最大项之间每一项的offset值。并且将读取到的offset值和当前指令的基地址相加：

int defaultoffset1 = baseoffset + codes.readint();

builder.append("\tdefault = #" + defaultoffset1);

int low = codes.readint();

int high = codes.readint();

int npair1 = high - low + 1;

builder.append(", npairs = " + npair1 + "\n");

for(int i = low;i <= high;i++) {

int match = i;

offset = baseoffset + codes.readint();

builder.append(string.format("\tcase %d : #%d\n", match, offset));

}

对lookupswitch指令，读取defaultoffset值，键值对数值（npairs），以及npairs对的键值对，将得到的offset值和当前指令的基地址相加：

int defaultoffset2 = baseoffset + codes.readint();

builder.append("\tdefault = #" + defaultoffset2);

int npairs2 = codes.readint();

builder.append(", npairs = " + npairs2 + "\n");

for(int i = 0;i < npairs2;i++) {

int match = codes.readint();

2. 所有条件跳转指令都有两个字节的偏移量操作数（if<cond>, if_icmp<cond>, ifnull, ifnonnull, if_acmp<cond>）。无条件跳转指令goto和子例程跳转指令jsr也都是两个字节的偏移量作为操作数。

offset = baseoffset + codes.readshort();

builder.append(string.format("\t\t#%d\n", offset));

3. 对宽偏移量的跳转指令goto_w和子例程跳转指令jsr_w的操作数是四个字节的偏移量。

offset = baseoffset + codes.readint();

4. wide指令，则继续读取下一条指令，并将wide参数设置为true。

bytecodetostring(codes, pool, verbose, true);

5. 还有一些指令值以一个字节的局部变量索引号作为操作数的，如果有wide修饰，则用两个字节作为操作数，代表局部变量索引号。这样的指令有：aload, iload, fload, lload, dload, astore, istore, fstore, lstore, dstore, ret。

if(wide) {

index = codes.readunsignedshort();

} else {

index = codes.readunsignedbyte();

builder.append(string.format("\t\t%%%d\n", index));

6. iinc指令，以一个字节的局部变量索引号和一个自己的常量作为参数；如果以wide修饰，则该指令的局部变量索引号和常量都占两个字节。

if(wide) {

index = codes.readunsignedshort();

constvalue = codes.readshort();

} else {

index = codes.readunsignedbyte();

constvalue = codes.readbyte();

}

builder.append(string.format("\t\t%d %d\n", index, constvalue));

7. 对象操作指令，它们的操作数都是常量池中的索引，长度为两个字节。指向constant_class_info类型的结构，这些指令有new, checkcast, instanceof, anewarray。

index = codes.readunsignedshort();

builder.append("\t\t" + pool.getclassinfo(index).toinstructionstring(verbose) + "\n");

8. 所有字段操作指令，它们的操作数都是常量池中的索引，长度为两个字节。指向constant_fieldref_info类型结构，这些指令有getfield, putfield, getstatic, putstatic。

builder.append("\t\t" + pool.getfieldrefinfo(index).toinstructionstring(verbose) + "\n");

9. 非接口方法调用指令，也都是以两个字节的索引号作为操作数，指向常量池中的constant_methodref_info类型结构，这些指令有invokespecial, invokevirtual, invokestatic。

builder.append("\t\t" + pool.getmethodrefinfo(index).toinstructionstring(verbose) + "\n");

10. 接口方法调用指令invokeinterface，它有四个字节的操作数，前两个字节为常量池的索引号，指向constant_interfacemethodref_info类型，第三个字节为count，表示参数的字节数，最后一个字节为0值。

int nargs = codes.readunsignedbyte(); //historical, redundant

builder.append("\t\t" + pool.getinterfacemethodrefinfo(index).toinstructionstring(verbose));

builder.append(" : " + nargs + "\n");

codes.readunsignedbyte(); //reserved should be zero

11. 基本类型的数组创建指令newarray，它的操作数为一个字节的类型标识。

string type = constants.type_names[codes.readbyte()];

builder.append(string.format("\t\t(%s)\n", type));

12. 多维数组的创建指令multianewarray，它有三个字节的操作数，前两个字节为索引号，指向constant_class_info类型，表示数组的类型，最后一个字节指定数组的维度。

int dimensions = codes.readunsignedbyte();

builder.append(string.format("\t\t%s (%d)\n", pool.getclassinfo(index).getname(), dimensions));

13. 常量入栈指令ldc，以一个字节的索引号作为参数，指向constant_integer_info、constant_float_info、constant_string_info、constant_class_info类型，表示要入栈的常量值（int类型值、float类型值、string引用类型值或对象引用类型值）。

index = codes.readunsignedbyte();

builder.append("\t\t" + pool.getpoolitem(index).toinstructionstring(verbose) + "\n");

14. 宽索引的常量入栈指令ldc_w，以两个字节的索引号作为参数，指向constant_integer_info、constant_float_info、constant_string_info、constant_class_info类型，表示要入栈的常量值（int类型值、float类型值、string引用类型值或对象引用类型值）。

15. 宽索引的常量入栈指令ldc2_w，以两个字节的索引号作为参数，指向constant_long_info、constant_double_info类型，表示要入栈的常量值（long类型值、double类型值）。

16. bipush指令，以一个字节的常量作为操作数。

byte constbyte = codes.readbyte();

builder.append(“\t” + constbyte);

17. sipush指令，以两个字节的常量作为操作数。

short constshort = codes.readshort();

builder.append(“\t” + constshort);

以上还有一些没有完成的代码，包括字段（方法）的签名和描述符没有解析，有一些解析的格式还需要调整等。不管怎么样，总体的结构就是这样了，其它的都是细节问题，这里不讨论了。

参见bcel项目的org.apache.bcel.classfile.utility类.

Java字节码（.class文件）的代码解析

继续阅读

配置apache支持PHP（win7）

27 Best Free Eclipse Plug-ins for Java Developer to be ProductiveCode Quality PluginsText Editor PluginsDependency ManagementVersion Control Integration PluginsFramework Development Continuous Integration Related PluginsOther Utility Plugins

Java String.format方法的简单使用

neo4j之cypher使用文档

GitHub连夜封杀！这份阿里 10W 字内部 Java 字面试手册到底有多强？

spark/scala关于【资源文件】加载方法概述外部文件加载方案测试资源文件打包入jar包中小结

mybatis_入门程序Mybatis入门

AOP编程_Android优雅权限框架(1)概念基础，2021金三银四前言正文大纲正文

Effective Java 8:通用程序设计

OOM三种类型

工厂模式-三种类型

NETCONF协议之netopeer软件安装

SDN学习之Opendaylight浅析(五)

【递归】高效率求2的n次幂

win10本地scala和spark安装安装scala安装spark

scala (3) Function 和 Method