Java source code (.java files) is typically compiled to bytecode (.class files). Bytecode is more compact than Java source code, but it may still contain a lot of unused code, especially if it includes program libraries. Shrinking programs such as ProGuard can analyze bytecode and remove unused classes, fields, and methods. The program remains functionally equivalent, including the information given in exception stack traces.
By default, compiled bytecode still contains a lot of debugging information: source file names, line numbers, field names, method names, argument names, variable names, etc. This information makes it straightforward to decompile the bytecode and reverse-engineer entire programs. Sometimes, this is not desirable. Obfuscators such as ProGuard can remove the debugging information and replace all names by meaningless character sequences, making it much harder to reverse-engineer the code. It further compacts the code as a bonus. The program remains functionally equivalent, except for the class names, method names, and line numbers given in exception stack traces.
When loading class files, the class loader performs some sophisticated verification of the byte code. This analysis makes sure the code can't accidentally or intentionally break out of the sandbox of the virtual machine. Java Micro Edition and Java 6 introduced split verification. This means that the JME preverifier and the Java 6 compiler add preverification information to the class files (StackMap and StackMapTable attributes, respectively), in order to simplify the actual verification step for the class loader. Class files can then be loaded faster and in a more memory-efficient way. ProGuard can perform the preverification step too, for instance allowing to retarget older class files at Java 6.
Apart from removing unused classes, fields, and methods in the shrinking step, ProGuard can also perform optimizations at the bytecode level, inside and across methods. Thanks to techniques like control flow analysis, data flow analysis, partial evaluation, static single assignment, global value numbering, and liveness analysis, ProGuard can:
The positive effects of these optimizations will depend on your code and on the virtual machine on which the code is executed. Simple virtual machines may benefit more than advanced virtual machines with sophisticated JIT compilers. At the very least, your bytecode may become a bit smaller.
Some notable optimizations that aren't supported yet:
Yes, you can. ProGuard itself is distributed under the GPL, but this doesn't affect the programs that you process. Your code remains yours, and its license can remain the same.
Yes, ProGuard supports all JDKs from 1.1 up to and including 8.0. Java 2 introduced some small differences in the class file format. Java 5 added attributes for generics and for annotations. Java 6 introduced optional preverification attributes. Java 7 made preverification obligatory and introduced support for dynamic languages. Java 8 added more attributes and default methods. ProGuard handles all versions correctly.
Yes. ProGuard itself runs in Java Standard Edition, but you can freely specify the run-time environment at which your programs are targeted, including Java Micro Edition. ProGuard then also performs the required preverification, producing more compact results than the traditional external preverifier.
ProGuard also comes with an obfuscator plug-in for the JME Wireless Toolkit.
dx compiler converts Java bytecode into the Dalvik bytecode that runs on Android devices. By preprocessing the original bytecode, ProGuard can significantly reduce the file sizes and boost the run-time performance of the code. It is distributed as part of the Android SDK. DexGuard, ProGuard's closed-source sibling for Android, offers additional optimizations and more application protection.
It should. RIM's proprietary
rapc compiler converts ordinary JME jar files into cod files that run on Blackberry devices. The compiler performs quite a few optimizations, but preprocessing the jar files with ProGuard can generally still reduce the final code size by a few percent. However, the
rapc compiler also seems to contain some bugs. It sometimes fails on obfuscated code that is valid and accepted by other JME tools and VMs. Your mileage may therefore vary.
Yes. ProGuard provides an Ant task, so that it integrates seamlessly into your Ant build process. You can still use configurations in ProGuard's own readable format. Alternatively, if you prefer XML, you can specify the equivalent XML configuration.
Yes. ProGuard also provides a Gradle task, so that it integrates into your Gradle build process. You can specify configurations in ProGuard's own format or embedded in the Groovy configuration.
ProGuard's jar files are also distributed as artefacts from the Maven Central repository. There are some third-party plugins that support ProGuard, such as the android-maven-plugin and the IDFC Maven ProGuard Plug-in.
DexGuard also comes with a Maven plugin.
Yes. First of all, ProGuard is perfectly usable as a command-line tool that can easily be integrated into any automatic build process. For casual users, there's also a graphical user interface that simplifies creating, loading, editing, executing, and saving ProGuard configurations.
Yes. ProGuard automatically handles constructs like
SomeClass.class. The referenced classes are preserved in the shrinking phase, and the string arguments are properly replaced in the obfuscation phase.
With variable string arguments, it's generally not possible to determine their possible values. They might be read from a configuration file, for instance. However, ProGuard will note a number of constructs like" "
(SomeClass)Class.forName(variable).newInstance()". These might be an indication that the class or interface
SomeClass and/or its implementations may need to be preserved. The developer can adapt his configuration accordingly.
Yes. ProGuard copies all non-class resource files, optionally adapting their names and their contents to the obfuscation that has been applied.
No. String encryption in program code has to be perfectly reversible by definition, so it only improves the obfuscation level. It increases the footprint of the code. However, by popular demand, ProGuard's closed-source sibling for Android, DexGuard, does provide string encryption, along with more protection techniques against static and dynamic analysis.
Not explicitly. Control flow obfuscation injects additional branches into the bytecode, in an attempt to fool decompilers. ProGuard does not do this, except to some extent in its optimization techniques. ProGuard's closed-source sibling for Android, DexGuard, does offer control flow obfuscation, as one of the many additional techniques to harden Android apps.
Yes. This feature allows you to specify a previous obfuscation mapping file in a new obfuscation step, in order to produce add-ons or patches for obfuscated code.
Yes. You can specify your own obfuscation dictionary, such as a list of reserved key words, identifiers with foreign characters, random source files, or a text by Shakespeare. Note that this hardly improves the obfuscation. Decent decompilers can automatically replace reserved keywords, and the effect can be undone fairly easily, by obfuscating again with simpler names.
Yes. ProGuard comes with a companion tool, ReTrace, that can 'de-obfuscate' stack traces produced by obfuscated applications. The reconstruction is based on the mapping file that ProGuard can write out. If line numbers have been obfuscated away, a list of alternative method names is presented for each obfuscated method name that has an ambiguous reverse mapping. Please refer to the ProGuard User Manual for more details.
Erik André at Badoo has written a tool to de-obfuscate HPROF memory dumps.
DexGuard is a commercial extension of ProGuard: