Implementing Spring-like Classpath Scanning in Android

One of the things that Spring 2.5 introduced back in 2007 was component scanning, a feature which removed the need for XML bean configuration and instead allowed developers to declare their beans using Java annotations. Rather than this:

We can do this:

It’s a pretty simple idea since Java makes it very easy to introspectively check a class’s annotations at runtime through its reflection API. Spring’s component scan feature also allows you to specify the base package(s) to scan for beans.

The big question is how do we get access to the classes in the classpath, specifically, those in the desired package? Java SE doesn’t provide an API for doing it, but there are ways to accomplish this. The most common (if not the only) approach is to load classes by relying on the file system. We know that we can use the ClassLoader to load a class by its package-qualified name, so it becomes a matter of retrieving the file names.

Getting the classpath itself in Java SE is easy:

This will yield something that looks like “/Users/Tyler/Workspace/Test/bin:/Users/Tyler/Workspace/Test/lib/gson-2.1.jar”. Loading the files from here is pretty straightforward, as is filtering on the package name since it maps to a directory one-to-one.

Another similar approach is to use the ClassLoader to load the resources directly:

Transition to Android

Unfortunately, these solutions don’t lend themselves to Android, which made implementing classpath scanning a little more difficult for Infinitum. The reason for this is, more or less, because of the way Android’s Dalvik VM is designed. When an Android application is compiled, the Dalvik bytecode is packaged into a file called “classes.dex” inside the APK. The good news is that the Android SDK provides an API for interacting with DEX files through the DexFile class.

In order to access classes.dex, we need a handle on the APK itself, which is actually quite easy to do:

The above code opens a DexFile for the running APK. Of course, this can have some performance implications. Opening the DexFile will potentially cause the VM to pass classes.dex through a process known as “dexopt”, which is a program that performs bytecode verification and optimization. This is an expensive process, but since we’re opening a DexFile for the APK itself, classes.dex should have already undergone this process, meaning dexopt won’t be run again.

The DexFile gives us access to the classes contained in classes.dex as an enumeration of strings representing the package-qualified class names. With this, we can iterate over the class names and load any which match the desired package.

This gets the job done, and it’s essentially how Infinitum accomplishes component scanning. However, it’s a very expensive operation. DexFile.entries() yields every class in the classpath — that is, every class in classes.dex — which includes not just application binaries, but also those of any libraries included.

It’s great that we can introspect every class in the classpath, but if we’re only interested in classes of a particular package, we’re out of luck. Every class is compiled into classes.dex and, short of decompiling it ((Tools for decompiling DEX files exist, such as Baksmali, but doing such a thing at runtime — if it’s even possible — would arguably not gain you any performance benefits. Still, this is something worth exploring.)),  there’s no way to pull out the classes we want without iterating over the entire classpath.

So, for now we settle with this somewhat inefficient solution. Nonetheless, it accomplishes what it needs to at the cost of maybe a few hundred milliseconds ((On the emulator running on my MacBook Pro, the classpath scanning takes about 600 milliseconds, while on my Galaxy Nexus, it takes about 200 milliseconds.)), so maybe it’s not such a bad approach in the grand scheme of things.

Dalvik Bytecode Generation

Earlier, I discussed the use of dynamic proxies and how they can be implemented in Java. As we saw, a necessary part of proxying classes is bytecode generation. From its onset, something I wanted to include in Infinitum was lazy loading. I also wanted to provide support for AOP down the road. Consequently, it was essential to include some way to generate bytecode at runtime.

The obvious choice would be to use a library like Cglib or Javassist, but sadly neither of those would work. That’s because Android does not use a Java VM, it uses its own virtual machine called Dalvik. As a result, Java source code isn’t compiled into Java bytecode (.class files), but rather Dalvik bytecode (.dex files). Since Cglib and Javassist are designed for Java bytecode manipulation, they do not work on the Android platform. ((ASMDEX, a Dalvik-compatible bytecode-manipulation library was released in March 2012, meaning Cglib could, in theory, be ported to Android since it relies on ASM.))

What’s a programmer to do? Fortunately, some Googlers developed a new library for runtime code generation targeting the Dalvik VM called Dexmaker.

It has a small, close-to-the-metal API. This API mirrors the Dalvik bytecode specification giving you tight control over the bytecode emitted. Code is generated instruction-by-instruction; you bring your own abstract syntax tree if you need one. And since it uses Dalvik’s dx tool as a backend, you get efficient register allocation and regular/wide instruction selection for free.

Even better, Dexmaker provides an API for directly creating proxies called ProxyBuilder. If you followed my previous post on generating proxies, then using ProxyBuilder is a piece of cake. Similar to Java’s Proxy class, ProxyBuilder relies on an InvocationHandler to specify a proxy’s behavior.

Dexmaker enabled me to implement lazy loading and AOP within the Infinitum framework. It also opens up the possibility of using Mockito for unit testing in an Android environment because Mockito relies on proxies for generating mocks. ((Infinitum is actually unit tested using Robolectric, which allows for testing Android code in a standard JVM.))

Proxies: Why They’re Useful and How They’re Implemented

I wanted to write about lazy loading, but doing so requires some background on proxies. Proxies are such an interesting and useful concept that I decided it would be worthwhile to write a separate post discussing them. I’ve talked about them in the past, for instance on StackOverflow, so this will be a bit of a rehash, but I will go into a little more depth here.

What is a proxy? Fundamentally, it’s a broker, or mediator, between an object and that object’s user, which I will refer to as its client. Specifically, a proxy intercepts calls to the object, performs some logic, and then (typically) passes the call on to the object itself. I say typically because the proxy could simply intercept without ever calling the object.

proxy

A proxy works by implementing an object’s non-final methods. This means that proxying an interface is pretty simple because an interface is merely a list of method signatures that need to be implemented. This facilitates the interception of method invocations quite nicely. Proxying a concrete class is a bit more involved, and I’ll explain why shortly.

Proxies are useful, very useful. That’s because they allow for the modification of an object’s behavior and do so in a way that’s completely invisible to the user. Few know about them, but many use them, usually without even being aware of it. Hibernate uses them for lazy loading, Spring uses them for aspect-oriented programming, and Mockito uses them for creating mocks. Those are just three (huge) use cases of many.

JDK Dynamic Proxies

Java provides a Proxy class which implements a list of interfaces at runtime. The behavior of a proxy is specified through an implementation of InvocationHandler, an interface which has a single method called invoke. The signature for the invoke method looks like the following:

The proxy argument is the proxy instance the method was invoked on. The method argument is the Method instance corresponding to the interface method invoked on the object.  The last argument, args, is an array of objects which consists of the arguments passed in to the method invocation, if any.

Each proxy has an InvocationHandler associated with it, and it’s this handler which is responsible for delegating method calls made on the proxy to the object being proxied. This level of indirection means that methods are not invoked on an object itself but rather on its proxy. The example below illustrates how an InvocationHandler would be implemented such that “Hello World” is printed to the console before every method invocation.

This is pretty easy to understand. The invoke method will intercept any method call by printing “Hello World” before delegating the invocation to the proxied object. It’s not very useful, but it does lend some insight into why proxies are useful for AOP.

An interesting observation is that invoke provides a reference to the proxy itself, meaning if you were to instead call the method on it, you would receive a StackOverflowError because it would lead to an infinite recursion.

Note that the InvocationHandler alone is of no use. In order to actually create a proxy, we need to use the Proxy class and provide the InvocationHandler. Proxy provides a static method for creating new instances called newProxyInstance. This method takes three arguments, a class loader, an array of interfaces to be implemented by the proxy, and the proxy behavior in the form of an InvocationHandler. An example of creating a proxy for a List is shown below.

The client invoking methods on the List can’t tell the difference between a proxy and its underlying object representation, nor should it care.

Proxying Classes

While proxying an interface dynamically is relatively straightforward, the same cannot be said for proxying a class. Java’s Proxy class is merely a runtime implementation of an interface or set of interfaces, but a class does not have to implement an interface at all. As a result, proxying classes requires bytecode manipulation. Fortunately, there are libraries available which help to facilitate this through a high-level API. For example, Cglib (short for code-generation library) provides a way to extend Java classes at runtime and Javassist (short for Java Programming Assistant) allows for both class modification and creation at runtime. It’s worth pointing out that Spring, Hibernate, Mockito, and various other frameworks make heavy use of these libraries.

Cglib and Javassist provide support for proxying classes because they can dynamically generate bytecode (i.e. class files), allowing us to extend classes at runtime in a way that Java’s Proxy can implement an interface at runtime.

At the core of Cglib is the Enhancer class, which is used to generate dynamic subclasses. It works in a similar fashion to the JDK’s Proxy class, but rather than using a JDK InvocationHandler, it uses a Callback for providing proxy behavior. There are various Callback extensions, such as InvocationHandler (which is a replacement for the JDK version), LazyLoader, NoOp, and Dispatcher.

This code is essentially the same as the earlier example in that every method invocation on the proxied object will first print “Hello World” before being delegated to the actual object. The difference is that MyClass does not implement an interface, so we didn’t need to specify an array of interfaces for the proxy.

Proxies are a very powerful programming construct which enables us to implement things like lazy loading and AOP. In general, they allow us to alter the behavior of objects transparently. In the future, I’ll dive into the specific use cases of lazy loading and AOP.