Thursday, March 14, 2013

Custom Class Loaders

Custom class loaders are a pretty interesting subject area.
Now why would anyone want to write their own custom class loader, the question is why not. It's always better to write your own class loader to load the classes your applications use, this way you can make sure that when your applications are un-deployed the classes you loaded will be eligible for garbage collection and the world would be a better place. Also it's just darn good practice, you can avoid the notorious class loader exceptions and linkage exceptions and most of all out of memory exceptions. Unlike some languages Java manages it's own memory, as a developer you just have to make sure you do your part by disposing objects you no longer need.

But you often take this for granted sometimes because you are just too darn lazy and sometimes because you just don't get it. Well you should, if you haven't already, read about java class loaders and how they load classes.

Let's see what sort of situations demand custom class loaders.


1. To allow class loading from alternative repositories
This is the most common scenario, you can load classes from networks, different directories, over ftp etc

2. To partition code
Used mostly in Servlet engines, you can have class loaders load multiple instances of the same application in the same JVM without any issues

3.To allow unloading of classes
Class loaders maintain a cache of the classes they load you can de-reference
the class loader to unload all the classes that it loaded. This is quite handy, you can unload your classes when you are done. This way you can avoid perm gen errors.

4. To change the way bytecode is loaded
For example you can load encrypted byte-code over the wire

5. To modify the loaded byte-code
used with AOP mostly

6. Automatically verify a digital signature before executing un-trusted code

7. Transparently decrypt code with a use supplied password
    Related to point 4 above

8. To support hot deployment on an application server

9. Use different versions of the same class on the fly to do something

The possibilities are endless. Let's see when class loaders become a huge issue,


1. Say, a Developer accidently loads two different versions of the same class, there is no guarantee which version of the class will be invoked, and the funny thing is the JVM won't complain

2. An application may consist of several class loaders, they might be unrelated, you get this a lot in application servers/ web servers. Dependent libraries may require different versions of the same class, in which case you might get linkage errors and class loaders version issues.

Let's now see how you can write your own. Remember when writing your own class loader its always better to link to a super class loader. If your class loader cannot load the class the chances are that your super class loader can, and mostly because it's just good practice.


import java.io.FileInputStream;
import java.util.Hashtable;

public class customclassloader extends ClassLoader {
 private Hashtable<String, Class<?>> classes = new Hashtable<String, Class<?>>();

 public customclassloader() {
  // chaining is important here
  super(customclassloader.class.getClassLoader());
 }

 private byte getClassFromExternalLocation(String className)[] {
  System.out.println("Fetching class " + className
    + "from sampledirectory\\");
  byte result[];
  try {
   FileInputStream fileInputStream = new FileInputStream(
     "sampledirectory\\" + className + ".ext");
   result = new byte[fileInputStream.available()];
   fileInputStream.read(result);
   return result;
  } catch (Exception e) {
   System.out.println("Unable to load class " + className);
   return null;
  }
 }

 // Main function called by client to resolve class
 public synchronized Class<?> loadClass(String className, boolean resolveIt)
   throws ClassNotFoundException {
  System.out.println("Attempting to load class " + className);
  Class<?> result;
  byte classData[];

  /* Check our local cache for the class */
  result = (Class<?>) classes.get(className);
  if (result != null) {
   System.out.println("Returning a cached instance");
   return result;
  }

  /* Call super class loader */
  try {
   result = super.findSystemClass(className);
   System.out.println("Calling the super class loader");
   return result;
  } catch (ClassNotFoundException e) {
   System.out.println("Not a class parent classloader can load");
  }

  /* Try to load it from our repository */
  classData = getClassFromExternalLocation(className);
  if (classData == null) {
   throw new ClassNotFoundException();
  }

  /* Define the class */
  result = defineClass(className, classData, 0, classData.length);
  if (result == null) {
   System.out.println("Format Error");
   throw new ClassFormatError();
  }
  /* load reference classes as well if required */
  if (resolveIt) {
   resolveClass(result);
  }
  /* Store the loaded class in the cache */
  classes.put(className, result);
  return result;
 }
}

Notice how method "getClassFromExternalLocation" loads the class from a different directory. You can read more about class loaders in one of my previous posts here.

Java Classloaders



If someone asks you what class loaders are, you would probably say with a smile on your face, well!! they load classes in to JVM. And you would probably go on to explain what the perm gen space is and how it would run out if you got too many classes hogging it. It's not serious business, you might think, the chances are that you have gone through all your life not intefering with them. Its like that weird kid in the class, you don't mess with. He does his business you do yours, but sooner or later you gonna have to talk to him. Well class loaders are like that.

How many times have you hot deployed a web application and gotten the dreaded out of memory error, and how many times have you cursed before restarting to get it working again. Well the problem I am afraid does not lie in the application server (obviously), its how you have written your code. Do you ever look to see what the problem really is, well most people choose not to.

Now before we go any further let's see what class loaders do.

Class loaders do a pretty important job, when you compile your application you turn your java code into bytecode and save it in .class files, you go a step further and bind your .class files in to a jar. That's all good. Now when you want to use the classes in this jar file, you invariably add it to the class path of the application in question. And viola, you can use those classes in your application now. What happens here is not magic. Your .class files are, behind the scenes being loaded on to the JVM by classloaders.

Let’s see when classes are being loaded in to JVM.

Classes are loaded when you create instances of them, or when you make a static reference using a dot notation, loading classes is done on demand. Curious minds would inquire now how the references classes are loaded, well they are loaded along with your main class, but what about when you have multiple classes declaring instances of the same class, well class loaders keep caches of the classes they load hence if the class you require is already loaded they won’t load it again there's a catch here though. This brings us to an interesting revelation, how does the situation change when you have multiple class loaders. Multiple unrelated (isolated) class loaders can load the same class so that it’s possible for multiple instances of the same class to exist in the JVM and the funny thing is they will never be equal.

You might ask, what of that singleton class I wrote last week, I was so proud of, well what of it, your singletonness is guaranteed only to the class loader that loads it. How to protect your singleton from being loaded twice in to JVM by multiple class loaders is a question that begs a separate blog post. In short you basically need to write your own class loader or make sure you always chain your custom class loaders so that they always have a common parent.

Let’s now move on to types of class loaders in Java of which there are mainly three.
  1.            Bootstrap Class Loader
  2.            Extensions Class Loader
  3.            System Class Loader
Bootstrap class loader loads Java Core classes typically present in jre/lib folder, while extensions class loader as you may have guessed already load Java extensions typically present in jre/lib/ext you can also define your ext locations by setting the system property “java.ext.dir”. Systems class loader loads classes from the class path defined, you can also define this using “java.class.path” System variable.

And of course you can write your own custom class loaders, do this if you are pretty finicky, well most Java geeks like to be very hands-on. What you also need to understand is that these class loaders are executed in the order listed, first the bootstrap, then extensions and finally system which is implemented by sun.misc.Launcher$AppClassLoader.

 When Java code is compiled by the compiler, the complier adds a static final field called “class”, every class has it. And why I brought this seemingly irrelevant subject here is because using this property you can get the class loader which loaded the class, this is pretty handy. You can try this out with a class you wrote, the interesting this to note here that you can go up the hierarchy to access the parent class loaders, when you get null, is when you know you have hit the bootstrap class loader, because bootstrap class loader always returns null.

Before I wrap this post up, let me list the steps a class loader has to go through before loading a class in to JVM, important if you are writing your own

1.     Verify Class Name


Important if you write your own class loader, you don’t want people to use your class loader to load classes you didn’t intent to.

2.     Check to see if it has been loaded already


If the class is loaded, which it tracks by using a map of some sort, return the class already loaded

3.     Check to see if the class is a systems class


If it is then the chances are that it’s already loaded, don’t take chances through, pass control to your parent class loader.

4.     Attempt to fetch the class from the class loader repository


This is where the actual loading happens, your repository may reside over a network, or a separate folder, load your classes as a byte stream.

5.     Define the class for the JVM


First step is to verify that the byte code is valid, if not throw a class format error

6.     Resolve the class


 What this means is that you need to load the references (any classes that the class you are trying load is using) you also need to verify the legitimacy of these references, failing this step is what leads to the infamous Linkage Errors

7.     Return the class to the caller


Finally return your class to the caller. Ill show you in another post how to write your own class loader.

And that’s a wrap.