Cajo, the easiest way to accomplish distributed computing in Java
“Distributed computing is becoming increasingly important in the world of enterprise application development. Today, developers continuously need to address questions like: How do you enhance scalability by scaling the application beyond a single node? How can you guarantee high-availability, eliminate single points of failure, and make sure that you meet your customer SLAs?
For many developers, the most natural way of tackling the problem would be to divide up the architecture into groups of components or services that are distributed among different servers. While this is not surprising, considering the heritage of CORBA, EJB, COM and RMI that most developers carry around, if you decide to go down this path then you are in for a lot of trouble. Most of the time it is not worth the effort and will give you more problems than it solves.”
On the other hand, distributed computing and Java go together naturally. As the first language designed from the bottom up with networking in mind, Java makes it very easy for computers to cooperate. Even the simplest applet running in a browser is a distributed application, if you think about it. The client running the browser downloads and executes code that is delivered by some other system. But even this simple applet wouldn’t be possible without Java’s guarantees of portability and security: the applet can run on any platform, and can’t sabotage its host.
The cajo project is a small library, enabling powerful dynamic multi-machine cooperation. It is a surprisingly easy to use yet unmatched in performance. It is a uniquely ‘drop-in’ distributed computing framework: meaning it imposes no structural requirements on your applications, nor source changes. It allows multiple remote JVMs to work together seamlessly, as one.
The project owner John Catherino claims “King Of the Mountain! ;-)” and challenges everyone who is willing to prove that there exists a distributed computing framework in Java that is equally flexible and as fast as cajo.
To tell you the truth, personally I am convinced by John’s saying; and I strongly believe that you will be also if you just let me walk you through this client – server example. You will be amazed of how easy and flexible the cajo framework is :
The Server.java
import gnu.cajo.Cajo; // The cajo implementation of the Grail public class Server { public static class Test { // remotely callable classes must be public // though not necessarily declared in the same class private final String greeting; // no silly requirement to have no-arg constructors public Test(String greeting) { this.greeting = greeting; } // all public methods, instance or static, will be remotely callable public String foo(Object bar, int count) { System.out.println("foo called w/ " + bar + ' ' + count + " count"); return greeting; } public Boolean bar(int count) { System.out.println("bar called w/ " + count + " count"); return Boolean.TRUE; } public boolean baz() { System.out.println("baz called"); return true; } public String other() { // functionality not needed by the test client return "This is extra stuff"; } } // arguments and return objects can be custom or common to server and client public static void main(String args[]) throws Exception { // unit test Cajo cajo = new Cajo(0); System.out.println("Server running"); cajo.export(new Test("Thanks")); } }
Compile via:
javac -cp cajo.jar;. Server.java
Execute via:
java -cp cajo.jar;. Server
As you can see with just 2 commands :
Cajo cajo = new Cajo(0); cajo.export(new Test("Thanks"));
we can expose any POJO (Plain Old Java Object) as a distributed service!
And now the Client.java
import gnu.cajo.Cajo; import java.rmi.RemoteException; // caused by network related errors interface SuperSet { // client method sets need not be public void baz() throws RemoteException; } // declaring RemoteException is optional, but a nice reminder interface ClientSet extends SuperSet { boolean bar(Integer quantum) throws RemoteException; Object foo(String barbaz, int foobar) throws RemoteException; } // the order of the client method set does not matter public class Client { public static void main(String args[]) throws Exception { // unit test Cajo cajo = new Cajo(0); if (args.length > 0) { // either approach must work... int port = args.length > 1 ? Integer.parseInt(args[1]) : 1198; cajo.register(args[0], port); // find server by registry address & port, or... } else Thread.currentThread().sleep(100); // allow some discovery time Object refs[] = cajo.lookup(ClientSet.class); if (refs.length > 0) { // compatible server objects found System.out.println("Found " + refs.length); ClientSet cs = (ClientSet)cajo.proxy(refs[0], ClientSet.class); cs.baz(); System.out.println(cs.bar(new Integer(77))); System.out.println(cs.foo(null, 99)); } else System.out.println("No server objects found"); System.exit(0); // nothing else left to do, so we can shut down } }
Compile via:
javac -cp cajo.jar;. Client.java
Execute via:
java -cp cajo.jar;. Client
The client can find server objects either by providing the server address and port (if available) or by using multicast. To locate the appropriate server object “Dynamic Client Subtyping” is used. For all of you who do not know what “Dynamic Client Subtyping” stands for, John Catherino explains in his relevant blog post :
“Oftentimes service objects implement a large, rich interface. Other times service objects implement several interfaces, grouping their functionality into distinct logical concerns. Quite often, a client needs only to use a small portion of an interface; or perhaps some methods from a few of the logical grouping interfaces, to satisfy its own needs.
The ability of a client to define its own interface, from ones defined by the service object, is known as subtyping in Java. (in contrast to subclassing) However, unlike conventional Java subtyping; Dynamic Client Subtyping means creating an entirely different interface. What makes this subtyping dynamic, is that it works with the original, unmodified service object.
This can be a very potent technique, for client-side complexity management.”
Isn’t that really cool??? We just have to define the interface our client “needs” to use and locate the appropriate server object that complies with the client specification. The following command derived from our example accomplish just that :
Object refs[] = cajo.lookup(ClientSet.class);
Last but not least we can create a client side “proxy” of the server object and remotely invoke its methods just like an ordinary local object reference, by issuing the following command :
ClientSet cs = (ClientSet)cajo.proxy(refs[0], ClientSet.class);
That’s it. These allow for complete interoperability between distributed JVMs. It just can’t get any easier than this.
As far as performance is concerned, I have conducted some preliminary tests on the provided example and achieved an average score of 12000 TPS on the following system :
Sony Vaio with the following characteristics :
- System : openSUSE 11.1 (x86_64)
- Processor (CPU) : Intel(R) Core(TM)2 Duo CPU T6670 @ 2.20GHz
- Processor Speed : 1,200.00 MHz
- Total memory (RAM) : 2.8 GB
- Java : OpenJDK 1.6.0_0 64-Bit
For your convenience I provide the code snippet that I used to perform the stress test :
int repeats = 1000000; long start = System.currentTimeMillis(); for(int i = 0; i < repeats;i ++) cs.baz(); System.out.println("TPS : " + repeats/((System.currentTimeMillis() - start)/1000d));
Happy Coding! and Don’t forget to share!
Justin
- Java Best Practices – High performance Serialization
- Java Best Practices – Vector vs ArrayList vs HashSet
- Java Best Practices – String performance and Exact String Matching
- Java Best Practices – Queue battle and the Linked ConcurrentHashMap
- Java Best Practices – Char to Byte and Byte to Char conversions
- How to Do 100K TPS at Less than 1ms Latency
- Revving Up Your Hibernate Engine