Dynamic Languages and Java/Language Integration Techniques
From JVMLanguages
| Table of contents |
Introduction
Hopefully by now you're beginning to see why, despite what the marketing literature say, there is no such thing as a silver bullet. Different languages have different goals. Even if the world did converge onto one uber-language that could do everything well -- one language to rule them all -- there is no reason to suspect that it could maintain this monopoly. The barrier to entry is too low, and the software development industry is too susceptable to buzzwords and fads. Instead, we need to learn to live in a world of diversity.
At the moment, programming languages tend to exist in little boxes containing their most devoted supporters, a myriad components that have been painstakingly ported from other programming environments, and a half dozen truly interesting things that make that particular language unique. If we're going to cope with the diversity, we need to start drawing lines between those boxes. We need to open up new ways of seamlessly sharing components between these languages, so that every tool does not need to be written in 12 different languages and so that developers do not need to pledge their allegence to any particular language based on the existing code that they may need to integrate with in the future.
This is exactly what we will be doing in this chapter. We'll be introducing three methods of integrating components written in different programming languages, and we'll be creating some notation and terminology to describe these relationships. In effect, we'll be drawing lines between the boxes.
Interpreters
How do computers work? What is it about an electronic device that makes it a "computer"? The essence of the computer is best summed up by the Universal Turing Machine, a theoretical model dreamed up by Alan Turing in 1947. A UTM is a machine comprised of an infinitely long tape upon which symbols can be read or written, and a head that can move along the tape and react to the symbols according to some internal set of rules. At each step, the UTM reads the symbol at the current position of the tape and checks its own internal state, then uses the combination of the two to determine what the next course of action will be. It can move forward or backward, erase the current symbol and write a new one, and optionally change its state for future operations. The particularly interesting part about this is that the set of rules that drive the UTM can also be stored on the tape. In effect, the same UTM could carry out an infinite number of different operations depending on the rules, or program, that was used to initialize it.
The Universal Turing Machine is an example of an interpreter, which means that it reads a program in some specific language and it carries out the instructions that this program represents. A more common example of an interpreter is a modern microprocessor, or CPU. Unlike a UTM, CPUs are generally implemented in hardware and are therefore subject to certain worldly constraints (time, space, etc.), but they are otherwise based on the same principles as the UTM. They read in a program, often written in some form of architecture-specific machine code, and execute that program one step at a time until the program completes.
Low-level programs such as the computer's operating system need to be implemented in machine code so that the CPU can understand them. But traditionally, most applications have been stored as machine code as well. To depict this situation graphically we'll use the following diagram, where the container represents an interpreter (in this case, the CPU), inside of it one or more applications and libraries are being executed directly by the CPU.
- Insert Figure 1. Running Java
Since very few of us are capable of writing programs directly in machine code, we employ a compiler to convert our programs from a source language into machine code. This technique often gives the best possible performance for CPU-intensive applications. However, certain languages cannot be compiled directly into machine code. Often these languages support features like dynamic typing which make static compilation difficult, and therefore they tend to be interpreted rather than compiled. Other times, interpretation is desired not because of language constraints, but because of when or how the source code is provided. Perhaps the code will be supplied at runtime by the user, or will change with each run. Or perhaps it is important that the development cycle be as short as possible. Compilation can be an expensive operation, and it doesn't always pay to spend time compiling the code into something that the CPU can understand directly.
When talking about interpreted languages it is useful to talk about layers of interpretation. For example, let's take the TCL language, which is a scripting language that became popular for rapid application development because of its simplicity and tight integration with the Tk GUI library. Let's assume that we want to run an application that is implemented entirely in TCL. To describe this graphically we will show two containers, one if which is "instantiated" by a component in the interpreter directly below it. In Figure 2, our application (app.tcl) is executed within the TCL interpreter, which is itself being executed by the CPU.
- Insert Figure 2. Running a TCL Script
The term "interpreted" here doesn't necessarily indicate that the TCL source code is being executed directly; very few languages actually do this. In most cases, an intermediate representation (often called opcodes or bytecode) is used. Perl, for example, tends to alternate between compiling a set of source code into concise opcodes, and then executing those opcodes. At a minimum, most languages will parse the source code to obtain an abstract syntax tree once, and then interpret the syntax tree rather than operating on the source code directly.
Often the interpreter is acting on an intermediate representation of the program that was generated earlier through compilation. A good example of this is the Java Virtual Machine (JVM). The JVM is an interpreter that is implemented in machine code and knows how to executes Java bytecode. However as we'll see in a moment, in a modern JVM this is only half of the story.
- Insert Figure 3. Running a Java Application
Embedded Interpreters
Up until now we've dealt with only a single language, so everything has been straightforward. Problems begin to arise when we need to integrate components written in two different languages. Let's say, for example, that we have an application written in Java but we want it to call a function written in TCL (perhaps to perform some calculation for the user).
One way to do this is to load a TCL interpreter like we did before, but communicate with it through the Java Native Interface (JNI). To do this, we would need to write a bit of glue code in C that wraps around the interpreter and provides native methods that we can call from Java. We would need functions to initialize the interpreter with your TCL source code, and to call the desired procedure. Any arguments passed into the TCL procedure would need to be converted into TCL-specific data structures and the return value would need to be extracted out of its TCL structure.
If all of this seems like a lot of work to you, you're absolutely right. A much better solution would be to use an existing component that solves this problem generically. In this case, there is a project called TclBlend which includes an embedded TCL interpreter and a Java API for it. To use TclBlend, you will still need to wrap your arguments in structures that TCL will understand, but now you can use Java objects rather than having to write JNI code.
- Insert Figure 4. Running TCL Code with TclBlend
Unfortunately, now that we've introduced code written in C we have lost the ability to run on any platform. Both the TCL interpreter and the TclBlend JNI code must be compiled specifically for each architecture that we want to use, and we need to deploy the resulting libraries along with our application.
Hosted Interpreters
An alternate to embedding an existing interpreter is to re-implement the interpreter in a higher-level language. For example, instead of using TclBlend for our example above we could have used Jacl, which is a TCL interpreter written entirely in Java. Unfortunately Jacl is restricted by what can be done from a Java program, so certain pieces of TCL functionality are not supported (e.g., environmental variables[1]). However, this means that Jacl has no native code and therefore is just as portable as Java.
- Insert Figure 5. Running TCL Code with Jacl
There are quite a few advantages to hosted interpreters over embedded interpreters. In particular, it is often convenient for the host language to have control over every operation that the hosted language can perform. This provides naturally for sandboxing, a security technique which we will cover in detail in Chapter 12. Hosted interpreters are also generally easier to suspend, or to serialize out to a file for future use. This can be used to migrate programs between processes, or to add transactional support to any language.
However, hosted interpreters also carry a heavy cost. They add an additional layer of interpretation, which translates directly into a fixed performance overhead. If that weren't true, we would write interpreters for everything. It's certainly possible to write an interpreter for C. You could then write a Java interpreter in C, and a Ruby interpreter in Java, etc. However, having so many levels of interpretation would clearly not be efficient. Instead, we write compilers. You can think of a compiler as a way to bypass one or more layers of interpretation.
- Insert Figure 6. Interpreted vs. Compiled Python
In Chapter 5 we will discuss interpreters in more detail.
Translators
The term translator refers to a wide class of tools that translate computer programs from one format into another. These formats can either be the source code grammar for various programming languages, or the compiled representation of these languages. For example, source-to-binary translators are generally referred to as compilers. In addition to compilers, we'll be discussing binary-to-binary translators and various types of source code generators.
Compilers
Compilers are one specific type of translator that converts from a source language into an executable language. In this book we'll be focusing on compilers that convert from the source code of one language into an executable format that is generally associated with one or more other programming languages. In particular, the Java bytecode format is generally thought of as an intermediate representation for Java source code before it is interpreted by the Java VM. However, there are many compilers available which can convert from other languages, such as Python or Scheme, into Java bytecode. In this way the standard Java interpreters and translators can act on the other language in the same way they would act on Java source code.
Compilers are typically invoked as a separate stage in your application's build process, long before your project is deployed. Sometimes compilers are invoked as part of an interpreter, immediately before execution. Some of interpreters even alternate between compiling and evaluating, as in the case of Perl. Finally, some compilers are invoked and re-invoked at runtime and actually use runtime information as hints during the compilation and optimization process. Static Compilers
Static compilers transform your program from source code into some binary representation. These are the compilers that you are probably familiar with. For example, all C and C++ compilers, such as the GNU CC compiler, are static compilers.
- Insert Figure 7. Various Compilers
Static compilers are available for many languages, including Java. One of the more interesting static compilers for Java is GCJ, which is a plug-in to the GCC compiler. Compiling your Java code with GCJ allows your Java methods to call C and C++ methods directly, without the need for a bridge like JNI. Most other Java compilers are commercial products that are used either for efficiency reasons or to avoid the need for a Java Virtual Machine to be installed on the client machine. Static compilers could also be used for obfuscation reasons, to make the distributed code more difficult to analyze. But obviously, if it is possible to write a compiler it is also possible (though usually more difficult) to write a de-compiler, so, to paraphrase a popular open source tagline, this type of security through obscurity is generally not much security at all.
Just-In-Time Compilers
Just-in-Time (JIT) compilers, on the other hand, are often tied to an interpreter and use information about how the program executes to influence the way that the program is compiled. For example, rather than simply eliminating so-called "dead code" that can never be executed, a dynamic compiler can compile only the individual blocks of code that have been executed and can replace all others with a stub to compile or interpret that code.
Java's HotSpot compiler is an example of a dynamic compiler that compiles Java bytecode into machine code. The HotSpot compiler comes bundled with Sun's Java Runtime Environment, and works alongside the interpreter. When a piece of code has been called a certain number of times, HotSpot will kick in and will compile the bytecodes directly to machine code.
- Insert Figure 8. Java's JIT Compiler
JIT compilers should not be confused with static compilers that are executed at runtime, like those in Perl, Python, and other similar languages. These compilers do not selectively re-compile code based on runtime performance, and they must at a minimum compile all methods the first time they are executed. There is no reason that Java could not have an interpreter that took a .java source file and automatically compiled it before executing it (in fact, BeanShell can now do something similar to this), but that would be very different from the HotSpot JIT compiler that is being mentioned here.
- Mention compiling to hardware (e.g., C source code to FPGA translation) ?
Binary-to-Binary Translators
Rather than compiling source code into some executable format, programs are often written to translate between executable formats. For example, the three primary intermediate representation formats -- Java, MSIL, and Parrot -- are all approximately equivalent in terms of their functionality. It would be possible to write translators to convert between any two of formats.
In fact, there has been a lot of interest in Java bytecode to CIL translators. J#, Microsoft's latest attempt to reach out to Java programmers, includes a converter that can be used to turn compiled Java code into CIL code that can be used in the .NET CLR. There are also other commercial translators available.
Code Generators
It is also possible to write a program that generates source code in some language. This can be done by parsing source code in another language, resulting in a source-to-source translator. These tend not to be complete translators, because that would involve getting the parsing rules for the language exactly right, but a Perl script with a few regular expressions can often be extremely useful in translating code between languages.
Code generators can also be driven off of the compiled representation of a language. This often results in source code that is often inefficient and less understandable to humans, but often provides a more complete mapping between the languages.
There are a number of times that these techniques would be preferred over a binary-to-binary translator. The code produced by such a translator can be difficult to debug, since there is no source code to refer to. For this reason code generators are also preferred when you want to maintain the code in the new language, rather than in the old language. Finally, the ultimate compiled representation is typically more efficient, because the compiler still has a chance to optimize the code for the new language.
Bridges
Rather than implementing one language in terms of another, you could simply leave the two languages as is and provide a bridge between them. The above techniques are often helpful when language integration is built in from the ground up, but when both languages are already implemented when the time comes to integrate them, they are usually joined by a bridge.
Some languages provide support for proxies, which are objects that appear to have some API but in reality are simply forwarding method calls on to something else. For example, Java's Dynamic Proxy API allows for the creation of objects that implement one or more interfaces, but whose implementation can be delegated to another object with a generic interface, InvocationHandler. The CGLIB project provides very similar functionality, but will instead generate classes on the fly so that proxies are not limited to implementing interfaces -- they can extend existing classes as well. For languages that do not have support for proxies and do not provide a catch-all method (like Perl's AUTOLOAD), stub code needs to be auto-generated which can serve as an adapter between the desired object interface and the bridge.
Bridge is a fairly vague term that can be used to describe a lot of different techniques. We'll divide them up into two categories: internal bridges and external bridges. Internal bridges connect two different languages or environments within the same process space. External bridges, on the other hand, connect components running in two different processes, possibly even on two different machines.
Internal Bridges
Because internal bridges live in the same process as both languages that they need to connect, they can often share memory between the two.
We've already discussed one bridge in this chapter. JNI, or the Java Native Interface, is a generic bridge between Java and C code.
- Insert Figure 9. Using JNI
We've seen another internal bridge in this chapter as well. TclBlend can be thought of as a Java-TCL bridge.
Most scripting languages have native bridges as well. For example, Perl has a native interface called XS that is very similar to JNI. A package called SWIG can be used to generate skeleton integration code for many languages including Java, Perl, Python, Ruby, TCL, and many more.
External Bridges
External bridges, on the other hand, operate across process boundaries and therefore require some way of transferring data between processes. This is most often done by serializing, or marshalling, the data types into a sequence of bytes and streaming them across a TCP socket connection to the other process.
Two languages can also be transparently integrated in much the same way that two applications written in Java can be integrated via a remote messaging mechanism like RMI. Communication between two JVM's is typically implemented via socket communication, so there needs to be a representation for the objects in terms of byte streams.
In the case where this data representation format is language-independent, and there is an implementation of it in two or more languages, the communication between the languages happens transparently. This is the idea behind the OMG's Common Object Request Broker Architecture (CORBA). Modern versions of CORBA use a language-neutral binary data format called IIOP.
- Insert Figure 10. Using CORBA
Another popular type of external bridge is a COM-CORBA bridge, which translates messages and allows them to be passed between CORBA components (written in any language) and COM component (written in C++, VB, etc.).
Web Services provide a more modern example of an external bridge. The communication format for Web Services is typically XML, which acts as a language-neutral data format, and specifications such as SOAP and XML-RPC provide a mapping between the document format and the semantics of a remote procedure call.
The use of bridges can have a profound impact on the performance of a system. At a minimum, internal bridges will require some mapping of data formats and therefore the overhead added will be proportional to the amount and complexity of the data that is crossing the bridge, in addition to the frequency of that cross the bridge. In extreme cases this can be even worse than the cost of an interpreter. However, the use of a bridge may also remove the need for an interpreter, and using a bridge in order to use remove a layer of interpretation could be a win. For example, the cost of crossing the JNI bridge may be expense, but if it allows you to use a TCL interpreter implemented in C instead of one implemented in Java, this may actually provide better performance in, depending on how much communication between languages is required.
External bridges will require context switching or network communication, so they tend to be the least efficient of the methods discussed here. However, they can be the most flexible, safest, and least intrusive method of integration.
In Chapter 7 we will discuss bridges in more detail.
Continue on to Chaper 4: Integration Frameworks.

