Dynamic Languages and Java/Integration Frameworks

From JVMLanguages

Table of contents

Introduction

In the previous chapter we examined a number of techniques that can be used to integrate other programming languages into Java, and we discussed a few of the open source projects that make use of these techniques. Many of these projects (like JACL and TclBlend) provide similar functionality, but through very different means (e.g., compilers, interpreters, and bridges).

However, in most cases the Java programmers who will use these projects don't particularly care which method is employed. They simply want a consistent API that hides the details of each particular language from them. These programmers shouldn't need to be an expert in every language -- there are too many details to remember (data types, language features, and method of execution, etc.). To simplify this problem and make it easy for programmers to support more than one language, a number of scripting frameworks have been created. These frameworks abstract many separate projects behind a common interface.

There are a number of advantages to these scripting frameworks, including:

  • For user-supplied functions or scripts, the decision of which specific languages to support can be deferred until run-time. Your program simply needs to code to one API and implementations for new programming languages can be added at a later point (often with no code changes).
  • Because your program uses one language-independent API, you do not need to be aware of the internals of any of the languages that you support (e.g., how to initialize an interpreter, or what data types are supported).
  • Many integration frameworks will help to migrate data between Java's native data types, and those of the target languages. For example, they framework may automatically unwrap primitives from out of language-specific wrappers.

Bean Scripting Framework

The most commonly-used abstraction layer is the Bean Scripting Framework (BSF). The purpose of BSF is to provide a Java application with access to various scripting languages via a common API. The main entry-point to this API the BSFManager class. One or more scripting engines register with the BSFManager class and provide an implementation of the BSFEngine interface. Instances of BSFManager can be used to evaluate expressions or execute blocks of code in any of the supported languages. Figure 1 shows the most important parts of the BSF architecture.

Insert Figure 1. BSF Architecture

Each BSFEngine implementation provides support for a single programming language by adapting from the BSF API to a stand-alone component. There are quite a few methods on the BSFEngine interface, so BSF includes a BSFEngineImpl abstract class that provides default implementations for many of these methods.

Here is the minimum amount of code needed to provide BSF support for a new programming language:

   package com.oreilly.javalangint.mylang.bsf;

   import java.util.Vector;
   import org.apache.bsf.*;
   import org.apache.bsf.util.BSFEngineImpl;
   import com.oreilly.javalangint.mylang.MyInterpreter;

   public class MyEngine extends BSFEngineImpl
   {
       private MyInterpreter myInterp;

       public void initialize(BSFManager man, String name, Vector beans)
           throws BSFException
       {
           super.initialize(man, name, beans);

           myInterp = new MyInterpreter();
           // initialize myInterp
       }

       public void declareBean(BSFDeclaredBean bean)
           throws BSFException
       {
           myInterp.setVariable(bean.name, bean.bean);
       }

       public void undeclareBean(BSFDeclaredBean bean)
           throws BSFException
       {
           myInterp.unsetVariable(bean.name);
       }

       public Object eval(String expr, int, int, Object)
           throws BSFException
       {
           // evaluate expr with myInterp, return the result
       }

       public Object call(Object obj, String methodName, Object[] args)
           throws BSFException
       {
           // call obj.methodName(args) through myInterp
       }

       public void terminate()
       {
           super.terminate();

           // destroy myInterp...
           myInterp = null;
       }
   }    

BSF actually makes a distinction between evaluating an expression, and executing a block of code. However, the BSFEngineImpl abstract class implements exec() using eval(). If you want to do something differently for this, you can also override:

       public void exec(String statements, int, int, Object)
           throws BSFException
       {
           // evaluate statements with myInterp
       }

There are also two other sets of methods that you can override, to control the generation of source code and debugging support. These will be discussed in more detail in Chapter 11.

History of BSF

BSF was created in 1999 as a research project at IBM. As interest in the project grew, it moved through the IBM alphaWorks and developerWorks websites until version 2.2 when it was donated to the Apache Software Foundation's Jakarta Project. Version 2.3 was the first version released by Jakarta, and remains the current version at the time of this writing.

When the code moved from IBM to Apache, the package structure changed as well -- from com.ibm.bsf to org.apache.bsf. This is something to be aware of when you look for sample code or BSFEngine implementations, since engines written for the IBM version of BSF will not work with version 2.3. For the rest of this book we will assume the use of Apache's BSF.

Registering Engines

As mentioned earlier, support for each programming language is provided by an implementation of the BSFEngine interface. One or more BSFEngine instances are registered with the BSFManager by calling the static method registerScriptingEngine. Each language is given a unique name and one or more file extensions, which are used to guess the language from a file name.

For example, we could register the MyEngine class given in the example above with the following code:

   BSFManager.registerScriptingEngine(
       "mylang",                                       // Language name
       "com.oreilly.javalangint.mylang.bsf.MyEngine",  // BSFEngine class
       new String[] {"my"}                             // extensions (e.g, script.my)
   );

There are a number of engines that are currently distributed along with BSF, which do not have to be registered explicitly. They are contained in a file called Languages.properties which is distributed in the bsf.jar. Note however that only the engines are provided; you will still need to download the interpreter for each language that you want to use. The list of languages which have engines distributed along with BSF includes JavaScript, JACL, NetREXX, Jython/JPython, and XSLT. Other languages (like Pnuts, BSH, BASIC, Ruby, and JudoScript) do not have their engines distributed along with BSH, but are registered in the base Languages.properties anyway.

   # List of script types and their associated scripting engines
   #
   # languageDescriptor = engineClass, ext1|ext2|... {, codebaseURL, ...}
   #
   # where exti are extensions for the language. Note that we leave 
   # all the engines enabled now and allow them to fail at load time.
   # This way engines can be added by just adding to the classpath 
   # without having to edit this file. Cheating, really, but it works.
   #
   javascript = org.apache.bsf.engines.javascript.JavaScriptEngine, js
   jacl = org.apache.bsf.engines.jacl.JaclEngine, jacl
   netrexx = org.apache.bsf.engines.netrexx.NetRexxEngine, nrx
   java = org.apache.bsf.engines.java.JavaEngine, java
   javaclass = org.apache.bsf.engines.javaclass.JavaClassEngine, class
   bml = org.apache.bml.ext.BMLEngine, bml
   vbscript = org.apache.bsf.engines.activescript.ActiveScriptEngine, vbs
   jscript = org.apache.bsf.engines.activescript.ActiveScriptEngine, jss
   perlscript = org.apache.bsf.engines.activescript.ActiveScriptEngine, pls
   perl = org.apache.bsf.engines.perl.PerlEngine, pl
   jpython = org.apache.bsf.engines.jpython.JPythonEngine, py
   jython = org.apache.bsf.engines.jython.JythonEngine, py
   lotusscript = org.apache.bsf.engines.lotusscript.LsEngine, lss
   xslt = org.apache.bsf.engines.xslt.XSLTEngine, xslt
   pnuts = pnuts.ext.PnutsBSFEngine, pnut
   beanbasic = org.apache.bsf.engines.beanbasic.BeanBasicEngine, bb
   beanshell = bsh.util.BeanShellBSFEngine, bsh
   ruby = org.jruby.javasupport.bsf.JRubyEngine, rb
   judoscript = com.judoscript.BSFJudoEngine, judo|jud
   groovy = org.codehaus.groovy.bsf.GroovyEngine, groovy|gy
   objectscript = oscript.bsf.ObjectScriptEngine, os

NOTE: As of the upcoming version 2.3.1 of BSH, you can have multiple Languages.properties files, all with the same path but in different jars. This allows for language interpreters to provide both a BSFEngine implementation and a Languages.properties file in their jar. When BSF is initialized it will search through all Languages.properties files contained in all JARs in the classpath, and your language will be registered automatically.

Running Scripts on the Command Line

BSF includes a main class, org.apache.bsf.Main, that can invoke scripts written in any language specified in BSF's Languages.properties file. One of the arguments given to this class is -in FILE, where FILE is the name of the script that you want to execute. Another argument, -lang LANGUAGE specifies which language the script is written in. If this argument is not specified, BSF will attmept to guess the language based on the file's extension. If a matching engine is found, that engine is invoked to process the supplied script.

For example, the BeanShell script given below can be executed with the following command-line:

   $ java -classpath bsf.jar:bsh.jar org.apache.bsf.Main -in primes.bsh
   2
   3
   5
   7
   11
   13
   17
   ...

primes.bsh, a BeanShell script to find prime numbers:

   primes = new ArrayList();

   isPrime (number) {
       for (i : primes) {
           if ((number % i) == 0) return false;
           if ((i * i) > number)  return true;
       }
       return true;
   }

   for (number = 2; number > 0; number++) {
       if (isPrime(number)) {
           primes.add(number);
           print(number);
       }
   }

There is one more argument supported by BSF's Main, -mode, which can have one of three values: eval, exec, or compile. eval and exec invoke the methods of the same name on the BSFEngine. compile is used for generating source code -- this will be described in more detail later.

Registering and Declaring Beans

The ability to run a script inside of Java would not be very useful without a way to pass information into the script. To facilitate this, BSF will populate a single variable called bsf in the script's language. This variable contains an instance of the org.apache.bsf.util.BSFFunctions class, which provides various callbacks that allow the script to access BSF. One of the callbacks provided is the lookupBean() method, which can be used to retrieve objects that have been declared by calling BSFManager.declareBean from Java. This allows you to store objects by a logical name, and retrieve them through the bsf variable.

         bsfManager.declareBean("foo", new ComplexObject());        

         my $foo = bsf->lookupBean("foo");
         print "Got: $foo\n";

You can also "declare" a bean, which forces the creation of a variable with the specified name. This has the advantage of not requiring the user to call bsf.lookupBean(String), but at the expense of clarity. It is no longer obvious from looking at the interpreted code where the variable originated, and no guarantee that future variables will not conflict with existing variables that are used privately by the script.

       bsfManager.registerBean("foo", new ComplexObject());      

       # $foo is already declared -- use it
       print "Got: $foo\n";

NOTE: The object that the engine stores in the bsf variable should be an instance of the org.apache.bsf.util.BSFFunctions class (or some kind of proxy to it). However, some engines will actually populate this with the BSFManager object that invoked the script. If you stick to the subset of methods available on the BSFFunctions class, you won't notice the difference. However, it's interesting to note that if the BSFManager itself is made available, you may be able to invoke one BSF language from another (i.e., eval Jython code from Jacl or vice versa).

Evaluating Code

The primary method of evaluating code through BSF is by calling the eval method on BSFManager. You will need to pass in a language name, which is used to retrieve the appropriate BSFEngine. This name must match the first part in the Languages.properties file (case-sensitive).

Unfortunately, there are no helper methods for evaluating files, so you will need to obtain a string containing your entire source code, and pass in the name of the source file and line and column numbers to be used for debugging purposes. BSF also provides a method to guess the language based on a file extension, so we can take advantage of this as well.

Here's a utility class that integrates all of these things:

   public class BSFFileUtils
   {
       public Object evalFile (BSFManager manager, File file)
           throws IOException
       {
           String fileName = file.getPath();
        
           String language = manager.getLangFromFilename(fileName);
           String code = extractFileContents(file);
        
           return manager.eval(language, fileName, 0, 0, code);
       }
        
       private String extractFileContents (File file)
           throws IOException
       {
           StringBuffer buf = new StringBuffer();
        
           BufferedReader r = new BufferedReader(new FileReader(file));
           try {
             int len;
             char[] chars = new char[8192];
       
             while ((len = r.read(chars, 0, chars.length)) != -1) {
               buf.append(chars, 0, len);
             }
       
             return buf.toString();
           } finally {
               r.close();
       }
   }
       

Now that we have a way to feed our script into BSF with the proper code and language specified, we can take an ordinary Perl script, like this one:

   use File::Find;

   my $count = 0;

   find(sub {
     $count++ if /\.java$/;
   }, $directory);

   return $count;

...and run it through Java like this:

   public static void main (String[] args)
   {
     String directory = args[0];
     File file = new File("count-java.pl");

     BSFManager manager = new BSFManager();

     manager.declareBean("directory", directory, String.class);
     Number numFiles = (Number)BSFFileUtils.evalFile(manager, file);

     if (numFiles.intValue() > 0) {
         System.err.println("Encountered " + numFiles + " Java source files.");
     } else {
         System.err.println("No Java source files found.");
     }
   }
     

Return Values

The exact format of the data returned from eval is somewhat inconsistent between languages. For the common Java types (ints, doubles, strings, booleans, etc.), the behavior is generally one of the following:

  • A few languages, such as BeanShell and Groovy, use the Java classes exclusively (String, Integer, Double, etc.) and always return these through BSF.
  • Some languages have their own language-specific data structures but the BSFEngine implementation is responsible for converting from these data structures to ordinary Java objects. For example, JRuby uses classes like RubyFixnum and RubyString internally and exposes them as part of its public API, but also provides a helper class that can convert between Java classes and Ruby classes. The BSFEngine implementation is responsible for invoking the type converter so that no Ruby details leak to the outside world.
  • Other languages use their own language-specific data structures and make no attempt to do type conversions, even through BSF. For example, Jython uses PyInteger, PyFloat, PyString, etc. internally and will return them from BSF, even though there are no Java utilities to convert these types to java types. Since the primary goal of BSF is to provide a language independent API, it doesn't make much sense to return language-specific objects, because the client would need to introduce unwanted dependencies in order to do anything useful with it. [1]

To help explore what kind of objects are returned from BSF, I've written a simple application that can evaluate expressions in any BSF-supported language.

Insert Figure 2. Screenshot of BSF Calculator

As you can see, most languages return types that you may expect: subclasses of Number for numeric values, String for strings, etc. However, languages are not consistent about what specific class they return (Integer, Long, Double, etc.). As mentioned above, Python is the exception to this rule -- it returns its own data types.

The source code for the BSF Calculator can be downloaded from the website for this book, at XXX.

Interacting with Complex Objects

In addition to returning the basic data types, it is also possible to construct complex objects and return them from BSFManager.eval(). In this case, most BSFEngine implementations will return a language-specific placeholder for the object. Luckily, BSF provides for a language-independent way to interact with these objects. The BSFEngine.call() function can be used to call methods on an object that was previously returned from eval.

For example:

   BSFManager manager = new BSFManager();
   BSFEngine engine = manager.loadScriptingEngine("beanshell");

   String expr = 
     "someClass() {" +
       "int getValue() { return 42; }" +
       "return this;" +
     "}" +
     "return someClass();";

   Object object = engine.eval("(eval)", 0, 0, expr);
   Integer value = (Integer)engine.call(object, "getValue", new Object[0]);     

The above code declares a class in BeanShell, called someClass, and then returns an instance of it. This instance is actually a bsf.XThis object, which we could invoke methods on through its reflection-like API. However, the call method on the BeanShell engine will forward method calls on for us without introducing any dependencies on the bsh package.

Sometimes though even using the BSF call() API is inconvenient. You may want to interact with the object as if it were written directly in Java. Because of the compile-time dependencies, you will still need a Java interface to code against, but it is fairly easy to create a proxy that implements this interface and forwards method calls on through call(). In fact, using the dynamic proxy API we can automate this process. The BSFProxyFactory class given below contains a factory that can wrap around BSF and provide objects that implement Java interfaces. For example, it can be used to ...:

   BSFManager mgr = new BSFManager();
   BSFEngine engine = mgr.loadScriptingEngine("beanshell");

   BSFProxyFactory factory = new BSFProxyFactory(engine);
   ...
   public class BSFProxyFactory
   {
       private BSFEngine engine;
       private Map<String,Class> classMap;

       public BSFProxyFactory (BSFEngine engine)
       {
           this.engine = engine;
           this.classMap = new HashMap<String,Class>();
       }

       public void eval (String expr)
       {
           return mapOut(mgr.eval(lang, "(expression)", 0, 0, expr));
       }

       private Object mapIn (Object obj)
       {
           if (obj instanceof BSFProxy) {
               return ((BSFProxy)obj).getProxiedObject();
           } else {
               return obj;
           }
       }

       private Object mapOut (Object obj)
       {
           Class javaClass = classMap.get(...); // TODO

           if (javaClass != null) {
               Class[] interfaces = new Class[] {
                   javaClass,
                   BSFProxy.class
               };

               return Proxy.newProxy(interfaces, new BSFInvocationHandler(obj));
           } else {
               return obj;
           }
       }

       private interface BSFProxy
       {
           public Object getProxiedObject();
       }

       private class BSFInvocationHandler implements InvocationHandler
       {
           private Object proxiedObject;

           public BSFInvocationHandler (Object proxiedObject)
           {
               this.proxiedObject = proxiedObject;
           }

           public Object invoke(Object obj, Method method, Object[] args)
           {
               if (method.getDeclaredClass().equals(BSFProxy.class)) {
                   return proxiedObject;
               }

               obj = mapIn(obj);
               for (int i = 0; i < args.length; i++) {
                   args[i] = mapIn(args[i]);
               }

               return mapOut(engine.call(obj, method.getName(), args));
           }
       }
   }
     

Handling Exceptions

All exceptions thrown from within BSF will be wrapped with the checked exception BSFException. In addition to the usual message field (which will contain the specified file name and line number offset that was reported to BSF), a BSFException also contains a reason field that specifies what kind of error occurred. The possible values are listed as static constants on the BSFException class:

     public static int REASON_INVALID_ARGUMENT = 0;
     public static int REASON_IO_ERROR = 10;
     public static int REASON_UNKNOWN_LANGUAGE = 20;
     public static int REASON_EXECUTION_ERROR = 100;
     public static int REASON_UNSUPPORTED_FEATURE = 499;
     public static int REASON_OTHER_ERROR = 500;

Often the BSFException returned will actually be wrapped around an exception that originated in an engine. In this case, the reason will be set to REASON_EXECUTION_ERROR and the specific exception will be accessible through the getTargetException method.

WARNING: BSFException does not play well with Java 1.4 wrapper exceptions, so if you wrap a BSFException with your own, you will very likely lose the real exception. The workaround for this is to extract the target exception from BSFException explicitly:

       public void runScript (File file)
           throws MyOwnException
       {
           try {
               manager.eval(...);
           } catch (BSFException ex) {
               if (ex.getTargetException() != null) {
                   throw new MyOwnException("Error executing " + file, ex.getTargetException());
               } else {
                   throw new MyOwnException("Error executing " + file, ex);
               }
           }
       }
           

Generating Compiled Code

In this section we've only discussed the most commonly used features of BSF, however it also has support for invoking compiled languages, and languages with integrated debuggers. We will touch more on this in Chapter 11.

JSR-223 (javax.script.*)

BSF has been around for a long time, and has gained quite a bit of grassroots support. However, it has a much younger competitor that has been endorced by the Java Community Process, and thus may have the power to supplant BSF. Java Specification Request (JSR) 223, "Scripting for the Java Platform," was announced at the 2003 JavaOne conference and has since been approved as a standard. Although an early release was made available in the summer of 2004, it was not until February of 2005 that the public draft was made available. This draft supports JavaScript, Groovy, and PHP.

The main focus of JSR-223 thus far has been on integrating scripting languages with J2EE and web technologies. In particular, one of the goals of of JSR-223 is to allow PHP and JSP scripts to coexist peacefully on a single web server. It seems that Sun is working with Zend, a commercial PHP vendor, on this initiative.

However, JSR-223 also includes a generic abstraction layer around all scripting languages (javax.script.*), in much the same way that BSF does. In many respects the API that it provides is superior to BSF; however, the quality of the end result will depend on how many scripting languages are ultimately supported and how seamlessly each of them is integrated.

JSR-223 shares much of its architecture and terminology with BSF. It provides a ScriptEngineManager, which is aware of one or more ScriptEngine instances, each of which has a unique language name. An engine can have objects register with it, which are made available as variables in the target language.

Evaluating Code

Evaluating code in JSR-223 is very straightforward. Unlike BSF, there is no distinction between evaluating expressions and executing code. There is also no source information (line numbers and file names) passed in to JSR-223 to facilitate debugging. There is simply a single set of eval() methods, which can take either a String or a java.io.Reader.

       ScriptEngineManager manager = new ScriptEngineManager();
       ScriptEngine engine = manager.getEngineByName("groovy");
       engine.eval(new FileReader(new File("foo.groovy")));

Running Scripts on the Command Line

Unlike BSF, JSR-223 has no Main class that can be used to run scripts from the command line. But the API is simple, so it's relatively straightforward to create a Main class, as I've done here.

   import javax.script.*;

   public class ScriptMain
   {
       public static void main (String[] args)
       {
           File f = new File(args[0]);

           ScriptEngineManager manager = new ScriptEngineManager();
           ScriptEngine engine = manager.getEngineByExtension(getExtension(f));

           Object obj = engine.eval(new FileReader(f));
           System.err.println("return value: " + obj);
       }

       private String getExtension (File f)
       {
           String fileName = f.getPath();
           int i = fileName.lastIndexOf(".");
           if (i > -1) {
               return fileName.substring(i + 1);
           } else {
               throw new RuntimeException("unable to determine ext of " + f);
           }
       }   
   }
       

Registering Engines

One of the areas where JSR-223 is far more advanced than BSF is in engine discovery. Unlike BSF, JSR-223 uses the Java Service Provider API to discover scripting engines that are present on the classpath. This alleviates the need to explicitly register engines that are not provided along with the framework (as you may need to do in BSF). Conceptually this is very similar to BSF's technique of reading multiple Languages.properties that was discussed above, but it searches the classpath for files named META-INF/services/javax.script.ScriptEngineFactory.

Not only is this a more elegant way for engine authors to plug into the common architecture, but JSR-223 also exposes this information as part of its API. In addition to looking up engines by name and by extension, you can also iterate through all of the factories for each script engine, and obtain a richer set of information about them. For example, this simple Groovy script:

   import javax.script.*

   for (info : (new ScriptEngineManager()).getEngineFactories())
       System.err.println("${info.languageName} is implemented by ${info.engineName}")
       

... produces the following output:

   ECMA Script is implemented by Mozilla Rhino
   Groovy is implemented by Groovy Script Engine
   PHP is implemented by PHP
       

Registering and Declaring Beans

       engine.put("name", object);
Talk about Namespace and ScriptContext.

Return Values

I've written a version of the BSF Calculator that uses JSR-223 instead.

Insert Figure 3. Screenshot of JSR-223 Calculator

Interacting with Complex Objects

ScriptEngines which implement the javax.script.Invocable interface can be used to call methods on returned objects. They can also be used to coerce returned objects to a specific interface. This allows scripting engines to return proxies, as we did for BSF in Example 6.

Handling Exceptions

Many methods in JSR-223 are declared to throw a javax.script.ScriptException if they encounter an unexpected error. This exception can contain line and column number information, and a file name.

Generating Compiled Code

Like BSF, JSR-223 also has support for compiled languages, which we will explore in Chapter 11.

Other Integration Frameworks

In Chapter 8, we will discuss Inversion of Control (IoC) containers like Spring and Jelly. These containers are generally used to hook up independent Java components that should not depend on one another, but they can also be very useful for integrating components written in other languages, without introducing a direct dependency between any two languages. Coupled with BSF and JSR-223's support for creating proxy objects that implement a Java interface, this can be an extremely flexible and unintrusive way to add support for components written in another programming language to an existing Java project.

There are also integration frameworks in languages other than Java. The best example of this is Perl's Inline module. This allows you to write code for other languages right alongside your Perl code. Inline supports Java as one of these languages, so it is possible to execute Java code from inside a Perl script. This is exactly the opposite technique that is used by BSFPerl, which will be discussed in more detail in Chapter 7.

Continue on to Chapter 5: Interpreters in Java.