Dynamic Languages and Java/Programming Language Safari
From JVMLanguages
| Table of contents |
Introduction
Safari's are no longer about hunting, but in a lot of ways they never were. Going on a safari has always been primarily about being closer to nature, and becoming a part of another environment for a short time. They allow you to see wild animals in their natural habitat, and to interact with them in a way that would otherwise be impossible. You're not necessarily going to learn anything that will translate directly to your everyday life, but there's a good chance that you'll gain a better appreciation for the world that you see every day. Nothing builds respect for your neighbor's cat better than watching a pack of lions stalk a gazelle.
In this chapter, we'll be going on safari. I'll be trying to show you as many programming languages as possible, with a focus on the individual characteristics that make each language unique. I'll also try to demonstrate each of these languages in their "natural habitat"; that is, I'll use them to solve the types of problems that they were designed to solve, and in very much the same way that they are commonly used. You're not necessarily going to go home and start writing code in some obscure language, but, like the pack of lions, hopefully you'll begin to see some ways that these ideas and techniques could be useful in your own programming environment.
The one factor that all of the best and most creative programmers I know have in common is that they have all extensively studied several programming languages. When discussing new design patterns or coding styles, they can always make an analogy to an existing language where that pattern is commonplace, or even built into the language itself. Why is this beneficial?
In much the same way that knowledge of object-oriented Design Patterns provides a common set of terms and architectural structures that allow developers to communicate efficiently and easily understand each other's code, understanding the features of one or more programming languages allow developers to not only learn new languages more efficiently, but also to appreciate the limitations of their more familiar languages.
There are many books available that can teach you more than one programming language, but most of them focus on the similarities rather than the differences. Like a first-year German textbook, they'll teach you a few words you can use to get around, but don't really get at the nuances that make each language special. Instead of simply teaching you a few basic concepts of each language, this chapter will try to give you a taste of what it's like to be fluent in that language. It should be more like living in Hamburg for a week, rather than taking German lessons for a year.
Language Type
The first thing to focus on when you're on safari is the types of animals that you'll be seeing -- in other words, the taxonomy of life. We need to do something similar; we need to outline the taxonomy of programming languages. This shows us how closely each language is related to each other language.
Imperative
Imperative languages emphasize how to answer a question or accomplish a task. A program written in an imperative language would tend to list each step that needs to be accomplished, the order those steps must follow, and the algorithm to use for each step.
There are two main types of procedural languages, which are probably most familiar to you. Procedural languages, like C, and object-oriented languages, like C++, Java, and Ruby, are all imperative. Other languages straddle the line between procedural and object-oriented, like Perl and Python, but still fit comfortably under the umbrella of imperative languages.
Declarative
Unlike imperative languages, declarative languages simply describe a problem or state a goal and leave it up to the implementation of the language (e.g., the compiler or the interpreter) to decide the specific set of steps that should be taken.
Functional languages like LISP, Scheme, ML, and Haskell often include aspects of both types of languages, but tend to be more declarative than imperative. A specific type of functional language, called an applicative language, is purely declarative and has no imperative constructs at all (e.g., no loops, blocks, or mutable variables).
Logical (e.g., rule-based) languages are also declarative. There are a number of languages integrated into rule-engines that we will discuss in Chapter 9. XSLT also falls into this category, as it does not present a specific algorithm to use, is lacking in many imperative constructs, and provides a set of rules that can match various segments of an XML tree.
Expression (JEL, XPath, JSTL-EL)
Domain-Specific Languages
Language Scope
General Purpose
This is the type of language that you are probably most familiar with. These languages are always Turing-complete, which means that they are capable of doing anything that can be done with a computer. General purpose languages differ in terms of their feature sets, but their design is typically made up of various compromises and hedges. They are designed with implementation factors like performance and long-term maintainence in mind.
Scripting/Extension Languages
Although they are typically Turing-complete as well, I think it's important to make a distinction between general purpose languages and extension languages, or scripting languages as they are usually known. These languages tend to be less concerned about performance, security, and constraining their developers and more concerned with rapid development and making their developer's job easier (at least in the short term). These languages are often interpreted, though this is by no means a requirement.
Extension languages tend towards being declarative, so that the algoritm being applied is clearly visible in the source code and is not obscured by implementation details.
There are certain roles where extension languages are especially useful. One particular role where these languages excel is that of a "glue language", meaning that they can cross language boundaries more easily and integrate diverse components (whether they be implemented in other languages, compiled as shared libraries, or used as external executables). Extension languages are also useful for embedding inside of a larger application so that programs can be written to control the application.
Domain-Specific
Domain-specific languages are sometimes Turing-complete, but more often they are not. These languages are usually expected to serve only one purpose and therefore their syntax, features, and performance characteristics are influenced directly by this purpose. These languages are usually very declarative, so much so that the intent of the code should not be buried within either implementation details or even the particular algorithm that will be used. These languages are designed to be written by non-programmers, who are often experts in the languages's domain.
Turing Tarpit
Some languages exist for no other reason than to simply prove that they can. These languages are usually called "turing tarpits" (why?). Although there is no reason to use such languages in a production environment, they can serve as useful tools for exploring what computers can and cannot do.
Language Features
We also need to focus on particular features that make each of these languages unique. These are typically more fine grained details, and discussing them helps to differentiate otherwise-similar languages.
Primitive Types
Some languages have separate integer and floating-point types. Some languages have a wide range of numeric types to fit several different levels of precision.
Some languages have strings as a data type. Others simply use character arrays.
Some languages support data structures like maps or sets as a first-order data type. Others have no language support for these structures.
Argument Passing Styles
Overloading
Argument Naming Schemes
Most languages simply allow a sequence of arguments to be passed to a method. In some cases, the number and types of arguments make up the method's signature -- in others, all methods are simply passed an array of values and are free to deal with them in any way they wish, as in Perl.
Even languages for which the number of methods is part of its signature often provide some syntactic sugar to allow a variable number of homogenous arguments to be passed. For example, Java 1.5 has the new void someMethod (int... args) {} syntax -- which is simply short for void someMethod (int[] args), although clients can call it without explicitly constructing an array (someMethod(2, 3, 5, 7, 11)). C and C++ UNIX code often uses the stdarg.h header file to gain support for variable arguments via a series of preprocessor macros.
In other languages, the caller of the method must refer to the argument names as well. Sometimes this is only done for constructors, as in two Java derivitive-languages, Groovy and Nice. This allows clients to omit specific arguments, or to pass the arguments in an arbitrary order. An exception to this is Objective-C, where all method calls could optionally refer to each argument by name, but they always had to be present in the correct order. Pass-By-Reference vs. Pass-By-Value
Subtyping
Single vs. Multiple Inheritence
Interfaces
Mixins
Object- vs. Class-based Inheritence
Most object-oriented languages that you hear about today use what is called "class-based inheritence." This means that it is classes that define behavior, and can inherit behavior from other classes. However, some languages instead use "object-based inheritence." In these languages, objects can declare behavior, and inherit behavior directly from other objects. Classes are not present as a language construct, but there are often objects specifically designated as "generic" or "prototype" objects which define behavior but are not used themselves.
Some languages do not support inheritence at all, but simply clone these prototype objects. After they are cloned, changing the parent object in some way will not affect the child.
Which one is Self? Mention LambdaMOO?
Scoping
One of the key ways that two programming languages can differ is in the way that they bind, or resolve, variable references to a specific variable declaration.
There are two main types of scoping: lexical and dynamic:
- Lexical Scoping
- Lexical scoping -- or static scoping as it is sometimes called -- means that variable references are bound to a declaration in one of the enclosing blocks of code. This means that the list of variables available to any block of code can be determined from the syntactic structure of the code itself before the program even executes.
- Dynamic Scoping
- Dynamic scoping means that variables are bound to the declaration that was encountered most recently in terms of the program's execution. With dynamic scoping, you cannot tell ahead of time whether all variable references will be resolved at run-time, or to which declarations they will bind.
Lexical scoping tends to be much more popular among statically compiled languages because it simplifies the creation of a compiler.
Lexical scoping also tends to lead to more understable code. Dynamic scoping is more frequently associated with scripting languages, because it often simplifies the creation of an interpreter and also makes the code a bit less verbose.
Some languages, like Perl, support both lexical and dynamic scoping.
Functions as Data
Many languages support a way to refer to a piece of code as a piece of data. C provides function pointers, which can be passed around in the same way you would use data pointers, but at a later time these pointers can be invoked.
Blocks
Blocks are arbitrary pieces of code that can be stored in a variable or passed into a function. They do not necessarily have to be functions themselves -- you can think of them as anonymous functions if you'd like. They may or may not be able to take arguments.
Closures
Closures are a special type of block, which inherits any lexically scoped variables from its parents frame. This means that inside of the closure you can refer to the same variables you would be able to reference if you were not creating a closure, and the references to those variables will translate correctly even if the closures is passed into a new function.
sub foo {
my ($self) = shift;
my @array = $self->get_array;
return sub {
my ($i) = shift;
return $array[$i];
};
}
$obj->foo->(10);
Generics
Generics, or parameterized-types as they are more formally known, allow you to take a single "generic" class definition and apply it to more than one situation. You could, for example, define a "generic" data structure that contains one or more elements of an arbitrary type. Later, you could use this single data structure class to hold integers in one situation, and strings in another situation.
public interface List<T>
{
public T get (int index);
public void add (T element);
...
}
List<int> intList = new ArrayList<int>();
intList.add(42);
List<String> stringList = new ArrayList<String>();
stringList.add("test");
intList.add("42"); // compile error!
Parameterized types can often be used on individual methods in addition to entire classes.
public class ListUtils
{
<T> T randomElement (List<T> list)
{
...
}
}
SML has a similar concept. Any type prefixed by a single quote (') character becomes a type variable. All instances of the same type variable must bind to the same type. Thus, the following function can take an int list and return an int, or it can take a string list and return a string, etc.
def randomElement('a list) : 'a = ...;
Java took this quite a bit further, and actually parameterized many of the core types of the language. For example, you can now declare a variable as follows:
Class<T> foo;
foo can now only hold instances of Class where T.isInstance(foo) returns true. This is actually very useful for creating factories. Consider this example:
<T> T createObject (Class<T> cl)
{
return cl.newInstance();
}
Foo a = createObject(Foo.class); // no casting needed!
Bar b = createObject(Bar.class); // no casting needed!
Foo c = createObject(Bar.class); // compile error
Dynamic Dispatch
This is what C++ and Java programmers usually mean when they say "polymorphism." It is the ability to invoke methods on the runtime type of the object rather than on its compile-time type. C++ supports both dynamically- and statically- dispatchable methods and uses the virtual keyword to distinguish the two. In Java, all methods are dynamically-dispatched unless they are declared as static.
Double Dispatch
Double dispatch is the ability for a language to locate a function by using the types of the arguments. In an object-oriented languages, this means that it is not only the class of the target object that determines which method is call, but also one of the arguments. (The term multi-dispatch is usually used when speaking about all of the arguments rather than just one.)
The Visitor pattern is often used to simulate double-dispatch in languages that have no built-in support for it. However, this can be a hassle. With a visitor, both the class defining the method and each of subclass of the type passed as an argument must participate in the pattern. The Visitor pattern also becomes overwhelmingly complex if you try to extend it to consider the types of three or more classes (multi-dispatch).
Another case where double dispatch is extremely useful is when doing operator overloading. Without this, it is very difficult to implement operators that behave the same when their two operands are of different types. If you've ever had two different classes in Java that you wanted to compare to each other (either by compareTo() or equals()), you've probably run into the problem that both classes need to check if their argument is of the opposite type, because it is very difficult to guarantee which object will be chosen to compare to the other. Double-dispatch would allow you to define both methods in one place.
Recursion
Recursion is simply the ability for a function to call itself, either directly or indirectly. Some languages do not allow recursion at all, either for safety reasons (to avoid infinite loops) or due to assumptions in their implementation. Other languages rely exclusively on recursion and do not provide any looping constructs (like for or while).
One particular form of recursion, tail-recursion, lends itself particularly well to optimization. Some languages support frame reuse, where a call from a function to itself at the very end of its life does not need to grow the stack.
Java does not currently optimize tail-recursion. This means that the following example:
public void loopForever () {
doSomething();
loopForever();
}
will very quickly throw a StackOverflowError.
Other languages, in particular functional languages, rely on recursion much more heavily because they have limited or no support for iteration statements like for and while. As a result, these languages tend to optimize recursion as much as possible. If you run the following code in Scheme:
...
it will run quite happily forever, without the stack growing beyond a single frame.
Continue on to Chapter 3: Language Integration Techniques.

