![]() |
|
|
|
| | | | | | | | | | |
, one of the new languages developed to run on the JVM. Clojure is a general-purpose, functional language that is dynamically and strongly typed. If you have come in contact with Lisp, you will notice that Clojure is a Lisp dialect leveraging the code-as-data philosophy. It is said to be homoiconic
. Clojure's emphasis on side-effect-free-functions and immutability together with explicit support for concurrent programming makes it an attractive language for scaling out code onto multiple cores.I will not spend more time on introducing Clojure and its potential here, instead I suggest you visit the Clojure website
when you are ready to learn more. But now, let's start exploring Clojure and its Java interoperability features immediately by looking at some code. This code is written for Clojure 1.0
.
The Java method we are about to convert to Clojure takes a list of character strings as input and returns a new list of the unique strings, sorted first by frequency and then, for entries with the same frequency, alphabetically.
Our JUnit test looks like this...
public class OrdererTestUsingJava extends TestCase {
private void doTestOrderByFreq(Orderer orderer) {
List unordered = Arrays.asList("b", "d", "b", "b", "a", "c", "a");
List ordered = orderer.orderByFreq(unordered);
assertEquals(Arrays.asList("b","a","c","d"), ordered);
}
public void testJavaImpl() {
doTestOrderByFreq(new JavaOrderer());
}
}
package pnehm;
import org.apache.commons.collections.Bag;
import org.apache.commons.collections.bag.HashBag;
import java.util.List;
import java.util.Comparator;
import java.util.Collections;
import java.util.ArrayList;
public class JavaOrderer implements Orderer {
public List orderByFreq(List in) {
final Bag bag = new HashBag(in);
List out = new ArrayList(bag.uniqueSet());
Collections.sort(out, new Comparator() {
public int compare(String a, String b) {
int freq = Integer.valueOf(bag.getCount(b))
.compareTo(bag.getCount(a));
return freq != 0 ? freq : a.compareTo(b);
}
});
return out;
}
}
package pnehm;
import java.util.List;
public interface Orderer {
List orderByFreq(List in);
}
(ns step1.pnehm.clojure-orderer
(:gen-class
:name step1.pnehm.ClojureOrderer
:implements [pnehm.Orderer]
))
.With the first script file ready, it's now time to implement the interface. By default the generated class will look for a Clojure function with the same name as the interface method, prefixed by '-'. The prefix can be changed by supplying the option :prefix
Our first iteration will be a direct port of the Java version. In other words, we will use the same Java classes, but call them from Clojure. This may seem like a weird thing to do; if we want to write Java, why take on extra work by doing it from Clojure? Well, we have to start somewhere, we have the Java version and we know that it works, and it gives us a nice way of exploring how Java code can be called from Clojure.
So, our next step will be to import the Java classes that we need. java.lang is already imported for us, just as in Java, but if we want more stuff we have to explicitly specify it. Let's add the imports we need to our ns macro call:
(ns step1.pnehm.clojure-orderer
(:gen-class
:name step1.pnehm.ClojureOrderer
:implements [pnehm.Orderer]
)
(:import [org.apache.commons.collections Bag]
[org.apache.commons.collections.bag HashBag]
[java.util List]
[java.util Comparator]
[java.util Collections]
[java.util ArrayList]))
(defn -orderByFreq [_ arg] ;function's body goes here )
With the function definition taken care of, all that remains is to write the function's body, and we're done. This is what the body looks like:
(defn -orderByFreq [_ arg]
(let [bag (HashBag. arg)
out (ArrayList. (.uniqueSet bag))
cmpr (proxy [Comparator] []
(compare [a b]
(let [freq (.compareTo (.getCount bag b) (.getCount bag a))]
(if-not (zero? freq) freq (.compareTo a b)))))]
(do
(Collections/sort out, cmpr)
out)))

First we use the let macro to bind a new HashBag instance to the name bag, and the ArrayList of unique character strings to the name out, this corresponds to the first two lines of our Java implementation. (HashBag.) is Clojure for new HashBag(), And (ArrayList. (.uniqueSet bag)) calls the method uniqueSet() on bag, and passes the result into the ArrayList constructor.
The third let binding is cmpr. cmpr is our java.util.Comparator implementation, here generated as a Dynamic Proxy by Clojure. Dynamic Proxies are commonly used in Clojure to generate Java code "on the fly", which we use here to implement the Java interface java.util.Compare in Clojure. (compare [a b]...) is our Clojure implementation of Comparator
if is a Clojure special form for branching. If the test evaluates to 'true', then the first expression is evaluated and returned, otherwise the second expression is evaluated and returned.
So, with our ArrayList set up as out and the Comparator in place as cmpr, we can use the java.util.Collections.sort() method to sort our list according to cmpr. The Clojure special form do evaluates the expressions in order, and returns the result of the last expression. We need to use it here since Collection.sort() doesn't return the sorted collection.
And with that, we have a working implementation. The solution, however, doesn't feel very Clojure-ish. Let's continue and see if we can approach some more idiomatic Clojure.
First let us try to get rid of the HashBag and ArrayList. Functional languages like Clojure typically have very strong support for manipulating lists. In our alternative implementation we create a map with frequency as keys and the character string as value. We can then sort it with a slightly modified version of our comparator. We start by creating a function called count-words:
(defn count-words [coll]
(reduce #(merge-with + %1 {%2 1}) {} coll))
A simple example with reduce, the function +, and a vector could be: (reduce + 0 [1 2 3 4 5]). The result of this would be 15. In this case the init value 0 doesn't really contribute anything, so we can just skip it, using a different version of reduce to get the same result: (reduce + [1 2 3 4 5]).
Our supplied init value in the code fragment above is {}, an empty map. The function, #(merge-with + %1 {%2 1}), is an anonymous function using merge-with. merge-with merges any number of maps into one map, from left to right. In the case of a key collision, the supplied function is used to calculate a new key using the values of the two original keys. In our anonymous function %1 refers to the first parameter and %2 to the second, as supplied by reduce.
Let us see what happens when the count-words function is called on our test list: "b", "d", "b", "b", "a", "c", "a".
The very first call to merge-with will get the empty map and the first item in our collection, "b", as parameters by reduce. The empty map is bound to %1 and the "b" is bound to %2. The expression {%2 1} evaluates to a new map with the "b" as key, and 1 as value. merge-with will merge these two into a new map: {"b" 1}. The second time merge-with is called, it will be with our new map, {"b" 1}, and the next item from our collection, {"d" 1}, and create a new map: {"b" 1, "d" 1}. The third time, {"b" 1, "d" 1} will be merged with {"b" 1}, this time a key collision occurs, since we already have a key "b". So, the function supplied to merge-with will be used, in our example that function is +. 1 + 1 is 2 even in Clojure, so what we get is the following map: {"b" 2, "d" 1}. This will go on until all items are processed and we have a complete map with character strings mapped to frequencies.With a slightly modified Comparator, and making sure we only return the keys from our sorted map, we're good to go. We have to replace Collections/sort with the Clojure equivalent sort since it is able to sort our map:
(defn count-words [coll]
(reduce #(merge-with + %1 {%2 1}) {} coll))
(def cmpr
(proxy [Comparator] []
(compare [a b]
(let [freq (.compareTo (.val b) (.val a))]
(if-not (zero? freq) freq (.compareTo (.key a) (.key b)))))))
(defn -orderByFreq [_ arg]
(let [out (count-words arg)]
(if (empty? out) () (keys (sort cmpr out)))))
(defn cmpr [[val1 freq1] [val2 freq2]]
(let [freq (compare freq2 freq1)]
(if-not (zero? freq) freq (compare val1 val2))))
(defn -orderByFreq [_ coll]
(if (empty? coll) () (keys (sort cmpr (count-words coll)))))
There is a unit-testing framework in the Clojure user contri
b
libraries called test-is. It is not part of Clojure 1.0 core and must be downloaded and built separately from clojure-contrib. In the next release of Clojure the testing framework is moved into core and can then be found in the clojure.test namespace.Re-implementing our test with test-is looks like this:
(ns step3.pnehm.clojure-orderer-test (:use step3.pnehm.clojure-orderer clojure.contrib.test-is)) (deftest test-order-by-freq-1 (is (= ["b","a","c","d"] (-orderByFreq :a ["b", "d", "b", "b", "a", "c", "a"]))))
(ns step5.pnehm.java-orderer-test (:use clojure.contrib.test-is) (:import [pnehm JavaOrderer])) (deftest test-order-by-freq-1 (is (= ["b","a","c","d"] (.orderByFreq (JavaOrderer.) ["b", "d", "b", "b", "a", "c", "a"]))))
(ns step4.pnehm.clojure-orderer)
(defn count-words [coll]
(reduce #(merge-with + %1 {%2 1}) {} coll))
(defn cmpr [[val1 freq1] [val2 freq2]]
(let [freq (compare freq2 freq1)]
(if-not (zero? freq) freq (compare val1 val2))))
(defn order-by-freq [coll]
(keys (sort cmpr (count-words coll))))
Rerunning the micro benchmark from Peter Backlunds's previous article shows that the Clojure implementation gives the following numbers for sorting 100 characters with 10000 samples on my Core 2 Duo:
Java - 111 ms
Groovy - 436 ms
Clojure - 970 ms
So, the Java implementation is almost nine times as fast as the Clojure implementation, and four times as fast as the Groovy implementation. As always, take these kind of measurements with a grain of salt.
By using functions from Clojure-contrib the performance of the Clojure implementation can be increased to match the Groovy implementation, see followup
blog entry.
All code used in this article can be found in this Google code project: http://code.google.com/p/pnehm-java-to-cool-language-x/
Here you will also find the Groovy implementation from the previous article.
Many thanks to the friendly members of the Clojure Google group
.
Clojure development is an ongoing effort. If you are curious about what is in store for the future, have a look at http://clojure.org/todo
and the Clojure space at Assembla
.

![]() |
Patrik Fredriksson is a consultant and mentor in software development at Citerus. Patrik is especially interested in design and architecture in Java, and in productivity in development projects. You can email Patrik at patrik dot fredriksson at citerus dot se. |

![]() |