Kurser i Domain-Driven Design - VĂ¥ren 2012




Tuesday, September 01, 2009

Clojure Crash Course For Java Developers

Updated, fixed bug in test-vec def

Looking at Clojure code coming from a Java background may be quite a daunting experience at first. So I have put together a short introduction that I hope can facilitate going from Java to Clojure.

Clojure is a functional programming language. As such we use functions to manipulate data, we avoid mutable state and side-effects. In a way it can be viewed as programming Java with only static methods and immutable objects. At first this may sound like a huge limitation, but in contrast to Java, functions in Clojure are are first-class citizens and Clojure fully support higher-order functions. That is, we can treat functions like any other piece of data, functions can be assigned to variables, functions can be passed as arguments to other functions and we can write functions that return functions. This can be very powerful and by keeping data immutable and strive for side-effect free behavior the idea is that we can write more robust programs that  more easily can scale to multiple cores.

Clojure is a homoiconic language, that is, data and program instructions are represented the same way. This may look weird at first, and could need some time getting used to. Clojure programs typically starts out as text (in a text file) and is converted by a reader into data structures used by the compiler. The reader reads the text form by form. A form is a chunk of data that can be translated into a Clojure data structure. Numbers, like 123, are forms, so are lists, (+ 1 2) and vectors, [1 2 3].

Program and data are expressed as s-expressions, symbolic expressions, in a parenthesized prefix-notation. I.e. the function comes first and the data follows. The Clojure expression for adding the numbers 1, 2, and 3 is: (+ 1 2 3) This is a list and as such it will be evaluated as a function call. The function is + and the data to apply the function on comes after that.

Expressions can be, and typically are nested: (+ 1 2 3 (- 2 3)) => 5

REPL
The Read Evaluate Print Loop is an excellent tool to use when experimenting with Clojure. It is part of the Clojure distribution, so download and set it up before we proceed, and you will be able to try out things live as we look closer at the language:

1) Download Clojure (examples below are for Clojure 1.0.0).
2) Extract the zip and run java -cp clojure-1.0.0.jar clojure.lang.Repl

You should see something like:

Clojure 1.0.0-

user=>

Where user indicates the current namespace.

For a somewhat more comfortable REPL experience, add the JLine ConsoleRunner:

java -cp jline-0.9.94.jar:clojure-1.0.0.jar jline.ConsoleRunner clojure.lang.Repl

Forms
We have already met the numeric form, which is a literal, the vector, and the list but there are other forms as well:

Symbols:
Symbols are used to name things such as functions, namespaces (a Clojure equivalent of Java packages) and Java classes. We have already seen one: the + operator, which is a function.

Literals:
Strings, numbers, characters, nil (Java null), booleans (true, false) and keywords. Keywords are like symbols but begins with a colon e.g. :color.

Map:
Much like Java Maps, Clojure maps are zero or more key/value pairs enclosed in braces: {:language "Clojure", :kind "Functional"}
(commas are treated like witespace, leave them out if you like)

Set:
Much like Java Set, Clojure sets are zero or more forms enclosed in braces with a leading #: #{:red :green :orange}

These forms are translated in to data structures by the reader and compiler. More on Clojure data structures, their characteristics, related functions and relation to Java can be found on the Clojure Data Structures page.

Special Forms
In addition to the forms we have mentioned so far we have what are called "special forms", these are forms evaluated in a special way to do things normal forms can not do. There are a few particularily important ones:

(def symbol init?)
Binds a symbol to a global var in the current namespace, this is used to define things we need access to, e.g. a function or a value. Try it out using the REPL:

user=> (def a 42)
> #'user/a

user=> a
> 42

(if test then else?)
Evaluate test, if not nil or false, evaluate then.

user=> (if (> 3 2) "TRUE" "FALSE")
> "TRUE"

(let [bindings* ] exprs*)
Create local bindings used when evaluating exprs*

user=> (let [a 2] (> a 3))
> false

(fn name? [params* ] exprs*)
Defines a function. If we don't need to refer to the function, we can make it anonymous by not supplying a name:

Create a function that adds two values, and immediately call it on 1 and 2:

user=> ((fn [a b] (+ a b)) 1 2)
> 3

More special forms: http://clojure.org/special_forms

Macros
Clojure macros make it possible to add new functionality to the language, for instance by combining primitive forms. As a Clojure beginner you can safely ignore Clojure's macro functionality until you feel more confident in the language. But it can be good to know that many constructs in Clojure are macros, but they look just like ordinary functions. One of the most commonly used ones are probably (defn name doc-string? attr-map? [params*] body), used to create and name functions globally:

user=> (defn add [a b] (+ a b))
> #'user/add

user=> (add 2 3)
> 5

Using the function macroexpand we can see that defn is infact a combination of def and fn:

user=> (macroexpand '(defn add [a b] (+ a b)))
> (def add (clojure.core/fn ([a b] (+ a b))))

Immutable data.
The set of data structures in Clojure is extensive. We met a few of the structures when we talked about forms above. An important of property of Clojure data structures is that they are immutable. For example, operations on a list returns a new list with the result of the operation. The original list is untouched. We'll see examples of this below.

Functions
There is quite an extensive set of functions in the core Clojure libraries. Mathematical functions are named after their common symbols, as you would expect:

user=> (+ 1 2)
> 3

user=> (* 2 5)
> 10

user=> (> 1 4)
> false

There are also many functions to work on collections:

user=> (def test-vec [1 2 3 4 5 6 7 8 9 10])
#'user/test-vec

conj adds an element to the collection:

user=> (conj test-vec 11)
> [1 2 3 4 5 6 7 8 9 10 11]

map applies a given function to all elements of a collection, returning a new collection with the result:

(user=> (map #(* % 10) test-vec)      
> (10 20 30 40 50 60 70 80 90 100)

The construct #(* % 10) is a reader macro for anonymous functions, the % is short for %1 and is bound to the first (and here, only) argument. We could also have written this using the fn form:

user=> (map (fn [a] (* a 10)) test-vec)
> (10 20 30 40 50 60 70 80 90 100)

filter will keep all elements for which the given function evaluates to true:
user=> (filter even? test-vec)
(2 4 6 8 10)

Again, all these functions return a new collection, our initial test-vec is untouched.

See the complete Clojure APIs for all core functions. In addition to Clojure core there is also the core clojure-contrib, a user maintained library that also acts as an incubator for Clojure core.

Java integration
Clojure runs on the JVM and is designed to make it easy to call Java code from Clojure, and Clojure code from Java. This is a big benefit as we can always drop down to Java if there is something we need that is missing from the Clojure libraries.

Calling Java is done using the . (dot) special form:

Accessing a static field:

user=> (. Math PI)
> 3.141592653589793

Calling a static method:

user=> (. Math min 3 4)
> 3

Creating a new instance is done using new:

user=> (def l (new java.util.ArrayList))
> #'user/l

Accessing an instance method, again using the . form:

user=> (. l size)
> 0

user=> (. l add "Cheese")
> true

user=> (. l get 0)
> "Cheese"

Note here that as we access the ArrayList instance it behaves as expected in Java. Clojure's immutability is not extended to Java objects, they keep their original behavior.

This is the basics of Java interoperability, but there is a lot of syntactic sugar available to make Java easier to work with from Clojure:

Direct static member access can be written with /:

user=> Math/PI
> 3.141592653589793

If it's a static method, make it a call:

user=> (Math/min 3 4)
> 3

Object creation can be done with Classname.:

user=> (def l (java.util.ArrayList.))
> #'user/l

If it feels annoying having to fully qualify ArrayList, we can import it (Math, as used above, is in java.lang and is imported by default):

user=> (import '(java.util ArrayList))
> java.util.ArrayList

user=> (def l (ArrayList.))
> #'user/l

The ' (quote) used above is another reader macro preventing the list to be evaluated as a function call.

Instance calls can be shoretened by placing the thing we want to call, the method, first, prefix Clojure style:

user=> (.add l "Cheese")
> true

Chaining Java calls in Clojure looks a bit messy by default:

user=> (. (. (. l get 0)  (subSequence 1 3)) length)
> 2

Using the .. notation we can do this:

user=> (.. l (get 0) (subSequence 1 3) length)
> 2

The .. is, again, a Clojure macro, that expands to our first version above.

More on Java interoperability.

Tooling
REPL is all good for trying out things at the promt, but for writing larger Clojure programs you'd typicall want to use something else. Comming from a Java background all three major IDE:s have some kind of Clojure support at this point:

IntelliJ/IDEA: La Clojure plug-in
NetBeans: enclojure plug-in
Eclipse: clojure-dev

Hopefully this will get you started exploring Clojure. Have fun!

2 comments:

Adriano Bonat said...

(def test-vec [1 2 3 4 5 6 7 8 9 0])

following your examples, the last value should be 10, I think you mistyped that definition.

Very good intro to Clojure!

Patrik said...

You are correct, I have updated the text. Thank you for pointing it out!