Clojure

Concurrent Programming

Today’s systems have to deal with many simultaneous tasks and leverage the power of multi-core CPUs. Doing so with threads can be very difficult due to the complexities of synchronization. Clojure simplifies multi-threaded programming in several ways. Because the core data structures are immutable, they can be shared readily between threads. However, it is often necessary to have state change in a program. Clojure, being a practical language, allows state to change but provides mechanism to ensure that, when it does so, it remains consistent, while alleviating developers from having to avoid conflicts manually using locks etc. The software transactional memory system (STM), exposed through dosync, ref, ref-set, alter, et al, supports sharing changing state between threads in a synchronous and coordinated manner. The agent system supports sharing changing state between threads in an asynchronous and independent manner. The atoms system supports sharing changing state between threads in a synchronous and independent manner. The dynamic var system, exposed through def, binding, et al, supports isolating changing state within threads.

In all cases, Clojure does not replace the Java thread system, rather it works with it. Clojure functions are java.util.concurrent.Callable, therefore they work with the Executor framework etc.

Most of this is covered in more detail in the concurrency screencast.

Refs are mutable references to objects. They can be ref-set or altered to refer to different objects during transactions, which are delimited by dosync blocks. All ref modifications within a transaction happen or none do. Also, reads of refs within a transaction reflect a snapshot of the reference world at a particular point in time, i.e. each transaction is isolated from other transactions. If a conflict occurs between 2 transactions trying to modify the same reference, one of them will be retried. All of this occurs without explicit locking.

In this example a vector of Refs containing integers is created (refs), then a set of threads are set up (pool) to run a number of iterations of incrementing every Ref (tasks). This creates extreme contention, but yields the correct result. No locks!

(import '(java.util.concurrent Executors))

(defn test-stm [nitems nthreads niters]
  (let [refs  (map ref (repeat nitems 0))
        pool  (Executors/newFixedThreadPool nthreads)
        tasks (map (fn [t]
                      (fn []
                        (dotimes [n niters]
                          (dosync
                            (doseq [r refs]
                              (alter r + 1 t))))))
                   (range nthreads))]
    (doseq [future (.invokeAll pool tasks)]
      (.get future))
    (.shutdown pool)
    (map deref refs)))

(test-stm 10 10 10000)
-> (550000 550000 550000 550000 550000 550000 550000 550000 550000 550000)

In typical use refs can refer to Clojure collections, which, being persistent and immutable, efficiently support simultaneous speculative 'modifications' by multiple transactions. Mutable objects should not be put in refs.

By default Vars are static, but per-thread bindings for Vars defined with metadata mark them as dynamic. Dynamic vars are also mutable references to objects. They have a (thread-shared) root binding which can be established by def, and can be set using *set!*, but only if they have been bound to a new storage location thread-locally using binding. Those bindings and any subsequent modifications to those bindings will only be seen within the thread by code in the dynamic scope of the binding block. Nested bindings obey a stack protocol and unwind as control exits the binding block.

(def ^:dynamic *v*)

(defn incv [n] (set! *v* (+ *v* n)))

(defn test-vars [nthreads niters]
  (let [pool (Executors/newFixedThreadPool nthreads)
        tasks (map (fn [t]
                     #(binding [*v* 0]
                        (dotimes [n niters]
                          (incv t))
                        *v*))
                   (range nthreads))]
      (let [ret (.invokeAll pool tasks)]
        (.shutdown pool)
        (map #(.get %) ret))))

(test-vars 10 1000000)
-> (0 1000000 2000000 3000000 4000000 5000000 6000000 7000000 8000000 9000000)
(set! *v* 4)
-> java.lang.IllegalStateException: Can't change/establish root binding of: *v* with set

Dynamic vars provide a way to communicate between different points on the call stack without polluting the argument lists and return values of the intervening calls. In addition, dynamic vars support a flavor of context-oriented programming. Because fns defined with defn are stored in vars, they too can be dynamically rebound:

(defn ^:dynamic say [& args]
  (apply str args))

(defn loves [x y]
  (say x " loves " y))

(defn test-rebind []
  (println (loves "ricky" "lucy"))
  (let [say-orig say]
    (binding [say (fn [& args]
                      (println "Logging say")
                      (apply say-orig args))]
      (println (loves "fred" "ethel")))))

(test-rebind)

ricky loves lucy
Logging say
fred loves ethel