user=> (map (fn [x] (.toUpperCase x)) (.split "Dasher Dancer Prancer" " "))
("DASHER" "DANCER" "PRANCER")
Clojure has a rich set of data structures. They share a set of properties:
They are immutable
They are read-able
They support proper value equality semantics in their implementation of equals
They provide good hash values
In addition, the collections:
Are manipulated via interfaces.
Support sequencing
Support persistent manipulation.
Support metadata
Implement java.lang.Iterable
Implement the non-optional (read-only) portion of java.util.Collection or java.util.Map
nil is a possible value of any data type in Clojure. nil has the same value as Java null. The Clojure conditional system is based around nil and false, with nil and false representing the values of logical falsity in conditional tests - anything else is logical truth. In addition, nil is used as the end-of-sequence sentinel value in the sequence protocol.
Clojure provides full support for JVM primitive values by default, allowing high performance, idiomatic Clojure code for numeric applications.
Clojure also supports the Java boxed number types derived from java.lang.Number, including BigInteger and BigDecimal, plus its own Ratio type. There is some special handling:
By default Clojure operates with natural numbers as instances of Java’s long primitive type. When a primitive integer operation results in a value that is too large to be contained in a primitive value, a java.lang.ArithmeticException is thrown. Clojure provides a set of alternative math operators suffixed with an apostrophe: +', -', *', inc', and dec'. These operators auto-promote to BigInt upon overflow, but are less efficient than the regular math operators.
Represents a ratio between integers. Division of integers that can’t be reduced to an integer yields a ratio, i.e. 22/7 = 22/7, rather than a floating point or truncated value.
BigInts and floating point types are "contagious" across operations. That is, any integer operation involving a BigInt will result in a BigInt, and any operation involving a double or float will result in a double.
Numeric literals for BigInt and BigDecimal are specified using a postfix N and M respectively.
Example expression | Return value |
---|---|
|
|
|
|
|
|
|
|
|
|
|
|
Computation: + - * / inc dec quot rem min max
Auto-promoting computation: +' -' *' inc' dec'
Comparison: == < <= > >= zero? pos? neg?
Bitwise operations: bit-and bit-or bit-xor bit-not bit-shift-right bit-shift-left
Ratios: numerator denominator
Coercions: int bigdec bigint double float long num short
Clojure strings are Java Strings. See also Printing.
user=> (map (fn [x] (.toUpperCase x)) (.split "Dasher Dancer Prancer" " "))
("DASHER" "DANCER" "PRANCER")
Keywords are symbolic identifiers that evaluate to themselves. They provide very fast equality tests. Like Symbols, they have names and optional namespaces, both of which are strings. The leading ':' is not part of the namespace or name.
Keywords implement IFn for invoke() of one argument (a map) with an optional second argument (a default value). For example (:mykey my-hash-map :none)
means the same as (get my-hash-map :mykey :none)
. See get.
Symbols are identifiers that are normally used to refer to something else. They can be used in program forms to refer to function parameters, let bindings, class names, class members, and global vars. They have names and optional namespaces, both of which are strings. Symbols can have metadata (see with-meta).
Symbols, just like Keywords, implement IFn for invoke() of one argument (a map) with an optional second argument (a default value). For example ('mysym my-hash-map :none)
means the same as (get my-hash-map 'mysym :none)
. See get.
All of the Clojure collections are immutable and persistent. In particular, the Clojure collections support efficient creation of 'modified' versions, by utilizing structural sharing, and make all of their performance bound guarantees for persistent use. The collections are efficient and inherently thread-safe. Collections are represented by abstractions, and there may be one or more concrete realizations. In particular, since 'modification' operations yield new collections, the new collection might not have the same concrete type as the source collection, but will have the same logical (interface) type.
All the collections support count for getting the size of the collection, conj for 'adding' to the collection, and seq to get a sequence that can walk the entire collection, though their specific behavior is slightly different for different types of collections.
Because collections support the seq function, all of the sequence functions can be used with any collection.
Clojure provides its own hash computations that provide better hash properties for collections (and other types), known as the hasheq value.
The IHashEq
interface marks collections that provide the hasheq()
function to obtain the hasheq value. In Clojure, the hash function can be used to compute the hasheq value.
Ordered collections (vector, list, seq, etc) must use the following algorithm for calculating hasheq (where hash computes hasheq). Note that unchecked-add-int and unchecked-multiply-int are used to get integer overflow calculations.
(defn hash-ordered [collection]
(-> (reduce (fn [acc e] (unchecked-add-int
(unchecked-multiply-int 31 acc)
(hash e)))
1
collection)
(mix-collection-hash (count collection))))
Unordered collections (maps, sets) must use the following algorithm for calculating hasheq. A map entry is treated as an ordered collection of key and value. Note that unchecked-add-int is used to get integer overflow calculations.
(defn hash-unordered [collection]
(-> (reduce unchecked-add-int 0 (map hash collection))
(mix-collection-hash (count collection))))
The mix-collection-hash algorithm is an implementation detail subject to change.
A Vector is a collection of values indexed by contiguous integers. Vectors support access to items by index in log32N hops. count is O(1). conj puts the item at the end of the vector. Vectors also support rseq, which returns the items in reverse order. Vectors implement IFn, for invoke() of one argument, which they presume is an index and look up in themselves as if by nth, i.e. vectors are functions of their indices. Vectors are compared first by length, then each element is compared in order.
A Map is a collection that maps keys to values. Two different map types are provided - hashed and sorted. Hash maps require keys that correctly support hashCode and equals. Sorted maps require keys that implement Comparable, or an instance of Comparator. Hash maps provide faster access (log32N hops) vs (logN hops), but sorted maps are, well, sorted. count is O(1). conj expects another (possibly single entry) map as the item, and returns a new map which is the old map plus the entries from the new, which may overwrite entries of the old. conj also accepts a MapEntry or a vector of two items (key and value). seq returns a sequence of map entries, which are key/value pairs. Sorted map also supports rseq, which returns the entries in reverse order. Maps implement IFn, for invoke() of one argument (a key) with an optional second argument (a default value), i.e. maps are functions of their keys. nil keys and values are ok.
Create a new map: hash-map sorted-map sorted-map-by
'change' a map: assoc dissoc select-keys merge merge-with zipmap
Examine a map: get contains? find keys vals map?
Examine a map entry: key val
Most uses of StructMaps would now be better served by records. |
Often many map instances have the same base set of keys, for instance when maps are used as structs or objects would be in other languages. StructMaps support this use case by efficiently sharing the key information, while also providing optional enhanced-performance accessors to those keys. StructMaps are in all ways maps, supporting the same set of functions, are interoperable with all other maps, and are persistently extensible (i.e. struct maps are not limited to their base keys). The only restriction is that you cannot dissociate a struct map from one of its base keys. A struct map will retain its base keys in order.
StructMaps are created by first creating a structure basis object using create-struct or defstruct, then creating instances with struct-map or struct.
(defstruct desilu :fred :ricky)
(def x (map (fn [n]
(struct-map desilu
:fred n
:ricky 2
:lucy 3
:ethel 4))
(range 100000)))
(def fred (accessor desilu :fred))
(reduce (fn [n y] (+ n (:fred y))) 0 x)
-> 4999950000
(reduce (fn [n y] (+ n (fred y))) 0 x)
-> 4999950000
StructMap setup: create-struct defstruct accessor
Create individual struct: struct-map struct
When doing code form manipulation it is often desirable to have a map which maintains key order. An array map is such a map - it is simply implemented as an array of key val key val… As such, it has linear lookup performance, and is only suitable for very small maps. It implements the full map interface. New ArrayMaps can be created with the array-map function. Note that an array map will only maintain sort order when un-'modified'. Subsequent assoc-ing will eventually cause it to 'become' a hash-map.
Sets are collections of unique values.
There is literal support for hash-sets:
#{:a :b :c :d}
-> #{:d :a :b :c}
You can create sets with the hash-set and sorted-set functions:
(hash-set :a :b :c :d)
-> #{:d :a :b :c}
(sorted-set :a :b :c :d)
-> #{:a :b :c :d}
You can also get a set of the values in a collection using the set function:
(set [1 2 3 2 1 2 3])
-> #{1 2 3}
Sets are collections:
(def s #{:a :b :c :d})
(conj s :e)
-> #{:d :a :b :e :c}
(count s)
-> 4
(seq s)
-> (:d :a :b :c)
(= (conj s :e) #{:a :b :c :d :e})
-> true
Sets support 'removal' with disj, as well as contains? and get, the latter returning the object that is held in the set which compares equal to the key, if found:
(disj s :d)
-> #{:a :b :c}
(contains? s :b)
-> true
(get s :a)
-> :a
Sets are functions of their members, using get:
(s :b)
-> :b
(s :k)
-> nil
Clojure provides basic set operations like union / difference / intersection, as well as some pseudo-relational algebra support for 'relations', which are simply sets of maps - select / index / rename / join.