### How LongAccumulator and DoubleAccumulator classes work?

Two classes new in Java 8 deserve some attention: `LongAccumulator` and `DoubleAccumulator`. They are designed to accumulate (more on what does that mean later) values across threads safely while being extremely fast. A test is worth a thousand words, so here is how it works:

```class AccumulatorSpec extends Specification {

public static final long A = 1
public static final long B = 2
public static final long C = 3
public static final long D = -4
public static final long INITIAL = 0L

def 'should add few numbers'() {
given:
LongAccumulator accumulator = new LongAccumulator({ long x, long y -> x + y }, INITIAL)
when:
accumulator.accumulate(A)
accumulator.accumulate(B)
accumulator.accumulate(C)
accumulator.accumulate(D)
then:
accumulator.get() == INITIAL + A + B + C + D
}
```
So the accumulator takes a binary operator and combines initial value with every accumulated value. That means `((((0 + 1) + 2) + 3) + -4)` equals to `2`. Don't go away yet, there's much more than that. Accumulator can take other operators as well, as illustrated by this use case:

```def 'should accumulate numbers using operator'() {
given:
LongAccumulator accumulator = new LongAccumulator(operator, initial)
when:
accumulator.accumulate(A)
accumulator.accumulate(B)
accumulator.accumulate(C)
accumulator.accumulate(D)
then:
accumulator.get() == expected
where:
operator                 | initial           || expected
{x, y -> x + y}          | 0                 || A + B + C + D
{x, y -> x * y}          | 1                 || A * B * C * D
{x, y -> Math.max(x, y)} | Integer.MIN_VALUE || max(A, B, C, D)
{x, y -> Math.min(x, y)} | Integer.MAX_VALUE || min(A, B, C, D)
}
```
Obviously accumulator would work just as well under heavy multi-threaded environment - which it was designed for. Now the question is, what other operations are permitted in `LongAccumulator` (this applies to `DoubleAccumulator` as well) and why? JavaDoc is not very formal this time (bold mine):

The order of accumulation within or across threads is not guaranteed and cannot be depended upon, so this class is only applicable to functions for which the order of accumulation does not matter. The supplied accumulator function should be side-effect-free, since it may be re-applied when attempted updates fail due to contention among threads. The function is applied with the current value as its first argument, and the given update as the second argument.

In order to understand how `LongAccumulator` works, what type of operations are permitted and why it's so fast (because it is, compared to e.g `AtomicLong`), let's start from the back, the `get()` method:

```transient volatile long base;
transient volatile Cell[] cells;

private final LongBinaryOperator function;

public long get() {
Cell[] as = cells; Cell a;
long result = base;
if (as != null) {
for (int i = 0; i < as.length; ++i) {
if ((a = as[i]) != null)
result = function.applyAsLong(result, a.value);
}
}
return result;
}
```
Which can be rewritten to not-exactly-equivalent but easier to read:

```public long get() {
long result = base;
for (Cell cell : cells)
result = function.applyAsLong(result, cell.value);
return result;
}
```
Or even more functionally without internal state:

```public long get() {
return Arrays.stream(cells)
.map(s -> s.value)
.reduce(base, function::applyAsLong);
}
```
We clearly see that there is some internal `cells` array and that in the end we must go through that array and apply our operator function sequentially on each element. Turns out `LongAccumulator` has two mechanisms for accumulating values: a single `base` counter and an array of values in case of high lock thread contention. If `LongAccumulator` is used under no lock contention, only a single `volatile base` variable and CAS operations are used, just like in `AtomicLong`. However if CAS fails, this class falls back to an array of values. You don't want to see the implementation, it's 90 lines long, occasionally with 8 levels of nesting. What you need to know is that it uses simple algorithm to always assign given thread to the same cell (improves cache locality). From now on this thread has its own, almost private copy of counter. It shares this copy with couple of other threads, but not with all of them - they have their own cells. So what you end up in the end is an array of semi-calculated counters which must be aggregated. This is what you saw in `get()` method.

This brings us again to the question, what kind of operators (`op`) are permitted in `LongAccumulator`. We know that the same sequence of accumulations under low load will result e.g. in:

```((I op A) op B)  //get()
```
Which means all values are aggregated in base variable and no counter array is used. However under high load, `LongAccumulator` will split work e.g. into two buckets (cells) and later accumulate buckets as well:

```(I op A)              //cell 1
(I op B)              //cell 2

(I op A) op (I op B)  //get()
```
or vice-versa:

```(I op B)              //cell 1
(I op A)              //cell 2

(I op B) op (I op A)  //get()
```
Clearly all invocations of `get()` should yield the same result, but it all depends on the properties of `op` operator being provided (`+`, `*`, `max`, etc.)

# Commutative

We have no control over the order of cells and how they are assigned. That's why `((I op A) op (I op B))` and `((I op B) op (I op A))` must return the same result. More compactly we are looking for such operators `op` where `X op Y = Y op X` for every `X` and `Y`. This means `op` must be commutative.

# Neutral element (identity)

Cells are logically initialized with identity (initial) value `I`. We have no control over the number and order of cells, thus the identity value can be applied numerous times in any order. However this is an implementation detail, so it shouldn't affect the result. More precisely, for every `X` and any `op`:

```X op I = I op X = X
```
Which means the identity (initial) value `I` must be a neutral value for every argument `X` to operator `op`.

# Associativity

Assume we have the following cells:

```I op A                              // cell 1
I op B                              // cell 2
I op C                              // cell 3
((I op A) op (I op B)) op (I op C)  //get()
```
but the next time they were arranged differently

```I op C                              // cell 1
I op B                              // cell 2
I op A                              // cell 2
((I op C) op (I op B)) op (I op A)  //get()
```
Knowing that `op` is commutative and `I` is a neutral element, we can prove that (for every `A`, `B` and `C`):

```((I op A) op (I op B)) op (I op C) = ((I op C) op (I op B)) op (I op A)
(A op B) op C = (C op B) op A
(A op B) op C = A op (B op C)
```
Which proves that `op` must be associative in order for `LongAccumulator` to actually work.

# Wrap up

`LongAccumulator` and `DoubleAccumulator` are highly specialized classes new in JDK 8. JavaDoc is quite vaque but we tried to prove properties that an operator and initial value must fullfil in order for them to do their job. We know that the operator must be associative, commutative and have a neutral element. It would have been so much better if JavaDoc clearly stated that it must be an abelian monoid ;-). Nevertheless for practical purposes these accumulators work only for adding, multiplying, min and max, as these are the only useful operators (with appropriate neutral element) that play well. Subtracting and dividing for example is not associative and commutative, thus can't possibly work. To make matters worse, accumulators would simply behave undeterministically.

1. 1. 