* Use Integer calculation instead of double, add unit-test
* Bring back StationIdent optimization
Originally, StationIdent was using byte[] to store names, so the extra
String allocation could be avoided. However, that produced incorrect
sorting.
Sorting is now moved to the result merging step. Here, names are
converted to Strings.
* Implement readStationName with SIMD 256bit
* Rebase and cleanup test code, now that it's in the project
* Fix seijikun formatting
* Fix test failure in specific jobCnt edge-cases
* Also switch to graalvm
In case of key collision broken implementation will likely attribute
measurements to the wrong key and therefore it is better to have
non-zero value to end up with a wrong average value.
When all measurements are zero then averages are also zero even
when attributed to the wrong keys.
Updates #91
* Added tests for endian-calculations (had these in a different class, perhaps handy for others to see as well)
Inlined the hash function, runs locally in 2.4sec now, hopefully endian issues fix
Added equals to support any city name up to 1024 in length, don't rely on hash
* For clarity I've updated the code so endian doesn't change the hashes, easier to debug.
* Fixing bug in array check
Simple is faster
* Also spotted the diff, not just the big exception
Fixed buffer limit issue
Input created via
```sh
bash -c 'for i in {1..10000} ; do echo "id$i;0.0" ; done' >./src/test/resources/samples/measurements-10000-unique-keys.txt
```
and output via baseline implementation.
Keys are short and very similar which improves chances for collision
and hence are good for testing.
Fixes#91
The script tests all implementations and prints PASS or FAIL status.
In case of failure it also prints implementation output to stderr.
This will be handy for adding new test samples.
Show test statuses and omit failing output:
```sh
$ ./test_all.sh 2>/dev/null
PASS artsiomkorzun
PASS baseline
PASS bjhara
PASS criccomini
FAIL ddimtirov
FAIL ebarlas
PASS filiphr
FAIL itaske
PASS khmarbaise
FAIL kuduwa-keshavram
FAIL lawrey
PASS padreati
FAIL palmr
PASS richardstartin
FAIL royvanrijn
FAIL seijikun
PASS spullara
PASS truelive
```
Show only passing implementations:
```
$ ./test_all.sh 2>/dev/null | grep PASS | cut -d' ' -f2
artsiomkorzun
baseline
bjhara
criccomini
filiphr
khmarbaise
padreati
richardstartin
spullara
truelive
```
For #61
* Use open-addressing scheme to deal with hash table collisions. Reduce concurrency from 16 to 8. Use bit mask rather than mod operator to confine hash code to table range.
* Properly handle file partitions that reside entirely within a line.
* Reorder statements in doProcessBuffer.