* Use open-addressing scheme to deal with hash table collisions. Reduce concurrency from 16 to 8. Use bit mask rather than mod operator to confine hash code to table range.
* Properly handle file partitions that reside entirely within a line.
* Reorder statements in doProcessBuffer.
Added SWAR (SIMD Within A Register) code to increase bytebuffer processing/throughput
Delaying the creation of the String by comparing hash, segmenting like spullara, improved EOL finding
Co-authored-by: Gunnar Morling <gunnar.morling@googlemail.com>
* Initial implementation using Shenandoah GC and parallel iteration
* Use memory mapped files
* Iterate the buffer once and use BigDecimal parsing instead of Double.parseDouble
* Add information about Graal
* Add sdk use to calculate script
* Memory mapped file, single-pass parsing, custom hash map, fixed thread pool
The threading was a hasty addition and needs work
* Used arraylist instead of treemap to reduce a little overhead
We only need it sorted for output, so only construct a treemap for output
* Attempt to speed up double conversion
* Cap core count for low-core systems
* Fix wrong exponent
* Accumulate measurement value in double, seems marginally faster
Benchmark Mode Cnt Score Error Units
DoubleParsingBenchmark.ourToDouble thrpt 10 569.771 ± 7.065 ops/us
DoubleParsingBenchmark.ourToDoubleAccumulateInToDouble thrpt 10 648.026 ± 7.741 ops/us
DoubleParsingBenchmark.ourToDoubleDivideInsteadOfMultiply thrpt 10 570.412 ± 9.329 ops/us
DoubleParsingBenchmark.ourToDoubleNegative thrpt 10 512.618 ± 8.580 ops/us
DoubleParsingBenchmark.ourToDoubleNegativeAccumulateInToDouble thrpt 10 565.043 ± 18.137 ops/us
DoubleParsingBenchmark.ourToDoubleNegativeDivideInsteadOfMultiply thrpt 10 511.228 ± 13.967 ops/us
DoubleParsingBenchmark.stringToDouble thrpt 10 52.310 ± 1.351 ops/us
DoubleParsingBenchmark.stringToDoubleNegative thrpt 10 50.785 ± 1.252 ops/us