* single thread memory mapped file reader, pool of processors
* cleanup of inner classes of MetricProcessor
* doubles are parsed without external functions, strings are lazily created from byte arrays
* remove load() MappedByteBuffer in memory
* fixed handling of newline
* fix a bug & correct locale used
* MappedByteBuffer size set to 1MB
* fixed rounding
* Do not use ArrayBlockingQueue.offer since it drops elements when queue is full
* MappedByteBuffer size = 32 MB
* Adding kgeri's solution
* parallelizing CalculatorAverage_kgeri
* fixing aggregation bugs, chunk size calc for small files
* removed GC logging
Co-authored-by: Gunnar Morling <gunnar.morling@googlemail.com>
* fix for when there's no newline at end of input
* fix for when the final record ends on the chunk boundary
---------
Co-authored-by: Gunnar Morling <gunnar.morling@googlemail.com>
* improve double reading by eleminating string parsing in between, make calculations over on integer instead of double, parse into double at the end only once
* more improvements, sharing a single StringBuilder to build all toStrings, minor performance gain.
* micro optimizations on reading temperature
* a small skip for redundant traverses, micro optmization
* micro optimization, eleminate some if cases, saves 0.5 seconds more
* micro optimization, calculate key hash ahead eleminates more more loop, saves 0.5 seconds more :)
* optimize key equals and handling the case when a region is larger than max integer size
---------
Co-authored-by: Yavuz Tas <yavuz.tas@ing.com>
* Initial version with multiple ideas
* Added virtual thread implementation based on certain task size
* Removed evaluate file
* Fixed test issues
* Added a custom input split
* first try
* format
* Update calculate_average_imrafaelmerino.sh
Co-authored-by: Gunnar Morling <gunnar.morling@googlemail.com>
* Update src/main/java/dev/morling/onebrc/CalculateAverage_imrafaelmerino.java
Co-authored-by: Gunnar Morling <gunnar.morling@googlemail.com>
---------
Co-authored-by: Rafael Merino García <imrafaelmerino@gmail.com>
Co-authored-by: Gunnar Morling <gunnar.morling@googlemail.com>
* Implementation of 1brc - felix19350
* Added license header
* Fixed failing tests
* Replaced parsing of doubles with a custom parser and integer arithmetic
---------
Co-authored-by: Bruno Felix <bruno.felix@klarna.com>
* Initial version.
* Make PGO feature optional off-by-default. Needs PGO_MODE environment
variable to be set. Add -O3 -march=native tuning flags for better
performance.
* Adjust script to be more quiet.
* Adjust max city length. Fix an issue when accumulating results.
* Tune thomaswue submission.
mmap the entire file, use Unsafe directly instead of ByteBuffer, avoid byte[] copies.
These tricks give a ~30% speedup, over an already fast implementation.
* Optimize parsing of numbers based on specific given constraints.
* Fix for segment calculation for case of very small input.
* Minor shell script fixes.
* Separate out build step into file additional_build_step_thomaswue.sh,
simplify run script and remove PGO option for now.
* Minor corrections to the run script.
---------
Co-authored-by: Alfonso² Peterssen <alfonso.peterssen@oracle.com>
* A solution with Actor Model concurrency and MappedByteBuffer
* fix test cases
* revert back the file name to original
* cache String hashCode calculation via composing with Key object
* fix wrong key caching and eleminate duplicate String creation between actors
* update possible char count in a line, fix calculate_average.sh
* increase possible line length to 256 bytes, much safer to cover 100 chars I hope
---------
Co-authored-by: Yavuz Tas <yavuz.tas@ing.com>
* artpar's attempt
* artpar's attempt
* remove int -> Integer conversions, custom parsing for measurements
* remove allocations by caching station names
also remove Integer and use int instead to remove valueOf calls
* Fix result mismatch errors
* parse int instead of double
* reduce time spent reading the mapped buffer
* cleanup unused memory
* less is faster ? vector addition doesn't look worth it
* backout from virtual threads as well
* Fix breaking tests
Somewhat mixed collection of multiple ideas, mostly based initially
on using the new JDK Vector API for extracting offsets of newlines
and semicolons.
Runs locally in just under 11 seconds on 1B rows of input on a
2020 M1 Macbook Air.
* isolgpus: submission 1
* isolgpus: fix min value bug (breaks if a negative temperature never appears)
* isolgpus: remove unused collector
* isolgpus: fix split on chunk bug
* isolgpus: change name equality algo to a cheaper check.
* isolgpus: fix chunking state to cope with last byte of last chunk
* isolgpus: hash as we go, instead of at the end
* isolgpus: adjust thread count to core count
* isolgpus: change cores to 8 statically
---------
Co-authored-by: Jamie Stansfield <jalstansfield@gmail.com>
* First performance tweaks
* further tweaks
* collect into a treemap
* Tweak JVM options
* Inline rounding into collector
* reduce some operations
* oops, add missing braces
* tweak JVM options
* small fixes
* add min and max to processing
* fix min
* remove compact strings
* replace sumWithCompensation with naive sum implementation
* use UseShenandoahGC
* integrate mmap
* integrate mmap
* Fix messed up array logic
* Set jdk version