* Initial submission for jonathan_aotearoa
* Fixing typos
* Adding hyphens to prepare and calculate shell scripts so that they're aligned with my GitHub username.
* Making chunk processing more robust in attempt to fix the cause of the build error.
* Fixing typo.
* Fixed the handling of files less than 8 bytes in length.
* Additional assertion, comment improvements.
* Refactoring to improve testability. Additional assertion and comments.
* Updating collision checking to include checking if the station name is equal.
* Minor refactoring to make param ordering consistent.
* Adding a custom toString method for the results map.
* Fixing collision checking bug
* Fixing rounding bug.
* Fixing collision bug.
---------
Co-authored-by: jonathan <jonathan@example.com>
* CalculateAverage_pdrakatos
* Rename to be valid with rules
* CalculateAverage_pdrakatos
* Rename to be valid with rules
* Changes on scripts execution
* Fixing bugs causing scripts not to be executed
* Changes on prepare make it compatible
* Fixing passing all tests
* Increase direct memory allocation buffer
* Fixing memory problem causes heap space exception
* Initial impl
* Fix bad file descriptor error in the `calculate_average_serkan-ozal.sh`
* Disable Epsilon GC and rely on default GC. Because apparently, JIT and Epsilon GC don't play well together in the eval machine for short lived Vector API's `ByteVector` objects
* Take care of byte order before processing key length with bit shift operators
* Fix key equality check for long keys
/**
* Solution based on thomaswue solution, commit:
* commit d0a28599c2
* Author: Thomas Wuerthinger <thomas.wuerthinger@oracle.com>
* Date: Sun Jan 21 20:13:48 2024 +0100
*
* Changes:
* 1) Use LinkedBlockingQueue to store partial results, that
* will then be merged into the final map later.
* As different chunks finish at different times, this allows
* to process them as they finish, instead of joining the
* threads sequentially.
* This change seems more useful for the 10k dataset, as the
* runtime difference of each chunk is greater.
* 2) Use only 4 threads if the file is >= 14GB.
* This showed much better results on my local test, but I only
* run with 200 million rows (because of limited RAM), and I have
* no idea how it will perform on the 1brc HW.
*/
* fix test rounding, pass 10K station names
* improved integer conversion, delayed string creation.
* new algorithm hash, use ConcurrentHashMap
* fix rounding test
* added the length of the string in the hash initialization.
* fix hash code collisions
* cleanup prepare script
* native image options
* fix quardaric probing (no change to perf)
* mask to get the last chunk of the name
* extract hash functions
* tweak the probing loop (-100ms)
* fiddle with native image options
* Reorder conditions in hope it makes branch predictor happier
* extracted constant
* Improve hash function
* remove limit on number of cores
* fix calculation of boundaries between chunks
* fix IOOBE
---------
Co-authored-by: Jason Nochlin <hundredwatt@users.noreply.github.com>
* Contribution by albertoventurini
* Shave off a couple of hundreds of milliseconds, by making an assumption on temperature readings
* Parse reading without loop, inspired by other solutions
* Use all cores
* Small improvements, only allocate 247 positions instead of 256
---------
Co-authored-by: Alberto Venturini <alberto.venturini@accso.de>
* Update with Rounding Bugfix
* Simplification of Merging Results
* More Plain Java Code for Value Storage
* Improve Performance by Stupid Hash
Drop around 3 seconds on my machine by
simplifying the hash to be ridicules stupid,
but faster.
* Fix outdated comment
* Dmitry challenge
* Dmitry submit 2.
Use MemorySegment of FileChannle and Unsafe
to read bytes from disk. 4 seconds speedup in local test
from 20s to 16s.
* tonivade improved not using HashMap
* use java 21.0.2
* same hash same station
* remove unused parameter in sameSation
* use length too
* refactor parallelization
* use parallel GC
* refactor
* refactor
1. Use Unsafe
2. Fit hashtable in L2 cache.
3. If we can find a good hash function, it can fit in L1 cache even.
4. Improve temperature parsing by using a lookup table
* Go implementation by AlexanderYastrebov
This is a proof-of-concept to demonstrate non-java submission.
It requires Docker with BuildKit plugin to build and export binary.
Updates
* #67
* #253
* Use collision-free id lookup
* Use number of buckets greater than max number of keys
* Init Push
* organize imports
* Add OpenMap
* Best outcome yet
* Create prepare script and calculate script for native image, also add comments on calculation
* Remove extra hashing, and need for the set array
* Commit formatting changes from build
* Remove unneeded device information
* Make shell scripts executable, add hash collision double check for equality
* Add hash collision double check for equality
* Skip multithreading for small files to improve small file performance