1. Use Unsafe
2. Fit hashtable in L2 cache.
3. If we can find a good hash function, it can fit in L1 cache even.
4. Improve temperature parsing by using a lookup table
* Go implementation by AlexanderYastrebov
This is a proof-of-concept to demonstrate non-java submission.
It requires Docker with BuildKit plugin to build and export binary.
Updates
* #67
* #253
* Use collision-free id lookup
* Use number of buckets greater than max number of keys
* Init Push
* organize imports
* Add OpenMap
* Best outcome yet
* Create prepare script and calculate script for native image, also add comments on calculation
* Remove extra hashing, and need for the set array
* Commit formatting changes from build
* Remove unneeded device information
* Make shell scripts executable, add hash collision double check for equality
* Add hash collision double check for equality
* Skip multithreading for small files to improve small file performance
* final comit
changing using mappedbytebuffer
changes before using unsafe address
using unsafe
* using graalvm,correct unsafe mem implementation
---------
Co-authored-by: Karthikeyans <karthikeyan.sn@zohocorp.com>
* Inline parsing name and station to avoid constantly updating the offset field (-100ms)
* Remove Worker class, inline the logic into lambda
* Accumulate results in an int matrix instead of using result row (-50ms)
* Use native image
* Deploy v2 for parkertimmins
Main changes:
- fix hash which masked incorrectly
- do station equality check in simd
- make station array length multiple of 32
- search for newline rather than semicolon
* Fix bug - entries were being skipped between batches
At the boundary between two batches, the first batch would stop after
crossing a limit with a padding of 200 characters applied. The next
batch should then start looking for the first full entry after the
padding. This padding logic had been removed when starting a batch. For
this reason, entries starting in the 200 character padding between
batches were skipped.
* fast-path for keys<16 bytes
* fix off by one error
the mask is wrong for he 2nd word when len == 16
* less chunks per thread
seems like compact code wins. on my test box anyway.
* Some clean up, small-scale tuning, and reduce complexity when handling longer names.
* Do actual work in worker subprocess. Main process returns immediately
and OS clean up of the mmap continues in the subprocess.
* Update minor Graal version after CPU release.
* Turn GC back to epsilon GC (although it does not seem to make a
difference).
* Minor tuning for another +1%.
- It avoids creating unnecessary Strings objects and handles with the station names with its djb2 hashes instead
- Initializes hashmaps with capacity and load factor
- Adds -XX:+AlwaysPreTouch
* final comit
changing using mappedbytebuffer
changes before using unsafe address
using unsafe
* using graalvm,correct unsafe mem implementation
---------
Co-authored-by: Karthikeyans <karthikeyan.sn@zohocorp.com>