* Calculate average by vaidhy
* Calculate average by vaidhy
* More changes
* remove worker log
* Pass -Dparellelism and switch back to open
* Try out mmap
* Improve mmap solution
* no copy version
* reduce threads
* hash code computed on the fly
* Reuse the char (Do not know if it helps)
* primitive hash map
* Primite HashMap
* Micro optimizations to push for optimizations
* Revert "Micro optimizations to push for optimizations"
This reverts commit ea333e2821ebb5c1d6d71a4e87e569a8f2f8f7f0.
* Micro optimizations to get the juice
* floorMod fixes
* findSemi and findNewLine as separate functions
* Optimized parseDouble
* More micro changes
* Aligned equal check
* more small changes
* XOR instead of compare
* Reduce loop length
* Revert changes
* Loop optimization and added native build
* Hand unrolled findSemi loop.
* Remove incorrect comments
* Taking care fo PR comments
* Add prepare script
* Missing header error fix
* remove wrong comment
---------
Co-authored-by: Anita S V <anitasvasu@gmail.com>
Co-authored-by: Anita SV <anitvasu@amazon.com>
* Use graal
* Use dynamic cores computer
* Use stream API to cleanup code
* Use max processors
* Use hash to avoid init string
* optimize concurrentmap init
* Smaller hash size
* Avoid checking concurrentmap
* Optimize data type
* string dedup
* Faster write
* Change base
* Remove time
* Use mul instead of div
* remove unneeded check
* slightly improved hash code perf
* Use unsafe to access memory + untangle the code a bit
* Adhoc cache that works a bit better
* Store station names as offset into the memory segment + length; slightly change how the hash is calculated
Squashed commit of the following:
commit 44d3736de87834b41118d45831e59fc2b052117c
Merge: fcf795f 3127962
Author: Andrew Sun <as-com@users.noreply.github.com>
Date: Thu Jan 11 20:01:13 2024 -0500
Merge branch 'gunnarmorling:main' into as-com
commit fcf795fbabacbd91891d11d21450ee4b1c479dc5
Author: Andrew Sun <me@andrewsun.com>
Date: Wed Jan 10 21:14:01 2024 -0500
Optimizations to Andrew Sun's entry
commit 4203924711bab5252ff3cbb50a90f4ce4e8e67c2
Merge: 9aed05a 085168a
Author: Andrew Sun <me@andrewsun.com>
Date: Wed Jan 10 19:40:19 2024 -0500
Merge remote-tracking branch 'upstream/main' into as-com
commit 9aed05a04bd27fe7323e66c347b1011c77da322c
Merge: 3f8df58 c2d120f
Author: Andrew Sun <me@andrewsun.com>
Date: Sun Jan 7 16:45:27 2024 -0500
Merge remote-tracking branch 'origin/as-com' into as-com
# Conflicts:
# calculate_average_asun.sh
# src/main/java/dev/morling/onebrc/CalculateAverage_asun.java
commit c2d120f0cb7f18c720a81a7f898102b310f9ecb9
Author: Andrew Sun <me@andrewsun.com>
Date: Sat Jan 6 00:45:47 2024 -0500
Add entry by Andrew Sun
commit 3f8df5803bcc8f3e29ed8bfff3077eb0e8cdab15
Author: Andrew Sun <me@andrewsun.com>
Date: Sat Jan 6 00:45:47 2024 -0500
Add entry by Andrew Sun
* Graal Native
* I need a GC :(
* Fix slash lolz
* Fix god damn output lol
I forgot java :D
* Custom hash, custom key
* More optimisations
* I don't need "optimize-build"
I don't care about image size! :D
- Eliminate redundant object creations in between
- Custom HashMap on purpose - Inspired by @spullara
- More performant temperature parsing - Inspired by @yemreinci
- JVM tweaks, decreased heap memory, and remove AlwaysPreTouch
Co-authored-by: Yavuz Tas <yavuz.tas@ing.com>
* Create clones
* Cleanup code and add memory mapping to read file
* Fix chunks reading logic to fit linebreak
* Remove unused
* Sequential
* Multi thread process chunks
* Add new line in output
* Remove unnecessary operation with map & reducer memory
* Reduce mem usage by using only 1 map
* formatting
* Remove unnecessary length check
* Remove trycatch
* Optimize double parsing
* check full hash before comparing
* implement merykitty suggestions to simplify temperature masking; required refactoring to little-endian
* standalone script for offline Perfect Hash seed searching
* stop using an oversized hash table
---------
Co-authored-by: Jason Nochlin <hundredwatt@users.noreply.github.com>
* divide the reading of the file by parts
* fix format
* add number of core partition
* fix format
* implement strToDouble
* fix strtodouble
* add locale, fix read file, tests pass
* delete unnecessary method clean
* First Version
First draft; stole chunking but it's bad
Forgot my changes
No regex building
Clean & optim
I was not benchmarking myself T_T
Faaaster
First Version
* Update calculate_average_samuelyvon.sh
Co-authored-by: Gunnar Morling <gunnar.morling@googlemail.com>
* Add prepare script
* Fix rounding
* Fix format
* Fixing casing
* Formats of sorts?
* Rename
---------
Co-authored-by: Gunnar Morling <gunnar.morling@googlemail.com>
* calculate_average_mtopolnik
* short hash (just first 8 bytes of name)
* Remove unneeded checks
* Remove archiving classes
* 2x larger hashtable
* Add "set" to setters
* Simplify parsing temperature, remove newline search
* Reduce the size of the name slot
* Store name length and use to detect collision
* Reduce memory loads in parseTemperature
* Use short for min/max
* Extract constant for semicolon
* Fix script header
* Explicit bash shell in shebang
* Inline usage of broadcast semicolon
* Try vectorization
* Remove vectorization
* Go Unsafe
* Use SWAR temperature parsing by merykitty
* Inline some things
* Remove commented-out MemorySegment usage
* Inline namesMem.asSlice() invocation
* Try out JVM JIT flags
* Implement strcmp
* Remove unused instance variables
* Optimize hashing
* Put station name into hashtable
* Reorder method
* Remove usage of MemorySegment.getUtf8String
Replace with UNSAFE.copyMemory() and new String()
* Fix hashing bug
* Remove outdated comments
* Fix informative constants
* Use broadcastByte() more
* Improve method naming
* More hashing
* Revert more hashing
* Add commented-out code to hash 16 bytes
* Slight cleanup
* Align hashtable at cacheline boundary
* Add Graal Native image
* Revert Graal Native image
This reverts commit d916a42326d89bd1a841bbbecfae185adb8679d7.
* Simplify shell script (no SDK selection)
* Move a constant, zero out hashtable on start
* Better name comparison
* Add prepare_mtopolnik.sh
* Cleaner idiom in name comparison
* AND instead of MOD for hashtable indexing
* Improve word masking code
* Fix formatting
* Reduce memory loads
* Remove endianness checks
* Avoid hash == 0 problem
* Fix subtle bug
* MergeSort of parellel results
* Touch up perf
* Touch up perf
* Remove -Xmx256m
* Extract result printing method
* Print allocation details on OOME
* Single mmap
* Use global allocation arena