* Combine <8 and 8-16 cases into one case.
* Adopt mask-based approach for the <16 length city fast path (idea of Van Phu Do).
* Slightly improved code layout.
* Update perf number.
- use smaller regions (increased region count) so there will be less idle time for the workers who completed their tasks
- get rid of some configuration related stuff during initialization which might save a few tens of milliseconds hopefully
- update temperature value parsing instruction order to get benefit of ILP better (hopefully)
* decrease instruction level parallelism
it turns out doing 2 things was too much. perf annotate showed spilling.
* more trickery with latency hiding
* work-stealing, lookp tables, credits
* do not assume gender
* Disable The GC
Cuts off sometimes up to 1 seconds
of runtime on my machine.
* Remove Confusing Byte-Order Parameter
Bytes have no Byte-Order ;)
* Provide More Memory to Run the 10K set
* Fix Comparison Function
* Justin's implementation
* Rename justin to Judekeyser
* Back to previous implementation of vectors
* Reading names as sequences of integers
* Fixing tests
* Scale down the number of NIO workers
---------
Co-authored-by: Justin Dekeyser <justin.dekeyser@Justins-MacBook-Pro.local>
instead of writing result line by line, implemented random.choices for randomisation of multiple stations and writing large batche ot the disk, also instead of "round" just using :.1f which is probably quicker on a large scale, because it's not a mathematical function
* added code
* Fixed pointers bugs
* removed my own benchmark
* added comment on how I handle hash collisions
* executed mwvn clean verify
* made scripts executable & fixed rounding issues
* Fixed way of dealing with hash collisions
* changed method name sameNameBytes to isSameNameBytes
* changes script from sh to bash
* fixed chunking bug
* Fixed bug in chunking when file size is too small
* added Runtime.getRuntime().availableProcessors
* added improvemnts on string copying, calculation of next index of Map in case on collision & improved string comparing
* Some clean up, fine tuning, removing non-supported options, added credit
section and additional comments.
* Put license header year back to 2023 to pass checks.
* Remove static linking (as it requires some more setup on the target
machine).
- split big regions into shared smaller tasks, so the workers complete their own tasks can pick up from the remaining instead of leaving its core idle
- reduce number of executed instructions in the hot path
/**
* Solution based on thomaswue solution, commit:
* commit d0a28599c2
* Author: Thomas Wuerthinger
* Date: Sun Jan 21 20:13:48 2024 +0100
*
* The goal here was to try to improve the runtime of his 10k
* solution of: 00:04.516
*
* With Thomas latest changes, his time is probably much better
* already, and maybe even 1st place for the 10k too.
* See: https://github.com/gunnarmorling/1brc/pull/606
*
* But as I was already coding something, I'll submit just to
* see if it will be faster than his *previous* 10k time of
* 00:04.516
*
* Changes:
* It's a similar idea of my previous solution, that if you split
* the chunks evenly, some threads might finish much faster and
* stay idle, so:
* 1) Create more chunks than threads, so the ones that finish first
* can do something;
* 2) Decrease chunk sizes as we get closer to the end of the file.
*/
* CalculateAverage_pdrakatos
* Rename to be valid with rules
* CalculateAverage_pdrakatos
* Rename to be valid with rules
* Changes on scripts execution
* Fixing bugs causing scripts not to be executed
* Changes on prepare make it compatible
* Fixing passing all tests
* Increase direct memory allocation buffer
* Fixing memory problem causes heap space exception
* Fresh solution to optimize performance of the execution
* Solution without unsafe
* Solution without unsafe
* Solution without unsafe, remove the usage of bytebuffer, passes the create_measurements3 test
* bug fix for 10k test, update also the CreateMeasurements3.java to use '\n' as newline instead of the os value (if it runs on windows it uses crlf and "breaks" the file format )
* new version that should perform way better than the previous one
* removed prepare script for giovannicuccu
* removed some comments
---------
Co-authored-by: Giovanni Cuccu <gcuccu@imolainformatica.it>