Commit Graph

8 Commits

Author SHA1 Message Date
Thomas Wuerthinger
036f9a01b1
Clean up, fine tuning, credit section for thomaswue (#646)
* Some clean up, fine tuning, removing non-supported options, added credit
section and additional comments.

* Put license header year back to 2023 to pass checks.

* Remove static linking (as it requires some more setup on the target
machine).
2024-01-29 22:19:23 +01:00
Thomas Wuerthinger
7e525c5992
Some fine tuning for thomaswue (#606)
* Some fine tuning.

* Process 2MB segments to make all threads finish at the same time.
Process with 3 scanners in parallel in the same thread.
2024-01-28 17:59:57 +01:00
Thomas Wuerthinger
d0a28599c2
Tuning and subprocess spawn for thomaswue (#533)
* Some clean up, small-scale tuning, and reduce complexity when handling longer names.

* Do actual work in worker subprocess. Main process returns immediately
and OS clean up of the mmap continues in the subprocess.

* Update minor Graal version after CPU release.

* Turn GC back to epsilon GC (although it does not seem to make a
difference).

* Minor tuning for another +1%.
2024-01-21 20:13:48 +01:00
Thomas Wuerthinger
be179dcf07
Improve scheduling for thomaswue (#358)
* Improve scheduling for another 6%.

* Tune hash function and collision handling.
2024-01-15 20:43:12 +01:00
Thomas Wuerthinger
bd4cff945d
Adding Scanner object and also tuning for better branch prediction for about +6%. (#341) 2024-01-12 20:51:22 +01:00
Thomas Wuerthinger
af66ac145f
Second tuning for thomaswue
* Optimize checking for collisions by doing this a long at a time always.

* Use a long at a time scanning for delimiter.

* Minor tuning. Now below 0.80s on Intel i9-13900K.

* Add number parsing code from Quan Anh Mai. Fix name length issue.

* Include suggestion from Alfonso Peterssen for another 1.5%.

* Optimize hash collision check compare for ~4% gain.

* Add perf stats based on latest version.
2024-01-10 19:42:51 +01:00
Thomas Wuerthinger
243388ad7b
Use SIMD for search for delimiter and name compare 2024-01-07 20:50:11 +01:00
Thomas Wuerthinger
a53aa2e6fd
Initial version for thomaswue with Oracle GraalVM Native Image
* Initial version.

* Make PGO feature optional off-by-default. Needs PGO_MODE environment
variable to be set. Add -O3 -march=native tuning flags for better
performance.

* Adjust script to be more quiet.

* Adjust max city length. Fix an issue when accumulating results.

* Tune thomaswue submission.
mmap the entire file, use Unsafe directly instead of ByteBuffer, avoid byte[] copies.
These tricks give a ~30% speedup, over an already fast implementation.

* Optimize parsing of numbers based on specific given constraints.

* Fix for segment calculation for case of very small input.

* Minor shell script fixes.

* Separate out build step into file additional_build_step_thomaswue.sh,
simplify run script and remove PGO option for now.

* Minor corrections to the run script.

---------

Co-authored-by: Alfonso² Peterssen <alfonso.peterssen@oracle.com>
2024-01-06 10:55:07 +01:00