143132e8df
Runs with standard JDK 21.
On my computers (i5 13500, 20 cores, 32GB ram) my best run is (file fully in page cache):
49.78user 0.69system 0:02.81elapsed 1795%CPU
A bit older version of the code on Mac pro M1 32 GB:
real 0m2.867s
user 0m23.956s
sys 0m1.329s
As I wrote in comments in the code, I have a few different roundings that the reference implementation. I have seend that there is an issue about that, but no specific rule yet.
Main points:
- use MemorySegment, it's faster than ByteBuffer
- split the work in a lot of chunks and distribute to a thread pool
- fast measurement parser by using a lot of domain knowledge
- very low allocation
- visit each byte only once
Things I tried that were in fact pessimizations:
- use some internal JDK code to vectorize the hashCode computation
- use a MemorySegment to represent the keys instead of byte[], to avoid
copying
Hope I won't have a bad surprise when running on the target server 😱
32 lines
1.4 KiB
Bash
Executable File
32 lines
1.4 KiB
Bash
Executable File
#!/bin/sh
|
|
#
|
|
# Copyright 2023 The original authors
|
|
#
|
|
# Licensed under the Apache License, Version 2.0 (the "License");
|
|
# you may not use this file except in compliance with the License.
|
|
# You may obtain a copy of the License at
|
|
#
|
|
# http://www.apache.org/licenses/LICENSE-2.0
|
|
#
|
|
# Unless required by applicable law or agreed to in writing, software
|
|
# distributed under the License is distributed on an "AS IS" BASIS,
|
|
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
|
|
# See the License for the specific language governing permissions and
|
|
# limitations under the License.
|
|
#
|
|
|
|
# runs with -Xmx24m on my machine, playing it safe with a larger heap
|
|
JAVA_OPTS="-Xmx64m --enable-preview"
|
|
# to use some black magic options
|
|
JAVA_OPTS="$JAVA_OPTS -XX:+UnlockExperimentalVMOptions"
|
|
# no GC, not needed
|
|
JAVA_OPTS="$JAVA_OPTS -XX:+UseEpsilonGC -XX:+AlwaysPreTouch"
|
|
# my finals are really final
|
|
JAVA_OPTS="$JAVA_OPTS -XX:+TrustFinalNonStaticFields"
|
|
# to get CalculateAverage_obourgain$OpenAddressingMap::getOrCreate to inline. A compile command wasn't enough, it was still hitting 'already compiled into a big method'
|
|
JAVA_OPTS="$JAVA_OPTS -XX:InlineSmallCode=10000"
|
|
# seems to be a bit faster
|
|
JAVA_OPTS="$JAVA_OPTS -XX:-TieredCompilation -XX:CICompilerCount=2 -XX:CompileThreshold=1000"
|
|
|
|
time java $JAVA_OPTS --class-path target/average-1.0.0-SNAPSHOT.jar dev.morling.onebrc.CalculateAverage_obourgain
|