My take on the one billion row challenge
Go to file
2024-01-22 11:00:01 +01:00
.github #104 Running tests for PRs 2024-01-21 17:44:35 +01:00
.mvn/wrapper 🏆 Initial import 2023-12-28 12:08:03 +01:00
data Adding missing ")" 2024-01-06 10:36:19 +01:00
etc 🏆 Initial import 2023-12-28 12:08:03 +01:00
src Add linl33's implementation (#503) 2024-01-21 21:14:05 +01:00
.gitattributes ddimtirov - supporting hash collisions, should have fixed #101 2024-01-06 19:24:48 +01:00
.gitignore Implementation CalculateAverage_japplis of 1BRC from Anthony Goubard (#271) 2024-01-10 23:09:21 +01:00
.sdkmanrc #104 Running tests for PRs 2024-01-21 17:44:35 +01:00
1brc.png README update 2024-01-01 15:49:01 +01:00
calculate_average_0xshivamagarwal.sh Add 0xshivamagarwal Implementation (#508) 2024-01-20 20:54:04 +01:00
calculate_average_abeobk.sh Native build, less memory acess, improved hash mixing (#449) 2024-01-16 22:31:00 +01:00
calculate_average_abfrmblr.sh Hyperfine: Script re-org 2024-01-09 21:26:29 +01:00
calculate_average_AbstractKamen.sh Updating leaderboard 2024-01-10 22:36:30 +01:00
calculate_average_adriacabeza.sh Improving first iteration by avoiding string creation as much as possible (#516) 2024-01-20 21:27:34 +01:00
calculate_average_agoncal.sh GitHub Copilot Chat with the help of agoncal (#460) 2024-01-19 21:26:12 +01:00
calculate_average_ags313.sh Hyperfine: Script re-org 2024-01-09 21:26:29 +01:00
calculate_average_alesj.sh Leaderboard, formatting 2024-01-11 10:48:03 +01:00
calculate_average_algirdasrascius.sh Quick and dirty first version (#215) 2024-01-10 22:35:37 +01:00
calculate_average_anandmattikopp.sh Hyperfine: Script re-org 2024-01-09 21:26:29 +01:00
calculate_average_anitasv.sh A SAFE and readable version (#388) 2024-01-14 18:41:04 +01:00
calculate_average_arjenvaneerde.sh Updating leaderboard 2024-01-10 22:36:30 +01:00
calculate_average_arjenw.sh Small optimizations (#426) 2024-01-15 20:00:52 +01:00
calculate_average_armandino.sh armandino: second attempt (#445) 2024-01-16 22:04:37 +01:00
calculate_average_artpar.sh inline hash calculation and number parsing (#200) 2024-01-12 09:38:09 +01:00
calculate_average_artsiomkorzun.sh subprocess spawner (#542) 2024-01-21 20:23:48 +01:00
calculate_average_as-com.sh Rename files to match GitHub username 2024-01-13 11:29:27 +01:00
calculate_average_baseline_original_rounding.sh #49 Fixing rounding behavior of baseline implementation 2024-01-14 10:59:24 +01:00
calculate_average_baseline.sh Update class name in calculate_average_baseline.sh 2024-01-12 16:30:08 +01:00
calculate_average_berry120.sh Leaderboard update 2024-01-10 23:14:59 +01:00
calculate_average_bjhara.sh Hyperfine: Script re-org 2024-01-09 21:26:29 +01:00
calculate_average_C5H12O5.sh Initial implementation by xylitol (#318) 2024-01-11 21:57:23 +01:00
calculate_average_charlibot.sh Charlibot - use memory mapping (#372) 2024-01-14 14:34:08 +01:00
calculate_average_cliffclick.sh Housekeeping around Cliff's entry: 2024-01-14 11:12:56 +01:00
calculate_average_coolmineman.sh Hyperfine: Script re-org 2024-01-09 21:26:29 +01:00
calculate_average_couragelee.sh Removing superfluous time calls 2024-01-10 21:21:54 +01:00
calculate_average_criccomini.sh Hyperfine: Script re-org 2024-01-09 21:26:29 +01:00
calculate_average_davecom.sh Hyperfine: Script re-org 2024-01-09 21:26:29 +01:00
calculate_average_davery22.sh Hyperfine: Script re-org 2024-01-09 21:26:29 +01:00
calculate_average_ddimtirov.sh Hyperfine: Script re-org 2024-01-09 21:26:29 +01:00
calculate_average_deemkeen.sh Hyperfine: Script re-org 2024-01-09 21:26:29 +01:00
calculate_average_dmitry-midokura.sh Dmitry challenge 2024-01-14 18:45:30 +01:00
calculate_average_ebarlas.sh Use Arena MemorySegments rather than ByteBuffers. (#505) 2024-01-20 13:56:27 +01:00
calculate_average_entangled90.sh Hyperfine: Script re-org 2024-01-09 21:26:29 +01:00
calculate_average_eriklumme.sh CalculateAverage_eriklumme first submission (#221) 2024-01-15 19:03:51 +01:00
calculate_average_faridtmammadov.sh CalculateAverage_faridtmammadov (#406) 2024-01-15 21:01:16 +01:00
calculate_average_fatroom.sh Fixed failing tests 2024-01-11 19:49:51 +01:00
calculate_average_felix19350.sh Further improved performance by improving the parsing logic so that strings for city names are not allocated with each row. (#323) 2024-01-14 20:56:11 +01:00
calculate_average_filiphr.sh Hyperfine: Script re-org 2024-01-09 21:26:29 +01:00
calculate_average_flippingbits.sh Update submission (#385) 2024-01-14 19:06:01 +01:00
calculate_average_fragmede.sh Hyperfine: Script re-org 2024-01-09 21:26:29 +01:00
calculate_average_gabrielreid.sh Hyperfine: Script re-org 2024-01-09 21:26:29 +01:00
calculate_average_gamlerhart.sh Basic Implementation with Memory Mapped File, SIMD Search 2024-01-11 11:58:57 +01:00
calculate_average_giovannicuccu.sh Leaderboard update 2024-01-21 13:43:40 +01:00
calculate_average_gnabyl.sh [Attempt #2] String overflow hash + data/mem optimization (#356) 2024-01-13 12:32:17 +01:00
calculate_average_gnmathur.sh Leaderboard, clean-up 2024-01-10 21:37:16 +01:00
calculate_average_gonix.sh CalculateAverage_gonix initial attempt (#413) 2024-01-16 22:49:39 +01:00
calculate_average_hallvard.sh Leaderboard update 2024-01-11 21:39:13 +01:00
calculate_average_hchiorean.sh Hyperfine: Script re-org 2024-01-09 21:26:29 +01:00
calculate_average_hundredwatt.sh create fork hundredwatt (#279) 2024-01-11 11:48:22 +01:00
calculate_average_ianopolous.sh Leaderboard, formatting 2024-01-12 21:21:40 +01:00
calculate_average_ianopolousfast.sh Introducing the vector api. 1s faster on 4 core i7 (#506) 2024-01-20 20:09:40 +01:00
calculate_average_imrafaelmerino.sh second try: just testing with the chunk size and gc tunning 2024-01-12 21:11:44 +01:00
calculate_average_isolgpus.sh Hyperfine: Script re-org 2024-01-09 21:26:29 +01:00
calculate_average_itaske.sh Hyperfine: Script re-org 2024-01-09 21:26:29 +01:00
calculate_average_ivanklaric.sh my 1brc entry (#367) 2024-01-13 22:02:36 +01:00
calculate_average_iziamos.sh extract cursor interface (#458) 2024-01-17 21:41:32 +01:00
calculate_average_JamalMulla.sh Leaderboard update 2024-01-11 21:39:13 +01:00
calculate_average_japplis.sh Read file in multiple threads and String to Text (#427) 2024-01-16 22:10:38 +01:00
calculate_average_jatingala.sh Leaderboard, permissions 2024-01-13 21:48:37 +01:00
calculate_average_javamak.sh Leaderboard, permissions 2024-01-13 21:48:37 +01:00
calculate_average_jbachorik.sh Small improvements (#379) 2024-01-14 14:12:19 +01:00
calculate_average_jerrinot.sh edge-case in hashing fixed (#459) 2024-01-17 18:28:03 +01:00
calculate_average_JesseVanRooy.sh CalculateAverage_JesseVanRooy (Submission 1) (#335) 2024-01-14 19:09:58 +01:00
calculate_average_jgrateron.sh Hyperfine: Script re-org 2024-01-09 21:26:29 +01:00
calculate_average_jincongho.sh Submission #2: jincongho (#416) 2024-01-15 20:48:32 +01:00
calculate_average_jotschi.sh Hyperfine: Script re-org 2024-01-09 21:26:29 +01:00
calculate_average_jparera.sh jparera's initial implementation (#433) 2024-01-17 21:52:33 +01:00
calculate_average_karthikeyan97.sh using unsafe alone (#512) 2024-01-20 20:49:54 +01:00
calculate_average_kevinmcmurtrie.sh Hyperfine: Script re-org 2024-01-09 21:26:29 +01:00
calculate_average_kgeri.sh Hyperfine: Script re-org 2024-01-09 21:26:29 +01:00
calculate_average_khmarbaise.sh Hyperfine: Script re-org 2024-01-09 21:26:29 +01:00
calculate_average_Kidlike.sh submission for kidlike (#294) 2024-01-13 22:22:36 +01:00
calculate_average_kuduwa-keshavram.sh Optimised code to iterate over non-null measurements (#444) 2024-01-16 22:02:26 +01:00
calculate_average_kumarsaurav123.sh kumarsaurav123 # Attempt 3 (#470) 2024-01-19 21:35:25 +01:00
calculate_average_lawrey.sh Hyperfine: Script re-org 2024-01-09 21:26:29 +01:00
calculate_average_linl33.sh Add linl33's implementation (#503) 2024-01-21 21:14:05 +01:00
calculate_average_maeda6uiui.sh Leaderboard update 2024-01-11 21:39:13 +01:00
calculate_average_MahmoudFawzyKhalil.sh MahmoudFawzyKhalil's implementation (#438) 2024-01-17 21:15:34 +01:00
calculate_average_maximz101.sh first attempt (#226) 2024-01-12 21:39:12 +01:00
calculate_average_MeanderingProgrammer.sh First attempt from MeanderingProgrammer 2024-01-11 12:03:29 +01:00
calculate_average_merykitty.sh merykitty's second attempt 2024-01-10 20:24:19 +01:00
calculate_average_merykittyunsafe.sh unsafe approach 2024-01-12 18:39:12 +01:00
calculate_average_moysesb.sh Hyperfine: Script re-org 2024-01-09 21:26:29 +01:00
calculate_average_mtopolnik.sh 10k improvement (#419) 2024-01-15 18:49:32 +01:00
calculate_average_mudit-saxena.sh Hyperfine: Script re-org 2024-01-09 21:26:29 +01:00
calculate_average_netrunnereve.sh Leaderboard update 2024-01-11 21:39:13 +01:00
calculate_average_obourgain.sh Hyperfine: Script re-org 2024-01-09 21:26:29 +01:00
calculate_average_omarchenko4j.sh Leaderboard, permissions 2024-01-13 21:48:37 +01:00
calculate_average_padreati.sh Hyperfine: Script re-org 2024-01-09 21:26:29 +01:00
calculate_average_palmr.sh Hyperfine: Script re-org 2024-01-09 21:26:29 +01:00
calculate_average_parkertimmins.sh Leaderboard update 2024-01-11 21:49:58 +01:00
calculate_average_pedestrianlove.sh Leaderboard update 2024-01-10 23:14:59 +01:00
calculate_average_phd3.sh first cut 1brc submission (#216) 2024-01-10 22:40:27 +01:00
calculate_average_plbpietrz.sh Initial 1brc version by plbpietrz (#219) 2024-01-15 20:30:04 +01:00
calculate_average_plevart.sh plevart: Look Mom No Unsafe! (#452) 2024-01-16 22:34:40 +01:00
calculate_average_raipc.sh Leaderboard update 2024-01-10 23:14:59 +01:00
calculate_average_rby.sh Hyperfine: Script re-org 2024-01-09 21:26:29 +01:00
calculate_average_ricardopieper.sh First attempt from ricardopieper, class CalculateAverage_ricardopieper (#306) 2024-01-11 22:14:13 +01:00
calculate_average_richardstartin.sh Hyperfine: Script re-org 2024-01-09 21:26:29 +01:00
calculate_average_roman-r-m.sh Reduce allocations and heap size (#525) 2024-01-21 18:01:23 +01:00
calculate_average_royvanrijn.sh Fixing the off-by-one error and updating to native, redone layout of code. (#307) 2024-01-11 11:12:05 +01:00
calculate_average_rprabhu.sh Updating leaderboard 2024-01-10 22:36:30 +01:00
calculate_average_SamuelYvon.sh Graal Native for SamuelYvon (#332) 2024-01-12 20:16:13 +01:00
calculate_average_santanu.sh Hyperfine: Script re-org 2024-01-09 21:26:29 +01:00
calculate_average_seijikun.sh Hyperfine: Script re-org 2024-01-09 21:26:29 +01:00
calculate_average_semotpan.sh Hyperfine: Script re-org 2024-01-09 21:26:29 +01:00
calculate_average_spullara.sh Hyperfine: Script re-org 2024-01-09 21:26:29 +01:00
calculate_average_thanhtrinity.sh 1brc submission - thanhtrinity (#269) 2024-01-12 20:36:44 +01:00
calculate_average_thomaswue.sh Remove additional_build_steps_*.sh support (#301) 2024-01-11 09:05:13 +01:00
calculate_average_tkosachev.sh CalculateAverage_tkosachev 2024-01-14 18:36:20 +01:00
calculate_average_truelive.sh Hyperfine: Script re-org 2024-01-09 21:26:29 +01:00
calculate_average_twobiers.sh Hyperfine: Script re-org 2024-01-09 21:26:29 +01:00
calculate_average_Ujjwalbharti.sh Hyperfine: Script re-org 2024-01-09 21:26:29 +01:00
calculate_average_unbounded.sh Add implementation for user unbounded (#394) 2024-01-14 19:11:57 +01:00
calculate_average_vaidhy.sh Primitive hash (#345) 2024-01-13 18:46:51 +01:00
calculate_average_vemana.sh 10% improvement from parallelizing munmap(); jumps to around 12th from 16th based on local testing; no Unsafe; no bitwise tricks yet (#465) 2024-01-17 21:30:31 +01:00
calculate_average_xpmatteo.sh My own solution -- memory mapping the files, running in parallel threads, using a state machine to parse the file (#466) 2024-01-17 21:26:19 +01:00
calculate_average_YannMoisan.sh Improved version based on rafaelmerino (#511) 2024-01-20 20:33:14 +01:00
calculate_average_yavuztas.sh I optimized my solution: (#337) 2024-01-12 09:47:31 +01:00
calculate_average_yehwankim23.sh Hyperfine: Script re-org 2024-01-09 21:26:29 +01:00
calculate_average_yemreinci.sh Hyperfine: Script re-org 2024-01-09 21:26:29 +01:00
calculate_average_yonatang.sh yonatang solution: a jdk8 friendly, no unsafe code, epsilon-gc friendly solution (#499) 2024-01-20 15:02:55 +01:00
calculate_average_zerninv.sh use unsafe (#343) 2024-01-12 09:54:28 +01:00
checkout.sh Script cleanup 2024-01-13 12:23:06 +01:00
cleanup.sh Eval infra 2024-01-05 16:01:28 +01:00
create_fork.sh Update create_fork.sh 2024-01-12 21:05:51 +01:00
create_measurements2.sh Faster version of the data generator 2024-01-03 13:03:37 +01:00
create_measurements3.sh Generate measurements with random names 2024-01-06 10:35:44 +01:00
create_measurements_fast.sh Add multithreaded generation of measurements file with Gaussian distribution (M2 AIR took 24 seconds for 1B items) (#175) 2024-01-11 12:16:30 +01:00
create_measurements.sh 👨‍💼 License headers 2023-12-28 15:31:35 +01:00
evaluate.sh evaluate.sh: Add note for "using Unsafe" (#547) 2024-01-22 09:20:26 +01:00
github_users.txt Improving first iteration by avoiding string creation as much as possible (#516) 2024-01-20 21:27:34 +01:00
LICENSE.txt 🏆 Initial import 2023-12-28 12:08:03 +01:00
mvnw 🏆 Initial import 2023-12-28 12:08:03 +01:00
mvnw.cmd Committing line separator changes on Linux systems (enforced by gitattr). 2024-01-08 12:17:23 +01:00
pom.xml Add linl33's implementation (#503) 2024-01-21 21:14:05 +01:00
prepare_abeobk.sh reorganize code, little bit faster (#509) 2024-01-21 13:25:18 +01:00
prepare_adriacabeza.sh Memory mapped buffers, ints instead of floats and epsilon GC (#451) 2024-01-16 22:23:35 +01:00
prepare_agoncal.sh GitHub Copilot Chat with the help of agoncal (#460) 2024-01-19 21:26:12 +01:00
prepare_ags313.sh Fix test.sh to use prepare script 2024-01-10 14:29:18 +01:00
prepare_anitasv.sh A SAFE and readable version (#388) 2024-01-14 18:41:04 +01:00
prepare_artsiomkorzun.sh subprocess spawner (#542) 2024-01-21 20:23:48 +01:00
prepare_baseline.sh Fix test.sh to use prepare script 2024-01-10 14:29:18 +01:00
prepare_C5H12O5.sh Initial implementation by xylitol (#318) 2024-01-11 21:57:23 +01:00
prepare_charlibot.sh Charlibot - use memory mapping (#372) 2024-01-14 14:34:08 +01:00
prepare_coolmineman.sh Fix test.sh to use prepare script 2024-01-10 14:29:18 +01:00
prepare_davecom.sh Leaderboard, formatting 2024-01-11 10:48:03 +01:00
prepare_ddimtirov.sh Fix test.sh to use prepare script 2024-01-10 14:29:18 +01:00
prepare_ebarlas.sh Fix test.sh to use prepare script 2024-01-10 14:29:18 +01:00
prepare_eriklumme.sh CalculateAverage_eriklumme first submission (#221) 2024-01-15 19:03:51 +01:00
prepare_filiphr.sh Fix test.sh to use prepare script 2024-01-10 14:29:18 +01:00
prepare_flippingbits.sh Fix test.sh to use prepare script 2024-01-10 14:29:18 +01:00
prepare_giovannicuccu.sh Leaderboard update 2024-01-21 13:43:40 +01:00
prepare_gnabyl.sh [Attempt #2] String overflow hash + data/mem optimization (#356) 2024-01-13 12:32:17 +01:00
prepare_hundredwatt.sh Use bash in prepare scripts (#339) 2024-01-12 08:25:43 +01:00
prepare_imrafaelmerino.sh Fix test.sh to use prepare script 2024-01-10 14:29:18 +01:00
prepare_JamalMulla.sh First submission - CalculateAverage_JamalMulla.java - Jamal Mulla (#238) 2024-01-11 19:56:29 +01:00
prepare_jatingala.sh Leaderboard, permissions 2024-01-13 21:48:37 +01:00
prepare_jbachorik.sh Small improvements (#379) 2024-01-14 14:12:19 +01:00
prepare_jerrinot.sh jerrinot's initial submission (#424) 2024-01-15 18:55:22 +01:00
prepare_Kidlike.sh submission for kidlike (#294) 2024-01-13 22:22:36 +01:00
prepare_kuduwa-keshavram.sh Use bash in prepare scripts (#339) 2024-01-12 08:25:43 +01:00
prepare_linl33.sh Add linl33's implementation (#503) 2024-01-21 21:14:05 +01:00
prepare_maximz101.sh first attempt (#226) 2024-01-12 21:39:12 +01:00
prepare_MeanderingProgrammer.sh First attempt from MeanderingProgrammer 2024-01-11 12:03:29 +01:00
prepare_mtopolnik.sh Use bash in prepare scripts (#339) 2024-01-12 08:25:43 +01:00
prepare_phd3.sh Add improvements (#412) 2024-01-15 18:47:06 +01:00
prepare_plevart.sh plevart: Look Mom No Unsafe! (#452) 2024-01-16 22:34:40 +01:00
prepare_ricardopieper.sh Use bash in prepare scripts (#339) 2024-01-12 08:25:43 +01:00
prepare_roman-r-m.sh Use bash in prepare scripts (#339) 2024-01-12 08:25:43 +01:00
prepare_royvanrijn.sh Reverting ByteBuffer idea, using Thomas's trick instead. (#538) 2024-01-21 20:15:07 +01:00
prepare_SamuelYvon.sh Graal Native for SamuelYvon (#332) 2024-01-12 20:16:13 +01:00
prepare_seijikun.sh Fix test.sh to use prepare script 2024-01-10 14:29:18 +01:00
prepare_spullara.sh Fix test.sh to use prepare script 2024-01-10 14:29:18 +01:00
prepare_thanhtrinity.sh 1brc submission - thanhtrinity (#269) 2024-01-12 20:36:44 +01:00
prepare_thomaswue.sh Tuning and subprocess spawn for thomaswue (#533) 2024-01-21 20:13:48 +01:00
prepare_truelive.sh Fix test.sh to use prepare script 2024-01-10 14:29:18 +01:00
prepare_twobiers.sh Fix test.sh to use prepare script 2024-01-10 14:29:18 +01:00
prepare_vaidhy.sh Primitive hash (#345) 2024-01-13 18:46:51 +01:00
prepare_vemana.sh Submission #5 [No bitwise tricks nor Unsafe yet; 13th place on leaderboard in local testing using evaluate2.sh] (#209) 2024-01-14 14:07:03 +01:00
prepare_YannMoisan.sh Improved version based on rafaelmerino (#511) 2024-01-20 20:33:14 +01:00
prepare_yavuztas.sh Fix test.sh to use prepare script 2024-01-10 14:29:18 +01:00
prepare_yonatang.sh yonatang solution: a jdk8 friendly, no unsafe code, epsilon-gc friendly solution (#499) 2024-01-20 15:02:55 +01:00
prepare_zerninv.sh improve equality check performance, use graal jvm (#454) 2024-01-17 18:35:22 +01:00
process_output.java Infra 2024-01-07 09:40:09 +01:00
process.sh Eval infra 2024-01-05 16:01:28 +01:00
README.md Update README.md to add the _baseline (#552) 2024-01-22 11:00:01 +01:00
test_all.sh test_all.sh: configure fork test timeout 2024-01-11 19:07:59 +01:00
test_ci.sh #104 Running tests for PRs 2024-01-21 17:44:35 +01:00
test.sh add colors to test diff 2024-01-13 18:39:45 +01:00
tocsv.sh Add a script to transform output into CSV format 2024-01-03 14:56:36 +01:00

1🐝🏎️ The One Billion Row Challenge

Status Jan 12: As there has been such a large number of entries to this challenge so far (100+), and this is becoming hard to manage, please only create new submissions if you expect them to run in 10 seconds or less on the evaluation machine.

Status Jan 1: This challenge is open for submissions!

The One Billion Row Challenge (1BRC) is a fun exploration of how far modern Java can be pushed for aggregating one billion rows from a text file. Grab all your (virtual) threads, reach out to SIMD, optimize your GC, or pull any other trick, and create the fastest implementation for solving this task!

1BRC

The text file contains temperature values for a range of weather stations. Each row is one measurement in the format <string: station name>;<double: measurement>, with the measurement value having exactly one fractional digit. The following shows ten rows as an example:

Hamburg;12.0
Bulawayo;8.9
Palembang;38.8
St. John's;15.2
Cracow;12.6
Bridgetown;26.9
Istanbul;6.2
Roseau;34.4
Conakry;31.2
Istanbul;23.0

The task is to write a Java program which reads the file, calculates the min, mean, and max temperature value per weather station, and emits the results on stdout like this (i.e. sorted alphabetically by station name, and the result values per station in the format <min>/<mean>/<max>, rounded to one fractional digit):

{Abha=-23.0/18.0/59.2, Abidjan=-16.2/26.0/67.3, Abéché=-10.0/29.4/69.0, Accra=-10.1/26.4/66.4, Addis Ababa=-23.7/16.0/67.0, Adelaide=-27.8/17.3/58.5, ...}

Submit your implementation by Jan 31 2024 and become part of the leaderboard!

Results

These are the results from running all entries into the challenge on eight cores of a Hetzner AX161 dedicated server (32 core AMD EPYC™ 7502P (Zen2), 128 GB RAM).

# Result (m:s.ms) Implementation JDK Submitter Notes
1 00:02.195 link 21.0.1-graal Thomas Wuerthinger, Quan Anh Mai, Alfonso² Peterssen GraalVM native binary, uses Unsafe
2 00:02.248 link 21.0.1-graal Artsiom Korzun GraalVM native binary, uses Unsafe
3* 00:02.313 link 21.0.1-graal Roy van Rijn GraalVM native binary, uses Unsafe
3* 00:02.336 link 21.0.1-graal Van Phu DO GraalVM native binary, uses Unsafe
00:02.575 link 21.0.1-open Quan Anh Mai uses Unsafe
00:02.909 link 21.0.1-graal Jaromir Hamala uses Unsafe
00:03.258 link 21.0.1-open Quan Anh Mai
00:03.376 link 21.0.1-graal Marko Topolnik uses Unsafe
00:03.714 link 21.0.1-graal Jason Nochlin
00:03.718 link 21.0.1-graal zerninv uses Unsafe
00:03.902 link 21.0.1-open Juan Parera
00:03.959 link 21.0.1-open gonix
00:03.966 link 21.0.1-open Jin Cong Ho uses Unsafe
00:03.990 link 21.0.1-graal Elliot Barlas uses Unsafe
00:04.066 link 21.0.1-open JesseVanRooy uses Unsafe
00:04.154 link 21.0.1-open John Ziamos uses Unsafe
00:04.551 link 21.0.1-graal Roman Musin uses Unsafe
00:04.741 link 21.0.1-open Cliff Click uses Unsafe
00:04.823 link 21.0.1-graal Jamal Mulla uses Unsafe
00:04.920 link 21.0.1-graal Subrahmanyam
00:04.959 link 21.0.1-graal Yavuz Tas uses Unsafe
00:05.142 link 21.0.1-open Arjen Wisse
00:05.235 link 21.0.1-open unbounded
00:05.336 link 21.0.1-tem Peter Levart
00:05.478 link 21.0.1-open Olivier Bourgain uses Unsafe
00:05.887 link 21.0.1-graal Charlie Evans uses Unsafe
00:05.960 link 21.0.1-graal Vaidhy Mayilrangam uses Unsafe
00:05.979 link 21.0.1-graal Sam Pullara
00:06.166 link 21.0.1-open Jamie Stansfield
00:06.257 link 21.0.1-graal Stefan Sprenger uses Unsafe
00:06.415 link 21.0.1-open Arman Sharif uses Unsafe
00:06.654 link 21.0.1-graal Jaroslav Bachorik
00:06.576 link 21.0.1-open Andrew Sun uses Unsafe
00:06.715 link 21.0.1-open Algirdas Raščius
00:06.872 link 21.0.1-open Dr Ian Preston
00:07.240 link java giovannicuccu
00:07.680 link 21.0.1-graal Xylitol uses Unsafe
00:07.730 link 21.0.1-open Johannes Schüth
00:07.925 link 21.0.1-graal Ricardo Pieper
00:07.913 link 21.0.1-open parkertimmins
00:08.167 link 21.0.1-tem Dimitar Dimitrov
00:08.214 link 21.0.1-open deemkeen
00:08.398 link 21.0.1-open Parth Mudgal uses Unsafe
00:08.489 link 21.0.1-graal Bang NGUYEN
00:08.517 link 21.0.1-graal ags uses Unsafe
00:08.557 link 21.0.1-graal Adrià Cabeza
00:08.622 link 21.0.1-graal Keshavram Kuduwa
00:08.689 link 21.0.1-open Roman Stoffel
00:08.752 link 21.0.1-graal Anita SV
00:08.892 link 21.0.1-open Roman Romanchuk
00:09.020 link 21.0.1-open yemreinci
00:09.071 link 21.0.1-open Gabriel Reid
00:09.352 link 21.0.1-graal Filip Hrisafov
00:09.867 link 21.0.1-graal Ricardo Pieper
00:09.945 link 21.0.1-open Anthony Goubard
00:10.092 link 21.0.1-graal Pratham
00:10.127 link 21.0.1-open Parth Mudgal uses Unsafe
00:11.577 link 21.0.1-open Eve
00:10.473 link 21.0.1-open Anton Rybochkin
00:11.119 link 21.0.1-open lawrey
00:11.167 link 21.0.1-open Nick Palmer
00:11.405 link 21.0.1-graal Rafael Merino García
00:11.433 link 21.0.1-graal Jatin Gala
00:11.805 link 21.0.1-graal Cool_Mineman
00:11.878 link 21.0.1-open karthikeyan97 uses Unsafe
00:11.934 link 21.0.1-open arjenvaneerde
00:12.051 link 21.0.1-open Dmitry Bufistov
00:12.102 link java Yann Moisan
00:12.220 link 21.0.1-open Richard Startin
00:12.495 link 21.0.1-graal Samuel Yvon GraalVM native binary
00:12.568 link 21.0.1-graal Vlad
00:12.800 link java Yonatan Graber
00:13.013 link 21.0.1-graal Thanh Duong
00:13.071 link 21.0.1-open Dr Ian Preston
00:13.817 link 21.0.1-open Carlo
00:14.502 link 21.0.1-graal eriklumme
00:14.772 link 21.0.1-open Kevin McMurtrie
00:14.867 link 21.0.1-open Michael Berry
00:15.662 link 21.0.1-open Serghei Motpan
00:17.179 link 21.0.1-open Jairo Graterón
00:17.490 link 21.0.1-open Gergely Kiss
00:17.255 link 21.0.1-open tkosachev
00:17.520 link 21.0.1-open Farid
00:17.717 link 21.0.1-open Oleh Marchenko
00:17.815 link 21.0.1-open Hallvard Trætteberg
00:17.932 link 21.0.1-open Bartłomiej Pietrzyk
00:18.251 link 21.0.1-graal Markus Ebner
00:18.448 link 21.0.1-open Moysés Borges Furtado
00:18.771 link 21.0.1-graal David Kopec
00:18.902 link 21.0.1-graal Maxime
00:19.357 link 21.0.1-graalce Roman Schweitzer
00:20.691 link 21.0.1-graal Kidlike GraalVM native binary
00:21.989 link 21.0.1-open couragelee
00:22.457 link 21.0.1-open Ramzi Ben Yahya
00:22.471 link 21.0.1-open Shivam Agarwal
00:24.986 link 21.0.1-open kumarsaurav123
00:26.500 link 21.0.1-open Bruno Félix
00:28.381 link 21.0.1-open Hampus
00:29.741 link 21.0.1-open Matteo Vaccari
00:32.018 link 21.0.1-open Aurelian Tutuianu
00:34.388 link 21.0.1-tem Tobi
00:35.875 link 21.0.1-open MahmoudFawzyKhalil
00:36.180 link 21.0.1-open Horia Chiorean
00:38.340 link 21.0.1-open AbstractKamen
00:41.982 link 21.0.1-open Chris Riccomini
00:42.893 link 21.0.1-open javamak
00:46.597 link 21.0.1-open Maeda-san
00:58.811 link 21.0.1-open Ujjwal Bharti
01:05.094 link 21.0.1-open Mudit Saxena
01:06.790 link 21.0.1-open Karl Heinz Marbaise
01:06.944 link 21.0.1-open santanu
01:07.014 link 21.0.1-open pedestrianlove
01:08.811 link 21.0.1-open Aleš Justin
01:08.908 link 21.0.1-open itaske
01:09.595 link 21.0.1-tem Antonio Goncalves
01:09.882 link 21.0.1-open Prabhu R
01:14.815 link 21.0.1-open twohardthings
01:25.801 link 21.0.1-open ivanklaric
01:33.594 link 21.0.1-open Gaurav Mathur
01:56.607 link 21.0.1-open Abhilash
03:43.521 link 21.0.1-open 김예환 Ye-Hwan Kim (Sam)
03:59.760 link 21.0.1-open Samson
---
04:49.679 link (Baseline) 21.0.1-open Gunnar Morling

* These two entries have such a similar runtime (below the error margin I can reliably measure), that they share position #1 in the leaderboar.

Note that I am not super-scientific in the way I'm running the contenders (see Evaluating Results for the details). This is not a high-fidelity micro-benchmark and there can be variations of ~ +-5% between runs. So don't be too hung up on the exact ordering of your entry compared to others in close proximity. The primary purpose of this challenge is to learn something new, have fun along the way, and inspire others to do the same. The leaderboard is only means to an end for achieving this goal. If you observe drastically different results though, please open an issue.

See Entering the Challenge for instructions how to enter the challenge with your own implementation. The Show & Tell features a wide range of 1BRC entries built using other languages, databases, and tools.

Bonus Results

This section lists results from running the fastest N entries with different configurations. As entries have been optimized towards the specific conditions of the original challenge description and set-up (such as size of the key set), challenge entries may perform very differently across different configurations. These bonus results are provided here for informational purposes only. For the 1BRC challenge, only the results in the previous section are of importance.

32 Cores / 64 Threads

For officially evaluating entries into the challenge, each contender is run on eight cores of the evaluation machine (AMD EPYC™ 7502P). Here are the results from running the top 15 entries (as of commit 2c26b511) on all 32 cores / 64 threads (i.e. SMT is enabled) of the machine:

# Result (m:s.ms) Implementation JDK Submitter Notes
1 00:00.799 link 21.0.1-graal Thomas Wuerthinger GraalVM native binary
2 00:00.933 link 21.0.1-graal Roy van Rijn GraalVM native binary
3 00:01.236 link 21.0.1-graal Artsiom Korzun
00:01.380 link 21.0.1-open merykittyunsafe
00:01.383 link 21.0.1-open Cliff Click
00:01.429 link 21.0.1-open John Ziamos
00:01.464 link 21.0.1-open Olivier Bourgain
00:01.603 link 21.0.1-open Van Phu DO
00:01.748 link 21.0.1-graal Yavuz Tas
00:01.778 link 21.0.1-open Quan Anh Mai
00:01.942 link 21.0.1-graal Marko Topolnik
00:01.972 link 21.0.1-graal Elliot Barlas
00:02.111 link 21.0.1-graal Jamal Mulla
00:02.644 link 21.0.1-graal Vaidhy Mayilrangam
00:03.697 link 21.0.1-graal Jason Nochlin

10K Key Set

The 1BRC challenge data set contains 413 distinct weather stations, whereas the rules allow for 10,000 different station names to occur. Here are the results from running the top 15 entries (as of commit 2c26b511) against 1,000,000,000 measurement values across 10K stations (created via ./create_measurements3.sh 1000000000), using eight cores on the evaluation machine:

# Result (m:s.ms) Implementation JDK Submitter Notes
1 00:04.589 link 21.0.1-graal Artsiom Korzun
2 00:05.296 link 21.0.1-graal Roy van Rijn GraalVM native binary
3 00:05.308 link 21.0.1-graal Thomas Wuerthinger GraalVM native binary
00:05.881 link 21.0.1-graal Marko Topolnik
00:07.120 link 21.0.1-graal Jamal Mulla
00:07.915 link 21.0.1-open Cliff Click
00:08.979 link 21.0.1-graal Yavuz Tas
00:10.052 link 21.0.1-open merykittyunsafe
00:10.134 link 21.0.1-graal Vaidhy Mayilrangam
00:10.599 link 21.0.1-graal Elliot Barlas
00:12.750 link 21.0.1-open Quan Anh Mai
---
DNF link 21.0.1-graal Jason Nochlin Didn't complete in 60 sec
DNF link 21.0.1-open Van Phu DO Didn't complete in 60 sec
DNF link 21.0.1-open John Ziamos Didn't complete in 60 sec
DNF link 21.0.1-open Olivier Bourgain Failed with java.lang.OutOfMemoryError: Java heap space

Prerequisites

Java 21 must be installed on your system.

Running the Challenge

This repository contains two programs:

  • dev.morling.onebrc.CreateMeasurements (invoked via create_measurements.sh): Creates the file measurements.txt in the root directory of this project with a configurable number of random measurement values
  • dev.morling.onebrc.CalculateAverage (invoked via calculate_average_baseline.sh): Calculates the average values for the file measurements.txt

Execute the following steps to run the challenge:

  1. Build the project using Apache Maven:

    ./mvnw clean verify
    
  2. Create the measurements file with 1B rows (just once):

    ./create_measurements.sh 1000000000
    

    This will take a few minutes. Attention: the generated file has a size of approx. 12 GB, so make sure to have enough diskspace.

    If you're running the challenge with a non-Java language, there's a non-authoritative Python script to generate the measurements file at src/main/python/create_measurements.py. The authoritative method for generating the measurements is the Java program dev.morling.onebrc.CreateMeasurements.

  3. Calculate the average measurement values:

    ./calculate_average_baseline.sh
    

    The provided naive example implementation uses the Java streams API for processing the file and completes the task in ~2 min on environment used for result evaluation. It serves as the base line for comparing your own implementation.

  4. Optimize the heck out of it:

    Adjust the CalculateAverage program to speed it up, in any way you see fit (just sticking to a few rules described below). Options include parallelizing the computation, using the (incubating) Vector API, memory-mapping different sections of the file concurrently, using AppCDS, GraalVM, CRaC, etc. for speeding up the application start-up, choosing and tuning the garbage collector, and much more.

Flamegraph/Profiling

A tip is that if you have jbang installed, you can get a flamegraph of your program by running async-profiler via ap-loader:

jbang --javaagent=ap-loader@jvm-profiling-tools/ap-loader=start,event=cpu,file=profile.html -m dev.morling.onebrc.CalculateAverage_yourname target/average-1.0.0-SNAPSHOT.jar

or directly on the .java file:

jbang --javaagent=ap-loader@jvm-profiling-tools/ap-loader=start,event=cpu,file=profile.html src/main/java/dev/morling/onebrc/CalculateAverage_yourname

When you run this, it will generate a flamegraph in profile.html. You can then open this in a browser and see where your program is spending its time.

Rules and limits

  • Any of these Java distributions may be used:
    • Any builds provided by SDKMan
    • Early access builds available on openjdk.net may be used (including EA builds for OpenJDK projects like Valhalla)
    • Builds on builds.shipilev.net If you want to use a build not available via these channels, reach out to discuss whether it can be considered.
  • No external library dependencies may be used
  • Implementations must be provided as a single source file
  • The computation must happen at application runtime, i.e. you cannot process the measurements file at build time (for instance, when using GraalVM) and just bake the result into the binary
  • Input value ranges are as follows:
    • Station name: non null UTF-8 string of min length 1 character and max length 100 bytes, containing neither ; nor \n characters. (i.e. this could be 100 one-byte characters, or 50 two-byte characters, etc.)
    • Temperature value: non null double between -99.9 (inclusive) and 99.9 (inclusive), always with one fractional digit
  • There is a maximum of 10,000 unique station names
  • Line endings in the file are \n characters on all platforms
  • Implementations must not rely on specifics of a given data set, e.g. any valid station name as per the constraints above and any data distribution (number of measurements per station) must be supported
  • The rounding of output values must be done using the semantics of IEEE 754 rounding-direction "roundTowardPositive"

Entering the Challenge

To submit your own implementation to 1BRC, follow these steps:

  • Create a fork of the onebrc GitHub repository.
  • Run ./create_fork.sh <your_GH_user> to copy the baseline implementation to your personal files, or do this manually:
    • Create a copy of CalculateAverage_baseline.java, named CalculateAverage_<your_GH_user>.java, e.g. CalculateAverage_doloreswilson.java.
    • Create a copy of calculate_average_baseline.sh, named calculate_average_<your_GH_user>.sh, e.g. calculate_average_doloreswilson.sh.
    • Adjust that script so that it references your implementation class name. If needed, provide any JVM arguments via the JAVA_OPTS variable in that script. Make sure that script does not write anything to standard output other than calculation results.
    • (Optional) OpenJDK 21 is used by default. If a custom JDK build is required, create a copy of prepare_baseline.sh, named prepare_<your_GH_user>.sh, e.g. prepare_doloreswilson.sh. Include the SDKMAN command sdk use java [version] in the your prepare script.
    • (Optional) If you'd like to use native binaries (GraalVM), add all the required build logic to your prepare_<your_GH_user>.sh script.
  • Make that implementation fast. Really fast.
  • Run the test suite by executing /test.sh <your_GH_user>; if any differences are reported, fix them before submitting your implementation.
  • Create a pull request against the upstream repository, clearly stating
    • The name of your implementation class.
    • The execution time of the program on your system and specs of the same (CPU, number of cores, RAM). This is for informative purposes only, the official runtime will be determined as described below.
  • I will run the program and determine its performance as described in the next section, and enter the result to the scoreboard.

Note: I reserve the right to not evaluate specific submissions if I feel doubtful about the implementation (I.e. I won't run your Bitcoin miner ;).

If you'd like to discuss any potential ideas for implementing 1BRC with the community, you can use the GitHub Discussions of this repository. Please keep it friendly and civil.

The challenge runs until Jan 31 2024. Any submissions (i.e. pull requests) created after Jan 31 2024 23:59 UTC will not be considered.

Evaluating Results

Results are determined by running the program on a Hetzner AX161 dedicated server (32 core AMD EPYC™ 7502P (Zen2), 128 GB RAM).

Programs are run from a RAM disk (i.o. the IO overhead for loading the file from disk is not relevant), using 8 cores of the machine. Each contender must pass the 1BRC test suite (/test.sh). The hyperfine program is used for measuring execution times of the launch scripts of all entries, i.e. end-to-end times are measured. Each contender is run five times in a row. The slowest and the fastest runs are discarded. The mean value of the remaining three runs is the result for that contender and will be added to the results table above. The exact same measurements.txt file is used for evaluating all contenders. See the script evaluate.sh for the exact implementation of the evaluation steps.

Prize

If you enter this challenge, you may learn something new, get to inspire others, and take pride in seeing your name listed in the scoreboard above. Rumor has it that the winner may receive a unique 1🐝🏎️ t-shirt, too!

FAQ

Q: Can I use Kotlin or other JVM languages other than Java?
A: No, this challenge is focussed on Java only. Feel free to inofficially share implementations significantly outperforming any listed results, though.

Q: Can I use non-JVM languages and/or tools?
A: No, this challenge is focussed on Java only. Feel free to inofficially share interesting implementations and results though. For instance it would be interesting to see how DuckDB fares with this task.

Q: I've got an implementation—but it's not in Java. Can I share it somewhere?
A: Whilst non-Java solutions cannot be formally submitted to the challenge, you are welcome to share them over in the Show and tell GitHub discussion area.

Q: Can I use JNI?
A: Submissions must be completely implemented in Java, i.e. you cannot write JNI glue code in C/C++. You could use AOT compilation of Java code via GraalVM though, either by AOT-compiling the entire application, or by creating a native library (see here.

Q: What is the encoding of the measurements.txt file?
A: The file is encoded with UTF-8.

Q: Can I make assumptions on the names of the weather stations showing up in the data set?
A: No, while only a fixed set of station names is used by the data set generator, any solution should work with arbitrary UTF-8 station names (for the sake of simplicity, names are guaranteed to contain no ; or \n characters).

Q: Can I copy code from other submissions?
A: Yes, you can. The primary focus of the challenge is about learning something new, rather than "winning". When you do so, please give credit to the relevant source submissions. Please don't re-submit other entries with no or only trivial improvements.

Q: Which operating system is used for evaluation?
A: Fedora 39.

Q: My solution runs in 2 sec on my machine. Am I the fastest 1BRC-er in the world?
A: Probably not :) 1BRC results are reported in wallclock time, thus results of different implementations are only comparable when obtained on the same machine. If for instance an implementation is faster on a 32 core workstation than on the 8 core evaluation instance, this doesn't allow for any conclusions. When sharing 1BRC results, you should also always share the result of running the baseline implementation on the same hardware.

Q: Why 1🐝🏎️ ?
A: It's the abbreviation of the project name: One Billion Row Challenge.

1BRC on the Web

A list of external resources such as blog posts and videos, discussing 1BRC and specific implementations:

Sponsorship

A big thank you to my employer Decodable for funding the evaluation environment and supporting this challenge!

License

This code base is available under the Apache License, version 2.

Code of Conduct

Be excellent to each other! More than winning, the purpose of this challenge is to have fun and learn something new.