Playground/1brc - 1brc - Gitea: Git with a cup of tea

Author	SHA1	Message	Date
Markus Ebner	093bd35c44	seijikun: Fix new unit-test introduced with #125	2024-01-06 10:39:19 +01:00
Marko Topolnik	35b90992aa	More detailed attribution	2024-01-06 10:35:44 +01:00
Marko Topolnik	7ec968d3bb	One more sample in test file	2024-01-06 10:35:44 +01:00
Marko Topolnik	eccc8f9097	Improve name generation	2024-01-06 10:35:44 +01:00
Marko Topolnik	e09cb7deea	Limit names to 100 bytes	2024-01-06 10:35:44 +01:00
Marko Topolnik	a094d07925	Move attribution into weather_stations.csv	2024-01-06 10:35:44 +01:00
Marko Topolnik	d7456c6ff9	Add test sample with a worst-case UTF-8 name	2024-01-06 10:35:44 +01:00
Marko Topolnik	816e59b678	Eliminate duplicate station names	2024-01-06 10:35:44 +01:00
Marko Topolnik	0f1f204a0d	Generate measurements with random names Name length goes from 1 to 100.	2024-01-06 10:35:44 +01:00
Abhilash	ba1999cddf	1 brc - 2gb memory 1 min 31 secs	2024-01-06 00:25:15 +01:00
twobiers	c24bcac047	Adjust buffer size to solve test failure in #125	2024-01-06 00:12:36 +01:00
Yavuz Tas	f6acc6f3d5	A solution with Actor Model concurrency and MappedByteBuffer * A solution with Actor Model concurrency and MappedByteBuffer * fix test cases * revert back the file name to original * cache String hashCode calculation via composing with Key object * fix wrong key caching and eleminate duplicate String creation between actors * update possible char count in a line, fix calculate_average.sh * increase possible line length to 256 bytes, much safer to cover 100 chars I hope --------- Co-authored-by: Yavuz Tas <yavuz.tas@ing.com>	2024-01-05 23:50:58 +01:00
Roman Schweitzer	5f4ed31fec	CalculateAverage_truelive second attempt * cleanup * getDouble new double parser * parseBuffer more reliable * use graalvm to execute * cleanup * cleanup * fix formatting * fix graalvm init and launch script	2024-01-05 23:40:03 +01:00
Parth Mudgal	72ad94d6c2	artpar's attempt * artpar's attempt * artpar's attempt * remove int -> Integer conversions, custom parsing for measurements * remove allocations by caching station names also remove Integer and use int instead to remove valueOf calls * Fix result mismatch errors * parse int instead of double * reduce time spent reading the mapped buffer * cleanup unused memory * less is faster ? vector addition doesn't look worth it * backout from virtual threads as well * Fix breaking tests	2024-01-05 23:28:38 +01:00
greid	d3e88219f0	gabrielreid's first attempt Somewhat mixed collection of multiple ideas, mostly based initially on using the new JDK Vector API for extracting offsets of newlines and semicolons. Runs locally in just under 11 seconds on 1B rows of input on a 2020 M1 Macbook Air.	2024-01-05 23:19:02 +01:00
Jamie Stansfield	4614b81eb6	isolgpus: submission 1 * isolgpus: submission 1 * isolgpus: fix min value bug (breaks if a negative temperature never appears) * isolgpus: remove unused collector * isolgpus: fix split on chunk bug * isolgpus: change name equality algo to a cheaper check. * isolgpus: fix chunking state to cope with last byte of last chunk * isolgpus: hash as we go, instead of at the end * isolgpus: adjust thread count to core count * isolgpus: change cores to 8 statically --------- Co-authored-by: Jamie Stansfield <jalstansfield@gmail.com>	2024-01-05 23:10:43 +01:00
Yunus Inci	4fe00bf7a7	Improve spullara's solution	2024-01-05 23:00:43 +01:00
Johannes Schüth	22b5435893	Adding Johannes Schüth's submission	2024-01-05 22:52:39 +01:00
Ramzi Ben Yahya	e8a3011aca	rby: Has some interesting optimisations but could be improved further with a custom hash map * rby: Could be improved with a custom hashmap * Flag not needed * Fixes the tests when running ./test.sh rby	2024-01-05 20:25:51 +01:00
Tobi	d617039d10	Twobier's submission * First performance tweaks * further tweaks * collect into a treemap * Tweak JVM options * Inline rounding into collector * reduce some operations * oops, add missing braces * tweak JVM options * small fixes * add min and max to processing * fix min * remove compact strings * replace sumWithCompensation with naive sum implementation * use UseShenandoahGC * integrate mmap * integrate mmap * Fix messed up array logic * Set jdk version	2024-01-05 20:18:27 +01:00
Markus Ebner	36dac255cf	Update seijikun implementation * Use Integer calculation instead of double, add unit-test * Bring back StationIdent optimization Originally, StationIdent was using byte[] to store names, so the extra String allocation could be avoided. However, that produced incorrect sorting. Sorting is now moved to the result merging step. Here, names are converted to Strings. * Implement readStationName with SIMD 256bit * Rebase and cleanup test code, now that it's in the project * Fix seijikun formatting * Fix test failure in specific jobCnt edge-cases * Also switch to graalvm	2024-01-05 19:35:15 +01:00
deemkeen	e3f6c3aaf7	initial deemkeen	2024-01-05 19:30:02 +01:00
Artsiom Korzun	cec579b506	improved artsiomkorzun solution	2024-01-05 19:02:44 +01:00
Keshavram Kuduwa	a53549ae50	Resolves #102 and Code Optimizations	2024-01-05 18:35:31 +01:00
Filip Hrisafov	c4879d4104	Use proper key for CalculateAverage_filiphr; * Revert using hash as a key * Use custom key with Arrays#equals as a key in the Map of measurements * Add sdk use java in the calculate script	2024-01-05 17:54:08 +01:00
Roy van Rijn	3a2e0ed267	Adding more speed improvements, going for first again. Updating script	2024-01-05 17:44:36 +01:00
Ujjwal Bharti	631722158c	Added implementation for calculating average	2024-01-05 17:23:13 +01:00
Elliot Barlas	99b453334c	Implement imperative state machine for floating point parser rather then generic, adaptive loop.	2024-01-05 17:11:22 +01:00
Samson Yeung	a1a9a19324	Custom atoi/atof parser logic, plus math changes. This commit uses a custom atoi function that converts 12.4 to 124 so we can do integer math instead of using doubles.	2024-01-05 16:59:29 +01:00
David Kopec	8a282ab71b	Add davecom Entry	2024-01-05 16:35:05 +01:00
Roman Romanchuk	15cceae81b	fatroom's initial attempt * Initial attempt * Fixed temperature parsing * Switched to memory mapped files * Fixed rounding issues * Inline of temperature reading * Fixed output rounding	2024-01-05 11:30:18 +01:00
anandmattikopp	0d33213dc6	feat: first version of the 1brc solution	2024-01-05 11:24:14 +01:00
Arman Sharif	951c06e051	armandino: first submission	2024-01-05 00:13:44 +01:00
Alexander Yastrebov	69ff290d9d	jgrateron: fix formatting Followup on #69	2024-01-04 23:56:47 +01:00
Nick Palmer	39c421d520	Pass newly added tests :fingers-crossed:	2024-01-04 23:54:04 +01:00
Nick Palmer	6aa63e1bd5	Attempt nicer threading via streams and spliterators	2024-01-04 23:54:04 +01:00
Richard Startin	b2cd84c6bc	make aggregation state grow dynamically	2024-01-04 23:48:54 +01:00
Alexander Yastrebov	b467319e58	test: use temperature value of 1.0 In case of key collision broken implementation will likely attribute measurements to the wrong key and therefore it is better to have non-zero value to end up with a wrong average value. When all measurements are zero then averages are also zero even when attributed to the wrong keys. Updates #91	2024-01-04 23:46:46 +01:00
jairo	a17ab05d4b	add implementation jgrateron	2024-01-04 23:43:43 +01:00
Nils Semmelrock	12ae36ade1	Adding Nils Semmelrock's submission nothing fancy, just work on chunks in parallel and optimize bottlenecks	2024-01-04 23:31:47 +01:00
Roy van Rijn	1c74049991	Updating Roy's submission * Added tests for endian-calculations (had these in a different class, perhaps handy for others to see as well) Inlined the hash function, runs locally in 2.4sec now, hopefully endian issues fix Added equals to support any city name up to 1024 in length, don't rely on hash * For clarity I've updated the code so endian doesn't change the hashes, easier to debug. * Fixing bug in array check Simple is faster * Also spotted the diff, not just the big exception Fixed buffer limit issue	2024-01-04 23:22:48 +01:00
Moysés Borges Furtado	acb6510a02	Adding Moysés Borges Furtado's submission	2024-01-04 23:15:22 +01:00
Alexander Yastrebov	723cc6a33b	test: add sample with 10k unique keys Input created via ```sh bash -c 'for i in {1..10000} ; do echo "id$i;0.0" ; done' >./src/test/resources/samples/measurements-10000-unique-keys.txt ``` and output via baseline implementation. Keys are short and very similar which improves chances for collision and hence are good for testing. Fixes #91	2024-01-04 21:39:04 +01:00
Gunnar Morling	e1a6832837	Adding a missing new line	2024-01-04 21:32:02 +01:00
Elliot Barlas	a8bd6b58ce	Elliot Barlas: Use proper hash key collision detection scheme * Use open-addressing scheme to deal with hash table collisions. Reduce concurrency from 16 to 8. Use bit mask rather than mod operator to confine hash code to table range. * Properly handle file partitions that reside entirely within a line. * Reorder statements in doProcessBuffer.	2024-01-04 21:06:19 +01:00
Sam Pullara	4af3253d53	Updating Sam Pullara's entry	2024-01-04 19:14:06 +01:00
Gunnar Morling	c1954f6a3f	Formatting	2024-01-04 19:03:42 +01:00
artsiomkorzun	9b0b10f101	Adding artsiomkorzun's solution	2024-01-04 19:01:28 +01:00
Filip Hrisafov	8c5aaf2db9	Manually compute temperature value instead of using Long.parseLong	2024-01-04 18:50:45 +01:00
Filip Hrisafov	f5f3a41045	Use a hash key for the city as a key in the map	2024-01-04 18:50:45 +01:00

1 2 3

140 Commits