Commit Graph

88 Commits

Author SHA1 Message Date
Horia Chiorean
1a9b1cb7da
hchiorean - Initial version
* Initial version

* Removed some System out messages and tweaked some config values

* Added some fixes and some tweaks
2024-01-06 17:59:49 +01:00
Quan Anh Mai
4fc6034812
merykitty's attempt
* first commit

* fix test

* concurrency

* format for easier to follow explanation

* fix large keys

* fix overlapping ranges

* prefetch file

* add comments, remove prefetching

* typo
2024-01-06 17:01:12 +01:00
Bruno Félix
6eb8b49b7c
Implementation of 1brc - felix19350 (#53)
* Implementation of 1brc - felix19350

* Added license header

* Fixed failing tests

* Replaced parsing of doubles with a custom parser and integer arithmetic

---------

Co-authored-by: Bruno Felix <bruno.felix@klarna.com>
2024-01-06 11:27:30 +01:00
Michael Berry
38fc317073 Fix #159 2024-01-06 11:05:36 +01:00
Thomas Wuerthinger
a53aa2e6fd
Initial version for thomaswue with Oracle GraalVM Native Image
* Initial version.

* Make PGO feature optional off-by-default. Needs PGO_MODE environment
variable to be set. Add -O3 -march=native tuning flags for better
performance.

* Adjust script to be more quiet.

* Adjust max city length. Fix an issue when accumulating results.

* Tune thomaswue submission.
mmap the entire file, use Unsafe directly instead of ByteBuffer, avoid byte[] copies.
These tricks give a ~30% speedup, over an already fast implementation.

* Optimize parsing of numbers based on specific given constraints.

* Fix for segment calculation for case of very small input.

* Minor shell script fixes.

* Separate out build step into file additional_build_step_thomaswue.sh,
simplify run script and remove PGO option for now.

* Minor corrections to the run script.

---------

Co-authored-by: Alfonso² Peterssen <alfonso.peterssen@oracle.com>
2024-01-06 10:55:07 +01:00
Markus Ebner
093bd35c44 seijikun: Fix new unit-test introduced with #125 2024-01-06 10:39:19 +01:00
Marko Topolnik
35b90992aa More detailed attribution 2024-01-06 10:35:44 +01:00
Marko Topolnik
eccc8f9097 Improve name generation 2024-01-06 10:35:44 +01:00
Marko Topolnik
e09cb7deea Limit names to 100 bytes 2024-01-06 10:35:44 +01:00
Marko Topolnik
a094d07925 Move attribution into weather_stations.csv 2024-01-06 10:35:44 +01:00
Marko Topolnik
816e59b678 Eliminate duplicate station names 2024-01-06 10:35:44 +01:00
Marko Topolnik
0f1f204a0d Generate measurements with random names
Name length goes from 1 to 100.
2024-01-06 10:35:44 +01:00
Abhilash
ba1999cddf 1 brc - 2gb memory 1 min 31 secs 2024-01-06 00:25:15 +01:00
twobiers
c24bcac047 Adjust buffer size to solve test failure in #125 2024-01-06 00:12:36 +01:00
Yavuz Tas
f6acc6f3d5
A solution with Actor Model concurrency and MappedByteBuffer
* A solution with Actor Model concurrency and MappedByteBuffer

* fix test cases

* revert back the file name to original

* cache String hashCode calculation via composing with Key object

* fix wrong key caching and eleminate duplicate String creation between actors

* update possible char count in a line, fix calculate_average.sh

* increase possible line length to 256 bytes, much safer to cover 100 chars I hope

---------

Co-authored-by: Yavuz Tas <yavuz.tas@ing.com>
2024-01-05 23:50:58 +01:00
Roman Schweitzer
5f4ed31fec
CalculateAverage_truelive second attempt
* cleanup

* getDouble new double parser

* parseBuffer more reliable

* use graalvm to execute

* cleanup

* cleanup

* fix formatting

* fix graalvm init and launch script
2024-01-05 23:40:03 +01:00
Parth Mudgal
72ad94d6c2
artpar's attempt
* artpar's attempt

* artpar's attempt

* remove int -> Integer conversions, custom parsing for measurements

* remove allocations by caching station names
also remove Integer and use int instead to remove valueOf calls

* Fix result mismatch errors

* parse int instead of double

* reduce time spent reading the mapped buffer

* cleanup unused memory

* less is faster ? vector addition doesn't look worth it

* backout from virtual threads as well

* Fix breaking tests
2024-01-05 23:28:38 +01:00
greid
d3e88219f0 gabrielreid's first attempt
Somewhat mixed collection of multiple ideas, mostly based initially
on using the new JDK Vector API for extracting offsets of newlines
and semicolons.

Runs locally in just under 11 seconds on 1B rows of input on a
2020 M1 Macbook Air.
2024-01-05 23:19:02 +01:00
Jamie Stansfield
4614b81eb6
isolgpus: submission 1
* isolgpus: submission 1

* isolgpus: fix min value bug (breaks if a negative temperature never appears)

* isolgpus: remove unused collector

* isolgpus: fix split on chunk bug

* isolgpus: change name equality algo to a cheaper check.

* isolgpus: fix chunking state to cope with last byte of last chunk

* isolgpus: hash as we go, instead of at the end

* isolgpus: adjust thread count to core count

* isolgpus: change cores to 8 statically

---------

Co-authored-by: Jamie Stansfield <jalstansfield@gmail.com>
2024-01-05 23:10:43 +01:00
Yunus Inci
4fe00bf7a7 Improve spullara's solution 2024-01-05 23:00:43 +01:00
Johannes Schüth
22b5435893 Adding Johannes Schüth's submission 2024-01-05 22:52:39 +01:00
Ramzi Ben Yahya
e8a3011aca
rby: Has some interesting optimisations but could be improved further with a custom hash map
* rby: Could be improved with a custom hashmap

* Flag not needed

* Fixes the tests when running ./test.sh rby
2024-01-05 20:25:51 +01:00
Tobi
d617039d10
Twobier's submission
* First performance tweaks

* further tweaks

* collect into a treemap

* Tweak JVM options

* Inline rounding into collector

* reduce some operations

* oops, add missing braces

* tweak JVM options

* small fixes

* add min and max to processing

* fix min

* remove compact strings

* replace sumWithCompensation with naive sum implementation

* use UseShenandoahGC

* integrate mmap

* integrate mmap

* Fix messed up array logic

* Set jdk version
2024-01-05 20:18:27 +01:00
Markus Ebner
36dac255cf
Update seijikun implementation
* Use Integer calculation instead of double, add unit-test

* Bring back StationIdent optimization

Originally, StationIdent was using byte[] to store names, so the extra
String allocation could be avoided. However, that produced incorrect
sorting.
Sorting is now moved to the result merging step. Here, names are
converted to Strings.

* Implement readStationName with SIMD 256bit

* Rebase and cleanup test code, now that it's in the project

* Fix seijikun formatting

* Fix test failure in specific jobCnt edge-cases

* Also switch to graalvm
2024-01-05 19:35:15 +01:00
deemkeen
e3f6c3aaf7
initial deemkeen 2024-01-05 19:30:02 +01:00
Artsiom Korzun
cec579b506 improved artsiomkorzun solution 2024-01-05 19:02:44 +01:00
Keshavram Kuduwa
a53549ae50
Resolves #102 and Code Optimizations 2024-01-05 18:35:31 +01:00
Filip Hrisafov
c4879d4104
Use proper key for CalculateAverage_filiphr;
* Revert using hash as a key
* Use custom key with Arrays#equals as a key in the Map of measurements
* Add sdk use java in the calculate script
2024-01-05 17:54:08 +01:00
Roy van Rijn
3a2e0ed267 Adding more speed improvements, going for first again.
Updating script
2024-01-05 17:44:36 +01:00
Ujjwal Bharti
631722158c Added implementation for calculating average 2024-01-05 17:23:13 +01:00
Elliot Barlas
99b453334c Implement imperative state machine for floating point parser rather then generic, adaptive loop. 2024-01-05 17:11:22 +01:00
Samson Yeung
a1a9a19324 Custom atoi/atof parser logic, plus math changes.
This commit uses a custom atoi function that converts 12.4 to 124 so we
can do integer math instead of using doubles.
2024-01-05 16:59:29 +01:00
David Kopec
8a282ab71b
Add davecom Entry 2024-01-05 16:35:05 +01:00
Roman Romanchuk
15cceae81b
fatroom's initial attempt
* Initial attempt

* Fixed temperature parsing

* Switched to memory mapped files

* Fixed rounding issues

* Inline of temperature reading

* Fixed output rounding
2024-01-05 11:30:18 +01:00
anandmattikopp
0d33213dc6 feat: first version of the 1brc solution 2024-01-05 11:24:14 +01:00
Arman Sharif
951c06e051 armandino: first submission 2024-01-05 00:13:44 +01:00
Alexander Yastrebov
69ff290d9d jgrateron: fix formatting
Followup on #69
2024-01-04 23:56:47 +01:00
Nick Palmer
39c421d520 Pass newly added tests :fingers-crossed: 2024-01-04 23:54:04 +01:00
Nick Palmer
6aa63e1bd5 Attempt nicer threading via streams and spliterators 2024-01-04 23:54:04 +01:00
Richard Startin
b2cd84c6bc make aggregation state grow dynamically 2024-01-04 23:48:54 +01:00
jairo
a17ab05d4b add implementation jgrateron 2024-01-04 23:43:43 +01:00
Nils Semmelrock
12ae36ade1
Adding Nils Semmelrock's submission
nothing fancy, just work on chunks in parallel and optimize bottlenecks
2024-01-04 23:31:47 +01:00
Roy van Rijn
1c74049991
Updating Roy's submission
* Added tests for endian-calculations (had these in a different class, perhaps handy for others to see as well)

Inlined the hash function, runs locally in 2.4sec now, hopefully endian issues fix

Added equals to support any city name up to 1024 in length, don't rely on hash

* For clarity I've updated the code so endian doesn't change the hashes, easier to debug.

* Fixing bug in array check

Simple is faster

* Also spotted the diff, not just the big exception

Fixed buffer limit issue
2024-01-04 23:22:48 +01:00
Moysés Borges Furtado
acb6510a02
Adding Moysés Borges Furtado's submission 2024-01-04 23:15:22 +01:00
Gunnar Morling
e1a6832837 Adding a missing new line 2024-01-04 21:32:02 +01:00
Elliot Barlas
a8bd6b58ce
Elliot Barlas: Use proper hash key collision detection scheme
* Use open-addressing scheme to deal with hash table collisions. Reduce concurrency from 16 to 8. Use bit mask rather than mod operator to confine hash code to table range.

* Properly handle file partitions that reside entirely within a line.

* Reorder statements in doProcessBuffer.
2024-01-04 21:06:19 +01:00
Sam Pullara
4af3253d53
Updating Sam Pullara's entry 2024-01-04 19:14:06 +01:00
Gunnar Morling
c1954f6a3f Formatting 2024-01-04 19:03:42 +01:00
artsiomkorzun
9b0b10f101
Adding artsiomkorzun's solution 2024-01-04 19:01:28 +01:00
Filip Hrisafov
8c5aaf2db9 Manually compute temperature value instead of using Long.parseLong 2024-01-04 18:50:45 +01:00