Commit Graph

326 Commits

Author SHA1 Message Date
greid
d3e88219f0 gabrielreid's first attempt
Somewhat mixed collection of multiple ideas, mostly based initially
on using the new JDK Vector API for extracting offsets of newlines
and semicolons.

Runs locally in just under 11 seconds on 1B rows of input on a
2020 M1 Macbook Air.
2024-01-05 23:19:02 +01:00
Jamie Stansfield
4614b81eb6
isolgpus: submission 1
* isolgpus: submission 1

* isolgpus: fix min value bug (breaks if a negative temperature never appears)

* isolgpus: remove unused collector

* isolgpus: fix split on chunk bug

* isolgpus: change name equality algo to a cheaper check.

* isolgpus: fix chunking state to cope with last byte of last chunk

* isolgpus: hash as we go, instead of at the end

* isolgpus: adjust thread count to core count

* isolgpus: change cores to 8 statically

---------

Co-authored-by: Jamie Stansfield <jalstansfield@gmail.com>
2024-01-05 23:10:43 +01:00
Yunus Inci
4fe00bf7a7 Improve spullara's solution 2024-01-05 23:00:43 +01:00
Johannes Schüth
22b5435893 Adding Johannes Schüth's submission 2024-01-05 22:52:39 +01:00
Ramzi Ben Yahya
e8a3011aca
rby: Has some interesting optimisations but could be improved further with a custom hash map
* rby: Could be improved with a custom hashmap

* Flag not needed

* Fixes the tests when running ./test.sh rby
2024-01-05 20:25:51 +01:00
Tobi
d617039d10
Twobier's submission
* First performance tweaks

* further tweaks

* collect into a treemap

* Tweak JVM options

* Inline rounding into collector

* reduce some operations

* oops, add missing braces

* tweak JVM options

* small fixes

* add min and max to processing

* fix min

* remove compact strings

* replace sumWithCompensation with naive sum implementation

* use UseShenandoahGC

* integrate mmap

* integrate mmap

* Fix messed up array logic

* Set jdk version
2024-01-05 20:18:27 +01:00
Markus Ebner
36dac255cf
Update seijikun implementation
* Use Integer calculation instead of double, add unit-test

* Bring back StationIdent optimization

Originally, StationIdent was using byte[] to store names, so the extra
String allocation could be avoided. However, that produced incorrect
sorting.
Sorting is now moved to the result merging step. Here, names are
converted to Strings.

* Implement readStationName with SIMD 256bit

* Rebase and cleanup test code, now that it's in the project

* Fix seijikun formatting

* Fix test failure in specific jobCnt edge-cases

* Also switch to graalvm
2024-01-05 19:35:15 +01:00
deemkeen
e3f6c3aaf7
initial deemkeen 2024-01-05 19:30:02 +01:00
Artsiom Korzun
cec579b506 improved artsiomkorzun solution 2024-01-05 19:02:44 +01:00
Keshavram Kuduwa
a53549ae50
Resolves #102 and Code Optimizations 2024-01-05 18:35:31 +01:00
Filip Hrisafov
c4879d4104
Use proper key for CalculateAverage_filiphr;
* Revert using hash as a key
* Use custom key with Arrays#equals as a key in the Map of measurements
* Add sdk use java in the calculate script
2024-01-05 17:54:08 +01:00
Roy van Rijn
3a2e0ed267 Adding more speed improvements, going for first again.
Updating script
2024-01-05 17:44:36 +01:00
Ujjwal Bharti
631722158c Added implementation for calculating average 2024-01-05 17:23:13 +01:00
Elliot Barlas
99b453334c Implement imperative state machine for floating point parser rather then generic, adaptive loop. 2024-01-05 17:11:22 +01:00
Samson Yeung
a1a9a19324 Custom atoi/atof parser logic, plus math changes.
This commit uses a custom atoi function that converts 12.4 to 124 so we
can do integer math instead of using doubles.
2024-01-05 16:59:29 +01:00
David Kopec
8a282ab71b
Add davecom Entry 2024-01-05 16:35:05 +01:00
Roman Romanchuk
15cceae81b
fatroom's initial attempt
* Initial attempt

* Fixed temperature parsing

* Switched to memory mapped files

* Fixed rounding issues

* Inline of temperature reading

* Fixed output rounding
2024-01-05 11:30:18 +01:00
anandmattikopp
0d33213dc6 feat: first version of the 1brc solution 2024-01-05 11:24:14 +01:00
Arman Sharif
951c06e051 armandino: first submission 2024-01-05 00:13:44 +01:00
Alexander Yastrebov
69ff290d9d jgrateron: fix formatting
Followup on #69
2024-01-04 23:56:47 +01:00
Nick Palmer
39c421d520 Pass newly added tests :fingers-crossed: 2024-01-04 23:54:04 +01:00
Nick Palmer
6aa63e1bd5 Attempt nicer threading via streams and spliterators 2024-01-04 23:54:04 +01:00
Richard Startin
b2cd84c6bc make aggregation state grow dynamically 2024-01-04 23:48:54 +01:00
Alexander Yastrebov
b467319e58 test: use temperature value of 1.0
In case of key collision broken implementation will likely attribute
measurements to the wrong key and therefore it is better to have
non-zero value to end up with a wrong average value.

When all measurements are zero then averages are also zero even
when attributed to the wrong keys.

Updates #91
2024-01-04 23:46:46 +01:00
jairo
a17ab05d4b add implementation jgrateron 2024-01-04 23:43:43 +01:00
Nils Semmelrock
12ae36ade1
Adding Nils Semmelrock's submission
nothing fancy, just work on chunks in parallel and optimize bottlenecks
2024-01-04 23:31:47 +01:00
Roy van Rijn
1c74049991
Updating Roy's submission
* Added tests for endian-calculations (had these in a different class, perhaps handy for others to see as well)

Inlined the hash function, runs locally in 2.4sec now, hopefully endian issues fix

Added equals to support any city name up to 1024 in length, don't rely on hash

* For clarity I've updated the code so endian doesn't change the hashes, easier to debug.

* Fixing bug in array check

Simple is faster

* Also spotted the diff, not just the big exception

Fixed buffer limit issue
2024-01-04 23:22:48 +01:00
Moysés Borges Furtado
acb6510a02
Adding Moysés Borges Furtado's submission 2024-01-04 23:15:22 +01:00
Alexander Yastrebov
723cc6a33b test: add sample with 10k unique keys
Input created via
```sh
bash -c 'for i in {1..10000} ; do echo "id$i;0.0" ; done' >./src/test/resources/samples/measurements-10000-unique-keys.txt
```
and output via baseline implementation.

Keys are short and very similar which improves chances for collision
and hence are good for testing.

Fixes #91
2024-01-04 21:39:04 +01:00
Gunnar Morling
e1a6832837 Adding a missing new line 2024-01-04 21:32:02 +01:00
Elliot Barlas
a8bd6b58ce
Elliot Barlas: Use proper hash key collision detection scheme
* Use open-addressing scheme to deal with hash table collisions. Reduce concurrency from 16 to 8. Use bit mask rather than mod operator to confine hash code to table range.

* Properly handle file partitions that reside entirely within a line.

* Reorder statements in doProcessBuffer.
2024-01-04 21:06:19 +01:00
Sam Pullara
4af3253d53
Updating Sam Pullara's entry 2024-01-04 19:14:06 +01:00
Gunnar Morling
c1954f6a3f Formatting 2024-01-04 19:03:42 +01:00
artsiomkorzun
9b0b10f101
Adding artsiomkorzun's solution 2024-01-04 19:01:28 +01:00
Filip Hrisafov
8c5aaf2db9 Manually compute temperature value instead of using Long.parseLong 2024-01-04 18:50:45 +01:00
Filip Hrisafov
f5f3a41045 Use a hash key for the city as a key in the map 2024-01-04 18:50:45 +01:00
Filip Hrisafov
4a483b4097 Use long parse and use char array instead of CharBuffer for adding to it 2024-01-04 18:50:45 +01:00
Gunnar Morling
88b1c30db8 Leaderboard update 2024-01-04 18:22:13 +01:00
Richard Startin
c3411f6023
Richard Startin: Adopt @spullara's double parsing code;
* increase chunk size
* simplify and tune parameters
2024-01-04 18:19:56 +01:00
Peter Lawrey
a09fa928db
Adding Peter Lawrey's submission 2024-01-04 17:59:40 +01:00
Gunnar Morling
5c219e7b7a Fixing wrong expected value 2024-01-04 17:43:01 +01:00
Gunnar Morling
b6d33fd8fe Expanding tests and eval infra 2024-01-04 17:22:00 +01:00
Filip Hrisafov
a503362c36 Auto reformat classes 2024-01-04 15:35:34 +01:00
Alexander Yastrebov
c9400bc1ce test: add test samples
Adds test samples that can be used for unit tests or to verify
implementations via:
```bash
for sample in $(ls src/test/resources/samples/*.txt)
do
  echo "Validating $sample"
  rm -f measurements.txt
  ln -s $sample measurements.txt

  diff <(./calculate_average.sh) ${sample%.txt}.out
done
rm measurements.txt
```

For #61
2024-01-04 13:18:29 +01:00
Dimitar Dimitrov
d73457872f ddimtirov - switched to the foreign memory access preview API for another 10% speedup 2024-01-03 21:04:44 +01:00
Dimitar Dimitrov
1923fc65a8 ddimtirov - lifted parallel mmapped i/o from Sam Pullara's implementation 2024-01-03 21:04:44 +01:00
Dimitar Dimitrov
57cfa54c68 ddimtirov - single-threaded datastructures tuning - reading to char buffers, one pass, no allocation processing 2024-01-03 21:04:44 +01:00
Dimitar Dimitrov
2458f056d6 ddimtirov - fixpoint, objects, streams and strings 2024-01-03 21:04:44 +01:00
Roy van Rijn
5570f1b60a
Roy van Rijn: memory mapped files, branchless parsing, bitwiddle magic
Added SWAR (SIMD Within A Register) code to increase bytebuffer processing/throughput
Delaying the creation of the String by comparing hash, segmenting like spullara, improved EOL finding

Co-authored-by: Gunnar Morling <gunnar.morling@googlemail.com>
2024-01-03 20:44:24 +01:00
Richard Startin
0ba5cf33d4 richardstartin submission 2024-01-03 20:42:18 +01:00
Filip Hrisafov
d57cf78faa
Adding filiphr's submission;
* Initial implementation using Shenandoah GC and parallel iteration

* Use memory mapped files

* Iterate the buffer once and use BigDecimal parsing instead of Double.parseDouble

* Add information about Graal

* Add sdk use to calculate script
2024-01-03 20:32:16 +01:00
Karl Heinz Marbaise
95c9d091ef
Adding khmarbaise 2024-01-03 20:21:23 +01:00
Markus Ebner
580c4ac214
Adding seijikun's submission 2024-01-03 20:06:14 +01:00
Nick Palmer
8e6298cd2a
Adding Nick Palmer's submission;
* Memory mapped file, single-pass parsing, custom hash map, fixed thread pool
The threading was a hasty addition and needs work

* Used arraylist instead of treemap to reduce a little overhead
We only need it sorted for output, so only construct a treemap for output

* Attempt to speed up double conversion

* Cap core count for low-core systems

* Fix wrong exponent

* Accumulate measurement value in double, seems marginally faster

Benchmark                                                           Mode  Cnt    Score    Error   Units
DoubleParsingBenchmark.ourToDouble                                 thrpt   10  569.771 ±  7.065  ops/us
DoubleParsingBenchmark.ourToDoubleAccumulateInToDouble             thrpt   10  648.026 ±  7.741  ops/us
DoubleParsingBenchmark.ourToDoubleDivideInsteadOfMultiply          thrpt   10  570.412 ±  9.329  ops/us
DoubleParsingBenchmark.ourToDoubleNegative                         thrpt   10  512.618 ±  8.580  ops/us
DoubleParsingBenchmark.ourToDoubleNegativeAccumulateInToDouble     thrpt   10  565.043 ± 18.137  ops/us
DoubleParsingBenchmark.ourToDoubleNegativeDivideInsteadOfMultiply  thrpt   10  511.228 ± 13.967  ops/us
DoubleParsingBenchmark.stringToDouble                              thrpt   10   52.310 ±  1.351  ops/us
DoubleParsingBenchmark.stringToDoubleNegative                      thrpt   10   50.785 ±  1.252  ops/us
2024-01-03 17:21:56 +01:00
Elliot Barlas
1b048c876d
Adding Elliot Barlas' submission 2024-01-03 16:25:24 +01:00
Roman Schweitzer
ea035790fd
Using DoubleAccumulators to save on a measurment creation (#26) 2024-01-03 16:10:31 +01:00
Chris Riccomini
e9f062ce4d Adding Chris Riccomini's submission 2024-01-03 15:56:31 +01:00
Sam Pullara
c832d64afe
Adding Sam Pullara's submission; 2024-01-03 15:35:51 +01:00
Rene Schwietzke
70fcbf9c27 Removed changes to formatting 2024-01-03 13:03:37 +01:00
Rene Schwietzke
04bd2d69b6 Faster version of the data generator 2024-01-03 13:03:37 +01:00
Karl Heinz Marbaise
7d485d0e8b Usage of try-with-resources
pom file cleanup
2024-01-03 13:03:03 +01:00
Aurelian Tutuianu
1721848570 - implementation by padreati 2024-01-02 21:16:49 +01:00
Gunnar Morling
7b7a7d1667 Evaluating bjhara's entry 2024-01-02 20:43:46 +01:00
Hampus Ram
6b13d52b67 Implementation using memory mapped file 2024-01-02 20:41:33 +01:00
Gunnar Morling
5e80d8a7b0 Evaluating Kuduwa Keshavram's submission 2024-01-02 20:30:41 +01:00
Keshavram Kuduwa
bea9cfdbd1 Keshavram Kuduwa's Submission 2024-01-02 20:25:56 +01:00
Gunnar Morling
dfd26e8168 Evaluating itaske's submission 2024-01-02 20:07:39 +01:00
itaske
48a6e49e47
Adding itaske's implementation 2024-01-02 20:01:33 +01:00
Roy van Rijn
2155286d7a
Initial implementation, using BufferedReader, parallel processing, combining everything in a single go, sorting afterwards (unoptimized) 2024-01-01 18:33:40 +01:00
Gunnar Morling
5e2657d809 README update 2024-01-01 14:39:46 +01:00
Nicolai Parlog
b3d23eb8b2 Sort weather stations 2023-12-29 21:24:02 +01:00
Nicolai Parlog
d53b3aac03 Don't reformat weather station comment 2023-12-29 18:30:42 +01:00
Gunnar Morling
3854500520 📈 More stations 2023-12-28 22:33:15 +01:00
Gunnar Morling
fd9a5f9799 📈 More stations 2023-12-28 22:12:43 +01:00
Gunnar Morling
28ed5722cf 🚧 Removing limit 2023-12-28 15:42:23 +01:00
Gunnar Morling
4226200b43 🏆 Initial import 2023-12-28 12:08:03 +01:00