Playground/1brc - 1brc - Gitea: Git with a cup of tea

Author SHA1 Message Date

Author	SHA1	Message	Date
tivrfoa	f4a0039a59	Try more chunks than threads, and of different sizes (#644 ) /** * Solution based on thomaswue solution, commit: * commit `d0a28599c2` * Author: Thomas Wuerthinger * Date: Sun Jan 21 20:13:48 2024 +0100 * * The goal here was to try to improve the runtime of his 10k * solution of: 00:04.516 * * With Thomas latest changes, his time is probably much better * already, and maybe even 1st place for the 10k too. * See: https://github.com/gunnarmorling/1brc/pull/606 * * But as I was already coding something, I'll submit just to * see if it will be faster than his previous 10k time of * 00:04.516 * * Changes: * It's a similar idea of my previous solution, that if you split * the chunks evenly, some threads might finish much faster and * stay idle, so: * 1) Create more chunks than threads, so the ones that finish first * can do something; * 2) Decrease chunk sizes as we get closer to the end of the file. */	2024-01-29 21:24:04 +01:00
tivrfoa	d9604d9258	Use LinkedBlockingQueue to process results - based on thomaswue (#603 ) /** * Solution based on thomaswue solution, commit: * commit `d0a28599c2` * Author: Thomas Wuerthinger <thomas.wuerthinger@oracle.com> * Date: Sun Jan 21 20:13:48 2024 +0100 * * Changes: * 1) Use LinkedBlockingQueue to store partial results, that * will then be merged into the final map later. * As different chunks finish at different times, this allows * to process them as they finish, instead of joining the * threads sequentially. * This change seems more useful for the 10k dataset, as the * runtime difference of each chunk is greater. * 2) Use only 4 threads if the file is >= 14GB. * This showed much better results on my local test, but I only * run with 200 million rows (because of limited RAM), and I have * no idea how it will perform on the 1brc HW. */	2024-01-27 19:41:00 +01:00

tivrfoa

f4a0039a59

Try more chunks than threads, and of different sizes (#644 )

/**
 * Solution based on thomaswue solution, commit:
 * commit d0a28599c2
 * Author: Thomas Wuerthinger
 * Date:   Sun Jan 21 20:13:48 2024 +0100
 *
 * The goal here was to try to improve the runtime of his 10k
 * solution of: 00:04.516
 *
 * With Thomas latest changes, his time is probably much better
 * already, and maybe even 1st place for the 10k too.
 * See: https://github.com/gunnarmorling/1brc/pull/606
 *
 * But as I was already coding something, I'll submit just to
 * see if it will be faster than his *previous* 10k time of
 * 00:04.516
 *
 * Changes:
 *   It's a similar idea of my previous solution, that if you split
 * the chunks evenly, some threads might finish much faster and
 * stay idle, so:
 *   1) Create more chunks than threads, so the ones that finish first
 * can do something;
 *   2) Decrease chunk sizes as we get closer to the end of the file.
 */

2024-01-29 21:24:04 +01:00

tivrfoa

d9604d9258

Use LinkedBlockingQueue to process results - based on thomaswue (#603 )

/**
 * Solution based on thomaswue solution, commit:
 * commit d0a28599c2
 * Author: Thomas Wuerthinger <thomas.wuerthinger@oracle.com>
 * Date:   Sun Jan 21 20:13:48 2024 +0100
 *
 * Changes:
 *   1) Use LinkedBlockingQueue to store partial results, that
 *   will then be merged into the final map later.
 *   As different chunks finish at different times, this allows
 *   to process them as they finish, instead of joining the
 *   threads sequentially.
 *     This change seems more useful for the 10k dataset, as the
 *   runtime difference of each chunk is greater.
 *   2) Use only 4 threads if the file is >= 14GB.
 *   This showed much better results on my local test, but I only
 *   run with 200 million rows (because of limited RAM), and I have
 *   no idea how it will perform on the 1brc HW.
 */

2024-01-27 19:41:00 +01:00

2 Commits