Commit Graph

13 Commits

Author SHA1 Message Date
Jason Nochlin
eff73db9fe
evaluate2.sh: Check output of warmup run and abort early if failed (#333)
* refactor: replace xtrace with "print_and_execute" function

* nit: stylize error messages

* replace out_expected.txt with measurements_1B.out

* print

* prevent errors on cleanup

* run tests and check warmup run output before running benchmark

* move "git diff" pretty diff output to test.sh

* Ensure "set -e" is re-enabled if we followed a "continue" branch

* add timeouts to test.sh invocations

* use diff with tocsv.sh to show differences on failed test

* add --quiet mode to test.sh

* move prepare_$fork.sh invocation to right below hyperfine since test.sh also invokes it

* Revert "add --quiet mode to test.sh"

This reverts commit 13e9fb7f395c1bd64a62528b8349803bc1366941.

* use tee to capture test output to a temp file and print contents on failure

---------

Co-authored-by: Jason Nochlin <hundredwatt@users.noreply.github.com>
2024-01-13 12:19:29 +01:00
Gunnar Morling
4a5eda70fd Logging 2024-01-11 09:31:36 +01:00
Jason Nochlin
007298b4d3
evaluate2.sh: Add Time Limit for Runs (#293)
* evaluate2.sh: Add Time Limit for Runs

* check that bc is installed

---------

Co-authored-by: Jason Nochlin <hundredwatt@users.noreply.github.com>
2024-01-11 09:25:58 +01:00
Gunnar Morling
e084c36760 Leaderboard update, better failure output 2024-01-11 09:21:04 +01:00
Alexander Yastrebov
c9183a5aeb
Remove additional_build_steps_*.sh support (#301)
There is no need to have it as preparation steps could be fit into prepare_*.sh
2024-01-11 09:05:13 +01:00
Jason Nochlin
e45c338f0e more robust error message 2024-01-10 16:01:07 +01:00
Jason Nochlin
a8ebaf1a59 catch hyperfine command failed 2024-01-10 16:01:07 +01:00
Jason Nochlin
6ee7f2d0b0 remove debug line 2024-01-10 15:26:19 +01:00
Jason Nochlin
08d99c38e5 Validate that ./calculate_average_<fork>.sh exists for each fork 2024-01-10 15:26:19 +01:00
Jason Nochlin
7c81bfec70 grep returns exit code 1 when no match, || true prevents the script from exiting early 2024-01-10 15:26:19 +01:00
Gunnar Morling
8fac59de42 #281 Trimming slowest/fastest run, not first/last in evaluate2.sh 2024-01-10 11:43:16 +01:00
Jason Nochlin
c92c88b9fb
evaluate2.sh improvements - leaderboard, default SDK
* reset the JDK to the default (21.0.1-open) when no prepare script is provided

* leaderboard improvements - sorting and content

* run sdk install once at the beginning of the script for all the SDKs detected in any of the evaluated prepare scripts

* remove unnecessary code and tweak doc comments

* one more nit

* Don't print rankings values when only 1 fork is being evaluated

* It's been a few hours, so I now have some more rate limit :)

---------

Co-authored-by: Jason Nochlin <hundredwatt@users.noreply.github.com>
2024-01-10 10:05:16 +01:00
Jason Nochlin
42e5ca1435
Use hyperfine and jq to improve evaluate.sh
* create new version of evaluate.sh using hyperfine + jq

* output the raw times for each command

* nit: s/command/fork/

* update evaluate2.sh for new fork file structure

* review changes

* use numactl on linux

* 1 warmup

* verify output

* leaderboard

* do not early exit on hyperfine error

* check if SMT and turbo boost are disabled

* fix bug

---------

Co-authored-by: Jason Nochlin <hundredwatt@users.noreply.github.com>
2024-01-09 20:51:59 +01:00