Changing benchmark_scale to 5 Rather than chasing individual tests, I think we have to simply average more than 3 trials per test.