I was just recently going through some old episodes from Software Engineering Radio when I came across this one episode featuring Casey Muratori, where he goes through some of his thoughts around his video from February 2023, titled "‘Clean’ Code, Horrible Performance". I was actually already aware of the video by this time, but listening through the episode gave me an itch to see these concepts in my reality, experiment them by myself.
so if (somehow) the accumulator was an integer, this loop would autovectorize and the performance differences would be smaller ?
Very likely yes