This is the continuation of my previous post and I recommend reading it first. In the last post, we stopped after parallelizing the for-loop which was the performance bottleneck. We also managed to get a decent boost in performance thanks to OpenMP. With all that good stuff, we might have…