If you have not already done so, you must collect performance data for omptest and open the first experiment, omptest.1.er. See Collecting Data for the omptest Example for instructions.
This section compares the performance of two routines, critsec_() and reduc_(), in which the CRITICAL SECTIONS directive and REDUCTION directive are used. In this case, the parallelization strategy deals with an identical assignment statement embedded in a pair of do loops. Its purpose is to sum the contents of three two-dimensional arrays
t = (a(j,i)+b(j,i)+c(j,i))/k sum = sum+t
The inclusive user CPU time for critsum_() is enormous, because critsum_() uses a critical section parallelization strategy. Although the summing operation is spread over all four CPUs, only one CPU at a time is allowed to add its value of t to sum. This is not a very efficient parallelization strategy for this kind of coding construct.
The inclusive user CPU time for redsum_() is much smaller than for critsum_(). This is because redsum_() uses a reduction strategy to evaluate the sum. In this strategy, partial sums are accumulated on each processor in parallel, then the partial sums are sequentially added to sum. This strategy makes much more efficient use of the available CPUs because it minimizes the need for sequential access to sum.