The first problem was defined over a square with one boundary at 100° and the others at 0°. The inital state was essentially a linear interpolation across the grid. SOR iteration was used until the residual (step size per iteration) fell below a threshold. The temperature field was then loaded into Matlab where it could be compared with the Fourier series solution. This was done by taking the error = T_fourier - T_sor and hence finding the mean squared error (element-wise).
The following figure shows the temperature field on a 256x256 grid after running SOR to reasonable convergence: a mean squared error of 0.014 after 17500 iterations.
Matlab was used to visualise the error surface, eg the following 64x64 grid. Clearly this is dominated by oscillations on the hot edge, which is a truncation error in the Fourier solution (the Gibbs phenomenon).
The convergence process on a 100x100 grid is shown in the following animation. Each frame is a snapshot of the error surface with increasing iterations. It can be seen that the higher-frequency error near the hot boundary is reduced faster than the lower-frequency error at the cold end.
Residuals were computed during the computation as simply the relaxation step size: the sum of absolute values of grid cell changes. This gives an indication of the error, and was used as one stopping criterium. By comparing residuals with error for various grid sizes, it seems that the same residual corresponds to a similar mean squared error over several grid sizes:
| grid | iterations | final residual | mean squared error | CPU time (secs) |
|---|---|---|---|---|
| 64x64 | 1,040 | 10-8 | 0.021 | 0 |
| 256x256 | 17,500 | 10-8 | 0.014 | 32 |
| 512x512 | 70,300 | 10-8 | 0.025 | 504 (8:24) |
These errors are within the truncation error of the Fourier series solution using 150 terms. This was verified by observing that the SOR solution was closer as the number of terms was increased.
The three dimensional problem was directly analogous: a cube with one face at 100° and the others at 0°. The initial condition was again a linear ramping between the boundaries, forming a layered pyramid, as in the following slice plot.
The temperature volume was visualised using slices and isosurfaces (surfaces where the temperature is constant). These can also be combined effectively as in the following example. On a 64x64x64 grid, a slice was taken perpendicular to the hot boundary, through the midpoint of that face. Translucent isosurfaces were then overlaid for 20° and 80°, and viewed in profile.
Residuals were again compared with the error (derived from the Fourier solution) for various grid sizes. Grids of length greater than 128 were resampled down to that level for calculating the error.
| grid | iterations | final residual | mean squared error | CPU time (secs) |
|---|---|---|---|---|
| 64x64x64 | 1,160 | 10-8 | 0.040 | 14 |
| 128x128x128 | 5,170 | 10-8 | 0.071 | 485 (8:05) |
| 256x256x256 | 10,000 | 3.5 * 10-2 | 1.67 | 9120 (2:32:) |
| 512x512x512 | 1,000 | 2.6 * 10+5 | 6840 (1:54:) |
A graph of residuals changing with number of iterations is shown below, for several 3D grid sizes: Red = 2563, Green = 1283, Blue = 643. The exponential decay of residuals is apparent, but the time scale increases dramatically with grid size. The early peak in residuals is due to ramping up of the over-relaxation coefficient -- this increases step sizes by definition, but leads to faster convergence.
We would like to know the structure of error in the 3D solution. This slice plot shows error on a 20x20x20 grid after 50 iterations. The dominant values are adjacent to the hot boundary, probably because the initial error there was large.

The same error was visualised with an isosurface representing 0 error. On one side there is an underestimation (towards centre of hot face), and on the other side the temperature is overestimated. It also appears that two other faces have a change of error sign, though this is probably trivial.
Red-black updating was used for a consistent parallel scheme. One dimension was partitioned evenly between the processors, and the boundaries of each portion were exchanged with the adjacent nodes twice on each time step (after red update, and black update). The speedups on a 512
| CPUs | parallel time | serial time | speedup |
|---|---|---|---|
| 4 | 0:31:33 | 01:54:00 | 3.61 |
| 8 | 0:17:13 | 01:54:00 | 6.62 |
Fortran 90 serial code:
| sor.f90 | Main program. |
| iterate.f90 | Code for doing 2D and 3D SOR. |
| sorglobals.f90 | Variable declarations. |
| sorinfin.f90 | Configuration setup, initial conditions, output. |
Fortran 90 parallel (MPI) code:
| mpi_sor.f90 | Main program. |
| mpi_iterate.f90 | 2D and 3D SOR code with distributed workload, using red-black updating. |
| mpi_sorglobals.f90 | Variable declarations. |
| mpi_sorinfin.f90 | Configuration setup, initial conditions, output. |