A Look to Parallel Computers
HERA-B Bologna
A Look to Parallel Computers
Parallel Machine = an assembly of processors and memory modules
communicating and cooperating to solve a problem.
Parallel-processing Architectures
- SMP (Symmetric Multiprocessing).
Design also known as "tightly coupled" or "shared everything".
- Multiple processors share RAM and system bus
[Cray CS6400 Enterprise Server (up to 64 CPUs), SGI Power Challenge (up to 18
CPUs), HP T-Class Servers (up to 14 CPUs), HP K-Class Servers (up to 4 CPUs),
HP D-Class Servers (up to 2 CPUs), HP J-Class Workstations (up to 2 CPUs)]
PROS
- Simple system and application programming.
- The single memory space lets a threads OS distribute its tasks among
various processors.
- A High Level Optimizer (HLO) may easily
produce parallel code, starting
from "simple" source code, without message-passing instructions.
CONS
- Adding more processors, memory traffic increas, until reaching the
saturation of the bus (some bus traffic may however be reduced by adding
cache memory to every processor).
- MPP (Massively Parallel Processing).
Design also known as "loosely coupled" or "shared nothing".
- RAM is distributed among the processors
[IBM SP2 RS/6000 Scalable Powerparallel System (up to 512 CPUs), APE,
ASCI]
PROS
- Avoid memory-bus bottleneck. The MPP systems reduce bus traffic because
each section of memory sees only those accesses that are bound for it, rather
than every memory access, as in SMP systems.
- Allow very large parallelization. U.S. Department of Energy
ASCI
(Accelerated Strategic Computing Initiative) uses more then 9000 Intel
Pentium-Pro (P6) CPUs.
CONS
- To access the memory outside its own RAM, a processor must use a
message-passing scheme, analogous to network packets.
- Programming is more difficult. You have to insert, in the source code,
message-passing commands.
- The source code may become hardware-dependent; the use of PVM (Parallel
Virtual Machine) public domain message-passing mechanism and the develope of
a standard Message Passing Interface (MPI) may however solve this problem.
- Time delay may occur when data migrates from one processor to a distant one.
- SPP (Scalable Parallel Processing).
- Uses a two-tiers memory hierarchy. The first memory tier consists
of a hypernode, which is, essentially, an SMP system, complete with multiple
processors and their globally shared memory. Large SPP systems are build by
interconnecting several hypernodes via second memory tier so that this tier
appears logically as one global shared memory space to the nodes.
[Convex SPP 1200 CD (up to 16 CPUs), Convex SPP 1200 XA (up to 128 CPUs); for
both machines, several systems may be tied to obtain more processor power;
furthermore you still have one unified memory space, even though the RAM is
phisically located in different machines]
PROS
- Reduced bus traffic. Only updates to keep memory coherent among the nodes
occur.
- Easy-to-program SMP programming model.
- Scalability similar to that of an MPP design.
HERA-B Bologna Home Page
February 22, 1996 Domenico Galli