background image
c CCLRC
Section 6.1
6.1
DL POLY 3 Parallelisation
DL POLY 3 is a distributed parallel molecular dynamics package based on the Domain Decompo-
sition parallelisation strategy [
2
,
8
,
9
,
4
,
5
]. In this section we briefly outline the basic methodology.
Users wishing to add new features DL POLY 3 will need to be familiar with the underlying tech-
niques as they are described in the above references.
6.1.1
The Domain Decomposition Strategy
The Domain Decomposition (DD) strategy [
2
,
4
] is one of several ways to achieve parallelisation
in MD. Its name derives from the division of the simulated system into equi-geometrical spatial
blocks or domains, each of which is allocated to a specific processor of a parallel computer. I.e.
the arrays defining the atomic coordinates r
i
, velocities v
i
and forces f
i
, for all N atoms in the
simulated system, are divided in to sub-arrays of approximate size N/P , where P is the number of
processors, and allocated to specific processors. In DL POLY 3 the domain allocation is handled
by the routine domains module and the decision of approximate sizes of various bookkeeping
arrays in set bounds. The division of the configuration data in this way is based on the location
of the atoms in the simulation cell, such a geometric allocation of system data is the hallmark of
DD algorithms. Note that in order for this strategy to work efficiently, the simulated system must
possess a reasonably uniform density, so that each processor is allocated an equal portion of atom
data (as far as possible). Through this approach the forces computation and integration of the
equations of motion are shared (reasonably) equally between processors and to a large extent can
be computed independently on each processor. The method is conceptually simple though tricky
to program and is particularly suited to large scale simulations, where efficiency is highest.
The DD strategy underpinning DL POLY 3 is based on the link cell algorithm of Hockney and
Eastwood [
43
] as implemented by various authors (e.g. Pinches et al. [
8
] and Rapaport [
9
]).
This requires that the cutoff applied to the interatomic potentials is relatively short ranged. In
DL POLY 3 the link-cell list is build by the routine link cell pairs. As with all DD algorithms,
there is a need for the processors to exchange `halo data', which in the context of link-cells means
sending the contents of the link cells at the boundaries of each domain, to the neighbouring proces-
sors, so that each may have all necessary information to compute the pair forces acting on the atoms
belonging to its allotted domain. This in DL POLY 3 is handled by the set halo particles rou-
tine. three-body and four-body Systems containing complex molecules present several difficulties.
They often contain ionic species, which usually require Ewald summation methods [
16
,
44
], and
intra-molecular interactions in addition to inter-molecular forces. Intramolecular interactions are
handled in the same way as in DL POLY 2, where each processor is allocated a subset of in-
tramolecular bonds to deal with. The allocation in this case is based on the atoms present in the
processor's domain. The SHAKE and RATTLE algorithms [
38
,
17
] require significant modification.
Each processor must deal with the constraint bonds present in its own domain, but it must also
deal with bonds it effectively shares with its neighbouring processors. This requires each processor
to inform its neighbours whenever it updates the position of a shared atom during every SHAKE
(RATTLE VV1) cycle (for RATTLE VV2 it is velocity that needs to be updated), so that all rele-
vant processors may incorporate this update into its own iterations. In the case of the DD strategy
the SHAKE (RATTLE) algorithm is simpler than for the Replicated Data method of DL POLY 2,
where global updates of the atom positions (merging and splicing) are required [
45
]. The absence of
the merge requirement means that the DD tailored SHAKE and RATTLE are less communications
dependent and thus more efficient, particularly with large processor counts.
The DD strategy is applied to complex molecular systems as follows:
115