Introduction to the Message Passing Interface (MPI) - PRACE ...
Introduction to Programming by MPI for Parallel ... - Kengo Nakajima
-
Upload
khangminh22 -
Category
Documents
-
view
2 -
download
0
Transcript of Introduction to Programming by MPI for Parallel ... - Kengo Nakajima
Intr
oduc
tion
to P
rogr
amm
ing
by M
PI fo
r Par
alle
l FEM
Rep
ort S
1 &
S2
in F
ortr
an
Ken
go N
akaj
ima
Pro
gram
min
g fo
r Par
alle
l Com
putin
g (6
16-2
057)
S
emin
ar o
n A
dvan
ced
Com
putin
g (6
16-4
009)
11
Mot
ivat
ion
for P
aral
lel C
ompu
ting
(and
this
cla
ss)
•La
rge-
scal
e pa
ralle
l com
pute
r ena
bles
fast
com
putin
g in
larg
e-sc
ale
scie
ntifi
c si
mul
atio
ns w
ith d
etai
led
mod
els.
C
ompu
tatio
nal s
cien
ce d
evel
ops
new
fron
tiers
of
scie
nce
and
engi
neer
ing.
•W
hy p
aral
lel c
ompu
ting
?–
fast
er &
larg
er–
“larg
er” i
s m
ore
impo
rtant
from
the
view
poi
nt o
f “ne
w fr
ontie
rs
of s
cien
ce &
eng
inee
ring”
, but
“fas
ter”
is a
lso
impo
rtant
.–
+ m
ore
com
plic
ated
–Id
eal:
Sca
labl
e•
Sol
ving
Nx
scal
e pr
oble
m u
sing
Nx
com
puta
tiona
l res
ourc
es d
urin
g sa
me
com
puta
tion
time.
MP
I Pro
gram
min
g
22
Ove
rvie
w
•W
hat i
s M
PI ?
•Y
our F
irst M
PI P
rogr
am: H
ello
Wor
ld
•G
loba
l/Loc
al D
ata
•C
olle
ctiv
e C
omm
unic
atio
n•
Pee
r-to-
Pee
r Com
mun
icat
ion
MP
I Pro
gram
min
g
33
Wha
t is
MPI
? (1
/2)
•M
essa
ge P
assi
ng In
terfa
ce•
“Spe
cific
atio
n” o
f mes
sage
pas
sing
AP
I for
dis
tribu
ted
mem
ory
envi
ronm
ent
–N
ot a
pro
gram
, Not
a li
brar
y•
http
://ph
ase.
hpcc
.jp/p
hase
/mpi
-j/m
l/mpi
-j-ht
ml/c
onte
nts.
htm
l
•H
isto
ry–
1992
MP
I For
um–
1994
MP
I-1–
1997
MP
I-2: M
PI I
/O–
2012
MP
I-3: F
ault
Res
ilien
ce, A
sync
hron
ous
Col
lect
ive
•Im
plem
enta
tion
–m
pich
AN
L (A
rgon
ne N
atio
nal L
abor
ator
y), O
penM
PI,
MV
AP
ICH
–H
/W v
endo
rs–
C/C
++, F
OTR
AN
, Jav
a ; U
nix,
Lin
ux, W
indo
ws,
Mac
OS
MP
I Pro
gram
min
g
44
Wha
t is
MPI
? (2
/2)
•“m
pich
” (fre
e) is
wid
ely
used
–su
ppor
ts M
PI-2
spe
c. (p
artia
lly)
–M
PIC
H2
afte
r Nov
. 200
5.–
http
://w
ww
-uni
x.m
cs.a
nl.g
ov/m
pi/
•W
hy M
PI i
s w
idel
y us
ed a
s de
fact
o st
anda
rd ?
–U
nifo
rm in
terfa
ce th
roug
h M
PI f
orum
•P
orta
ble,
can
wor
k on
any
type
s of
com
pute
rs•
Can
be
calle
d fro
m F
ortra
n, C
, etc
.
–m
pich
•fre
e, s
uppo
rts e
very
arc
hite
ctur
e
•P
VM
(Par
alle
l Virt
ual M
achi
ne) w
as a
lso
prop
osed
in
early
90’
s bu
t not
so
wid
ely
used
as
MP
I
MP
I Pro
gram
min
g
55
Ref
eren
ces
•W
.Gro
ppet
al.,
Usi
ng M
PI s
econ
d ed
ition
, MIT
Pre
ss, 1
999.
•
M.J
.Qui
nn, P
aral
lel P
rogr
amm
ing
in C
with
MP
I and
Ope
nMP
, M
cGra
whi
ll, 2
003.
•W
.Gro
ppet
al.,
MP
I:Th
e C
ompl
ete
Ref
eren
ce V
ol.I,
II, M
IT P
ress
, 19
98.
•ht
tp://
ww
w-u
nix.
mcs
.anl
.gov
/mpi
/ww
w/
–A
PI(
App
licat
ion
Inte
rface
) of M
PI
MP
I Pro
gram
min
g
66
How
to le
arn
MPI
(1/2
)•
Gra
mm
ar–
10-2
0 fu
nctio
ns o
f MP
I-1 w
ill be
taug
ht in
the
clas
s•
alth
ough
ther
e ar
e m
any
conv
enie
nt c
apab
ilitie
s in
MP
I-2–
If yo
u ne
ed fu
rther
info
rmat
ion,
you
can
find
info
rmat
ion
from
web
, bo
oks,
and
MP
I exp
erts
.•
Pra
ctic
e is
impo
rtant
–P
rogr
amm
ing
–“R
unni
ng th
e co
des”
is th
e m
ost i
mpo
rtant
•B
e fa
milia
r with
or “
grab
” the
idea
of S
PM
D/S
IMD
op’
s–
Sin
gle
Pro
gram
/Inst
ruct
ion
Mul
tiple
Dat
a–
Eac
h pr
oces
s do
es s
ame
oper
atio
n fo
r diff
eren
t dat
a•
Larg
e-sc
ale
data
is d
ecom
pose
d, a
nd e
ach
part
is c
ompu
ted
by e
ach
proc
ess
–G
loba
l/Loc
al D
ata,
Glo
bal/L
ocal
Num
berin
g
MP
I Pro
gram
min
g
77
SPM
D
PE
#0
Pro
gram
Dat
a #0
PE
#1
Pro
gram
Dat
a #1
PE
#2
Pro
gram
Dat
a #2
PE
#M
-1
Pro
gram
Dat
a #M
-1
mpirun -np M <Program>
You
unde
rsta
nd 9
0% M
PI,
if yo
u un
ders
tand
this
figu
re.
PE
: Pro
cess
ing
Ele
men
tP
roce
ssor
, Dom
ain,
Pro
cess
Eac
h pr
oces
s do
es s
ame
oper
atio
n fo
r diff
eren
t dat
aLa
rge-
scal
e da
ta is
dec
ompo
sed,
and
eac
h pa
rt is
com
pute
d by
eac
h pr
oces
sIt
is id
eal t
hat p
aral
lel p
rogr
am is
not
diff
eren
t fro
m s
eria
l one
exc
ept c
omm
unic
atio
n.
MP
I Pro
gram
min
g
88
Som
e Te
chni
cal T
erm
s•
Pro
cess
or, C
ore
–P
roce
ssin
g U
nit (
H/W
), P
roce
ssor
=Cor
e fo
r sin
gle-
core
pro
c’s
•P
roce
ss–
Uni
t for
MP
I com
puta
tion,
nea
rly e
qual
to “c
ore”
–E
ach
core
(or p
roce
ssor
) can
hos
t mul
tiple
pro
cess
es (b
ut n
ot
effic
ient
)•
PE
(Pro
cess
ing
Ele
men
t)–
PE
orig
inal
ly m
ean
“pro
cess
or”,
but i
t is
som
etim
es u
sed
as
“pro
cess
” in
this
cla
ss. M
oreo
ver i
t mea
ns “d
omai
n” (n
ext)
•In
mul
ticor
e pr
oc’s
: PE
gen
eral
ly m
eans
“cor
e”
•D
omai
n–
dom
ain=
proc
ess
(=P
E),
each
of “
MD
” in
“SP
MD
”, ea
ch d
ata
set
•P
roce
ss ID
of M
PI (
ID o
f PE
, ID
of d
omai
n) s
tarts
from
“0”
–if
you
have
8 p
roce
sses
(PE
’s, d
omai
ns),
ID is
0~7
MP
I Pro
gram
min
g
99
SPM
D
PE
#0
Pro
gram
Dat
a #0
PE
#1
Pro
gram
Dat
a #1
PE
#2
Pro
gram
Dat
a #2
PE
#M
-1
Pro
gram
Dat
a #M
-1
mpirun -np M <Program>
You
unde
rsta
nd 9
0% M
PI,
if yo
u un
ders
tand
this
figu
re.
PE
: Pro
cess
ing
Ele
men
tP
roce
ssor
, Dom
ain,
Pro
cess
Eac
h pr
oces
s do
es s
ame
oper
atio
n fo
r diff
eren
t dat
aLa
rge-
scal
e da
ta is
dec
ompo
sed,
and
eac
h pa
rt is
com
pute
d by
eac
h pr
oces
sIt
is id
eal t
hat p
aral
lel p
rogr
am is
not
diff
eren
t fro
m s
eria
l one
exc
ept c
omm
unic
atio
n.
MP
I Pro
gram
min
g
1010
How
to le
arn
MPI
(2/2
)
•N
OT
so d
iffic
ult.
•Th
eref
ore,
2-3
lect
ures
are
eno
ugh
for j
ust l
earn
ing
gram
mar
of M
PI.
•G
rab
the
idea
of S
PM
D !
MP
I Pro
gram
min
g
1111
Sche
dule
•M
PI
–B
asic
Fun
ctio
ns–
Col
lect
ive
Com
mun
icat
ion
–P
oint
-to-P
oint
(or P
eer-
to-P
eer)
Com
mun
icat
ion
•90
min
. x 4
-5 le
ctur
es–
Col
lect
ive
Com
mun
icat
ion
•R
epor
t S1
–P
oint
-to-P
oint
/Pee
r-to
-Pee
r Com
mun
icat
ion
•R
epor
t S2:
Par
alle
lizat
ion
of 1
D c
ode
–A
t thi
s po
int,
you
are
alm
ost a
n ex
pert
of M
PI p
rogr
amm
ing.
MP
I Pro
gram
min
g
1212
•W
hat i
s M
PI ?
•Yo
ur F
irst M
PI P
rogr
am: H
ello
Wor
ld
•G
loba
l/Loc
al D
ata
•C
olle
ctiv
e C
omm
unic
atio
n•
Pee
r-to-
Pee
r Com
mun
icat
ion
MP
I Pro
gram
min
g
1313
Logi
n to
Oak
leaf
-FX
MP
I Pro
gram
min
g
ssht85**@oakleaf-fx.cc.u-tokyo.ac.jp
Cre
ate
dire
ctor
y>$ cd
>$ mkdirfem2
(you
r fav
orite
nam
e)>$ cd fem2
In th
is c
lass
this
top-
dire
ctor
y is
cal
led
<$O-fem2>.
File
s ar
e co
pied
to th
is d
irect
ory.
Und
er th
is d
irect
ory,
S1, S2, S1-ref
are
crea
ted:
<$O-S1> = <$O-fem2>/mpi/S1
<$O-S2> = <$O-fem2>/mpi/S2
Oak
leaf
-FX
ECC
S201
2
1414
Cop
ying
file
s on
Oak
leaf
-FX
MP
I Pro
gram
min
g
Cop
y>$ cd <$O-fem2>
>$ cp /home/z30088/fem2/F/s1.tar .
>$ tar xvf s1.tar
Con
firm
dire
ctor
y>$ ls
mpi
>$ cd mpi/S1
This
dire
ctor
y is
cal
led
as<$O-S1>.
<$O-S1> = <$O-fem2>/mpi/S1
151515
Firs
t Exa
mpl
eimplicit REAL*8 (A-H,O-Z)
include 'mpif.h‘
integer :: PETOT, my_rank, ierr
call MPI_INIT (ierr)
call MPI_COMM_SIZE (MPI_COMM_WORLD, PETOT, ierr )
call MPI_COMM_RANK (MPI_COMM_WORLD, my_rank, ierr )
write (*,'(a,2i8)') 'Hello World FORTRAN', my_rank, PETOT
call MPI_FINALIZE (ierr)
stop
end
#include "mpi.h"
#include <stdio.h>
int main(int argc, char **argv)
{int n, myid, numprocs, i;
MPI_Init(&argc,&argv);
MPI_Comm_size(MPI_COMM_WORLD,&numprocs);
MPI_Comm_rank(MPI_COMM_WORLD,&myid);
printf ("Hello World %d¥n", myid);
MPI_Finalize();
}
hello.f
hello.c
MP
I Pro
gram
min
g
161616
Com
pilin
g he
llo.f/
c
FOR
TRA
N$> mpifrtpx–Kfast hello.f
“mpifrtpx”:
FORTRAN90+MPIによってプログラムをコンパ
イル
する際
に必
要な,コンパ
イラ,ライブラリ等
がバ
インドされ
ている
C言
語 $> mpifccpx–Kfast hello.c
“mpifccpx”:
C+MPIによってプログラムをコンパ
イル
する際
に必
要な,コンパ
イラ,ライブラリ等
がバ
インドされ
ている
MP
I Pro
gram
min
g
>$ cd <$O-S1>
>$ mpifrtpx –Kfast hello.f
>$ mpifccpx –Kfast hello.c
171717
Run
ning
Job
•B
atch
Job
s–
Onl
y ba
tch
jobs
are
allo
wed
.–
Inte
ract
ive
exec
utio
ns o
f job
s ar
e no
t allo
wed
.•
How
to ru
n–
writ
ing
job
scrip
t–
subm
ittin
g jo
b–
chec
king
job
stat
us–
chec
king
resu
lts•
Util
izat
ion
of c
ompu
tatio
nal r
esou
rces
–
1-no
de (1
6 co
res)
is o
ccup
ied
by e
ach
job.
–Y
our n
ode
is n
ot s
hare
d by
oth
er jo
bs.
MP
I Pro
gram
min
g
181818
Job
Scrip
t•<$O-S1>/hello.sh
•S
ched
ulin
g +
She
ll S
crip
t
MP
I Pro
gram
min
g
#!/bin/sh
#PJM -L “node=1“
Number of Nodes
#PJM -L “elapse=00:10:00“
Computation Time
#PJM -L “rscgrp=lecture5“
Name of “QUEUE”
#PJM -g “gt85“
Group Name (Wallet)
#PJM -j
#PJM -o “hello.lst“
Standard Output
#PJM --mpi “proc=4“
MPI Process #
mpiexec ./a.out
Execs
8プ
ロセ
ス“node=1“
“proc=8”
16プ
ロセ
ス“node=1“
“proc=16”
32プ
ロセ
ス“node=2“
“proc=32”
64プ
ロセ
ス“node=4“
“proc=64”
192プ
ロセ
ス“node=12“
“proc=192”
191919
Subm
ittin
g Jo
bsM
PI P
rogr
amm
ing
>$ cd <$O-S1>
>$ pjsub hello.sh
>$ cat hello.lst
Hello World 0
Hello World 3
Hello World 2
Hello World 1
202020
Ava
ilabl
e Q
UEU
E’s
•Fo
llow
ing
2 qu
eues
are
ava
ilabl
e.•
1 To
fu (1
2 no
des)
can
be
used
–lecture
•12
nod
es (1
92 c
ores
), 15
min
., va
lid u
ntil
the
end
of
Oct
ober
, 201
4•
Sha
red
by a
ll “e
duca
tiona
l” us
ers
–lecture2
•12
nod
es (1
92 c
ores
), 15
min
., ac
tive
durin
g cl
ass
time
•M
ore
jobs
(com
pare
d to
lecture
) can
be
proc
esse
d up
on
ava
ilabi
lity.
MP
I Pro
gram
min
g
Tofu
Inte
rcon
nect
•N
ode
Gro
up–
12no
des
–A
-/C-a
xis:
4 n
odes
in s
yste
m b
oard
, B-a
xis:
3 b
oard
s•
6D: (
X,Y
,Z,A
,B,C
)–
AB
C 3
D M
esh:
in e
ach
node
gro
up: 2
×2×
3–
XY
Z 3D
Mes
h: c
onne
ctio
n of
nod
e gr
oups
: 10×
5×8
•Jo
b su
bmis
sion
acc
ordi
ng to
net
wor
k to
polo
gy is
pos
sibl
e:–
Info
rmat
ion
abou
t use
d “X
YZ”
is a
vaila
ble
afte
r exe
cutio
n.
21
222222
Subm
ittin
g &
Che
ckin
g Jo
bs•
Sub
mitt
ing
Jobs
pjsubSCRIPT NAME
•C
heck
ing
stat
us o
f job
spjstat
•D
elet
ing/
abor
ting
pjdel JOB ID
•C
heck
ing
stat
us o
f que
ues
pjstat --rsc
•D
etai
led
info
. of q
ueue
spjstat --rsc –x
•N
umbe
r of r
unni
ng jo
bspjstat --rsc –b
•Li
mita
tion
of s
ubm
issi
onpjstat --limit
[z30088@oakleaf-fx-6 S2-ref]$ pjstat
Oakleaf-FX scheduled stop time: 2012/09/28(Fri) 09:00:00 (Remain: 31days 20:01:46)
JOB_ID JOB_NAME STATUS PROJECT RSCGROUP START_DATE ELAPSE TOKEN NODE:COORD
334730 go.sh RUNNING gt61 lecture 08/27 12:58:08 00:00:05 0.0 1
MP
I Pro
gram
min
g
232323
Bas
ic/E
ssen
tial F
unct
ions
implicit REAL*8 (A-H,O-Z)
include 'mpif.h‘
integer :: PETOT, my_rank, ierr
call MPI_INIT (ierr)
call MPI_COMM_SIZE (MPI_COMM_WORLD, PETOT, ierr )
call MPI_COMM_RANK (MPI_COMM_WORLD, my_rank, ierr )
write (*,'(a,2i8)') 'Hello World FORTRAN', my_rank, PETOT
call MPI_FINALIZE (ierr)
stop
end
#include "mpi.h"
#include <stdio.h>
int main(int argc, char **argv)
{int n, myid, numprocs, i;
MPI_Init(&argc,&argv);
MPI_Comm_size(MPI_COMM_WORLD,&numprocs);
MPI_Comm_rank(MPI_COMM_WORLD,&myid);
printf ("Hello World %d¥n", myid);
MPI_Finalize();
}MP
I Pro
gram
min
g
‘mpif.h’, “mpi.h”
Ess
entia
l Inc
lude
file
“use mpi”
is p
ossi
ble
in F
90
MPI_Init
Initi
aliz
atio
n
MPI_Comm_size
Num
ber o
f MP
I Pro
cess
esmpirun
-np
XX
<prog>
MPI_Comm_rank
Pro
cess
ID s
tarti
ng fr
om 0
MPI_Finalize
Term
inat
ion
of M
PI p
roce
sses
242424
Diff
eren
ce b
etw
een
FOR
TRA
N/C
•(B
asic
ally
) sam
e in
terfa
ce–
In C
, UP
PE
R/lo
wer
cas
es a
re c
onsi
dere
d as
diff
eren
t•
e.g.
: MPI
_Com
m_s
ize
–M
PI:
UP
PE
R c
ase
–Fi
rst c
hara
cter
of t
he fu
nctio
n ex
cept
“MP
I_” i
s in
UP
PE
R c
ase.
–O
ther
cha
ract
ers
are
in lo
wer
cas
e.
•In
For
tran,
retu
rn v
alue
ierr
has
to b
e ad
ded
at th
e en
d of
the
argu
men
t lis
t. •
Cne
eds
spec
ial t
ypes
for v
aria
bles
:–
MP
I_C
omm
, MP
I_D
atat
ype,
MP
I_O
p et
c.•MPI_INIT
is d
iffer
ent:
–call MPI_INIT (ierr)
–MPI_Init (int *argc, char ***argv)
MP
I Pro
gram
min
g
252525
Wha
t’s a
re g
oing
on
?implicit REAL*8 (A-H,O-Z)
include 'mpif.h‘
integer :: PETOT, my_rank, ierr
call MPI_INIT (ierr)
call MPI_COMM_SIZE (MPI_COMM_WORLD, PETOT, ierr )
call MPI_COMM_RANK (MPI_COMM_WORLD, my_rank, ierr )
write (*,'(a,2i8)') 'Hello World FORTRAN', my_rank, PETOT
call MPI_FINALIZE (ierr)
stop
end
MP
I Pro
gram
min
g
•mpiexec
star
ts u
p 4
MP
I pro
cess
es(”p
roc=
4”)
–A
sin
gle
prog
ram
runs
on
four
pro
cess
es.
–ea
ch p
roce
ss w
rites
a v
alue
of myid
•Fo
ur p
roce
sses
do
sam
e op
erat
ions
, but
va
lues
of myid
are
diffe
rent
.•
Out
put o
f eac
h pr
oces
s is
diff
eren
t.•
That
is S
PM
D !
#!/bin/sh
#PJM -L “node=1“
Number of Nodes
#PJM -L “elapse=00:10:00“
Computation Time
#PJM -L “rscgrp=lecture“
Name of “QUEUE”
#PJM -g “gt64“
Group Name (Wallet)
#PJM -j
#PJM -o “hello.lst“
Standard Output
#PJM --mpi“proc=4“
MPI Process #
mpiexec./a.out
Execs
262626
mpi
.h,
mpi
f.himplicit REAL*8 (A-H,O-Z)
include 'mpif.h‘
integer :: PETOT, my_rank, ierr
call MPI_INIT (ierr)
call MPI_COMM_SIZE (MPI_COMM_WORLD, PETOT, ierr )
call MPI_COMM_RANK (MPI_COMM_WORLD, my_rank, ierr )
write (*,'(a,2i8)') 'Hello World FORTRAN', my_rank, PETOT
call MPI_FINALIZE (ierr)
stop
end
#include "mpi.h"
#include <stdio.h>
int main(int argc, char **argv)
{int n, myid, numprocs, i;
MPI_Init(&argc,&argv);
MPI_Comm_size(MPI_COMM_WORLD,&numprocs);
MPI_Comm_rank(MPI_COMM_WORLD,&myid);
printf ("Hello World %d¥n", myid);
MPI_Finalize();
}
•V
ario
us ty
pes
of p
aram
eter
s an
d va
riabl
es fo
r MP
I & th
eir i
nitia
l val
ues.
•N
ame
of e
ach
var.
star
ts fr
om “M
PI_
”•
Val
ues
of th
ese
para
met
ers
and
varia
bles
can
not b
e ch
ange
d by
us
ers.
•U
sers
do
not s
peci
fy v
aria
bles
st
artin
g fro
m “M
PI_
” in
user
s’
prog
ram
s.
MP
I Pro
gram
min
g
2727
MPI
_IN
IT•
Initi
aliz
e th
e M
PI e
xecu
tion
envi
ronm
ent (
requ
ired)
•It
is re
com
men
ded
to p
ut th
is B
EFO
RE
all
stat
emen
ts in
the
prog
ram
.
•call MPI_INIT (ierr)
–ierr
IO
Com
plet
ion
Cod
e
implicit REAL*8 (A-H,O-Z)
include 'mpif.h‘
integer :: PETOT, my_rank, ierr
call MPI_INIT (ierr)
call MPI_COMM_SIZE (MPI_COMM_WORLD, PETOT, ierr )
call MPI_COMM_RANK (MPI_COMM_WORLD, my_rank, ierr )
write (*,'(a,2i8)') 'Hello World FORTRAN', my_rank, PETOT
call MPI_FINALIZE (ierr)
stop
end
Fortr
anM
PI P
rogr
amm
ing
2828
MPI
_FIN
ALI
ZE•
Term
inat
es M
PI e
xecu
tion
envi
ronm
ent (
requ
ired)
•It
is re
com
men
ded
to p
ut th
is A
FTE
R a
ll st
atem
ents
in th
e pr
ogra
m.
•P
leas
e do
not
forg
et th
is.
•call MPI_FINALIZE (ierr)
–ierr
I
Oco
mpl
etio
n co
de
implicit REAL*8 (A-H,O-Z)
include 'mpif.h‘
integer :: PETOT, my_rank, ierr
call MPI_INIT (ierr)
call MPI_COMM_SIZE (MPI_COMM_WORLD, PETOT, ierr )
call MPI_COMM_RANK (MPI_COMM_WORLD, my_rank, ierr )
write (*,'(a,2i8)') 'Hello World FORTRAN', my_rank, PETOT
call MPI_FINALIZE (ierr)
stop
end
Fortr
anM
PI P
rogr
amm
ing
2929
MPI
_CO
MM
_SIZ
E•
Det
erm
ines
the
size
of t
he g
roup
ass
ocia
ted
with
a c
omm
unic
ator
•no
t req
uire
d, b
ut v
ery
conv
enie
nt fu
nctio
n
•call MPI_COMM_SIZE (comm, size, ierr)
–comm
I
Ico
mm
unic
ator
–size
I
Onu
mbe
r of p
roce
sses
in th
e gr
oup
of c
omm
unic
ator
–ierr
IO
com
plet
ion
code
implicit REAL*8 (A-H,O-Z)
include 'mpif.h‘
integer :: PETOT, my_rank, ierr
call MPI_INIT (ierr)
call MPI_COMM_SIZE (MPI_COMM_WORLD, PETOT, ierr )
call MPI_COMM_RANK (MPI_COMM_WORLD, my_rank, ierr )
write (*,'(a,2i8)') 'Hello World FORTRAN', my_rank, PETOT
call MPI_FINALIZE (ierr)
stop
end
Fortr
anM
PI P
rogr
amm
ing
303030
Wha
t is
Com
mun
icat
or ?
•G
roup
of p
roce
sses
for c
omm
unic
atio
n•
Com
mun
icat
or m
ust b
e sp
ecifi
ed in
MP
I pro
gram
as
a un
it of
com
mun
icat
ion
•A
ll pr
oces
ses
belo
ng to
a g
roup
, nam
ed
“MPI
_CO
MM
_WO
RLD
” (de
faul
t)•
Mul
tiple
com
mun
icat
ors
can
be c
reat
ed, a
nd c
ompl
icat
ed
oper
atio
ns a
re p
ossi
ble.
–C
ompu
tatio
n, V
isua
lizat
ion
•O
nly
“MP
I_C
OM
M_W
OR
LD” i
s ne
eded
in th
is c
lass
.
MPI_Comm_Size (MPI_COMM_WORLD, PETOT)
MP
I Pro
gram
min
g
313131
MPI
_CO
MM
_WO
RLD
Com
mun
icat
or in
MPI
One
pro
cess
can
bel
ong
to m
ultip
le c
omm
unic
ator
s
CO
MM
_MA
NTL
EC
OM
M_C
RU
ST
CO
MM
_VIS
MP
I Pro
gram
min
g
323232M
PI P
rogr
amm
ing
Cou
plin
g be
twee
n “G
roun
d M
otio
n” a
nd
“Slo
shin
g of
Tan
ks fo
r Oil-
Stor
age”
3434
Targ
et A
pplic
atio
n•
Cou
plin
g be
twee
n “G
roun
d M
otio
n” a
nd “S
losh
ing
of
Tank
s fo
r Oil-
Sto
rage
”–
“One
-way
” cou
plin
g fro
m “G
roun
d M
otio
n” to
“Tan
ks”.
–D
ispl
acem
ent o
f gro
und
surfa
ce is
giv
en a
s fo
rced
di
spla
cem
ent o
f bot
tom
sur
face
of t
anks
.–
1 Ta
nk =
1 P
E (s
eria
l)
Def
orm
atio
nof
sur
face
w
ill b
e gi
ven
as
boun
dary
con
ditio
nsat
bot
tom
of t
anks
.
Def
orm
atio
nof
sur
face
w
ill b
e gi
ven
as
boun
dary
con
ditio
nsat
bot
tom
of t
anks
.
MP
I Pro
gram
min
g
35
2003
Tok
achi
Ear
thqu
ake
(M8.
0)Fi
re a
ccid
ent o
f oil
tank
s du
e to
long
per
iod
grou
nd m
otio
n (s
urfa
ce w
aves
) dev
elop
ed in
the
basi
n of
Tom
akom
ai
MP
I Pro
gram
min
g
3737
Sim
ulat
ion
Cod
es•
Gro
und
Mot
ion
(Ichi
mur
a): F
ortra
n–
Par
alle
l FE
M, 3
D E
last
ic/D
ynam
ic•
Exp
licit
forw
ard
Eul
er s
chem
e
–E
ach
elem
ent:
2m×
2m×
2m c
ube
–24
0m×
240m
×10
0m re
gion
•S
losh
ing
of T
anks
(Nag
ashi
ma)
: C–
Ser
ial F
EM
(Em
barr
assi
ngly
Par
alle
l)•
Impl
icit
back
war
d E
uler
, Sky
line
met
hod
•S
hell
elem
ents
+ In
visc
id p
oten
tial f
low
–D
: 42.
7m, H
: 24.
9m, T
: 20m
m,
–Fr
eque
ncy:
7.6
sec.
–80
ele
men
ts in
circ
., 0.
6m m
esh
in h
eigh
t–
Tank
-to-T
ank:
60m
, 4×
4•
Tota
l num
ber o
f unk
now
ns: 2
,918
,169
MP
I Pro
gram
min
g
38
3838
Thre
e C
omm
unic
ator
sm
eshG
LOB
AL%
MP
I_C
OM
M
base
mem
t#0
base
men
t#1
base
men
t#2
base
men
t#3
mes
hBA
SE
%M
PI_
CO
MM
tank #0
tank #1
tank #2
tank #3
tank #4
tank #5
tank #6
tank #7
tank #8
mes
hTA
NK
%M
PI_
CO
MM
mes
hGLO
BA
L%m
y_ra
nk=
0~3
mes
hBA
SE
%m
y_ra
nk=
0~3
mes
hGLO
BA
L%m
y_ra
nk=
4~12
mes
hTA
NK
%m
y_ra
nk=
0~ 8
mes
hTA
NK
%m
y_ra
nk=
-1m
eshB
AS
E%
my_
rank
= -1
mes
hGLO
BA
L%M
PI_
CO
MM
base
mem
t#0
base
men
t#1
base
men
t#2
base
men
t#3
mes
hBA
SE
%M
PI_
CO
MM
base
mem
t#0
base
men
t#1
base
men
t#2
base
men
t#3
mes
hBA
SE
%M
PI_
CO
MM
tank #0
tank #1
tank #2
tank #3
tank #4
tank #5
tank #6
tank #7
tank #8
mes
hTA
NK
%M
PI_
CO
MM
tank #0
tank #1
tank #2
tank #3
tank #4
tank #5
tank #6
tank #7
tank #8
mes
hTA
NK
%M
PI_
CO
MM
mes
hGLO
BA
L%m
y_ra
nk=
0~3
mes
hBA
SE
%m
y_ra
nk=
0~3
mes
hGLO
BA
L%m
y_ra
nk=
4~12
mes
hTA
NK
%m
y_ra
nk=
0~ 8
mes
hTA
NK
%m
y_ra
nk=
-1m
eshB
AS
E%
my_
rank
= -1
MP
I Pro
gram
min
g
MP
I Pro
gram
min
g
39
MPI
_CO
MM
_RA
NK
•D
eter
min
es th
e ra
nk o
f the
cal
ling
proc
ess
in th
e co
mm
unic
ator
–“ID
of M
PI p
roce
ss” i
s so
met
imes
cal
led
“ran
k”
•MPI_COMM_RANK (comm, rank, ierr)
–comm
I
Ico
mm
unic
ator
–rank
I
Ora
nk o
f the
cal
ling
proc
ess
in th
e gr
oup
of c
omm
Sta
rting
from
“0”
–ierr
I
Oco
mpl
etio
n co
de
implicit REAL*8 (A-H,O-Z)
include 'mpif.h‘
integer :: PETOT, my_rank, ierr
call MPI_INIT (ierr)
call MPI_COMM_SIZE (MPI_COMM_WORLD, PETOT, ierr )
call MPI_COMM_RANK (MPI_COMM_WORLD, my_rank, ierr )
write (*,'(a,2i8)') 'Hello World FORTRAN', my_rank, PETOT
call MPI_FINALIZE (ierr)
stop
end
Fortr
an
MPI
_AB
OR
T•
Abo
rts M
PI e
xecu
tion
envi
ronm
ent
•call MPI_ABORT (comm, errcode, ierr)
–comm
I
Ico
mm
unic
atio
n–
errcode
I
Oer
ror c
ode
–ierr
I
Oco
mpl
etio
n co
de
40
Fortr
anM
PI P
rogr
amm
ing
40
MPI
_WTI
ME
•R
etur
ns a
n el
apse
d tim
e on
the
callin
g pr
oces
sor
•time= MPI_WTIME ()
–time
R8
OTi
me
in s
econ
ds s
ince
an
arbi
trary
tim
e in
the
past
.
… real(kind=8):: Stime, Etime
Stime= MPI_WTIME ()
do i= 1, 100000000
a= 1.d0
enddo
Etime= MPI_WTIME ()
write (*,'(i5,1pe16.6)') my_rank, Etime-Stime
Fortr
an
41
41M
PI P
rogr
amm
ing
424242
Exam
ple
of M
PI_W
time
$> cd <$O-S1>
$> mpifccpx –O1 time.c
$> mpifrtpx –O1 time.f
(modify go4.sh, 4 processes)
$> pjsub go4.sh
0 1.113281E+00
3 1.113281E+00
2 1.117188E+00
1 1.117188E+00
Process Time
ID
MP
I Pro
gram
min
g
434343
MPI
_Wtic
k•
Ret
urns
the
reso
lutio
n of
MP
I_W
time
•de
pend
s on
har
dwar
e, a
nd c
ompi
ler
•time= MPI_Wtick ()
–time
R8
OTi
me
in s
econ
ds o
f res
olut
ion
of M
PI_
Wtim
e
implicit REAL*8 (A-H,O-Z)
include 'mpif.h'
… TM= MPI_WTICK ()
write (*,*) TM
…
double Time;
… Time = MPI_Wtick();
printf("%5d%16.6E¥n", MyRank, Time);
…
MP
I Pro
gram
min
g
444444
Exam
ple
of M
PI_W
tick
$> cd <$O-S1>
$> mpifccpx –O1 wtick.c
$> mpifrtpx –O1 wtick.f
(modify go1.sh, 1 process)
$> pjsub go1.sh
MP
I Pro
gram
min
g
4545
MPI
_BA
RR
IER
•B
lock
s un
til a
ll pr
oces
ses
in th
e co
mm
unic
ator
hav
e re
ache
d th
is ro
utin
e.•
Mai
nly
for d
ebug
ging
, hug
e ov
erhe
ad, n
ot re
com
men
ded
for r
eal c
ode.
•call MPI_BARRIER (comm, ierr)
–comm
I
Ico
mm
unic
ator
–ierr
I
Oco
mpl
etio
n co
de
Fortr
anM
PI P
rogr
amm
ing
464646
•W
hat i
s M
PI ?
•Y
our F
irst M
PI P
rogr
am: H
ello
Wor
ld
•G
loba
l/Loc
al D
ata
•C
olle
ctiv
e C
omm
unic
atio
n•
Pee
r-to-
Pee
r Com
mun
icat
ion
MP
I Pro
gram
min
g
4747
Dat
a St
ruct
ures
& A
lgor
ithm
s
•C
ompu
ter p
rogr
am c
onsi
sts
of d
ata
stru
ctur
es a
nd
algo
rithm
s.•
They
are
clo
sely
rela
ted.
In o
rder
to im
plem
ent a
n al
gorit
hm, w
e ne
ed to
spe
cify
an
appr
opria
te d
ata
stru
ctur
e fo
r tha
t.–
We
can
even
say
that
“Dat
a S
truct
ures
=Alg
orith
ms”
–S
ome
peop
le m
ay n
ot a
gree
with
this
, I (K
N) t
hink
it is
true
for
scie
ntifi
c co
mpu
tatio
ns fr
om m
y ex
perie
nces
.
•A
ppro
pria
te d
ata
stru
ctur
es fo
r par
alle
l com
putin
g m
ust
be s
peci
fied
befo
re s
tarti
ng p
aral
lel c
ompu
ting.
MP
I Pro
gram
min
g
484848
SPM
D:Si
ngle
Pro
gram
Mul
tiple
Dat
a
•Th
ere
are
vario
us ty
pes
of “p
aral
lel c
ompu
ting”
, and
th
ere
are
man
y al
gorit
hms.
•C
omm
on is
sue
is S
PM
D (S
ingl
e P
rogr
am M
ultip
le D
ata)
.•
It is
idea
l tha
t par
alle
l com
putin
g is
don
e in
the
sam
e w
ay fo
r ser
ial c
ompu
ting
(exc
ept c
omm
unic
atio
ns)
–It
is re
quire
d to
spe
cify
pro
cess
es w
ith c
omm
unic
atio
ns a
nd
thos
e w
ithou
t com
mun
icat
ions
.
MP
I Pro
gram
min
g
494949
Wha
t is
a da
ta s
truc
ture
whi
ch is
ap
prop
riate
for S
PMD
?
PE
#0
Pro
gram
Dat
a #0
PE
#1
Pro
gram
Dat
a #1
PE
#2
Pro
gram
Dat
a #2
PE
#3
Pro
gram
Dat
a #3
MP
I Pro
gram
min
g
505050
Dat
a St
ruct
ure
for S
MPD
(1/2
)•
SP
MD
: Lar
ge d
ata
is d
ecom
pose
d in
to s
mal
l pie
ces,
an
d ea
ch p
iece
is p
roce
ssed
by
each
pro
cess
or/p
roce
ss•
Con
side
r the
follo
win
g si
mpl
e co
mpu
tatio
n fo
r vec
tor V
gw
ith le
ngh
of N
g(=
20):
•If
you
com
pute
this
usi
ng fo
ur p
roce
ssor
s, e
ach
proc
esso
r sto
res
and
proc
esse
s 5
(=20
/4) c
ompo
nent
s of
Vg.
integer, parameter :: NG= 20
real(kind=8), dimension(20) :: VG
do i= 1, NG
VG(i)= 2.0 * VG(i)
enddo
MP
I Pro
gram
min
g
515151
Dat
a St
ruct
ure
for S
MPD
(2/2
)•
i.e.
•Th
us, a
“sin
gle
prog
ram
” can
exe
cute
par
alle
l pro
cess
ing.
–In
eac
h pr
oces
s, c
ompo
nent
s of
“Vl”
are
diffe
rent
: Mul
tiple
Dat
a –
Com
puta
tion
usin
g on
ly “V
l” (a
s lo
ng a
s po
ssib
le) l
eads
to
effic
ient
par
alle
l com
puta
tion.
–P
rogr
am is
not
diff
eren
t fro
m th
at fo
r ser
ial C
PU
(in
the
prev
ious
pag
e).
integer, parameter :: NL= 5
real(kind=8), dimension(5) :: VL
do i= 1, NL
VL(i)= 2.0 * VL(i)
enddo
MP
I Pro
gram
min
g
525252
Glo
bal &
Loc
al D
ata
•V
g –E
ntire
Dom
ain
–“G
loba
l Dat
a” w
ith “G
loba
l ID
” fro
m 1
to 2
0•
Vl –fo
r Eac
h P
roce
ss (P
E, P
roce
ssor
, Dom
ain)
–“L
ocal
Dat
a” w
ith “L
ocal
ID” f
rom
1 to
5–
Effi
cien
t util
izat
ion
of lo
cal d
ata
lead
s to
exc
elle
nt p
aral
lel
effic
ienc
y.
MP
I Pro
gram
min
g
Idea
of L
ocal
Dat
aVg
: Glo
bal D
ata
•1st-5
thco
mp.
on
PE
#0•6
th-1
0th
com
p. o
n P
E#1
•11t
h -15
thco
mp.
on
PE
#2•1
6th -
20th
com
p. o
n P
E#3
Eac
h of
thes
e fo
ur s
ets
corr
espo
nds
to 1
st-5
th
com
pone
nts
of V
l(lo
cal
data
) whe
re th
ere
loca
l ID
’s a
re 1
-5.
VL(1)
VL(2)
VL(3)
VL(4)
VL(5)
PE#0
PE#1
PE#2
PE#3
VL(1)
VL(2)
VL(3)
VL(4)
VL(5)
VL(1)
VL(2)
VL(3)
VL(4)
VL(5)
VL(1)
VL(2)
VL(3)
VL(4)
VL(5)
VG( 1)
VG( 2)
VG( 3)
VG( 4)
VG( 5)
VG( 6)
VG( 7)
VG( 8)
VG( 9)
VG(10)
VG(11)
VG(12)
VG(13)
VG(14)
VG(15)
VG(16)
VG(17)
VG(18)
VG(19)
VG(20)
53
Fortr
anM
PI P
rogr
amm
ing
53
5454
Glo
bal &
Loc
al D
ata
•V
g –E
ntire
Dom
ain
–“G
loba
l Dat
a” w
ith “G
loba
l ID
” fro
m 1
to 2
0•
Vl –fo
r Eac
h P
roce
ss (P
E, P
roce
ssor
, Dom
ain)
–“L
ocal
Dat
a” w
ith “L
ocal
ID” f
rom
1 to
5•
Ple
ase
keep
you
r atte
ntio
n to
the
follo
win
g:–
How
to g
ener
ate
Vl(
loca
l dat
a) fr
om V
g (g
loba
l dat
a)–
How
to m
ap c
ompo
nent
s, fr
om V
g to
Vl,
and
from
Vlt
o V
g.–
Wha
t to
do if
Vlc
anno
t be
calc
ulat
ed o
n ea
ch p
roce
ss in
in
depe
nden
t man
ner.
–P
roce
ssin
g as
loca
lized
as
poss
ible
lead
s to
exc
elle
nt p
aral
lel
effic
ienc
y:
•D
ata
stru
ctur
es &
alg
orith
ms
for t
hat p
urpo
se.
MP
I Pro
gram
min
g
555555
•W
hat i
s M
PI ?
•Y
our F
irst M
PI P
rogr
am: H
ello
Wor
ld
•G
loba
l/Loc
al D
ata
•C
olle
ctiv
e C
omm
unic
atio
n•
Pee
r-to-
Pee
r Com
mun
icat
ion
MP
I Pro
gram
min
g
565656
Wha
t is
Col
lect
ive
Com
mun
icat
ion
?集
団通
信,グル
ープ通
信
•C
olle
ctiv
e co
mm
unic
atio
n is
the
proc
ess
of
exch
angi
ng in
form
atio
n be
twee
n m
ultip
le M
PI
proc
esse
s in
the
com
mun
icat
or: o
ne-to
-all
or a
ll-to
-all
com
mun
icat
ions
.
•E
xam
ples
–B
road
cast
ing
cont
rol d
ata
–M
ax, M
in–
Sum
mat
ion
–D
ot p
rodu
cts
of v
ecto
rs–
Tran
sfor
mat
ion
of d
ense
mat
rices
MP
I Pro
gram
min
g
575757
Exam
ple
of C
olle
ctiv
e C
omm
unic
atio
ns (1
/4)
A0
P#0
B0
C0
D0
P#1
P#2
P#3
Bro
adca
stA
0P
#0B
0C
0D
0
A0
P#1
B0
C0
D0
A0
P#2
B0
C0
D0
A0
P#3
B0
C0
D0
A0
P#0
B0
C0
D0
P#1
P#2
P#3
Scat
ter
A0
P#0
B0
P#1
C0
P#2
D0
P#3
Gat
her
MP
I Pro
gram
min
g
585858
Exam
ple
of C
olle
ctiv
e C
omm
unic
atio
ns (2
/4)
All
gath
erA
0P
#0B
0C
0D
0
A0
P#1
B0
C0
D0
A0
P#2
B0
C0
D0
A0
P#3
B0
C0
D0
All-
to-A
ll
A0
P#0
B0
P#1
C0
P#2
D0
P#3
A0
P#0
A1
A2
A3
B0
P#1
B1
B2
B3
C0
P#2
C1
C2
C3
D0
P#3
D1
D2
D3
A0
P#0
B0
C0
D0
A1
P#1
B1
C1
D1
A2
P#2
B2
C2
D2
A3
P#3
B3
C3
D3
MP
I Pro
gram
min
g
595959
Exam
ple
of C
olle
ctiv
e C
omm
unic
atio
ns (3
/4)
Red
uce
P#0
P#1
P#2
P#3
A0
P#0
B0
C0
D0
A1
P#1
B1
C1
D1
A2
P#2
B2
C2
D2
A3
P#3
B3
C3
D3
op.A
0-A
3op
.B0-
B3
op.C
0-C
3op
.D0-
D3
All
redu
ceP
#0
P#1
P#2
P#3
A0
P#0
B0
C0
D0
A1
P#1
B1
C1
D1
A2
P#2
B2
C2
D2
A3
P#3
B3
C3
D3
op.A
0-A
3op
.B0-
B3
op.C
0-C
3op
.D0-
D3
op.A
0-A
3op
.B0-
B3
op.C
0-C
3op
.D0-
D3
op.A
0-A
3op
.B0-
B3
op.C
0-C
3op
.D0-
D3
op.A
0-A
3op
.B0-
B3
op.C
0-C
3op
.D0-
D3
MP
I Pro
gram
min
g
606060
Exam
ple
of C
olle
ctiv
e C
omm
unic
atio
ns (4
/4)
Red
uce
scat
ter
P#0
P#1
P#2
P#3
A0
P#0
B0
C0
D0
A1
P#1
B1
C1
D1
A2
P#2
B2
C2
D2
A3
P#3
B3
C3
D3
op.A
0-A
3
op.B
0-B
3
op.C
0-C
3
op.D
0-D
3
MP
I Pro
gram
min
g
616161
Exam
ples
by
Col
lect
ive
Com
m.
•D
ot P
rodu
cts
of V
ecto
rs•
Sca
tter/G
athe
r•
Rea
ding
Dis
tribu
ted
File
s•
MP
I_A
llgat
herv
MP
I Pro
gram
min
g
626262
Glo
bal/L
ocal
Dat
a•
Dat
a st
ruct
ure
of p
aral
lel c
ompu
ting
base
d on
SP
MD
, w
here
larg
e sc
ale
“glo
bal d
ata”
is d
ecom
pose
d to
sm
all
piec
es o
f “lo
cal d
ata”
.
MP
I Pro
gram
min
g
636363
Dom
ain
Dec
ompo
sitio
n/Pa
rtiti
onin
g
MP
I Pro
gram
min
g
Large-scale
Data
local
data
local
data
local
data
local
data
local
data
local
data
local
data
local
data
comm.
Domain
Decomposition
•P
C w
ith 1
GB
RA
M: c
an e
xecu
te F
EM
app
licat
ion
with
up
to
106
mes
hes
–10
3 km×
103
km×
102
km(S
W J
apan
): 10
8m
eshe
s by
1km
cu
bes
•La
rge-
scal
e D
ata:
Dom
ain
deco
mpo
sitio
n, p
aral
lel &
loca
l op
erat
ions
•G
loba
l Com
puta
tion:
Com
m. a
mon
g do
mai
ns n
eede
d
646464
Loca
l Dat
a St
ruct
ure
•It
is im
porta
nt to
def
ine
prop
er lo
cal d
ata
stru
ctur
e fo
r ta
rget
com
puta
tion
(and
its
algo
rithm
)–
Alg
orith
ms=
Dat
a S
truct
ures
•M
ain
obje
ctiv
e of
this
cla
ss !
MP
I Pro
gram
min
g
656565
Glo
bal/L
ocal
Dat
a•
Dat
a st
ruct
ure
of p
aral
lel c
ompu
ting
base
d on
SP
MD
, w
here
larg
e sc
ale
“glo
bal d
ata”
is d
ecom
pose
d to
sm
all
piec
es o
f “lo
cal d
ata”
.
•C
onsi
der t
he d
ot p
rodu
ct o
f fol
low
ing
VE
Cp
and
VE
Cs
with
leng
th=2
0 by
par
alle
l com
puta
tion
usin
g 4
proc
esso
rsVECp[ 0]= 2
[ 1]= 2
[ 2]= 2
…[17]= 2
[18]= 2
[19]= 2
VECs[ 0]= 3
[ 1]= 3
[ 2]= 3
…[17]= 3
[18]= 3
[19]= 3
VECp( 1)= 2
( 2)= 2
( 3)= 2
…(18)= 2
(19)= 2
(20)= 2
VECs( 1)= 3
( 2)= 3
( 3)= 3
…(18)= 3
(19)= 3
(20)= 3
MP
I Pro
gram
min
g
666666
<$O
-S1>
/dot
.f, d
ot.c
implicit REAL*8 (A-H,O-Z)
real(kind=8),dimension(20):: &
VECp, VECs
do i= 1, 20
VECp(i)= 2.0d0
VECs(i)= 3.0d0
enddo
sum= 0.d0
do ii= 1, 20
sum= sum + VECp(ii)*VECs(ii)
enddo
stop
end
#include <stdio.h>
int main(){
int i;
double VECp[20], VECs[20]
double sum;
for(i=0;i<20;i++){
VECp[i]= 2.0;
VECs[i]= 3.0;
} sum = 0.0;
for(i=0;i<20;i++){
sum += VECp[i] * VECs[i];
} return 0;
}
MP
I Pro
gram
min
g
676767
<$O
-S1>
/dot
.f, d
ot.c
MP
I Pro
gram
min
g
>$ cd <$O-S1>
>$ cc -O3 dot.c
>$ f95 –O3 dot.f
>$ ./a.out
1 2. 3.
2 2. 3.
3 2. 3.
…18 2. 3.
19 2. 3.
20 2. 3.
dot product 120.
6868
MPI
_RED
UC
E•
Red
uces
val
ues
on a
ll pr
oces
ses
to a
sin
gle
valu
e–
Sum
mat
ion,
Pro
duct
, Max
, Min
etc
.
•call MPI_REDUCE
(sendbuf,recvbuf,count,datatype,op,root,comm,ierr)
–sendbuf
choi
ceI
star
ting
addr
ess
of s
end
buffe
r–
recvbuf
choi
ce
Ost
artin
g ad
dres
s re
ceiv
e bu
ffer
type
is d
efin
ed b
y ”datatype”
–count
II
num
ber o
f ele
men
ts in
sen
d/re
ceiv
e bu
ffer
–datatypeI
Ida
ta ty
pe o
f ele
men
ts o
f sen
d/re
cive
buffe
rFORTRAN MPI_INTEGER, MPI_REAL, MPI_DOUBLE_PRECISION, MPI_CHARACTER etc.
C MPI_INT, MPI_FLOAT, MPI_DOUBLE, MPI_CHAR etc
–op
I
Ire
duce
ope
ratio
n MPI_MAX, MPI_MIN, MPI_SUM, MPI_PROD, MPI_LAND, MPI_BAND etc
Use
rs c
an d
efin
e op
erat
ions
by MPI_OP_CREATE
–root
I
Ira
nk o
f roo
t pro
cess
–
comm
I
Ico
mm
unic
ator
–ierr
I
Oco
mpl
etio
n co
de
Red
uce
P#0
P#1
P#2
P#3
A0
P#0
B0
C0
D0
A1
P#1
B1
C1
D1
A2
P#2
B2
C2
D2
A3
P#3
B3
C3
D3
A0
P#0
B0
C0
D0
A1
P#1
B1
C1
D1
A2
P#2
B2
C2
D2
A3
P#3
B3
C3
D3
op.A
0-A
3op
.B0-
B3
op.C
0-C
3op
.D0-
D3
op.A
0-A
3op
.B0-
B3
op.C
0-C
3op
.D0-
D3
Fortr
an
MP
I Pro
gram
min
g
696969
Send
/Rec
eive
Buf
fer
(Sen
ding
/Rec
eivi
ng)
•A
rray
s of
“sen
d (s
endi
ng) b
uffe
r” a
nd “r
ecei
ve
(rec
eivi
ng) b
uffe
r” o
ften
appe
ar in
MP
I.
•A
ddre
sses
of “
send
(sen
ding
) buf
fer”
and
“rec
eive
(r
ecei
ving
) buf
fer”
mus
t be
diffe
rent
.
MP
I Pro
gram
min
g
7070
Exam
ple
of M
PI_R
educ
e (1
/2)
call MPI_REDUCE
(sendbuf,recvbuf,count,datatype,op,root,comm,ierr)
real(kind=8):: X0, X1
call MPI_REDUCE
(X0, X1, 1, MPI_DOUBLE_PRECISION, MPI_MAX, 0, <comm>, ierr)
real(kind=8):: X0(4), XMAX(4)
call MPI_REDUCE
(X0, XMAX, 4, MPI_DOUBLE_PRECISION, MPI_MAX, 0, <comm>, ierr)
Glo
bal M
ax. v
alue
s of
X0(
i) go
to X
MA
X(i)
on
#0 p
roce
ss (i
=1-4
)
Fortr
anM
PI P
rogr
amm
ing
7171
Exam
ple
of M
PI_R
educ
e (2
/2)
call MPI_REDUCE
(sendbuf,recvbuf,count,datatype,op,root,comm,ierr)
real(kind=8):: X0, XSUM
call MPI_REDUCE
(X0, XSUM, 1, MPI_DOUBLE_PRECISION, MPI_SUM, 0, <comm>, ierr)
real(kind=8):: X0(4)
call MPI_REDUCE
(X0(1), X0(3), 2, MPI_DOUBLE_PRECISION, MPI_SUM, 0, <comm>, ierr)
Glo
bal s
umm
atio
n of
X0
goes
to X
SU
Mon
#0
proc
ess.
Fortr
anM
PI P
rogr
amm
ing ・
Glo
bal s
umm
atio
n of
X0(
1) g
oes
to X
0(3)
on #
0 pr
oces
s.・
Glo
bal s
umm
atio
n of
X0(
2) g
oes
to X
0(4)
on #
0 pr
oces
s.
7272
MPI
_BC
AST
•B
road
cast
s a
mes
sage
from
the
proc
ess
with
rank
"roo
t" to
all
othe
r pr
oces
ses
of th
e co
mm
unic
ator
•call MPI_BCAST (buffer,count,datatype,root,comm,ierr)
–buffer
choice
I/O
star
ting
addr
ess
of b
uffe
rty
pe is
def
ined
by ”datatype”
–count
I
Inu
mbe
r of e
lem
ents
in s
end/
recv
buf
fer
–datatypeI
Ida
ta ty
pe o
f ele
men
ts o
f sen
d/re
cv b
uffe
rFORTRAN MPI_INTEGER, MPI_REAL, MPI_DOUBLE_PRECISION, MPI_CHARACTER etc.
C MPI_INT, MPI_FLOAT, MPI_DOUBLE, MPI_CHAR etc.
–root
I
Ira
nk o
f roo
t pro
cess
–
comm
I
Ico
mm
unic
ator
–ierr
I
Oco
mpl
etio
n co
de
A0
P#0
B0
C0
D0
P#1
P#2
P#3
A0
P#0
B0
C0
D0
P#1
P#2
P#3
Bro
adca
stA
0P
#0B
0C
0D
0
A0
P#1
B0
C0
D0
A0
P#2
B0
C0
D0
A0
P#3
B0
C0
D0
A0
P#0
B0
C0
D0
A0
P#1
B0
C0
D0
A0
P#2
B0
C0
D0
A0
P#3
B0
C0
D0
Fortr
an
MP
I Pro
gram
min
g
7373
MPI
_ALL
RED
UC
E•
MP
I_R
educ
e +
MP
I_B
cast
•S
umm
atio
n (o
f dot
pro
duct
s) a
nd M
AX
/MIN
val
ues
are
likel
y to
util
ized
in
each
pro
cess
•call MPI_ALLREDUCE
(sendbuf,recvbuf,count,datatype,op, comm,ierr)
–sendbuf
choice
Ist
artin
g ad
dres
s of
sen
d bu
ffer
–recvbuf
choice
Ost
artin
g ad
dres
s re
ceiv
e bu
ffer
type
is d
efin
ed b
y ”datatype”
–count
II
num
ber o
f ele
men
ts in
sen
d/re
cv b
uffe
r –
datatypeI
Ida
ta ty
pe o
f ele
men
ts in
sen
d/re
cv b
uffe
r
–op
II
redu
ce o
pera
tion
–comm
I
Ico
mm
uini
cato
r–
ierr
I
Oco
mpl
etio
n co
de
All
redu
ceP
#0
P#1
P#2
P#3
A0
P#0
B0
C0
D0
A1
P#1
B1
C1
D1
A2
P#2
B2
C2
D2
A3
P#3
B3
C3
D3
A0
P#0
B0
C0
D0
A1
P#1
B1
C1
D1
A2
P#2
B2
C2
D2
A3
P#3
B3
C3
D3
op.A
0-A
3op
.B0-
B3
op.C
0-C
3op
.D0-
D3
op.A
0-A
3op
.B0-
B3
op.C
0-C
3op
.D0-
D3
op.A
0-A
3op
.B0-
B3
op.C
0-C
3op
.D0-
D3
op.A
0-A
3op
.B0-
B3
op.C
0-C
3op
.D0-
D3
op.A
0-A
3op
.B0-
B3
op.C
0-C
3op
.D0-
D3
op.A
0-A
3op
.B0-
B3
op.C
0-C
3op
.D0-
D3
op.A
0-A
3op
.B0-
B3
op.C
0-C
3op
.D0-
D3
op.A
0-A
3op
.B0-
B3
op.C
0-C
3op
.D0-
D3
Fortr
an
MP
I Pro
gram
min
g
74
“op”
of M
PI_R
educ
e/A
llred
uce
•MPI_MAX,MPI_MIN
Max
, Min
•MPI_SUM,MPI_PROD
Sum
mat
ion,
Pro
duct
•MPI_LAND
Logi
cal A
ND
call MPI_REDUCE
(sendbuf,recvbuf,count,datatype,op,root,comm,ierr)
Fortr
an
MP
I Pro
gram
min
g
7575
Loca
l Dat
a (1
/2)
•D
ecom
pose
vec
tor w
ith le
ngth
=20
into
4 d
omai
ns (p
roce
sses
)•
Eac
h pr
oces
s ha
ndle
s a
vect
or w
ith le
ngth
= 5
VECp( 1)= 2
( 2)= 2
( 3)= 2
…(18)= 2
(19)= 2
(20)= 2
VECs( 1)= 3
( 2)= 3
( 3)= 3
…(18)= 3
(19)= 3
(20)= 3
Fortr
anM
PI P
rogr
amm
ing
7676
Loca
l Dat
a (2
/2)
•1t
h -5t
hco
mpo
nent
s of
orig
inal
glo
bal v
ecto
r go
to 1
th-5
th c
ompo
nent
s of
PE
#0, 6
th-1
0th
-> P
E#1
, 11t
h -15
th->
PE
#2, 1
6th -
20th
-> P
E#3
.
VECp(1)= 2
(2)= 2
(3)= 2
(4)= 2
(5)= 2
VECs(1)= 3
(2)= 3
(3)= 3
(4)= 3
(5)= 3
VECp(1)= 2
(2)= 2
(3)= 2
(4)= 2
(5)= 2
VECs(1)= 3
(2)= 3
(3)= 3
(4)= 3
(5)= 3
VECp(1)= 2
(2)= 2
(3)= 2
(4)= 2
(5)= 2
VECs(1)= 3
(2)= 3
(3)= 3
(4)= 3
(5)= 3
VECp(1)= 2
(2)= 2
(3)= 2
(4)= 2
(5)= 2
VECs(1)= 3
(2)= 3
(3)= 3
(4)= 3
(5)= 3
PE#0
PE#1
PE#2
PE#3
VECp(16)~VECp(20)
VECs(16)~VECs(20)
VECp(11)~VECp(15)
VECs(11)~VECs(15)
VECp( 6)~VECp(10)
VECs( 6)~VECs(10)
VECp( 1)~VECp( 5)
VECs( 1)~VECs( 5)
Fortr
anM
PI P
rogr
amm
ing
77
But
...
•It
is to
o ea
sy !!
Jus
t de
com
posi
ng a
nd
renu
mbe
ring
from
1 (o
r 0)
.
•O
f cou
rse,
this
is n
ot
enou
gh. F
urth
er
exam
ples
will
be s
how
n in
the
latte
r par
t.
VL(1)
VL(2)
VL(3)
VL(4)
VL(5)
PE#0
PE#1
PE#2
PE#3
VL(1)
VL(2)
VL(3)
VL(4)
VL(5)
VL(1)
VL(2)
VL(3)
VL(4)
VL(5)
VL(1)
VL(2)
VL(3)
VL(4)
VL(5)
VG( 1)
VG( 2)
VG( 3)
VG( 4)
VG( 5)
VG( 6)
VG( 7)
VG( 8)
VG( 9)
VG(10)
VG(11)
VG(12)
VG(13)
VG(14)
VG(15)
VG(16)
VG(17)
VG(18)
VG(19)
VG(20)
MP
I Pro
gram
min
g
7878
Exam
ple:
Dot
Pro
duct
(1/3
)
implicit REAL*8 (A-H,O-Z)
include 'mpif.h'
integer :: PETOT, my_rank, ierr
real(kind=8), dimension(5) :: VECp, VECs
call MPI_INIT (ierr)
call MPI_COMM_SIZE (MPI_COMM_WORLD, PETOT, ierr )
call MPI_COMM_RANK (MPI_COMM_WORLD, my_rank, ierr )
sumA= 0.d0
sumR= 0.d0
do i= 1, 5
VECp(i)= 2.d0
VECs(i)= 3.d0
enddo
sum0= 0.d0
do i= 1, 5
sum0= sum0 + VECp(i) * VECs(i)
enddo
if (my_rank.eq.0) then
write (*,'(a)') '(my_rank, sumALLREDUCE, sumREDUCE)‘
endif
<$O-S1>/allreduce.f
MP
I Pro
gram
min
g
Loca
l vec
tor i
s ge
nera
ted
at e
ach
loca
l pro
cess
.
7979
!C
!C--
REDUCE
call MPI_REDUCE (sum0, sumR, 1, MPI_DOUBLE_PRECISION, MPI_SUM, 0, &
MPI_COMM_WORLD, ierr)
!C
!C--
ALL-REDUCE
call MPI_Allreduce (sum0, sumA, 1, MPI_DOUBLE_PRECISION, MPI_SUM, &
MPI_COMM_WORLD, ierr)
write (*,'(a,i5, 2(1pe16.6))') 'before BCAST', my_rank, sumA, sumR
Exam
ple:
Dot
Pro
duct
(2/3
)<$O-S1>/allreduce.f
MP
I Pro
gram
min
g
Dot
Pro
duct
Sum
mat
ion
of re
sults
of e
ach
proc
ess
(sum
0)“s
umR
” has
val
ue o
nly
on P
E#0
.
“sum
A” h
as v
alue
on
all p
roce
sses
by
MP
I_A
llred
uce
8080
Exam
ple:
Dot
Pro
duct
(3/3
)
!C
!C--
BCAST
call MPI_BCAST (sumR, 1, MPI_DOUBLE_PRECISION, 0, MPI_COMM_WORLD, &
ierr)
write (*,'(a,i5, 2(1pe16.6))') 'after BCAST', my_rank, sumA, sumR
call MPI_FINALIZE (ierr)
stop
end
<$O-S1>/allreduce.f
MP
I Pro
gram
min
g
“sum
R” h
as v
alue
on
PE
#1-#
3 by
MP
I_B
cast
818181
Execute <$O-S1>/allreduce.f/c
$> mpifccpx –Kfast allreduce.c
$> mpifrtpx –Kfast allreduce.f
(modify go4.sh, 4 process)
$> pjsub go4.sh
(my_rank, sumALLREDUCE,sumREDUCE)
before BCAST 0 1.200000E+02 1.200000E+02
after BCAST 0 1.200000E+02 1.200000E+02
before BCAST 1 1.200000E+02 0.000000E+00
after BCAST 1 1.200000E+02 1.200000E+02
before BCAST 3 1.200000E+02 0.000000E+00
after BCAST 3 1.200000E+02 1.200000E+02
before BCAST 2 1.200000E+02 0.000000E+00
after BCAST 2 1.200000E+02 1.200000E+02
MP
I Pro
gram
min
g
828282
Exam
ples
by
Col
lect
ive
Com
m.
•D
ot P
rodu
cts
of V
ecto
rs•
Sca
tter/G
athe
r•
Rea
ding
Dis
tribu
ted
File
s•
MP
I_A
llgat
herv
MP
I Pro
gram
min
g
8383
Glo
bal/L
ocal
Dat
a (1
/3)
•P
aral
leliz
atio
n of
an
easy
pro
cess
whe
re a
real
num
ber
is a
dded
to e
ach
com
pone
nt o
f rea
l vec
tor V
ECg:
do i= 1, NG
VECg(i)= VECg(i) + ALPHA
enddo
for (i=0; i<NG; i++{
VECg[i]= VECg[i] + ALPHA
}
MP
I Pro
gram
min
g
8484
Glo
bal/L
ocal
Dat
a (2
/3)
•C
onfig
urat
iona
–N
G=
32 (l
engt
h of
the
vect
or)
–A
LPH
A=1
000.
–P
roce
ss #
of M
PI=
4
•V
ecto
r VE
Cg
has
follo
win
g 32
com
pone
nts
(<
$T-S
1>/a
1x.a
ll):
(101.0, 103.0, 105.0, 106.0, 109.0, 111.0, 121.0, 151.0,
201.0, 203
.0, 205.0, 206
.0, 209.
0, 211
.0, 221.0, 251
.0,
301.0, 303
.0, 305.0, 306
.0, 309.
0, 311
.0, 321.0, 351
.0,
401.0, 403
.0, 405.0, 406
.0, 409.
0, 411
.0, 421.0, 451
.0)
MP
I Pro
gram
min
g
8585
Glo
bal/L
ocal
Dat
a (3
/3)
•P
roce
dure
①R
eadi
ng v
ecto
r VEC
gw
ith le
ngth
=32
from
one
pro
cess
(e.g
.0th
proc
ess)
–G
loba
l Dat
a②
Dis
tribu
ting
vect
or c
ompo
nent
s to
4 M
PI p
roce
sses
equ
ally
(i.e
.len
gth=
8
for e
ach
proc
esse
s)–
Loca
l Dat
a, L
ocal
ID/N
umbe
ring
③A
ddin
g A
LPH
Ato
eac
h co
mpo
nent
of t
he lo
cal v
ecto
r (w
ith le
ngth
= 8)
on
each
pro
cess
.④
Mer
ging
the
resu
lts to
glo
bal v
ecto
r with
leng
th=
32.
•A
ctua
lly, w
e do
not
nee
d pa
ralle
l com
pute
rs fo
r suc
h a
kind
of
smal
l com
puta
tion.
MP
I Pro
gram
min
g
8686
Ope
ratio
ns o
f Sca
tter/G
athe
r (1/
8)R
eadi
ng V
ECg
(leng
th=3
2) fr
om a
pro
cess
(e.g
.#0)
•R
eadi
ng g
loba
l dat
a fro
m #
0 pr
oces
s i
nclude
'mpif.h'
integer,
parameter :: NG= 32
real(kind=8), dimension(NG):: VECg
call MPI_INIT (ierr)
call MPI_COMM_SIZE (<comm>, PETOT , ierr)
call MPI_COMM_RANK (<comm>, my_rank, ierr)
if (my_rank.eq.0) then
open (21, file= 'a1x.all', status= 'un
known')
do i= 1, NG
read (21,*) VECg(i)
enddo
close (21)
endif
#include <mpi.h>
#include <stdio.h>
#include <math.h>
#include <assert.h>
int main(int argc, char **argv){
int i, NG=32;
int PeTot, MyRank, MPI_Comm;
double VECg[32];
char filename[80];
FILE *fp;
MPI_Init(&argc, &argv);
MPI_Comm_size(<comm>, &PeTot);
MPI_Comm_rank(<comm>, &MyRank);
fp = fopen("a1x.all", "r");
if(!MyRank) for(i=0;i<NG;i++){
fscanf(fp, "%lf", &VECg[i]);
}
MP
I Pro
gram
min
g
8787
Ope
ratio
ns o
f Sca
tter/G
athe
r (2/
8)D
istri
butin
g gl
obal
dat
a to
4 p
roce
ss e
qual
ly (i
.e.l
engt
h=8
for
each
pro
cess
)
•M
PI_
Sca
tter
MP
I Pro
gram
min
g
8888
MPI
_SC
ATT
ER
•S
ends
dat
a fro
m o
ne p
roce
ss to
all
othe
r pro
cess
es in
a c
omm
unic
ator
–
scount
-siz
e m
essa
ges
are
sent
to e
ach
proc
ess
•call MPI_SCATTER (sendbuf, scount, sendtype, recvbuf,
rcount, recvtype, root, comm, ierr)
–sendbuf
choi
ce
Ist
artin
g ad
dres
s of
sen
ding
buf
fer
type
is d
efin
ed b
y ”datatype”
–scount
I
Inu
mbe
r of e
lem
ents
sen
t to
each
pro
cess
–sendtypeI
Ida
ta ty
pe o
f ele
men
ts o
f sen
ding
buf
fer
FORTRAN MPI_INTEGER, MPI_REAL, MPI_DOUBLE_PRECISION, MPI_CHARACTER etc.
C MPI_INT, MPI_FLOAT, MPI_DOUBLE, MPI_CHAR etc.
–recvbuf
choi
ce
Ost
artin
g ad
dres
s of
rece
ivin
g bu
ffer
–rcount
I
Inu
mbe
r of e
lem
ents
rece
ived
from
the
root
pro
cess
–
recvtypeI
I
data
type
of e
lem
ents
of r
ecei
ving
buf
fer
–root
II
rank
of r
oot p
roce
ss
–comm
II
com
mun
icat
or–
ierr
I
Oco
mpl
etio
n co
de
A0P
#0B0
C0
D0
P#1
P#2
P#3
A0P
#0B0
C0
D0
P#1
P#2
P#3
Sca
tter
A0P
#0
B0P
#1
C0
P#2
D0
P#3
A0P
#0
B0P
#1
C0
P#2
D0
P#3
Gat
her
Fortr
an
MP
I Pro
gram
min
g
8989
MPI
_SC
ATT
ER(c
ont.)
•call MPI_SCATTER (sendbuf, scount, sendtype, recvbuf,
rcount, recvtype, root, comm, ierr)
–sendbuf
choi
ce
Ist
artin
g ad
dres
s of
sen
ding
buf
fer
type
is d
efin
ed b
y ”datatype”
–scount
I
Inu
mbe
r of e
lem
ents
sen
t to
each
pro
cess
–sendtypeI
Ida
ta ty
pe o
f ele
men
ts o
f sen
ding
buf
fer
–recvbuf
choi
ce
Ost
artin
g ad
dres
s of
rece
ivin
g bu
ffer
–rcount
I
Inu
mbe
r of e
lem
ents
rece
ived
from
the
root
pro
cess
–
recvtypeI
I
data
type
of e
lem
ents
of r
ecei
ving
buf
fer
–root
II
rank
of r
oot p
roce
ss
–comm
II
com
mun
icat
or–
ierr
I
Oco
mpl
etio
n co
de•
Usu
ally
–scount = rcount
–sendtype= recvtype
•Th
is fu
nctio
n se
nds scount
com
pone
nts
star
ting
from
sendbuf
(sen
ding
bu
ffer)
at p
roce
ss #root
to e
ach
proc
ess
in comm
. Eac
h pr
oces
s re
ceiv
es
rcount
com
pone
nts
star
ting
from
recvbuf
(rec
eivi
ng b
uffe
r).
A0P
#0B0
C0
D0
P#1
P#2
P#3
A0P
#0B0
C0
D0
P#1
P#2
P#3
Sca
tter
A0P
#0
B0P
#1
C0
P#2
D0
P#3
A0P
#0
B0P
#1
C0
P#2
D0
P#3
Gat
her
Fortr
an
MP
I Pro
gram
min
g
9090
Ope
ratio
ns o
f Sca
tter/G
athe
r (3/
8)D
istri
butin
g gl
obal
dat
a to
4 p
roce
sses
equ
ally
•A
lloca
ting
rece
ivin
g bu
ffer V
EC(le
ngth
=8) a
t eac
h pr
oces
s.•
8 co
mpo
nent
s se
nt fr
om s
endi
ng b
uffe
r VEC
gof
pro
cess
#0
are
rece
ived
at e
ach
proc
ess
#0-#
3 as
1st-8
thco
mpo
nent
s of
rece
ivin
g bu
ffer V
EC.
call MPI_SCATTER
(sendbuf, scount, sendtype, recvbuf, rcount,
recvtype, root, comm, ierr)
MP
I Pro
gram
min
g
integer, parameter :: N = 8
real(kind=8), dimension(N ) :: VEC
...
call MPI_Scatter &
(VECg, N, MPI_DOUBLE_PRECISION, &
VEC , N, MPI_DOUBLE_PRECISION, &
0, <comm>, ierr)
int N=8;
double VEC [8];
...
MPI_Scatter
(VECg, N, MPI_DOUBLE,
VEC, N,
MPI_DOUBLE, 0, <comm>);
9191
Ope
ratio
ns o
f Sca
tter/G
athe
r (4/
8)D
istri
butin
g gl
obal
dat
a to
4 p
roce
sses
equ
ally
•8
com
pone
nts
are
scat
tere
dto
eac
h pr
oces
s fro
m ro
ot (#
0)•
1st -8
thco
mpo
nent
s of
VEC
gar
e st
ored
as
1st -8
thon
es o
f VEC
at #
0,
9th -
16th
com
pone
nts
of V
ECg
are
stor
ed a
s 1s
t -8th
ones
of V
ECat
#1,
et
c. –VE
Cg:
Glo
bal D
ata,
VEC
: Loc
al D
ata
VECg
sendbuf
VEC
recvbuf
PE
#0
88
88
8
root
PE
#18
PE
#28
PE
#38
VECg
sendbuf
VEC
recvbuf
PE
#0
88
88
8
root
PE
#18
PE
#28
PE
#38lo
cal d
ata
glob
al d
ata
MP
I Pro
gram
min
g
9292
Ope
ratio
ns o
f Sca
tter/G
athe
r (5/
8)D
istri
butin
g gl
obal
dat
a to
4 p
roce
sses
equ
ally
•G
loba
l Dat
a: 1
st-3
2nd
com
pone
nts
of V
ECg
at #
0•
Loca
l Dat
a: 1
st-8
thco
mpo
nent
s of
VEC
at e
ach
proc
ess
•E
ach
com
pone
nt o
f VEC
can
be w
ritte
n fro
m e
ach
proc
ess
in th
e fo
llow
ing
way
:
do i= 1, N
write (*,'(a, 2i8,f10.0)') 'before', my_rank, i, VEC(i)
enddo
for(i=0;i<N;i++){
printf("before %5d %5d %10.0F\n", MyRank, i+1, VEC[i]);}
MP
I Pro
gram
min
g
9393
Ope
ratio
ns o
f Sca
tter/G
athe
r (5/
8)D
istri
butin
g gl
obal
dat
a to
4 p
roce
sses
equ
ally
•G
loba
l Dat
a: 1
st-3
2nd
com
pone
nts
of V
ECg
at #
0•
Loca
l Dat
a: 1
st-8
thco
mpo
nent
s of
VEC
at e
ach
proc
ess
•E
ach
com
pone
nt o
f VEC
can
be w
ritte
n fro
m e
ach
proc
ess
in th
e fo
llow
ing
way
:
MP
I Pro
gram
min
g
PE#0
before 0 1 101.
before 0 2 103.
before 0 3 105.
before 0 4 106.
before 0 5 109.
before 0 6 111.
before 0 7 121.
before 0 8 151.
PE#1
before 1 1 201.
before 1 2 203.
before 1 3 205.
before 1 4 206.
before 1 5 209.
before 1 6 211.
before 1 7 221.
before 1 8 251.
PE#2
before 2 1 301.
before 2 2 303.
before 2 3 305.
before 2 4 306.
before 2 5 309.
before 2 6 311.
before 2 7 321.
before 2 8 351.
PE#3
before 3 1 401.
before 3 2 403.
before 3 3 405.
before 3 4 406.
before 3 5 409.
before 3 6 411.
before 3 7 421.
before 3 8 451.
9494
Ope
ratio
ns o
f Sca
tter/G
athe
r (6/
8)O
n ea
ch p
roce
ss, A
LPH
Ais
add
ed to
eac
h of
8 c
ompo
nent
s of
VEC
•O
n ea
ch p
roce
ss, c
ompu
tatio
n is
in th
e fo
llow
ing
way
r
eal(kind=8), parameter :: ALPHA= 1000.
do i= 1, N
VEC(i)= VEC(i) + ALPHA
enddo
double ALPHA=1000.;
...
for(i=0;i<N;i++){
VEC[i]= VEC[i] + ALPHA;}
•R
esul
ts:
PE#0
after 0 1 1101.
after 0 2 1103.
after 0 3 1105.
after 0 4 1106.
after 0 5 1109.
after 0 6 1111.
after 0 7 1121.
after 0 8 1151.
PE#1
after 1 1 1201.
after 1 2 1203.
after 1 3 1205.
after 1 4 1206.
after 1 5 1209.
after 1 6 1211.
after 1 7 1221.
after 1 8 1251.
PE#2
after 2 1 1301.
after 2 2 1303.
after 2 3 1305.
after 2 4 1306.
after 2 5 1309.
after 2 6 1311.
after 2 7 1321.
after 2 8 1351.
PE#3
after 3 1 1401.
after 3 2 1403.
after 3 3 1405.
after 3 4 1406.
after 3 5 1409.
after 3 6 1411.
after 3 7 1421.
after 3 8 1451.
MP
I Pro
gram
min
g
9595
Ope
ratio
ns o
f Sca
tter/G
athe
r (7/
8)M
ergi
ng th
e re
sults
to g
loba
l vec
tor w
ith le
ngth
= 32
•U
sing
MP
I_G
athe
r (in
vers
e op
erat
ion
to M
PI_
Sca
tter)
MP
I Pro
gram
min
g
9696
MPI
_GA
THER
•G
athe
rs to
geth
er v
alue
s fro
m a
gro
up o
f pro
cess
es, i
nver
se o
pera
tion
to
MP
I_S
catte
r
•call MPI_GATHER (sendbuf, scount, sendtype, recvbuf,
rcount, recvtype, root, comm, ierr)
–sendbuf
choi
ce
Ist
artin
g ad
dres
s of
sen
ding
buf
fer
–scount
I
Inu
mbe
r of e
lem
ents
sen
t to
each
pro
cess
–sendtypeI
Ida
ta ty
pe o
f ele
men
ts o
f sen
ding
buf
fer
–recvbuf
choi
ce
Ost
artin
g ad
dres
s of
rece
ivin
g bu
ffer
–rcount
II
num
ber o
f ele
men
ts re
ceiv
ed fr
om th
e ro
ot p
roce
ss
–recvtypeI
I
data
type
of e
lem
ents
of r
ecei
ving
buf
fer
–root
II
rank
of r
oot p
roce
ss
–comm
II
com
mun
icat
or–
ierr
I
Oco
mpl
etio
n co
de
•recvbuf
is o
n root
proc
ess.
A0P
#0B0
C0
D0
P#1
P#2
P#3
A0P
#0B0
C0
D0
P#1
P#2
P#3
Sca
tter
A0P
#0
B0P
#1
C0
P#2
D0
P#3
A0P
#0
B0P
#1
C0
P#2
D0
P#3
Gat
her
Fortr
an
MP
I Pro
gram
min
g
9797
Ope
ratio
ns o
f Sca
tter/G
athe
r (8/
8)M
ergi
ng th
e re
sults
to g
loba
l vec
tor w
ith le
ngth
= 32
•E
ach
proc
ess
com
pone
nts
of V
ECto
VEC
gon
root
(#0
in th
is c
ase)
.
VECg
recvbuf
VEC
sendbuf
PE
#0
88
88
8
root
PE
#18
PE
#28
PE
#38
VECg
recvbuf
VEC
sendbuf
PE
#0
88
88
8
root
PE
#18
PE
#28
PE
#38•
8 co
mpo
nent
s ar
e ga
ther
ed fr
om e
ach
proc
ess
to th
e ro
ot p
roce
ss.
loca
l dat
a
glob
al d
ata
MP
I Pro
gram
min
g
call MPI_Gather &
(VEC , N, MPI_DOUBLE_PRECISION, &
VECg, N, MPI_DOUBLE_PRECISION, &
0, <comm>, ierr)
MPI_Gather
(VEC, N,
MPI_DOUBLE, VECg,
N,
MPI_DOUBLE, 0, <comm>);
989898
<$O-S1>/scatter-gather.f/c
exam
ple
PE#0
before 0 1 101.
before 0 2 103.
before 0 3 105.
before 0 4 106.
before 0 5 109.
before 0 6 111.
before 0 7 121.
before 0 8 151.
PE#1
before 1 1 201.
before 1 2 203.
before 1 3 205.
before 1 4 206.
before 1 5 209.
before 1 6 211.
before 1 7 221.
before 1 8 251.
PE#2
before 2 1 301.
before 2 2 303.
before 2 3 305.
before 2 4 306.
before 2 5 309.
before 2 6 311.
before 2 7 321.
before 2 8 351.
PE#3
before 3 1 401.
before 3 2 403.
before 3 3 405.
before 3 4 406.
before 3 5 409.
before 3 6 411.
before 3 7 421.
before 3 8 451.
PE#0
after 0 1 1101.
after 0 2 1103.
after 0 3 1105.
after 0 4 1106.
after 0 5 1109.
after 0 6 1111.
after 0 7 1121.
after 0 8 1151.
PE#1
after 1 1 1201.
after 1 2 1203.
after 1 3 1205.
after 1 4 1206.
after 1 5 1209.
after 1 6 1211.
after 1 7 1221.
after 1 8 1251.
PE#2
after 2 1 1301.
after 2 2 1303.
after 2 3 1305.
after 2 4 1306.
after 2 5 1309.
after 2 6 1311.
after 2 7 1321.
after 2 8 1351.
PE#3
after 3 1 1401.
after 3 2 1403.
after 3 3 1405.
after 3 4 1406.
after 3 5 1409.
after 3 6 1411.
after 3 7 1421.
after 3 8 1451.
MP
I Pro
gram
min
g
$> mpifccpx –Kfast scatter-gather.c
$> mpifrtpx –Kfast scatter-gather.f
$> (exec.4 proc’s) go4.sh
9999
MPI
_RED
UC
E_SC
ATT
ER
•M
PI_
RE
DU
CE
+ M
PI_
SC
ATT
ER
•call MPI_REDUCE_SCATTER (sendbuf, recvbuf, rcount,
datatype, op, comm, ierr)
–sendbuf
choi
ce
Ist
artin
g ad
dres
s of
sen
ding
buf
fer
–recvbuf
choi
ce
Ost
artin
g ad
dres
s of
rece
ivin
g bu
ffer
–rcount
I
Iin
tege
r arr
ay s
peci
fyin
g th
e nu
mbe
r of e
lem
ents
in re
sult
dist
ribut
ed to
eac
h pr
oces
s. A
rray
mus
t be
iden
tical
on
all
callin
g pr
oces
ses.
–
datatypeI
Ida
ta ty
pe o
f ele
men
ts o
f sen
ding
/rece
ivin
g bu
ffer
–op
II
redu
ce o
pera
tion
–comm
I
Ico
mm
unic
ator
–ierr
I
Oco
mpl
etio
n co
deRed
uce
scat
ter
P#0
P#1
P#2
P#3
A0
P#0
B0
C0
D0
A1
P#1
B1
C1
D1
A2
P#2
B2
C2
D2
A3
P#3
B3
C3
D3
A0
P#0
B0
C0
D0
A1
P#1
B1
C1
D1
A2
P#2
B2
C2
D2
A3
P#3
B3
C3
D3
op.A
0-A
3op
.A0-
A3
op.B
0-B
3op
.B0-
B3
op.C
0-C
3op
.C0-
C3
op.D
0-D
3op
.D0-
D3
Fortr
an
MP
I Pro
gram
min
g
100
100
MPI
_ALL
GA
THER
•M
PI_
GA
THE
R+
MP
I_B
CA
ST
–G
athe
rs d
ata
from
all
task
s an
d di
strib
ute
the
com
bine
d da
ta to
all
task
s
•call MPI_ALLGATHER (sendbuf, scount, sendtype, recvbuf,
rcount, recvtype, comm, ierr)
–sendbuf
choi
ce
Ist
artin
g ad
dres
s of
sen
ding
buf
fer
–scount
I
Inu
mbe
r of e
lem
ents
sen
t to
each
pro
cess
–
sendtypeI
Ida
ta ty
pe o
f ele
men
ts o
f sen
ding
buf
fer
–recvbuf
choi
ceO
star
ting
addr
ess
of re
ceiv
ing
buffe
r –
rcount
I
Inu
mbe
r of e
lem
ents
rece
ived
from
eac
h pr
oces
s–
recvtypeI
Ida
ta ty
pe o
f ele
men
ts o
f rec
eivi
ng b
uffe
r–
comm
I
Ico
mm
unic
ator
–ierr
I
Oco
mpl
etio
n co
de
All
gath
erA
0P
#0B0
C0
D0
A0
P#1
B0C
0D
0
A0
P#2
B0C
0D
0
A0
P#3
B0C
0D
0
A0
P#0
B0C
0D
0
A0
P#1
B0C
0D
0
A0
P#2
B0C
0D
0
A0
P#3
B0C
0D
0
A0
P#0
B0
P#1
C0
P#2
D0
P#3
A0
P#0
B0
P#1
C0
P#2
D0
P#3
Fortr
an
MP
I Pro
gram
min
g
101
101
MPI
_ALL
TOA
LL
•S
ends
dat
a fro
m a
ll to
all
proc
esse
s: tr
ansf
orm
atio
n of
den
se m
atrix
•call MPI_ALLTOALL (sendbuf, scount, sendtype, recvbuf,
rcount, recvrype, comm, ierr)
–sendbuf
choi
ce
Ist
artin
g ad
dres
s of
sen
ding
buf
fer
–scount
I
Inu
mbe
r of e
lem
ents
sen
t to
each
pro
cess
–
sendtypeI
Ida
ta ty
pe o
f ele
men
ts o
f sen
ding
buf
fer
–recvbuf
choi
ceO
star
ting
addr
ess
of re
ceiv
ing
buffe
r –
rcount
I
Inu
mbe
r of e
lem
ents
rece
ived
from
eac
h pr
oces
s–
recvtypeI
Ida
ta ty
pe o
f ele
men
ts o
f rec
eivi
ng b
uffe
r–
comm
I
Ico
mm
unic
ator
–ierr
I
Oco
mpl
etio
n co
de
All-
to-A
llA
0P
#0A
1A
2A3
B0
P#1
B1
B2
B3
C0
P#2
C1
C2
C3
D0
P#3
D1
D2
D3
A0
P#0
A1
A2
A3
B0
P#1
B1
B2
B3
C0
P#2
C1
C2
C3
D0
P#3
D1
D2
D3
A0
P#0
B0
C0
D0
A1
P#1
B1
C1
D1
A2
P#2
B2
C2
D2
A3
P#3
B3
C3
D3
A0
P#0
B0
C0
D0
A1
P#1
B1
C1
D1
A2
P#2
B2
C2
D2
A3
P#3
B3
C3
D3
Fortr
an
MP
I Pro
gram
min
g
102
102
102
Exam
ples
by
Col
lect
ive
Com
m.
•D
ot P
rodu
cts
of V
ecto
rs•
Sca
tter/G
athe
r•
Rea
ding
Dis
tribu
ted
File
s•
MP
I_A
llgat
herv
MP
I Pro
gram
min
g
103
103
103
Ope
ratio
ns o
f Dis
trib
uted
Loc
al F
iles
•In
Sca
tter/G
athe
r exa
mpl
e, P
E#0
read
s gl
obal
dat
a, th
at
is s
catte
red
to e
ach
proc
esse
r, th
en p
aral
lel o
pera
tions
ar
e do
ne.
•If
the
prob
lem
siz
e is
ver
y la
rge,
a s
ingl
e pr
oces
sor m
ay
not r
ead
entir
e gl
obal
dat
a.–
If th
e en
tire
glob
al d
ata
is d
ecom
pose
d to
dis
tribu
ted
loca
l dat
a se
ts, e
ach
proc
ess
can
read
the
loca
l dat
a.–
If gl
obal
ope
ratio
ns a
re n
eede
d to
a c
erta
in s
ets
of v
ecto
rs,
MP
I fun
ctio
ns, s
uch
as M
PI_
Gat
her e
tc. a
re a
vaila
ble.
MP
I Pro
gram
min
g
104
104
104
Rea
ding
Dis
trib
uted
Loc
al F
iles:
U
nifo
rm V
ec. L
engt
h (1
/2)
MP
I Pro
gram
min
g
>$ cd <$O-S1>
>$ ls a1.*
a1.0 a1.1 a1.2 a1.3
a1x.all is decomposed to
4 files.
>$ mpifccpx –Kfast file.c
>$ mpifrtpx –Kfast file.f
(modify go4.sh for 4 processes)
>$ pjsub go4.sh
105
105
Ope
ratio
ns o
f Dis
trib
uted
Loc
al F
iles
•Lo
cal f
iles a1.0~a1.3
are
orig
inal
ly fr
om g
loba
l file
a1x.all
.
a1.0
a1.1
a1.2
a1.3
a1x.
all
MP
I Pro
gram
min
g
106
106
Rea
ding
Dis
trib
uted
Loc
al F
iles:
U
nifo
rm V
ec. L
engt
h (2
/2)
implicit REAL*8 (A-H,O-Z)
include 'mpif.h'
integer :: PETOT, my_rank, ierr
real(kind=8), dimension(8) :: VEC
character(len=80) :: filename
call MPI_INIT (ierr)
call MPI_COMM_SIZE (MPI_COMM_WORLD, PETOT, ierr )
call MPI_COMM_RANK (MPI_COMM_WORLD, my_rank, ierr )
if (my_rank.eq.0) filename= 'a1.0'
if (my_rank.eq.1) filename= 'a1.1'
if (my_rank.eq.2) filename= 'a1.2'
if (my_rank.eq.3) filename= 'a1.3'
open (21, file= filename, status= 'unknown')
do i= 1, 8
read (21,*) VEC(i)
enddo
close (21)
call MPI_FINALIZE (ierr)
stop
end
<$O-S1>/file.f
MP
I Pro
gram
min
g
Sim
ilar t
o “H
ello
”
Loca
l ID
is 1
-8
107
107
Typi
cal S
PMD
Ope
ratio
n
PE
#0
“a.o
ut”
“a1.
0”
PE
#1
“a.o
ut”
“a1.
1”
PE
#2
“a.o
ut”
“a1.
2”
mpirun -np 4 a.out
PE
#3
“a.o
ut”
“a1.
3”
MP
I Pro
gram
min
g
108
108
108
Non
-Uni
form
Vec
tor L
engt
h (1
/2)
MP
I Pro
gram
min
g
>$ cd <$O-S1>
>$ lsa2.*
a2.0 a2.1 a2.2 a2.3
>$ cat a2.0
5N
umbe
r of C
ompo
nent
s at
eac
h P
roce
ss201.0
Com
pone
nts
203.0
205.0
206.0
209.0
>$ mpifccpx–Kfastfile2.c
>$ mpifrtpx–Kfastfile2.f
(modify go4.sh for 4 processes)
>$ pjsubgo4.sh
109
109
Non
-Uni
form
Vec
tor L
engt
h (2
/2)
implicit REAL*8 (A-H,O-Z)
include 'mpif.h'
integer :: PETOT, my_rank, ierr
real(kind=8), dimension(:), allocatable :: VEC
character(len=80) :: filename
call MPI_INIT (ierr)
call MPI_COMM_SIZE (MPI_COMM_WORLD, PETOT, ierr )
call MPI_COMM_RANK (MPI_COMM_WORLD, my_rank, ierr )
if (my_rank.eq.0) filename= 'a2.0'
if (my_rank.eq.1) filename= 'a2.1'
if (my_rank.eq.2) filename= 'a2.2'
if (my_rank.eq.3) filename= 'a2.3'
open (21, file= filename, status= 'unknown')
read (21,*) N
allocate (VEC(N))
do i= 1, N
read (21,*) VEC(i)
enddo
close(21)
call MPI_FINALIZE (ierr)
stop
end
<$O-S1>/file2.f
MP
I Pro
gram
min
g
“N” i
s di
ffere
nt a
t eac
h pr
oces
s
110
How
to g
ener
ate
loca
l dat
a•
Rea
ding
glo
bal d
ata
(N=N
G)
–S
catte
ring
to e
ach
proc
ess
–P
aral
lel p
roce
ssin
g on
eac
h pr
oces
s–
(If n
eede
d) re
cons
truct
ion
of g
loba
l dat
a by
gat
herin
g lo
cal
data
•G
ener
atin
g lo
cal d
ata
(N=N
L), o
r rea
ding
dis
tribu
ted
loca
l dat
a –
Gen
erat
ing
or re
adin
g lo
cal d
ata
on e
ach
proc
ess
–P
aral
lel p
roce
ssin
g on
eac
h pr
oces
s–
(If n
eede
d) re
cons
truct
ion
of g
loba
l dat
a by
gat
herin
g lo
cal
data
•In
futu
re, l
atte
r cas
e is
mor
e im
porta
nt, b
ut fo
rmer
cas
e is
als
o in
trodu
ced
in th
is c
lass
for u
nder
stan
ding
of
oper
atio
ns o
f glo
bal/l
ocal
dat
a.
MP
I Pro
gram
min
g
111
Exam
ples
by
Col
lect
ive
Com
m.
•D
ot P
rodu
cts
of V
ecto
rs•
Sca
tter/G
athe
r•
Rea
ding
Dis
tribu
ted
File
s•
MP
I_A
llgat
herv
MP
I Pro
gram
min
g
112
MPI
_GA
THER
V,M
PI_S
CA
TTER
V
•M
PI_
Gat
her,
MP
I_S
catte
r–
Leng
th o
f mes
sage
from
/to e
ach
proc
ess
is u
nifo
rm
•M
PI_
XX
Xv
exte
nds
func
tiona
lity
of M
PI_
XX
X b
y al
low
ing
a va
ryin
g co
unt o
f dat
a fro
m e
ach
proc
ess:
–
MP
I_G
athe
rv–
MP
I_S
catte
rv–
MP
I_A
llgat
herv
–M
PI_
Allt
oallv
MP
I Pro
gram
min
g
113
MPI
_ALL
GA
THER
V•
Var
iabl
e co
unt v
ersi
on o
f MP
I_A
llgat
her
–cr
eate
s “g
loba
l dat
a” fr
om “l
ocal
dat
a”
•call MPI_ALLGATHERV (sendbuf, scount, sendtype, recvbuf,
rcounts, displs, recvtype, comm, ierr)
–sendbuf
choi
ce
Ist
artin
g ad
dres
s of
sen
ding
buf
fer
–scount
I
Inu
mbe
r of e
lem
ents
sen
t to
each
pro
cess
–
sendtypeI
Ida
ta ty
pe o
f ele
men
ts o
f sen
ding
buf
fer
–recvbuf
choi
ceO
star
ting
addr
ess
of re
ceiv
ing
buffe
r –
rcounts
II
inte
ger a
rray
(of l
engt
h gr
oup
size
) con
tain
ing
the
num
ber o
f el
emen
ts th
at a
re to
be
rece
ived
from
eac
h pr
oces
s (a
rray
: siz
e= PETOT
)–
displs
II
inte
ger a
rray
(of l
engt
h gr
oup
size
). E
ntry
isp
ecifi
es th
e di
spla
cem
ent (
rela
tive
to re
cvbu
f) a
t whi
ch to
pla
ce th
e in
com
ing
data
from
pro
cess
i(a
rray
: siz
e= PETOT+1
)–
recvtypeI
Ida
ta ty
pe o
f ele
men
ts o
f rec
eivi
ng b
uffe
r–
comm
I
Ico
mm
unic
ator
–ierr
I
Oco
mpl
etio
n co
de
Fortr
anM
PI P
rogr
amm
ing
114
MPI
_ALL
GA
THER
V(c
ont.)
•call MPI_ALLGATHERV (sendbuf, scount, sendtype, recvbuf,
rcounts, displs, recvtype, comm, ierr)
–rcounts
II
inte
ger a
rray
(of l
engt
h gr
oup
size
) con
tain
ing
the
num
ber o
f el
emen
ts th
at a
re to
be
rece
ived
from
eac
h pr
oces
s (a
rray
: siz
e= PETOT
)–
displs
II
inte
ger a
rray
(of l
engt
h gr
oup
size
). E
ntry
isp
ecifi
es th
e di
spla
cem
ent (
rela
tive
to re
cvbu
f) a
t whi
ch to
pla
ce th
e in
com
ing
data
from
pro
cess
i(a
rray
: siz
e= PETOT+1
)–
Thes
e tw
o ar
rays
are
rela
ted
to s
ize
of fi
nal “
glob
al d
ata”
, the
refo
re e
ach
proc
ess
requ
ires
info
rmat
ion
of th
ese
arra
ys (r
counts, displs
)•
Eac
h pr
oces
s m
ust h
ave
sam
e va
lues
for a
ll co
mpo
nent
s of
bot
h ve
ctor
s–
Usu
ally
, stride(i)=rcounts(i)
rcou
nts(
1)rc
ount
s(2)
rcou
nts(
3)rc
ount
s(m
)rc
ount
s(m
-1)
PE#0
PE#1
PE#2
PE#(
m-2
)PE
#(m
-1)
disp
ls(1
)=0
disp
ls(2
)=di
spls
(1) +
stri
de(1
)di
spls
(m+1
)=di
spls
(m) +
stri
de(m
)
size
(rec
vbuf
)= d
ispl
s(P
ETO
T+1)
= su
m(s
tride
)
strid
e(1)
strid
e(2)
strid
e(3)
strid
e(m
-1)
strid
e(m
)
Fortr
anM
PI P
rogr
amm
ing
115
Wha
t MPI
_Allg
athe
rv
is d
oing
strid
e(1)
PE
#0N
PE
#1N
PE
#2N
PE
#3N
rcounts(1) rcounts(2) rcounts(3) rcounts(4)
strid
e(2)
strid
e(3)
strid
e(4)
disp
ls(1
)
disp
ls(2
)
disp
ls(3
)
disp
ls(4
)
disp
ls(5
)
Gen
erat
ing
glob
al d
ata
from
lo
cal d
ata
MP
I Pro
gram
min
g
Loca
l Dat
a: s
endb
ufG
loba
l Dat
a: re
cvbu
f
116
Wha
t MPI
_Allg
athe
rv
is d
oing
strid
e(1)
= rc
ount
s(1)
PE
#0N
PE
#1N
PE
#2N
PE
#3N
rcounts(1) rcounts(2) rcounts(3) rcounts(4)
strid
e(2)
= rc
ount
s(2)
strid
e(3)
= rc
ount
s(3)
strid
e(4)
= rc
ount
s(4)
disp
ls(1
)
disp
ls(2
)
disp
ls(3
)
disp
ls(4
)
disp
ls(5
)
Gen
erat
ing
glob
al d
ata
from
loca
l dat
a
MP
I Pro
gram
min
g
Loca
l Dat
a: s
endb
ufG
loba
l Dat
a: re
cvbu
f
117
MPI
_Allg
athe
rv in
det
ail (
1/2)
•call MPI_ALLGATHERV (sendbuf, scount, sendtype, recvbuf,
rcounts, displs, recvtype, comm, ierr)
•rcounts
–S
ize
of m
essa
ge fr
om e
ach
PE
: Siz
e of
Loc
al D
ata
(Len
gth
of L
ocal
Vec
tor)
•displs
–A
ddre
ss/in
dex
of e
ach
loca
l dat
a in
the
vect
or o
f glo
bal d
ata
–displs(PETOT+1)
= S
ize
of E
ntire
Glo
bal D
ata
(Glo
bal V
ecto
r)
rcou
nts(
1)rc
ount
s(2)
rcou
nts(
3)rc
ount
s(m
)rc
ount
s(m
-1)
PE#0
PE#1
PE#2
PE#(
m-2
)PE
#(m
-1)
disp
ls(1
)=0
disp
ls(2
)=di
spls
(1) +
stri
de(1
)di
spls
(m+1
)=di
spls
(m) +
stri
de(m
)
size
(rec
vbuf
)= d
ispl
s(P
ETO
T+1)
= su
m(s
tride
)
strid
e(1)
strid
e(2)
strid
e(3)
strid
e(m
-1)
strid
e(m
)
Fortr
anM
PI P
rogr
amm
ing
118
MPI
_Allg
athe
rv in
det
ail (
2/2)
•E
ach
proc
ess
need
s in
form
atio
n of
rcounts
& displs
–“rcounts”
can
be c
reat
ed b
y ga
ther
ing
loca
l vec
tor l
engt
h “N”
from
eac
h pr
oces
s.
–O
n ea
ch p
roce
ss, “displs”
can
be g
ener
ated
from
“rcounts”
on e
ach
proc
ess.
•stride[i]= rcounts[i]
–S
ize
of ”recvbuf”
is c
alcu
late
d by
sum
mat
ion
of ”rcounts”.
rcou
nts(
1)rc
ount
s(2)
rcou
nts(
3)rc
ount
s(m
)rc
ount
s(m
-1)
PE#0
PE#1
PE#2
PE#(
m-2
)PE
#(m
-1)
disp
ls(1
)=0
disp
ls(2
)=di
spls
(1) +
stri
de(1
)di
spls
(m+1
)=di
spls
(m) +
stri
de(m
)
size
(rec
vbuf
)= d
ispl
s(P
ETO
T+1)
= su
m(s
tride
)
strid
e(1)
strid
e(2)
strid
e(3)
strid
e(m
-1)
strid
e(m
)
Fortr
anM
PI P
rogr
amm
ing
119
Prep
arat
ion
for M
PI_A
llgat
herv
<$O
-S1>
/agv
.f
•G
ener
atin
g gl
obal
vec
tor f
rom
“a2
.0”~
”a2.
3”.
•Le
ngth
of t
he e
ach
vect
or is
8, 5
, 7, a
nd 3
, re
spec
tivel
y. T
here
fore
, siz
e of
fina
l glo
bal v
ecto
r is
23
(= 8
+5+7
+3).
MP
I Pro
gram
min
g
120
a2.0
~a2.
3
PE#0
8101.0
103.0
105.0
106.0
109.0
111.0
121.0
151.0
PE#1
5201.0
203.0
205.0
206.0
209.0
PE#2
7301.0
303.0
305.0
306.0
311.0
321.0
351.0
PE#3
3401.0
403.0
405.0
MP
I Pro
gram
min
g
121
Prep
arat
ion:
MPI
_Allg
athe
rv (1
/4)
implicit REAL*8 (A-H,O-Z)
include 'mpif.h'
integer :: PETOT, my_rank, SOLVER_COMM, ierr
real(kind=8), dimension(:), allocatable :: VEC
real(kind=8), dimension(:), allocatable :: VEC2
real(kind=8), dimension(:), allocatable :: VECg
integer(kind=4), dimension(:), allocatable :: rcounts
integer(kind=4), dimension(:), allocatable :: displs
character(len=80) :: filename
call MPI_INIT (ierr)
call MPI_COMM_SIZE (MPI_COMM_WORLD, PETOT, ierr )
call MPI_COMM_RANK (MPI_COMM_WORLD, my_rank, ierr )
if (my_rank.eq.0) filename= 'a2.0'
if (my_rank.eq.1) filename= 'a2.1'
if (my_rank.eq.2) filename= 'a2.2'
if (my_rank.eq.3) filename= 'a2.3'
open (21, file= filename, status= 'unknown')
read (21,*) N
allocate (VEC(N))
do i= 1, N
read (21,*) VEC(i)
enddo
<$O-S1>/agv.f
MP
I Pro
gram
min
g
N(NL)
is d
iffer
ent a
tea
ch p
roce
ss
Fortr
an
122
Prep
arat
ion:
MPI
_Allg
athe
rv (2
/4)
allocate (rcounts(PETOT), displs(PETOT+1))
rcounts= 0
write (*,‘(a,10i8)’) “before”, my_rank, N, rcounts
call MPI_allGATHER ( N , 1, MPI_INTEGER, &
& rcounts, 1, MPI_INTEGER, &
& MPI_COMM_WORLD, ierr)
write (*,'(a,10i8)') "after ", my_rank, N, rcounts
displs(1)= 0
PE
#0 N
=8
PE
#1 N
=5
PE
#2 N
=7
PE
#3 N
=3
MP
I_A
llgat
her
rcou
nts(
1:4)
= {8
, 5, 7
, 3}
rcou
nts(
1:4)
= {8
, 5, 7
, 3}
rcou
nts(
1:4)
= {8
, 5, 7
, 3}
rcou
nts(
1:4)
= {8
, 5, 7
, 3}
<$O-S1>/agv.f
MP
I Pro
gram
min
g
Fortr
an
Rcounts
on e
ach
PE
123
Prep
arat
ion:
MPI
_Allg
athe
rv (2
/4)
allocate (rcounts(PETOT), displs(PETOT+1))
rcounts= 0
write (*,‘(a,10i8)’) “before”, my_rank, N, rcounts
call MPI_allGATHER ( N , 1, MPI_INTEGER, &
& rcounts, 1, MPI_INTEGER, &
& MPI_COMM_WORLD, ierr)
write (*,'(a,10i8)') "after ", my_rank, N, rcounts
displs(1)= 0
do ip= 1, PETOT
displs(ip+1)= displs(ip) + rcounts(ip)
enddo
write (*,'(a,10i8)') "displs", my_rank, displs
call MPI_FINALIZE (ierr)
stop
end
<$O-S1>/agv.f
MP
I Pro
gram
min
g
Fortr
an
Rcounts
on e
ach
PE
Displs
on e
ach
PE
124
Prep
arat
ion:
MPI
_Allg
athe
rv (3
/4)
> cd <$O-S1>
> mpifrtpx –Kfast agv.f, mpifccpx –Kfast agv.c
(modify go4.sh for 4 processes)
> pjsub go4.sh
before 0 8 0 0 0 0
after 0 8 8 5 7 3
displs 0 0 8 13 20 23
FORTRAN STOP
before 1 5 0 0 0 0
after 1 5 8 5 7 3
displs 1 0 8 13 20 23
FORTRAN STOP
before 3 3 0 0 0 0
after 3 3 8 5 7 3
displs 3 0 8 13 20 23
FORTRAN STOP
before 2 7 0 0 0 0
after 2 7 8 5 7 3
displs 2 0 8 13 20 23
FORTRAN STOP
write (*,‘(a,10i8)’) “before”, my_rank, N, rcounts
write (*,'(a,10i8)') "after ", my_rank, N, rcounts
write (*,'(a,10i8)') "displs", my_rank, displs
MP
I Pro
gram
min
g
125
Prep
arat
ion:
MPI
_Allg
athe
rv (4
/4)
•O
nly ”recvbuf”
is n
ot d
efin
ed y
et.
•S
ize
of ”recvbuf”
= ”displs(PETOT+1)”
call MPI_allGATHERv
( VEC , N, MPI_DOUBLE_PRECISION,
recvbuf, rcounts, displs, MPI_DOUBLE_PRECISION,
MPI_COMM_WORLD, ierr)
MP
I Pro
gram
min
g
126
Rep
ort S
1 (1
/2)
•D
eadl
ine:
17:
00 F
ebru
ary
14th
(Sat
), 20
15.
–S
end
files
via
e-m
ail a
t nakajima(at)cc.u-tokyo.ac.jp
•P
robl
em S
1-1
–R
ead
loca
l file
s <$
O-S
1>/a
1.0~
a1.3
, <$O
-S1>
/a2.
0~a2
.3.
–D
evel
op c
odes
whi
ch c
alcu
late
nor
m ||
x|| o
f glo
bal v
ecto
r for
ea
ch c
ase.
•<$
O-S
1>fil
e.f,
<$O
-S1>
file2
.f
•P
robl
em S
1-2
–R
ead
loca
l file
s <$
O-S
1>/a
2.0~
a2.3
.–
Dev
elop
a c
ode
whi
ch c
onst
ruct
s “g
loba
l vec
tor”
usin
g M
PI_
Allg
athe
rv.
MP
I Pro
gram
min
g
127
Rep
ort S
1 (2
/2)
•P
robl
em S
1-3
–D
evel
op p
aral
lel p
rogr
am w
hich
cal
cula
tes
the
follo
win
g nu
mer
ical
inte
grat
ion
usin
g “tr
apez
oida
l rul
e” b
y M
PI_
Red
uce,
M
PI_
Bca
st e
tc.
–M
easu
re c
ompu
tatio
n tim
e, a
nd p
aral
lel p
erfo
rman
ce
dxx
1 02
14
•R
epor
t–
Cov
er P
age:
Nam
e, ID
, and
Pro
blem
ID (S
1) m
ust b
e w
ritte
n.
–Le
ss th
an tw
o pa
ges
incl
udin
g fig
ures
and
tabl
es (A
4) fo
r eac
h of
thre
e su
b-pr
oble
ms
•S
trate
gy, S
truct
ure
of th
e P
rogr
am, R
emar
ks–
Sou
rce
list o
f the
pro
gram
(if y
ou h
ave
bugs
)–
Out
put l
ist (
as s
mal
l as
poss
ible
)
MP
I Pro
gram
min
g
128
128
128
•W
hat i
s M
PI ?
•Y
our F
irst M
PI P
rogr
am: H
ello
Wor
ld
•G
loba
l/Loc
al D
ata
•C
olle
ctiv
e C
omm
unic
atio
n•
Peer
-to-P
eer C
omm
unic
atio
n
MP
I Pro
gram
min
g
Peer
-to-P
eer C
omm
unic
atio
nPo
int-t
o-Po
int C
omm
unic
atio
1対
1通信
•W
hat i
s P
2P C
omm
unic
atio
n ?
•2D
Pro
blem
, Gen
eral
ized
Com
mun
icat
ion
Tabl
e•
Rep
ort S
2
129
MP
I Pro
gram
min
g
1D F
EM: 1
2 no
des/
11 e
lem
’s/3
dom
ains
12
34
56
78
910
1112
12
34
56
79
1011
12
34
5
89
1011
12
23
4
89
1011
1
45
67
89
45
67
8
12
34
56
78
910
1112
23
45
67
89
1011
1
130
MP
I Pro
gram
min
g
8
1D F
EM: 1
2 no
des/
11 e
lem
’s/3
dom
ains
Loca
l ID
: Sta
rting
from
1 fo
r nod
e an
d el
em a
t eac
h do
mai
n
12
34
5
51
23
4
23
4
41
23
1
51
23
46
41
23
5
#0
#1
#2
131
MP
I Pro
gram
min
g
1D F
EM: 1
2 no
des/
11 e
lem
’s/3
dom
ains
Inte
rnal
/Ext
erna
l Nod
es
45
5
51
46
#0
#1
#2
132
MP
I Pro
gram
min
g
12
3
23
4
23
4
41
23
1
23
41
23
5
1
Pre
cond
ition
ed C
onju
gate
Gra
dien
t M
etho
d (C
G)
Compute r(0)= b-[A]x(0)
fori= 1, 2, …
solve [M]z(i-1)= r(i-1)
i-1= r
(i-1) z(i-1)
ifi=1
p(1)= z(0)
else
i-1= i
-1/ i
-2
p(i)= z(i-1) + i
-1p(i-1)
endif
q(i)= [A]p(i)
i = i
-1/p(i)q(i)
x(i)= x(i-1) + ip(i)
r(i)= r(i-1) -
iq(i)
check convergence |r|
end
133
MP
I Pro
gram
min
g
Pre
cond
ition
er:
Dia
gona
l Sca
ling
Poi
nt-J
acob
i Pre
cond
ition
ing
N
N
DD
DD
M
0...
00
00
0...
......
00
00
0...
0
1
2
1
Pre
cond
ition
ing,
DA
XP
YLo
cal O
pera
tions
by
Onl
y In
tern
al P
oint
s: P
aral
lel P
roce
ssin
g is
pos
sibl
e
!C
!C--{x}= {x} + ALPHA*{p}
DAXPY: double a{x} plus {y}
!C {r}= {r} -ALPHA*{q}
do i= 1, N
PHI(i)= PHI(i) + ALPHA * W(i,P)
W(i,R)= W(i,R) -ALPHA * W(i,Q)
enddo
!C!C--
{z}= [Minv]{r}
do i= 1, N
W(i,Z)= W(i,DD) * W(i,R)
enddo
1 2 3 4 5 6 7 8 9 10 11 12
134
MP
I Pro
gram
min
g
Dot
Pro
duct
sG
loba
l Sum
mat
ion
need
ed: C
omm
unic
atio
n ?
!C
!C--
ALPHA= RHO / {p}{q}
C1= 0.d0
do i= 1, N
C1= C1 + W(i,P)*W(i,Q)
enddo
ALPHA= RHO / C1
135
MP
I Pro
gram
min
g
1 2 3 4 5 6 7 8 9 10 11 12
Mat
rix-V
ecto
r Pro
duct
sV
alue
s at
Ext
erna
l Poi
nts:
P-to
-P C
omm
unic
atio
n!C!C--{q}= [A]{p}
do i= 1, N
W(i,Q) = DIAG(i)*W(i,P)
do j= INDEX(i-1)+1, INDEX(i)
W(i,Q) = W(i,Q) + AMAT(j)*W(ITEM(j),P)
enddo
enddo 5
12
34
6
136
MP
I Pro
gram
min
g
Mat
-Vec
Pro
duct
s: L
ocal
Op.
Pos
sibl
e1
2
3
4
5
6
7
7
9
10
11
12
1 2 3 4 5 6 7 8 9 10 11 12
1 2 3 4 5 6 7 8 9 10 11 12
=
137
MP
I Pro
gram
min
g
Mat
-Vec
Pro
duct
s: L
ocal
Op.
Pos
sibl
e1
2
3
4
5
6
7
8
9
10
11
12
1 2 3 4 5 6 7 8 9 10 11 12
1 2 3 4 5 6 7 8 9 10 11 12
=
138
MP
I Pro
gram
min
g
Mat
-Vec
Pro
duct
s: L
ocal
Op.
Pos
sibl
e13
9M
PI P
rogr
amm
ing
1 2 3 4 5 6 7 8 9 10 11 12
1 2 3 4 5 6 7 8 9 10 11 12
=
1
2
3
4
5
6
7
8
9
10
11
12
Mat
-Vec
Pro
duct
s: L
ocal
Op.
#1
1
2
3
4
1 2 3 4
1 2 3 4
=
51
23
46
1
2
3
4
1 2 3 4
1 2 3 4
=
5 6
140
MP
I Pro
gram
min
g
Wha
t is
Peer
-to-P
eer C
omm
unic
atio
n ?
•C
olle
ctiv
e C
omm
unic
atio
n–
MP
I_R
educ
e, M
PI_
Sca
tter/G
athe
r etc
.–
Com
mun
icat
ions
with
all
proc
esse
s in
the
com
mun
icat
or–
App
licat
ion
Are
a•
BE
M, S
pect
ral M
etho
d, M
D: g
loba
l int
erac
tions
are
con
side
red
•D
ot p
rodu
cts,
MA
X/M
IN: G
loba
l Sum
mat
ion
& C
ompa
rison
•P
eer-
toP
eer/P
oint
-to-P
oint
–M
PI_
Sen
d, M
PI_
Rec
eive
–C
omm
unic
atio
n w
ith li
mite
d pr
oces
ses
•N
eigh
bors
–A
pplic
atio
n A
rea
•FE
M, F
DM
: Loc
aliz
ed M
etho
d
01
23
4
40
12
3
12
3
30
12
0
40
12
35
30
12
4
#0
#1
#2
141
MP
I Pro
gram
min
g
Col
lect
ive/
P2P
Com
mun
icat
ions
Inte
ract
ions
with
onl
y N
eigh
borin
g P
roce
sses
/Ele
men
tFi
nite
Diff
eren
ce M
etho
d (F
DM
), Fi
nite
Ele
men
t M
etho
d (F
EM
)
142
MP
I Pro
gram
min
g
Whe
n do
we
need
P2P
com
m.:
1D-F
EMIn
fo in
nei
ghbo
ring
dom
ains
is re
quire
d fo
r FE
M o
pera
tions
Mat
rix a
ssem
blin
g, It
erat
ive
Met
hod
143
MP
I Pro
gram
min
g
45
5
51
46
#0
#1
#2
12
3
23
4
23
4
41
23
1
23
41
23
5
1
Met
hod
for P
2P C
omm
.•MPI_Send,
MPI_Recv
•Th
ese
are
“blo
ckin
g” fu
nctio
ns. “
Dea
d lo
ck” o
ccur
s fo
r th
ese
“blo
ckin
g” fu
nctio
ns.
•A
“blo
ckin
g” M
PI c
all m
eans
that
the
prog
ram
exe
cutio
n w
ill be
sus
pend
ed u
ntil
the
mes
sage
buf
fer i
s sa
fe to
use
. •
The
MP
I sta
ndar
ds s
peci
fy th
at a
blo
ckin
g S
EN
D o
r R
EC
V d
oes
not r
etur
n un
til th
e se
nd b
uffe
r is
safe
to
reus
e (fo
r MP
I_S
end)
, or t
he re
ceiv
e bu
ffer i
s re
ady
to
use
(for M
PI_
Rec
v).
–B
lock
ing
com
m. c
onfir
ms
“sec
ure”
com
mun
icat
ion,
but
it is
ver
y in
conv
enie
nt.
•P
leas
e ju
st re
mem
ber t
hat “
ther
e ar
e su
ch fu
nctio
ns”.
144
MP
I Pro
gram
min
g
MPI
_Sen
d/M
PI_R
ecv
if (my_rank.eq.0) NEIB_ID=1
if (my_rank.eq.1) NEIB_ID=0
… call MPI_SEND (NEIB_ID, arg’s)
call MPI_RECV (NEIB_ID, arg’s)
…
•Th
is s
eem
s re
ason
able
, but
it s
tops
at
MP
I_S
end/
MP
I_R
ecv.
–S
omet
imes
it w
orks
(acc
ordi
ng to
impl
emen
tatio
n).
12
34
12
34
PE#0
PE#1
5
4
145
MP
I Pro
gram
min
g
MPI
_Sen
d/M
PI_R
ecv
(con
t.)if (my_rank.eq.0) NEIB_ID=1
if (my_rank.eq.1) NEIB_ID=0
… if (my_rank.eq.0) then
call MPI_SEND (NEIB_ID, arg’s)
call MPI_RECV (NEIB_ID, arg’s)
endif
if (my_rank.eq.1) then
call MPI_RECV (NEIB_ID, arg’s)
call MPI_SEND (NEIB_ID, arg’s)
endif
…
•It
wor
ks ..
. but
12
34
12
34
PE#0
PE#1
5
4
146
MP
I Pro
gram
min
g
How
to d
o P2
P C
omm
. ?
•U
sing
“non
-blo
ckin
g” fu
nctio
ns MPI_Isend
&
MPI_Irecv
toge
ther
with
MPI_Waitall
for
sync
hron
izat
ion
•MPI_Sendrecv
is a
lso
avai
labl
e.if (my_rank.eq.0) NEIB_ID=1
if (my_rank.eq.1) NEIB_ID=0
… call MPI_Isend (NEIB_ID, arg’s)
call MPI_Irecv (NEIB_ID, arg’s)
… call MPI_Waitall (for Irecv)
… call MPI_Waitall (for Isend)
12
34
12
34
PE#0
PE#1
5
4
147
MP
I Pro
gram
min
g
MPI_Waitall
for b
oth
of
MPI_Isend/MPI_Irecv i
s po
ssib
le
MPI
_ISE
ND
•B
egin
s a
non-
bloc
king
sen
d –
Sen
d th
e co
nten
ts o
f sen
ding
buf
fer (
star
ting
from
sendbuf
, num
ber o
f mes
sage
s: count
) to
dest
with
tag
. –
Con
tent
s of
sen
ding
buf
fer c
anno
t be
mod
ified
bef
ore
callin
g co
rresp
ondi
ng MPI_Waitall
.
•call MPI_ISEND
(sendbuf,count,datatype,dest,tag,comm,request, ierr)
–sendbuf
choi
ce
Ist
artin
g ad
dres
s of
sen
ding
buf
fer
–count
I
Inu
mbe
r of e
lem
ents
sen
t to
each
pro
cess
–
datatypeI
Ida
ta ty
pe o
f ele
men
ts o
f sen
ding
buf
fer
–dest
II
rank
of d
estin
atio
n–
tag
II
mes
sage
tag
This
inte
ger c
an b
e us
ed b
y th
e ap
plic
atio
n to
dis
tingu
ish
mes
sage
s. C
omm
unic
atio
n oc
curs
if tag’s
of
MPI_Isend
and MPI_Irecv
are
mat
ched
. U
sual
ly ta
g is
set
to b
e “0
” (in
this
cla
ss),
–comm
I
Ico
mm
unic
ator
–request
IO
com
mun
icat
ion
requ
est a
rray
used
in MPI_Waitall
–ierr
I
Oco
mpl
etio
n co
de
Fortr
an14
8M
PI P
rogr
amm
ing
Com
mun
icat
ion
Req
uest
: req
uest
通信
識別
子•
call MPI_ISEND
(sendbuf,count,datatype,dest,tag,comm,request, ierr)
–sendbuf
choi
ce
Ist
artin
g ad
dres
s of
sen
ding
buf
fer
–count
I
Inu
mbe
r of e
lem
ents
sen
t to
each
pro
cess
–
datatypeI
Ida
ta ty
pe o
f ele
men
ts o
f sen
ding
buf
fer
–dest
II
rank
of d
estin
atio
n–
tag
II
mes
sage
tag
This
inte
ger c
an b
e us
ed b
y th
e ap
plic
atio
n to
dis
tingu
ish
mes
sage
s. C
omm
unic
atio
n oc
curs
if tag’s
of
MPI_Isend
and MPI_Irecv
are
mat
ched
. U
sual
ly ta
g is
set
to b
e “0
” (in
this
cla
ss),
–comm
I
Ico
mm
unic
ator
–request
IO
com
mun
icat
ion
requ
est u
sed
in MPI_Waitall
Siz
e of
the
arra
y is
tota
l num
ber o
f nei
ghbo
ring
proc
esse
s–
ierr
I
Oco
mpl
etio
n co
de
•Ju
st d
efin
e th
e ar
ray
allocate (request(NEIBPETOT))
Fortr
an
149
MP
I Pro
gram
min
g
MPI
_IR
ECV
•B
egin
s a
non-
bloc
king
rece
ive
–R
ecei
ving
the
cont
ents
of r
ecei
ving
buf
fer (
star
ting
from
recvbuf
, num
ber o
f mes
sage
s:
count
) fro
m source
with
tag
. –
Con
tent
s of
rece
ivin
g bu
ffer c
anno
t be
used
bef
ore
callin
g co
rresp
ondi
ng MPI_Waitall
.
•call MPI_IRECV
(recvbuf,count,datatype,dest,tag,comm,request, ierr)
–recvbuf
choi
ce
Ist
artin
g ad
dres
s of
rece
ivin
g bu
ffer
–count
I
Inu
mbe
r of e
lem
ents
in re
ceiv
ing
buffe
r –
datatypeI
Ida
ta ty
pe o
f ele
men
ts o
f rec
eivi
ng b
uffe
r–
source
II
rank
of s
ourc
e–
tag
II
mes
sage
tag
This
inte
ger c
an b
e us
ed b
y th
e ap
plic
atio
n to
dis
tingu
ish
mes
sage
s. C
omm
unic
atio
n oc
curs
if tag’s
of
MPI_Isend
and MPI_Irecv
are
mat
ched
. U
sual
ly ta
g is
set
to b
e “0
” (in
this
cla
ss),
–comm
I
Ico
mm
unic
ator
–request
IO
com
mun
icat
ion
requ
est u
sed
in MPI_Waitall
–ierr
I
Oco
mpl
etio
n co
de
Fortr
an15
0M
PI P
rogr
amm
ing
MPI
_WA
ITA
LL•
MPI_Waitall
bloc
ks u
ntil
all c
omm
’s, a
ssoc
iate
d w
ith request
in th
e ar
ray,
co
mpl
ete.
It is
use
d fo
r syn
chro
nizi
ng MPI_Isend
and MPI_Irecv
in th
is c
lass
.•
At s
endi
ng p
hase
, con
tent
s of
sen
ding
buf
fer c
anno
t be
mod
ified
bef
ore
callin
g co
rresp
ondi
ng MPI_Waitall
. At r
ecei
ving
pha
se, c
onte
nts
of re
ceiv
ing
buffe
r ca
nnot
be
used
bef
ore
callin
g co
rresp
ondi
ng MPI_Waitall
.•
MPI_Isend
and MPI_Irecv
can
be s
ynch
roni
zed
sim
ulta
neou
sly
with
a s
ingl
e MPI_Waitall
if it
is c
onsi
tent
.–
Sam
e request
shou
ld b
e us
ed in
MPI_Isend
and MPI_Irecv
.•
Its o
pera
tion
is s
imila
r to
that
of MPI_Barrier
but, MPI_Waitall
can
not b
e re
plac
ed b
yMPI_Barrier
.–
Pos
sibl
e tro
uble
s us
ing MPI_Barrier
inst
ead
of MPI_Waitall
: Con
tent
s of
request
and
status
are
not u
pdat
ed p
rope
rly, v
ery
slow
ope
ratio
ns e
tc.
•call MPI_WAITALL (count,request,status,ierr)
–count
I
Inu
mbe
r of p
roce
sses
to b
e sy
nchr
oniz
ed
–request
I
I/O
com
m. r
eque
st u
sed
in MPI_Waitall
(arr
ay s
ize:
count
) –
status
IO
arra
y of
sta
tus
obje
cts
MPI_STATUS_SIZE: defined in ‘mpif.h’, ‘mpi.h’
–ierr
I
Oco
mpl
etio
n co
de
Fortr
an15
1M
PI P
rogr
amm
ing
Arr
ay o
f sta
tus
obje
ct:
stat
us状
況オブジェクト配
列
•call MPI_WAITALL (count,request,status,ierr)
–count
I
Inu
mbe
r of p
roce
sses
to b
e sy
nchr
oniz
ed
–request
I
I/O
com
m. r
eque
st u
sed
in MPI_Waitall
(arr
ay s
ize:
count
)–
status
IO
arra
y of
sta
tus
obje
cts
MPI_STATUS_SIZE: defined in ‘mpif.h’, ‘mpi.h’
–ierr
I
Oco
mpl
etio
n co
de
•Ju
st d
efin
e th
e ar
ray
allocate (stat(MPI_STATUS_SIZE,NEIBPETOT))
Fortr
an
152
MP
I Pro
gram
min
g
MPI
_SEN
DR
ECV
•M
PI_
Sen
d+M
PI_
Rec
v:no
t rec
omm
ende
d, m
any
rest
rictio
ns
•call MPI_SENDRECV
(sendbuf,sendcount,sendtype,dest,sendtag,recvbuf,
recvcount,recvtype,source,recvtag,comm,status,ierr)
–sendbuf
choi
ceI
star
ting
addr
ess
of s
endi
ng b
uffe
r–
sendcount
I
Inu
mbe
r of e
lem
ents
in s
endi
ng b
uffe
r–
sendtype
I
Ida
taty
peof
eac
h se
ndin
g bu
ffer e
lem
ent
–dest
I
Ira
nk o
f des
tinat
ion
–sendtag
II
mes
sage
tag
for s
endi
ng–
comm
I
Ico
mm
unic
ator
–recvbuf
choi
ceI
star
ting
addr
ess
of re
ceiv
ing
buffe
r–
recvcount
I
Inu
mbe
r of e
lem
ents
in re
ceiv
ing
buffe
r–
recvtype
I
Ida
taty
peof
eac
h re
ceiv
ing
buffe
r ele
men
t–
source
II
rank
of s
ourc
e–
recvtag
I
Im
essa
ge ta
g fo
r rec
eivi
ng–
comm
I
Ico
mm
unic
ator
–status
I
Oar
ray
of s
tatu
s ob
ject
sMPI_STATUS_SIZE: defined in ‘mpif.h’, ‘mpi.h’
–ierr
I
Oco
mpl
etio
n co
de
Fortr
an15
3M
PI P
rogr
amm
ing
Fund
amen
tal M
PI
154
REC
V: re
ceiv
ing
to e
xter
naln
odes
Rec
v. c
ontin
uous
dat
a to
recv
. buf
fer f
rom
nei
ghbo
rs•
MPI_Irecv
(recvbuf,count,datatype,dest,tag,comm,request)
–recvbuf
choi
ce
Ist
artin
g ad
dres
s of
rece
ivin
g bu
ffer
–count
I
Inu
mbe
r of e
lem
ents
in re
ceiv
ing
buffe
r –
datatypeI
Ida
ta ty
pe o
f ele
men
ts o
f rec
eivi
ng b
uffe
r–
source
II
rank
of s
ourc
e
12
3
45
67
89
11
10
14
13
15
12
PE#0
78
910
45
612
311
12
PE#1
71
23
10
911
12
56
84 PE#2
34
8
69
10
12
12
5
11
7
PE#3
12
3
45
67
89
11
10
14
13
15
12
PE#0
78
910
45
612
311
12
PE#1
71
23
10
911
12
56
84 PE#2
34
8
69
10
12
12
5
11
7
PE#3
•MPI_Isend
(sendbuf,count,datatype,dest,tag,comm,request)
–sendbuf
choi
ce
Ist
artin
g ad
dres
s of
sen
ding
buf
fer
–count
I
Inu
mbe
r of e
lem
ents
sen
t to
each
pro
cess
–
datatypeI
Ida
ta ty
pe o
f ele
men
ts o
f sen
ding
buf
fer
–dest
II
rank
of d
estin
atio
n
Fund
amen
tal M
PI
155
SEN
D: s
endi
ng fr
om b
ound
ary
node
sSe
nd c
ontin
uous
dat
a to
sen
d bu
ffer o
f nei
ghbo
rs
12
3
45
67
89
11
10
14
13
15
12
PE#0
78
910
45
612
311
12
PE#1
71
23
10
911
12
56
84 PE#2
34
8
69
10
12
12
5
11
7
PE#3
12
3
45
67
89
11
10
14
13
15
12
PE#0
78
910
45
612
311
12
PE#1
71
23
10
911
12
56
84 PE#2
34
8
69
10
12
12
5
11
7
PE#3
Req
uest
, Sta
tus
in F
ortra
n
•MPI_Sendrecv: status
integer status (MPI_STATUS_SIZE)
•MPI_Isend: request
•MPI_Irecv: request
•MPI_Waitall: request, status
integer request(NEIBPETOT)
integer status (MPI_STAUTS_SIZE,NEIBPETOT)
156
MP
I Pro
gram
min
g
157
File
s on
Oak
leaf
-FX
Fotran
>$ cd <$O-TOP>
>$ cp /home/z30088/class_eps/F/s2-f.tar .
>$ tar xvf s2-f.tar
C>$ cd <$O-TOP>
>$ cp /home/z30088/class_eps/C/s2-c.tar .
>$ tar xvf s2-c.tar
Confirm Directory
>$ ls
mpi
>$ cd mpi/S2
This directory is called as <$O-S2> in this course.
<$O-S2> = <$O-TOP>/mpi/S2
157
MP
I Pro
gram
min
g
Ex.1
: Sen
d-R
ecv
a Sc
alar
•E
xcha
nge
VA
L(re
al, 8
-byt
e) b
etw
een
PE
#0 &
PE
#1if (my_rank.eq.0) NEIB= 1
if (my_rank.eq.1) NEIB= 0
call MPI_Isend (VAL
,1,MPI_DOUBLE_PRECISION,NEIB,…,req_send,…)
call MPI_Irecv (VALtemp,1,MPI_DOUBLE_PRECISION,NEIB,…,req_recv,…)
call MPI_Waitall (…,req_recv,stat_recv,…) Recv.buf
VALtemp
can be used
call MPI_Waitall (…,req_send,stat_send,…)
Send buf
VAL can
be modified
VAL= VALtemp
if (my_rank.eq.0) NEIB= 1
if (my_rank.eq.1) NEIB= 0
call MPI_Sendrecv (VAL
,1,MPI_DOUBLE_PRECISION,NEIB,… &
VALtemp,1,MPI_DOUBLE_PRECISION,NEIB,…, status,…)
VAL= VALtemp
158
MP
I Pro
gram
min
g
Nam
e of
recv
. buf
fer c
ould
be
“VAL
”, b
ut n
ot re
com
men
ded.
Ex.1
: Sen
d-R
ecv
a Sc
alar
Isen
d/Ire
cv/W
aita
ll$> cd <$O-S2>
$> mpifrtpx –Kfast ex1-1.f
$> pjsub go2.sh
implicit REAL*8 (A-H,O-Z)
include 'mpif.h'
integer(kind=4) :: my_rank, PETOT, NEIB
real (kind=8) :: VAL, VALtemp
integer(kind=4), dimension(MPI_STATUS_SIZE,1) :: stat_send, stat_recv
integer(kind=4), dimension(1)
:: request_send, request_recv
call MPI_INIT (ierr)
call MPI_COMM_SIZE (MPI_COMM_WORLD, PETOT, ierr )
call MPI_COMM_RANK (MPI_COMM_WORLD, my_rank, ierr )
if (my_rank.eq.0) then
NEIB= 1
VAL = 10.d0
else
NEIB= 0
VAL = 11.d0
endif
call MPI_ISEND (VAL,1,MPI_DOUBLE_PRECISION,NEIB,0,MPI_COMM_WORLD,request_send(1),ierr)
call MPI_IRECV (VALx,1,MPI_DOUBLE_PRECISION,NEIB,0,MPI_COMM_WORLD,request_recv(1),ierr)
call MPI_WAITALL (1, request_recv, stat_recv, ierr)
call MPI_WAITALL (1, request_send, stat_send, ierr)
VAL= VALx
call MPI_FINALIZE (ierr)
end
159
MP
I Pro
gram
min
g
Ex.1
: Sen
d-R
ecv
a Sc
alar
Sen
dRec
v
implicit REAL*8 (A-H,O-Z)
include 'mpif.h'
integer(kind=4) :: my_rank, PETOT, NEIB
real (kind=8) :: VAL, VALtemp
integer(kind=4) :: status(MPI_STATUS_SIZE)
call MPI_INIT (ierr)
call MPI_COMM_SIZE (MPI_COMM_WORLD, PETOT, ierr )
call MPI_COMM_RANK (MPI_COMM_WORLD, my_rank, ierr )
if (my_rank.eq.0) then
NEIB= 1
VAL = 10.d0
endif
if (my_rank.eq.1) then
NEIB= 0
VAL = 11.d0
endif
call MPI_SENDRECV &
& (VAL , 1, MPI_DOUBLE_PRECISION, NEIB, 0, &
& VALtemp, 1, MPI_DOUBLE_PRECISION, NEIB, 0, MPI_COMM_WORLD, status, ierr)
VAL= VALtemp
call MPI_FINALIZE (ierr)
end
$> cd <$O-S2>
$> mpifrtpx –Kfast ex1-2.f
$> pjsub go2.sh
160
MP
I Pro
gram
min
g
Ex.2
: Sen
d-R
ecv
an A
rray
(1/4
)
•E
xcha
nge
VE
C (r
eal,
8-by
te) b
etw
een
PE
#0 &
PE
#1•
PE
#0 to
PE
#1–
PE
#0: s
end
VE
C(1
)-VE
C(1
1) (
leng
th=1
1)–
PE
#1: r
ecv.
as
VE
C(2
6)-V
EC
(36)
(len
gth=
11)
•P
E#1
to P
E#0
–P
E#1
: sen
d V
EC
(1)-V
EC
(25)
(le
ngth
=25)
–P
E#0
: rec
v. a
s V
EC
(12)
-VE
C(3
6) (l
engt
h=25
)
•P
ract
ice:
Dev
elop
a p
rogr
am fo
r thi
s op
erat
ion.
12
34
56
78
910
1112
1314
1516
1718
1920
2122
2324
2526
2728
2930
3132
3334
3536
PE
#0
PE
#11
23
45
67
89
1011
1213
1415
1617
1819
2021
2223
2425
2627
2829
3031
3233
3435
36
161
MP
I Pro
gram
min
g
Prac
tice:
t1
•In
itial
sta
tus
of V
EC(:)
:–
PE
#0V
EC
(1-3
6)=
101,
102,
103,
~,13
5,13
6–
PE
#1V
EC
(1-3
6)=
201,
202,
203,
~,23
5,23
6
•C
onfir
m th
e re
sults
in th
e ne
xt p
age
•U
sing
follo
win
g tw
o fu
nctio
ns:
–M
PI_
Isen
d/Ire
cv/W
aita
ll–
MP
I_S
endr
ecv
162
MP
I Pro
gram
min
g
t1
Estim
ated
Res
ults
0 #BEFORE# 1 101.
0 #BEFORE# 2 102.
0 #BEFORE# 3 103.
0 #BEFORE# 4 104.
0 #BEFORE# 5 105.
0 #BEFORE# 6 106.
0 #BEFORE# 7 107.
0 #BEFORE# 8 108.
0 #BEFORE# 9 109.
0 #BEFORE# 10 110.
0 #BEFORE# 11 111.
0 #BEFORE# 12 112.
0 #BEFORE# 13 113.
0 #BEFORE# 14 114.
0 #BEFORE# 15 115.
0 #BEFORE# 16 116.
0 #BEFORE# 17 117.
0 #BEFORE# 18 118.
0 #BEFORE# 19 119.
0 #BEFORE# 20 120.
0 #BEFORE# 21 121.
0 #BEFORE# 22 122.
0 #BEFORE# 23 123.
0 #BEFORE# 24 124.
0 #BEFORE# 25 125.
0 #BEFORE# 26 126.
0 #BEFORE# 27 127.
0 #BEFORE# 28 128.
0 #BEFORE# 29 129.
0 #BEFORE# 30 130.
0 #BEFORE# 31 131.
0 #BEFORE# 32 132.
0 #BEFORE# 33 133.
0 #BEFORE# 34 134.
0 #BEFORE# 35 135.
0 #BEFORE# 36 136.
0 #AFTER # 1 101.
0 #AFTER # 2 102.
0 #AFTER # 3 103.
0 #AFTER # 4 104.
0 #AFTER # 5 105.
0 #AFTER # 6 106.
0 #AFTER # 7 107.
0 #AFTER # 8 108.
0 #AFTER # 9 109.
0 #AFTER # 10 110.
0 #AFTER # 11 111.
0 #AFTER # 12 201.
0 #AFTER # 13 202.
0 #AFTER # 14 203.
0 #AFTER # 15 204.
0 #AFTER # 16 205.
0 #AFTER # 17 206.
0 #AFTER # 18 207.
0 #AFTER # 19 208.
0 #AFTER # 20 209.
0 #AFTER # 21 210.
0 #AFTER # 22 211.
0 #AFTER # 23 212.
0 #AFTER # 24 213.
0 #AFTER # 25 214.
0 #AFTER # 26 215.
0 #AFTER # 27 216.
0 #AFTER # 28 217.
0 #AFTER # 29 218.
0 #AFTER # 30 219.
0 #AFTER # 31 220.
0 #AFTER # 32 221.
0 #AFTER # 33 222.
0 #AFTER # 34 223.
0 #AFTER # 35 224.
0 #AFTER # 36 225.
1 #BEFORE# 1 201.
1 #BEFORE# 2 202.
1 #BEFORE# 3 203.
1 #BEFORE# 4 204.
1 #BEFORE# 5 205.
1 #BEFORE# 6 206.
1 #BEFORE# 7 207.
1 #BEFORE# 8 208.
1 #BEFORE# 9 209.
1 #BEFORE# 10 210.
1 #BEFORE# 11 211.
1 #BEFORE# 12 212.
1 #BEFORE# 13 213.
1 #BEFORE# 14 214.
1 #BEFORE# 15 215.
1 #BEFORE# 16 216.
1 #BEFORE# 17 217.
1 #BEFORE# 18 218.
1 #BEFORE# 19 219.
1 #BEFORE# 20 220.
1 #BEFORE# 21 221.
1 #BEFORE# 22 222.
1 #BEFORE# 23 223.
1 #BEFORE# 24 224.
1 #BEFORE# 25 225.
1 #BEFORE# 26 226.
1 #BEFORE# 27 227.
1 #BEFORE# 28 228.
1 #BEFORE# 29 229.
1 #BEFORE# 30 230.
1 #BEFORE# 31 231.
1 #BEFORE# 32 232.
1 #BEFORE# 33 233.
1 #BEFORE# 34 234.
1 #BEFORE# 35 235.
1 #BEFORE# 36 236.
1 #AFTER # 1 201.
1 #AFTER # 2 202.
1 #AFTER # 3 203.
1 #AFTER # 4 204.
1 #AFTER # 5 205.
1 #AFTER # 6 206.
1 #AFTER # 7 207.
1 #AFTER # 8 208.
1 #AFTER # 9 209.
1 #AFTER # 10 210.
1 #AFTER # 11 211.
1 #AFTER # 12 212.
1 #AFTER # 13 213.
1 #AFTER # 14 214.
1 #AFTER # 15 215.
1 #AFTER # 16 216.
1 #AFTER # 17 217.
1 #AFTER # 18 218.
1 #AFTER # 19 219.
1 #AFTER # 20 220.
1 #AFTER # 21 221.
1 #AFTER # 22 222.
1 #AFTER # 23 223.
1 #AFTER # 24 224.
1 #AFTER # 25 225.
1 #AFTER # 26 101.
1 #AFTER # 27 102.
1 #AFTER # 28 103.
1 #AFTER # 29 104.
1 #AFTER # 30 105.
1 #AFTER # 31 106.
1 #AFTER # 32 107.
1 #AFTER # 33 108.
1 #AFTER # 34 109.
1 #AFTER # 35 110.
1 #AFTER # 36 111.
163
MP
I Pro
gram
min
g
t1
Ex.2
: Sen
d-R
ecv
an A
rray
(2/4
)if (my_rank.eq.0) then
call MPI_Isend (VEC( 1),11,MPI_DOUBLE_PRECISION,1,…,req_send,…)
call MPI_Irecv (VEC(12),25,MPI_DOUBLE_PRECISION,1,…,req_recv,…)
endif
if (my_rank.eq.1) then
call MPI_Isend (VEC( 1),25,MPI_DOUBLE_PRECISION,0,…,req_send,…)
call MPI_Irecv (VEC(26),11,MPI_DOUBLE_PRECISION,0,…,req_recv,…)
endif
call MPI_Waitall (…,req_recv,stat_recv,…)
call MPI_Waitall (…,req_send,stat_send,…)
It w
orks
, but
com
plic
ated
ope
ratio
ns.
Not
look
s lik
e S
PM
D.
Not
por
tabl
e.
164
MP
I Pro
gram
min
g
t1
Ex.2
: Sen
d-R
ecv
an A
rray
(3/4
)if (my_rank.eq.0) then
NEIB= 1
start_send= 1
length_send= 11
start_recv= length_send + 1
length_recv= 25
endif
if (my_rank.eq.1) then
NEIB= 0
start_send= 1
length_send= 25
start_recv= length_send + 1
length_recv= 11
endif
call MPI_Isend &
(VEC(start_send),length_send,MPI_DOUBLE_PRECISION,NEIB,…,req_send,…)
call MPI_Irecv &
(VEC(start_recv),length_recv,MPI_DOUBLE_PRECISION,NEIB,…,req_recv,…)
call MPI_Waitall (…,req_recv,stat_recv,…)
call MPI_Waitall (…,req_send,stat_send,…)
165
MP
I Pro
gram
min
g
t1
This
is “S
MP
D” !
!
Ex.2
: Sen
d-R
ecv
an A
rray
(4/4
)if (my_rank.eq.0) then
NEIB= 1
start_send= 1
length_send= 11
start_recv= length_send + 1
length_recv= 25
endif
if (my_rank.eq.1) then
NEIB= 0
start_send= 1
length_send= 25
start_recv= length_send + 1
length_recv= 11
endif
call MPI_Sendrecv &
(VEC(start_send),length_send,MPI_DOUBLE_PRECISION,NEIB,… &
VEC(start_recv),length_recv,MPI_DOUBLE_PRECISION,NEIB,…, status,…)
166
MP
I Pro
gram
min
g
t1
Not
ice:
Sen
d/R
ecv
Arr
ays
#PE0send:
VEC(start_send)~
VEC(start_send+length_send-1)
#PE1recv:
VEC(start_recv)~
VEC(start_recv+length_recv-1)
#PE1send:
VEC(start_send)~
VEC(start_send+length_send-1)
#PE0recv:
VEC(start_recv)~
VEC(start_recv+length_recv-1)
•“le
ngth
_sen
d” o
f sen
ding
pro
cess
mus
t be
equa
l to
“leng
th_r
ecv”
of r
ecei
ving
pro
cess
.–
PE
#0 to
PE
#1, P
E#1
to P
E#0
•“s
endb
uf” a
nd “r
ecvb
uf”:
diffe
rent
add
ress
167
MP
I Pro
gram
min
g
t1
Peer
-to-P
eer C
omm
unic
atio
n
•W
hat i
s P
2P C
omm
unic
atio
n ?
•2D
Pro
blem
, Gen
eral
ized
Com
mun
icat
ion
Tabl
e–
2D F
DM
–P
robl
em S
ettin
g–
Dis
tribu
ted
Loca
l Dat
a an
d C
omm
unic
atio
n Ta
ble
–Im
plem
enta
tion
•R
epor
t S2
168
MP
I Pro
gram
min
g
2D F
DM
(5-p
oint
, cen
tral
diff
eren
ce)
fy
x
2
2
2
2
xx
W
C E
N S
y y
CS
CN
WC
Ef
yx
22
22
170
MP
I Pro
gram
min
g
Dec
ompo
se in
to 4
dom
ains
56
78
1314
1516
2122
2324
2930
3132
3334
3536
4142
4344
4950
5152
5758
5960
3738
3940
4546
4748
5354
5556
6162
6364
12
34
910
1112
1718
1920
2526
2728
12
34
910
1112
1718
1920
2526
2728
171
MP
I Pro
gram
min
g
3334
3536
4142
4344
4950
5152
5758
5960
12
34
910
1112
1718
1920
2526
2728
12
34
910
1112
1718
1920
2526
2728
56
78
1314
1516
2122
2324
2930
3132
4 do
mai
ns: G
loba
l ID
PE
#0P
E#1
PE
#2P
E#3
3738
3940
4546
4748
5354
5556
6162
6364
172
MP
I Pro
gram
min
g
4 do
mai
ns: L
ocal
ID
12
34
910
1112
1718
1920
2526
2728
12
34
56
78
910
1112
1314
1516
12
34
910
1112
1718
1920
2526
2728
12
34
56
78
910
1112
1314
1516
12
34
910
1112
1718
1920
2526
2728
12
34
56
78
910
1112
1314
1516
12
34
910
1112
1718
1920
2526
2728
12
34
56
78
910
1112
1314
1516
PE
#0P
E#1
173
MP
I Pro
gram
min
g
PE
#2P
E#3
Exte
rnal
Poi
nts:
Ove
rlapp
ed R
egio
n
12
34
910
1112
1718
1920
2526
2728
12
34
56
78
910
1112
1314
1516
12
34
910
1112
1718
1920
2526
2728
12
34
56
78
910
11
1314
12
34
910
1112
1718
1920
2526
2728
12
34
56
78
910
1112
1314
1516
12
34
910
1112
1718
1920
2526
2728
12
56
7
910
1112
1314
1516
PE
#0P
E#1
xx
W
C E
N S
y y
12
1516
13
4
12
15164
3
8
174
MP
I Pro
gram
min
g
PE
#2P
E#3
Exte
rnal
Poi
nts:
Ove
rlapp
ed R
egio
n
12
34
910
1112
1718
1920
2526
2728
12
34
56
78
910
1112
1314
1516
12
34
910
1112
1718
1920
2526
2728
23
4
67
8
1011
12
1314
1516
12
34
910
1112
1718
1920
2526
2728 4
56
78
910
1112
1314
1516
12
34
910
1112
1718
1920
2526
2728
67
8
1011
12
1415
16
PE
#0P
E#1
12
3
15912
34
5913
175
MP
I Pro
gram
min
g
PE
#2P
E#3
Loca
l ID
of E
xter
nal P
oint
s ?
12
34
910
1112
1718
1920
2526
2728
12
3
56
7
910
11
12
34
910
1112
1718
1920
2526
2728
23
4
67
8
1011
12
910
1112
1718
1920
2526
2728
12
34
56
7
910
11
1314
15
12
34
910
1112
1718
1920
2526
2728
67
8
1011
12
1415
16
PE
#0P
E#1
? ? ? ?
??
??
? ? ? ?
? ? ? ?
? ? ? ?
??
??
??
??
??
??
4812
1314
1516
1591314
1516
12
3481216
12
34
5913
176
MP
I Pro
gram
min
g
PE
#2P
E#3
Ove
rlapp
ed R
egio
n
12
34
910
1112
1718
1920
2526
2728
12
34
56
78
910
1112
1314
1516
12
34
910
1112
1718
1920
2526
2728
12
34
56
78
910
1112
1314
1516
910
1112
1718
1920
2526
2728
12
34
12
34
56
78
910
1112
1314
1516
12
34
910
1112
1718
1920
2526
2728
12
34
56
78
910
1112
1314
1516
PE
#0P
E#1
? ? ? ?
??
??
? ? ? ?
? ? ? ?
? ? ? ?
??
??
??
??
??
??
177
MP
I Pro
gram
min
g
PE
#2P
E#3
Ove
rlapp
ed R
egio
n
12
34
910
1112
1718
1920
2526
2728
12
34
56
78
910
1112
1314
1516
12
34
910
1112
1718
1920
2526
2728
12
34
56
78
910
1112
1314
1516
910
1112
1718
1920
2526
2728
12
34
12
34
56
78
910
1112
1314
1516
12
34
910
1112
1718
1920
2526
2728
12
34
56
78
910
1112
1314
1516
PE
#0P
E#1
? ? ? ?
??
??
? ? ? ?
? ? ? ?
? ? ? ?
??
??
??
??
??
??
178
MP
I Pro
gram
min
g
PE
#2P
E#3
Peer
-to-P
eer C
omm
unic
atio
n
•W
hat i
s P
2P C
omm
unic
atio
n ?
•2D
Pro
blem
, Gen
eral
ized
Com
mun
icat
ion
Tabl
e–
2D F
DM
–P
robl
em S
ettin
g–
Dis
tribu
ted
Loca
l Dat
a an
d C
omm
unic
atio
n Ta
ble
–Im
plem
enta
tion
•R
epor
t S2
179
MP
I Pro
gram
min
g
Prob
lem
Set
ting:
2D
FD
M
•2D
regi
on w
ith 6
4 m
eshe
s (8
x8)
•E
ach
mes
h ha
s gl
obal
ID
from
1 to
64
–In
this
exa
mpl
e, th
is
glob
al ID
is c
onsi
dere
d as
dep
ende
nt v
aria
ble,
su
ch a
s te
mpe
ratu
re,
pres
sure
etc
.–
Som
ethi
ng li
ke c
ompu
ted
resu
lts
5758
5960
6162
6364
4950
5152
5354
5556
4142
4344
4546
4748
3334
3536
3738
3940
2526
2728
2930
3132
1718
1920
2122
2324
910
1112
1314
1516
12
34
56
78
180
MP
I Pro
gram
min
g
Prob
lem
Set
ting:
Dis
trib
uted
Loc
al D
ata
•4
sub-
dom
ains
. •
Info
. of e
xter
nal p
oint
s (g
loba
l ID
of m
esh)
is
rece
ived
from
nei
ghbo
rs.
–P
E#0
rece
ives
□
5758
5960
4950
5152
4142
4344
3334
3536
6162
6364
5354
5556
4546
4748
3738
3940
2526
2728
1718
1920
910
1112
12
34
2930
3132
2122
2324
1314
1516
56
78
PE#0
PE#1
PE#2
PE#3
5758
5960
4950
5152
4142
4344
3334
3536
6162
6364
5354
5556
4546
4748
3738
3940
2526
2728
1718
1920
910
1112
12
34
2930
3132
2122
2324
1314
1516
56
78
PE#0
PE#1
PE#2
PE#3
3334
3536
2526
2728
61 53 45 37 29 21 13 5
60 52 44 36 28 20 12 4
2930
3132
3738
3940
181
MP
I Pro
gram
min
g
Ope
ratio
ns o
f 2D
FD
M
5758
5960
4950
5152
4142
4344
3334
3536
6162
6364
5354
5556
4546
4748
3738
3940
2526
2728
1718
1920
910
1112
12
34
2930
3132
2122
2324
1314
1516
56
78
xx
W
C E
N S
y yfy
x
2
2
2
2
CS
CN
WC
Ef
yx
2
2
22
182
MP
I Pro
gram
min
g
Ope
ratio
ns o
f 2D
FD
M
5758
5960
4950
5152
4142
4344
3334
35
6162
6364
5354
5556
4546
4748
3738
3940
2526
1719
121
34
3031
3221
2223
2413
1415
165
67
8
189
1011
2
3627
28 2029
xx
W
C E
N S
y yfy
x
2
2
2
2
CS
CN
WC
Ef
yx
2
2
22
183
MP
I Pro
gram
min
g
Com
puta
tion
(1/3
)
•O
n ea
ch P
E, i
nfo.
of i
nter
nalp
ts (i
=1-N
(=16
)) a
re re
ad fr
om
dist
ribut
ed lo
cal d
ata,
info
. of b
ound
ary
pts
are
sent
to
neig
hbor
s, a
nd th
ey a
re re
ceiv
ed a
s in
fo. o
f ext
erna
lpts
.
5758
5960
4950
5152
4142
4344
3334
3536
6162
6364
5354
5556
4546
4748
3738
3940
2526
2728
1718
1920
910
1112
12
34
2930
3132
2122
2324
1314
1516
56
78
PE#0
PE#1
PE#2
PE#3
184
MP
I Pro
gram
min
g
Com
puta
tion
(2/3
): B
efor
e Se
nd/R
ecv
5758
5960
4950
5152
4142
4344
3334
3536
6162
6364
5354
5556
4546
4748
3738
3940
2526
2728
1718
1920
910
1112
12
34
2930
3132
2122
2324
1314
1516
56
78
PE#0
PE#1
PE#2
PE#3
1: 3
39:
49
17: ?
2: 3
410
: 50
18: ?
3: 3
511
: 51
19: ?
4: 3
612
: 52
20: ?
5: 4
113
: 57
21: ?
6: 4
214
: 58
22: ?
7: 4
315
: 59
23: ?
8: 4
416
: 60
24: ?
1: 3
79:
53
17: ?
2: 3
810
: 54
18: ?
3: 3
911
: 55
19: ?
4: 4
012
: 56
20: ?
5: 4
513
: 61
21: ?
6: 4
614
: 62
22: ?
7: 4
715
: 63
23: ?
8: 4
816
: 64
24: ?
1:
19:
17
17: ?
2:
210
: 18
18: ?
3:
311
: 19
19: ?
4:
412
: 20
20: ?
5:
913
: 25
21: ?
6: 1
014
: 26
22: ?
7: 1
115
: 27
23: ?
8: 1
216
: 28
24: ?
1:
59:
21
17: ?
2:
610
: 22
18: ?
3:
711
: 23
19: ?
4:
812
: 24
20: ?
5: 1
313
: 29
21: ?
6: 1
414
: 30
22: ?
7: 1
515
: 31
23: ?
8: 1
616
: 32
24: ?
3334
3536
2526
2728
61 53 45 37 29 21 13 5
60 52 44 36 28 20 12 4
2930
3132
3738
3940
185
MP
I Pro
gram
min
g
Com
puta
tion
(2/3
): B
efor
e Se
nd/R
ecv
5758
5960
4950
5152
4142
4344
3334
3536
6162
6364
5354
5556
4546
4748
3738
3940
2526
2728
1718
1920
910
1112
12
34
2930
3132
2122
2324
1314
1516
56
78
PE#0
PE#1
PE#2
PE#3
1: 3
39:
49
17: ?
2: 3
410
: 50
18: ?
3: 3
511
: 51
19: ?
4: 3
612
: 52
20: ?
5: 4
113
: 57
21: ?
6: 4
214
: 58
22: ?
7: 4
315
: 59
23: ?
8: 4
416
: 60
24: ?
1: 3
79:
53
17: ?
2: 3
810
: 54
18: ?
3: 3
911
: 55
19: ?
4: 4
012
: 56
20: ?
5: 4
513
: 61
21: ?
6: 4
614
: 62
22: ?
7: 4
715
: 63
23: ?
8: 4
816
: 64
24: ?
1:
19:
17
17: ?
2:
210
: 18
18: ?
3:
311
: 19
19: ?
4:
412
: 20
20: ?
5:
913
: 25
21: ?
6: 1
014
: 26
22: ?
7: 1
115
: 27
23: ?
8: 1
216
: 28
24: ?
1:
59:
21
17: ?
2:
610
: 22
18: ?
3:
711
: 23
19: ?
4:
812
: 24
20: ?
5: 1
313
: 29
21: ?
6: 1
414
: 30
22: ?
7: 1
515
: 31
23: ?
8: 1
616
: 32
24: ?
3334
3536
2526
2728
61 53 45 37 29 21 13 5
60 52 44 36 28 20 12 4
2930
3132
3738
3940
186
MP
I Pro
gram
min
g
Com
puta
tion
(3/3
): A
fter S
end/
Rec
v
6162
6364
5354
5556
4546
4748
3738
3940
2526
2728
1718
1920
910
1112
12
34
PE#0
PE#1
PE#2
PE#3
1: 3
39:
49
17: 3
72:
34
10: 5
018
: 45
3: 3
511
: 51
19: 5
34:
36
12: 5
220
: 61
5: 4
113
: 57
21: 2
56:
42
14: 5
822
: 26
7: 4
315
: 59
23: 2
78:
44
16: 6
024
: 28
1: 3
79:
53
17: 3
62:
38
10: 5
418
: 44
3: 3
911
: 55
19: 5
24:
40
12: 5
620
: 60
5: 4
513
: 61
21: 2
96:
46
14: 6
222
: 30
7: 4
715
: 63
23: 3
18:
48
16: 6
424
: 32
1:
19:
17
17:
52:
2
10: 1
818
: 14
3:
311
: 19
19: 2
14:
4
12: 2
020
: 29
5:
913
: 25
21: 3
36:
10
14: 2
622
: 34
7: 1
115
: 27
23: 3
58:
12
16: 2
824
: 36
1:
59:
21
17:
42:
6
10: 2
218
: 12
3:
711
: 23
19: 2
04:
8
12: 2
420
: 28
5: 1
313
: 29
21: 3
76:
14
14: 3
022
: 38
7: 1
515
: 31
23: 3
98:
16
16: 3
224
: 40
3334
3536
2526
2728
61 53 45 37 29 21 13 5
60 52 44 36 28 20 12 4
2930
3132
3738
3940
5758
5960
4950
5152
4142
4344
3334
3536
2930
3132
2122
2324
1314
1516
56
78
187
MP
I Pro
gram
min
g
Peer
-to-P
eer C
omm
unic
atio
n
•W
hat i
s P
2P C
omm
unic
atio
n ?
•2D
Pro
blem
, Gen
eral
ized
Com
mun
icat
ion
Tabl
e–
2D F
DM
–P
robl
em S
ettin
g–
Dis
tribu
ted
Loca
l Dat
a an
d C
omm
unic
atio
n Ta
ble
–Im
plem
enta
tion
•R
epor
t S2
188
MP
I Pro
gram
min
g
Ove
rvie
w o
f Dis
trib
uted
Loc
al D
ata
Exa
mpl
e on
PE
#0
Valu
e at
eac
h m
esh
(= G
loba
l ID
) Lo
cal I
D
2526
2728
1718
1920
910
1112
12
34
PE#0
PE#1
PE#2
1314
1516
910
1112
56
78
12
34
PE#0
PE#1
PE#2
189
MP
I Pro
gram
min
g
SPM
D・・・
PE
#0
“a.
out”
“sq
m.0
”
PE
#1
“a.
out”
“sq
m.1
”
PE
#2
“a.
out”
“sq
m.2
”
PE
#3
“a.
out”
“sq
m.3
”
“sq
.0”
“sq
.1”
“sq
.2”
“sq
.3”
Dis
t. Lo
cal D
ata
Set
s (N
eigh
bors
,C
omm
. Tab
les)
Dis
t. Lo
cal D
ata
Set
s (G
loba
l ID
of
Inte
rnal
Poi
nts)
Geo
met
ry
Res
ults
190
MP
I Pro
gram
min
g
2D F
DM
: PE#
0In
form
atio
n at
eac
h do
mai
n (1
/4)
12
34
56
78
910
1112
1314
1516
191
MP
I Pro
gram
min
g
Inte
rnal
Poi
nts
Mes
hes
orig
inal
ly a
ssig
ned
to th
e do
mai
n
2D F
DM
: PE#
0In
form
atio
n at
eac
h do
mai
n (2
/4)
PE#1
12
34
56
78
910
1112
1314
1516
●●
●●
● ● ● ●
Ext
erna
l Poi
nts
Mes
hes
orig
inal
ly a
ssig
ned
to d
iffer
ent d
omai
n, b
ut
requ
ired
for c
ompu
tatio
n of
mes
hes
in th
e do
mai
n (m
eshe
s in
ove
rlapp
ed re
gion
s)
・S
leev
es・H
alo
192
MP
I Pro
gram
min
g
Inte
rnal
Poi
nts
Mes
hes
orig
inal
ly a
ssig
ned
to th
e do
mai
n
PE#2
2D F
DM
: PE#
0In
form
atio
n at
eac
h do
mai
n (3
/4)
PE#1
12
34
56
78
910
1112
1314
1516
●●
●●
● ● ● ●
193
MP
I Pro
gram
min
g
Inte
rnal
Poi
nts
Mes
hes
orig
inal
ly a
ssig
ned
to th
e do
mai
n
Ext
erna
l Poi
nts
Mes
hes
orig
inal
ly a
ssig
ned
to d
iffer
ent d
omai
n, b
ut
requ
ired
for c
ompu
tatio
n of
mes
hes
in th
e do
mai
n (m
eshe
s in
ove
rlapp
ed re
gion
s)
Bou
ndar
y P
oint
sIn
tern
al p
oint
s, w
hich
are
als
o ex
tern
al p
oint
s of
ot
her d
omai
ns (u
sed
in c
ompu
tatio
ns o
f mes
hes
in
othe
r dom
ains
)
PE#2
2D F
DM
: PE#
0In
form
atio
n at
eac
h do
mai
n (4
/4)
Inte
rnal
Poi
nts
Mes
hes
orig
inal
ly a
ssig
ned
to th
e do
mai
n
Ext
erna
l Poi
nts
Mes
hes
orig
inal
ly a
ssig
ned
to d
iffer
ent d
omai
n, b
ut
requ
ired
for c
ompu
tatio
n of
mes
hes
in th
e do
mai
n (m
eshe
s in
ove
rlapp
ed re
gion
s)
Bou
ndar
y P
oint
sIn
tern
al p
oint
s, w
hich
are
als
o ex
tern
al p
oint
s of
ot
her d
omai
ns (u
sed
in c
ompu
tatio
ns o
f mes
hes
in
othe
r dom
ains
)
Rel
atio
nshi
ps b
etw
een
Dom
ains
Com
mun
icat
ion
Tabl
e: E
xter
nal/B
ound
ary
Poi
nts
Nei
ghbo
rs
PE#1
12
34
56
78
910
1112
1314
1516
●●
●●
● ● ● ●
194
MP
I Pro
gram
min
g
PE#2
Des
crip
tion
of D
istr
ibut
ed L
ocal
Dat
a
•In
tern
al/E
xter
nal P
oint
s–
Num
berin
g: S
tarti
ng fr
om in
tern
alpt
s,
then
ext
erna
lpts
afte
r tha
t•
Nei
ghbo
rs–
Sha
res
over
lapp
ed m
eshe
s–
Num
ber a
nd ID
of n
eigh
bors
•E
xter
nal P
oint
s–
From
whe
re, h
ow m
any,
and
whi
ch
exte
rnal
poi
nts
are
rece
ived
/impo
rted
? •
Bou
ndar
y P
oint
s–
To w
here
, how
man
y an
d w
hich
bo
unda
ry p
oint
s ar
e se
nt/e
xpor
ted
?
12
34
17
56
78
18
910
1112
19
1314
1516
20
2122
2324
195
MP
I Pro
gram
min
g
Ove
rvie
w o
f Dis
trib
uted
Loc
al D
ata
Exa
mpl
e on
PE
#0
2526
2728
1718
1920
910
1112
12
34
PE#0
PE#1
PE#2
2122
2324
1314
1516
910
1112
56
78
12
34
20 19 18 17PE
#0PE
#1
PE#2
196
MP
I Pro
gram
min
g
Valu
e at
eac
h m
esh
(= G
loba
l ID
) Lo
cal I
D
Gen
eral
ized
Com
m. T
able
: Sen
d•
Nei
ghbo
rs–
NE
IBP
ETO
T,N
EIB
PE
(nei
b)•
Mes
sage
siz
e fo
r eac
h ne
ighb
or–
expo
rt_in
dex(
neib
), ne
ib=
0, N
EIB
PE
TOT
•ID
of b
ound
ary
poin
ts–
expo
rt_ite
m(k
), k=
1, e
xpor
t_in
dex(
NE
IBP
ETO
T)•
Mes
sage
s to
eac
h ne
ighb
or–
SE
ND
buf(k
), k=
1, e
xpor
t_in
dex(
NE
IBP
ETO
T)
Fortr
an
197
MP
I Pro
gram
min
g
SEN
D: M
PI_I
send
/Irec
v/W
aita
llne
ib#1
SEN
Dbu
fne
ib#2
neib
#3ne
ib#4
expo
rt_i
ndex
(0)+
1
BU
Flen
gth_
eB
UFl
engt
h_e
BU
Flen
gth_
eB
UFl
engt
h_e
expo
rt_i
ndex
(1)+
1ex
port
_ind
ex(2
)+1
expo
rt_i
ndex
(3)+
1
do neib= 1, NEIBPETOT
do k= export_index(neib-1)+1, export_index(neib)
kk= export_item(k)
SENDbuf(k)= VAL(kk)
enddo
enddo
do neib= 1, NEIBPETOT
iS_e= export_index(neib-1) + 1
iE_e= export_index(neib )
BUFlength_e= iE_e + 1 -iS_e
call MPI_ISEND &
& (SENDbuf(iS_e), BUFlength_e,MPI_INTEGER, NEIBPE(neib),0,&
& MPI_COMM_WORLD, request_send(neib),ierr)
enddo
call MPI_WAITALL (NEIBPETOT, request_send, stat_recv, ierr)
expo
rt_i
ndex
(4)
Fortr
an19
8M
PI P
rogr
amm
ing
Cop
ied
to s
endi
ng b
uffe
rs
Gen
eral
ized
Com
m. T
able
: Rec
eive
•N
eigh
bors
–N
EIB
PE
TOT,
NE
IBP
E(n
eib)
•M
essa
ge s
ize
for e
ach
neig
hbor
–im
port_
inde
x(ne
ib),
neib
= 0,
NE
IBP
ETO
T•
ID o
f ext
erna
lpoi
nts
–im
port_
item
(k),
k= 1
, im
port_
inde
x(N
EIB
PE
TOT)
•M
essa
ges
from
eac
h ne
ighb
or–
RE
CV
buf(k
), k=
1, i
mpo
rt_in
dex(
NE
IBP
ETO
T)
199
MP
I Pro
gram
min
g
Fortr
an
REC
V: M
PI_I
send
/Irec
v/W
aita
ll
neib
#1R
ECVb
ufne
ib#2
neib
#3ne
ib#4
BU
Flen
gth_
iB
UFl
engt
h_i
BU
Flen
gth_
iB
UFl
engt
h_i
do neib= 1, NEIBPETOT
iS_i= import_index(neib-1) + 1
iE_i= import_index(neib )
BUFlength_i= iE_i + 1 -iS_i
call MPI_IRECV &
& (RECVbuf(iS_i), BUFlength_i,MPI_INTEGER, NEIBPE(neib),0,&
& MPI_COMM_WORLD, request_recv(neib),ierr)
enddo
call MPI_WAITALL (NEIBPETOT, request_recv, stat_recv, ierr)
do neib= 1, NEIBPETOT
do k= import_index(neib-1)+1, import_index(neib)
kk= import_item(k)
VAL(kk)= RECVbuf(k)
enddo
enddo
impo
rt_i
ndex
(0)+
1im
port
_ind
ex(1
)+1
impo
rt_i
ndex
(2)+
1im
port
_ind
ex(3
)+1
impo
rt_i
ndex
(4)
Fortr
an20
0M
PI P
rogr
amm
ing
Cop
ied
from
rece
ivin
g bu
ffer
Rel
atio
nshi
p SE
ND
/REC
V
do neib= 1, NEIBPETOT
iS_i= import_index(neib-1) + 1
iE_i= import_index(neib )
BUFlength_i= iE_i + 1 -iS_i
call MPI_IRECV &
& (RECVbuf(iS_i), BUFlength_i,MPI_INTEGER, NEIBPE(neib),0,&
& MPI_COMM_WORLD, request_recv(neib),ierr)
enddo
do neib= 1, NEIBPETOT
iS_e= export_index(neib-1) + 1
iE_e= export_index(neib )
BUFlength_e= iE_e + 1 -iS_e
call MPI_ISEND
&& (SENDbuf(iS_e), BUFlength_e, MPI_INTEGER, NEIBPE(neib),0,&
& MPI_COMM_WORLD, request_send(neib),ierr)
enddo
•C
onsi
sten
cy o
f ID
’s o
f sou
rces
/des
tinat
ions
, siz
e an
d co
nten
ts o
f mes
sage
s !
•C
omm
unic
atio
n oc
curs
whe
n N
EIB
PE
(nei
b) m
atch
es
201
MP
I Pro
gram
min
g
Rel
atio
nshi
p SE
ND
/REC
V (#
0 to
#3)
•C
onsi
sten
cy o
f ID
’s o
f sou
rces
/des
tinat
ions
, siz
e an
d co
nten
ts o
f mes
sage
s !
•C
omm
unic
atio
n oc
curs
whe
n N
EIB
PE
(nei
b) m
atch
es
Send
#0
Rec
v. #
3
#1 #5 #9
#1 #10
#0
#3
NEI
BPE
(:)=1
,3,5
,9N
EIB
PE(:)
=1,0
,10
202
MP
I Pro
gram
min
g
Gen
eral
ized
Com
m. T
able
(1/6
)
12
34
17
56
78
18
910
1112
19
1314
1516
20
2122
2324
PE#2
PE#1
#NEIBPEtot
2 #NEIBPE
12
#NODE
24 16
#IMPORT_index
4 8
#IMPORT_items
17
18
19
20
21
22
23
24
#EXPORT_index
4 8
#EXPORT_items
4 8 12
16
13
14
15
16
203
MP
I Pro
gram
min
g
Gen
eral
ized
Com
m. T
able
(2/6
)
12
34
17
56
78
18
910
1112
19
1314
1516
20
2122
2324
PE#1
#NEIBPEtot
Num
ber o
f nei
ghbo
rs2 #NEIBPE
ID o
f nei
ghbo
rs1
2#NODE
24 16
Ext
/Int P
ts, I
nt P
ts#IMPORT_index
4 8
#IMPORT_items
17
18
19
20
21
22
23
24
#EXPORT_index
4 8
#EXPORT_items
4 8 12
16
13
14
15
16
204
MP
I Pro
gram
min
g
PE#2
#NEIBPEtot
2 #NEIBPE
12
#NODE
24 16
#IMPORT_index
4 8
#IMPORT_items
17
18
19
20
21
22
23
24
#EXPORT_index
4 8
#EXPORT_items
4 8 12
16
13
14
15
16
Gen
eral
ized
Com
m. T
able
(3/6
)
Four
ext
pts
(1st-4
thite
ms)
are
im
porte
d fro
m 1
stne
ighb
or
(PE
#1),
and
four
(5th-8
thite
ms)
ar
e fro
m 2
ndne
ighb
or (P
E#2
).
12
34
17
56
78
18
910
1112
19
1314
1516
20
2122
2324
PE#1
205
MP
I Pro
gram
min
g
PE#2
#NEIBPEtot
2 #NEIBPE
12
#NODE
24 16
#IMPORT_index
4 8
#IMPORT_items
17
18
19
20
21
22
23
24
#EXPORT_index
4 8
#EXPORT_items
4 8 12
16
13
14
15
16
Gen
eral
ized
Com
m. T
able
(4/6
)
impo
rted
from
1st
Nei
ghbo
r (P
E#1
) (1s
t -4th
item
s)
12
34
17
56
78
18
910
1112
19
1314
1516
20
2122
2324
PE#1
206
MP
I Pro
gram
min
g
impo
rted
from
2nd
Nei
ghbo
r (P
E#2
) (5t
h -8t
hite
ms)
PE#2
#NEIBPEtot
2 #NEIBPE
12
#NODE
24 16
#IMPORT_index
4 8
#IMPORT_items
17
18
19
20
21
22
23
24
#EXPORT_index
4 8
#EXPORT_items
4 8 12
16
13
14
15
16
Gen
eral
ized
Com
m. T
able
(5/6
)
12
34
17
56
78
18
910
1112
19
1314
1516
20
2122
2324
PE#1
207
MP
I Pro
gram
min
g
Four
bou
ndar
y pt
s (1
st-4
th
item
s) a
re e
xpor
ted
to 1
st
neig
hbor
(PE
#1),
and
four
(5th-
8th
item
s) a
re to
2nd
neig
hbor
(P
E#2
).
PE#2
#NEIBPEtot
2 #NEIBPE
12
#NODE
24 16
#IMPORT_index
4 8
#IMPORT_items
17
18
19
20
21
22
23
24
#EXPORT_index
4 8
#EXPORT_items
4 8 12
16
13
14
15
16
Gen
eral
ized
Com
m. T
able
(6/6
)
12
34
17
56
78
18
910
1112
19
1314
1516
20
2122
2324
PE#1
208
MP
I Pro
gram
min
g
expo
rted
to 1
stN
eigh
bor
(PE
#1) (
1st -4
thite
ms)
expo
rted
to 2
ndN
eigh
bor
(PE
#2) (
5th -
8th
item
s)
PE#2
Gen
eral
ized
Com
m. T
able
(6/6
)
12
34
17
56
78
18
910
1112
19
1314
1516
20
2122
2324
PE#1
An
exte
rnal
poi
nt is
onl
y se
nt
from
its
orig
inal
dom
ain.
A bo
unda
ry p
oint
cou
ld b
e re
ferre
d fro
m m
ore
than
one
do
mai
n, a
nd s
ent t
o m
ultip
le
dom
ains
(e.g
. 16t
hm
esh)
.
209
MP
I Pro
gram
min
g
PE#2
Not
ice:
Sen
d/R
ecv
Arr
ays
#PE0send:
VEC(start_send)~
VEC(start_send+length_send-1)
#PE1recv:
VEC(start_recv)~
VEC(start_recv+length_recv-1)
#PE1send:
VEC(start_send)~
VEC(start_send+length_send-1)
#PE0recv:
VEC(start_recv)~
VEC(start_recv+length_recv-1)
•“le
ngth
_sen
d” o
f sen
ding
pro
cess
mus
t be
equa
l to
“leng
th_r
ecv”
of r
ecei
ving
pro
cess
.–
PE
#0 to
PE
#1, P
E#1
to P
E#0
•“s
endb
uf” a
nd “r
ecvb
uf”:
diffe
rent
add
ress
210
MP
I Pro
gram
min
g
Peer
-to-P
eer C
omm
unic
atio
n
•W
hat i
s P
2P C
omm
unic
atio
n ?
•2D
Pro
blem
, Gen
eral
ized
Com
mun
icat
ion
Tabl
e–
2D F
DM
–P
robl
em S
ettin
g–
Dis
tribu
ted
Loca
l Dat
a an
d C
omm
unic
atio
n Ta
ble
–Im
plem
enta
tion
•R
epor
t S2
211
MP
I Pro
gram
min
g
Sam
ple
Prog
ram
for 2
D F
DM
$ cd <$O-S2>
$ mpifrtpx –Kfast sq-sr1.f
$ mpifccpx –Kfast sq-sr1.c
(modify go4.sh for 4 processes)
$ pjsub go4.sh
212
MP
I Pro
gram
min
g
Exam
ple:
sq-
sr1.
f (1/
6)In
itial
izat
ion
implicit REAL*8 (A-H,O-Z)
include 'mpif.h‘
integer(kind=4) :: my_rank, PETOT
integer(kind=4) :: N, NP, NEIBPETOT, BUFlength
integer(kind=4), dimension(:), allocatable :: VAL
integer(kind=4), dimension(:), allocatable :: SENDbuf, RECVbuf
integer(kind=4), dimension(:), allocatable :: NEIBPE
integer(kind=4), dimension(:), allocatable :: import_index, import_item
integer(kind=4), dimension(:), allocatable :: export_index, export_item
integer(kind=4), dimension(:,:), allocatable :: stat_send, stat_recv
integer(kind=4), dimension(: ), allocatable :: request_send
integer(kind=4), dimension(: ), allocatable :: request_recv
character(len=80) :: filename, line
!C!C +-----------+
!C | INIT. MPI |
!C +-----------+
!C===call MPI_INIT (ierr)
call MPI_COMM_SIZE (MPI_COMM_WORLD, PETOT, ierr )
call MPI_COMM_RANK (MPI_COMM_WORLD, my_rank, ierr )
213
MP
I Pro
gram
min
g
Fortr
an
Exam
ple:
sq-
sr1.
f (2/
6)R
eadi
ng d
istri
bute
d lo
cal d
ata
files
(sqm
.*)
214
MP
I Pro
gram
min
g
Fortr
an
!C!C--MESH
if (my_rank.eq.0) filename= 'sqm.0'
if (my_rank.eq.1) filename= 'sqm.1'
if (my_rank.eq.2) filename= 'sqm.2'
if (my_rank.eq.3) filename= 'sqm.3'
open (21, file= filename, status= 'unknown')
read (21,*) NEIBPETOT
allocate (NEIBPE(NEIBPETOT))
allocate (import_index(0:NEIBPETOT))
allocate (export_index(0:NEIBPETOT))
import_index= 0
export_index= 0
read (21,*) (NEIBPE(neib), neib= 1, NEIBPETOT)
read (21,*) NP, N
read (21,'(a80)') line
read (21,*) (import_index(neib), neib= 1, NEIBPETOT)
nn= import_index(NEIBPETOT)
allocate (import_item(nn))
do i= 1, nn
read (21,*) import_item(i)
enddo
read (21,'(a80)') line
read (21,*) (export_index(neib), neib= 1, NEIBPETOT)
nn= export_index(NEIBPETOT)
allocate (export_item(nn))
do i= 1, nn
read (21,*) export_item(i)
enddo
close (21)
Exam
ple:
sq-
sr1.
f (2/
6)R
eadi
ng d
istri
bute
d lo
cal d
ata
files
(sqm
.*)
215
MP
I Pro
gram
min
g
Fortr
an
!C!C--MESH
if (my_rank.eq.0) filename= 'sqm.0'
if (my_rank.eq.1) filename= 'sqm.1'
if (my_rank.eq.2) filename= 'sqm.2'
if (my_rank.eq.3) filename= 'sqm.3'
open (21, file= filename, status= 'unknown')
read (21,*) NEIBPETOT
allocate (NEIBPE(NEIBPETOT))
allocate (import_index(0:NEIBPETOT))
allocate (export_index(0:NEIBPETOT))
import_index= 0
export_index= 0
read (21,*) (NEIBPE(neib), neib= 1, NEIBPETOT)
read (21,*) NP, N
read (21,*) (import_index(neib), neib= 1, NEIBPETOT)
nn= import_index(NEIBPETOT)
allocate (import_item(nn))
do i= 1, nn
read (21,*) import_item(i)
enddo
read (21,*) (export_index(neib), neib= 1, NEIBPETOT)
nn= export_index(NEIBPETOT)
allocate (export_item(nn))
do i= 1, nn
read (21,*) export_item(i)
enddo
close (21)
#NEIBPEtot
2 #NEIBPE
1 2
#NODE
24 16
#IMPORTindex
4 8
#IMPORTitems
1718192021222324#EXPORTindex
4 8
#EXPORTitems
4 8 121613141516
Exam
ple:
sq-
sr1.
f (2/
6)R
eadi
ng d
istri
bute
d lo
cal d
ata
files
(sqm
.*)
216
MP
I Pro
gram
min
g
Fortr
an
!C!C--MESH
if (my_rank.eq.0) filename= 'sqm.0'
if (my_rank.eq.1) filename= 'sqm.1'
if (my_rank.eq.2) filename= 'sqm.2'
if (my_rank.eq.3) filename= 'sqm.3'
open (21, file= filename, status= 'unknown')
read (21,*) NEIBPETOT
allocate (NEIBPE(NEIBPETOT))
allocate (import_index(0:NEIBPETOT))
allocate (export_index(0:NEIBPETOT))
import_index= 0
export_index= 0
read (21,*) (NEIBPE(neib), neib= 1, NEIBPETOT)
read (21,*) NP, N
read (21,'(a80)') line
read (21,*) (import_index(neib), neib= 1, NEIBPETOT)
nn= import_index(NEIBPETOT)
allocate (import_item(nn))
do i= 1, nn
read (21,*) import_item(i)
enddo
read (21,'(a80)') line
read (21,*) (export_index(neib), neib= 1, NEIBPETOT)
nn= export_index(NEIBPETOT)
allocate (export_item(nn))
do i= 1, nn
read (21,*) export_item(i)
enddo
close (21)
#NEIBPEtot
2 #NEIBPE
1 2
#NODE
24 16
#IMPORTindex
4 8
#IMPORTitems
1718192021222324#EXPORTindex
4 8
#EXPORTitems
4 8 121613141516
NP
Num
ber o
f all
mes
hes
(inte
rnal
+ e
xter
nal)
NN
umbe
r of i
nter
nal m
eshe
s
Exam
ple:
sq-
sr1.
f (2/
6)R
eadi
ng d
istri
bute
d lo
cal d
ata
files
(sqm
.*)
217
MP
I Pro
gram
min
g
Fortr
an
!C!C--MESH
if (my_rank.eq.0) filename= 'sqm.0'
if (my_rank.eq.1) filename= 'sqm.1'
if (my_rank.eq.2) filename= 'sqm.2'
if (my_rank.eq.3) filename= 'sqm.3'
open (21, file= filename, status= 'unknown')
read (21,*) NEIBPETOT
allocate (NEIBPE(NEIBPETOT))
allocate (import_index(0:NEIBPETOT))
allocate (export_index(0:NEIBPETOT))
import_index= 0
export_index= 0
read (21,*) (NEIBPE(neib), neib= 1, NEIBPETOT)
read (21,*) NP, N
read (21,*) (import_index(neib), neib= 1, NEIBPETOT)
nn= import_index(NEIBPETOT)
allocate (import_item(nn))
do i= 1, nn
read (21,*) import_item(i)
enddo
read (21,*) (export_index(neib), neib= 1, NEIBPETOT)
nn= export_index(NEIBPETOT)
allocate (export_item(nn))
do i= 1, nn
read (21,*) export_item(i)
enddo
close (21)
#NEIBPEtot
2 #NEIBPE
1 2
#NODE
24 16
#IMPORTindex
4 8
#IMPORTitems
1718192021222324#EXPORTindex
4 8
#EXPORTitems
4 8 121613141516
Exam
ple:
sq-
sr1.
f (2/
6)R
eadi
ng d
istri
bute
d lo
cal d
ata
files
(sqm
.*)
218
MP
I Pro
gram
min
g
Fortr
an
!C!C--MESH
if (my_rank.eq.0) filename= 'sqm.0'
if (my_rank.eq.1) filename= 'sqm.1'
if (my_rank.eq.2) filename= 'sqm.2'
if (my_rank.eq.3) filename= 'sqm.3'
open (21, file= filename, status= 'unknown')
read (21,*) NEIBPETOT
allocate (NEIBPE(NEIBPETOT))
allocate (import_index(0:NEIBPETOT))
allocate (export_index(0:NEIBPETOT))
import_index= 0
export_index= 0
read (21,*) (NEIBPE(neib), neib= 1, NEIBPETOT)
read (21,*) NP, N
read (21,*) (import_index(neib), neib= 1, NEIBPETOT)
nn= import_index(NEIBPETOT)
allocate (import_item(nn))
do i= 1, nn
read (21,*) import_item(i)
enddo
read (21,*) (export_index(neib), neib= 1, NEIBPETOT)
nn= export_index(NEIBPETOT)
allocate (export_item(nn))
do i= 1, nn
read (21,*) export_item(i)
enddo
close (21)
#NEIBPEtot
2 #NEIBPE
1 2
#NODE
24 16
#IMPORTindex
4 8
#IMPORTitems
1718192021222324#EXPORTindex
4 8
#EXPORTitems
4 8 121613141516
REC
V/Im
port
: PE#
0#NEIBPEtot
2 #NEIBPE
1 2
#NODE
24 16
#IMPORTindex
4 8
#IMPORTitems
1718192021222324#EXPORTindex
4 8
#EXPORTitems
4 8 121613141516
2122
2324
1314
1516
910
1112
56
78
12
34
20 19 18 17PE
#0PE
#1
PE#2
219
MP
I Pro
gram
min
g
Exam
ple:
sq-
sr1.
f (2/
6)R
eadi
ng d
istri
bute
d lo
cal d
ata
files
(sqm
.*)
220
MP
I Pro
gram
min
g
Fortr
an
!C!C--MESH
if (my_rank.eq.0) filename= 'sqm.0'
if (my_rank.eq.1) filename= 'sqm.1'
if (my_rank.eq.2) filename= 'sqm.2'
if (my_rank.eq.3) filename= 'sqm.3'
open (21, file= filename, status= 'unknown')
read (21,*) NEIBPETOT
allocate (NEIBPE(NEIBPETOT))
allocate (import_index(0:NEIBPETOT))
allocate (export_index(0:NEIBPETOT))
import_index= 0
export_index= 0
read (21,*) (NEIBPE(neib), neib= 1, NEIBPETOT)
read (21,*) NP, N
read (21,*) (import_index(neib), neib= 1, NEIBPETOT)
nn= import_index(NEIBPETOT)
allocate (import_item(nn))
do i= 1, nn
read (21,*) import_item(i)
enddo
read (21,*) (export_index(neib), neib= 1, NEIBPETOT)
nn= export_index(NEIBPETOT)
allocate (export_item(nn))
do i= 1, nn
read (21,*) export_item(i)
enddo
close (21)
#NEIBPEtot
2 #NEIBPE
1 2
#NODE
24 16
#IMPORTindex
4 8
#IMPORTitems
1718192021222324#EXPORTindex
4 8
#EXPORTitems
4 8 121613141516
Exam
ple:
sq-
sr1.
f (2/
6)R
eadi
ng d
istri
bute
d lo
cal d
ata
files
(sqm
.*)
221
MP
I Pro
gram
min
g
Fortr
an
!C!C--MESH
if (my_rank.eq.0) filename= 'sqm.0'
if (my_rank.eq.1) filename= 'sqm.1'
if (my_rank.eq.2) filename= 'sqm.2'
if (my_rank.eq.3) filename= 'sqm.3'
open (21, file= filename, status= 'unknown')
read (21,*) NEIBPETOT
allocate (NEIBPE(NEIBPETOT))
allocate (import_index(0:NEIBPETOT))
allocate (export_index(0:NEIBPETOT))
import_index= 0
export_index= 0
read (21,*) (NEIBPE(neib), neib= 1, NEIBPETOT)
read (21,*) NP, N
read (21,*) (import_index(neib), neib= 1, NEIBPETOT)
nn= import_index(NEIBPETOT)
allocate (import_item(nn))
do i= 1, nn
read (21,*) import_item(i)
enddo
read (21,*) (export_index(neib), neib= 1, NEIBPETOT)
nn= export_index(NEIBPETOT)
allocate (export_item(nn))
do i= 1, nn
read (21,*) export_item(i)
enddo
close (21)
#NEIBPEtot
2 #NEIBPE
1 2
#NODE
24 16
#IMPORTindex
4 8
#IMPORTitems
1718192021222324#EXPORTindex
4 8
#EXPORTitems
4 8 121613141516
SEN
D/E
xpor
t: PE
#0#NEIBPEtot
2 #NEIBPE
1 2
#NODE
24 16
#IMPORTindex
4 8
#IMPORTitems
1718192021222324#EXPORTindex
4 8
#EXPORTitems
4 8 121613141516
2122
2324
1314
1516
910
1112
56
78
12
34
20 19 18 17PE
#0PE
#1
PE#2
222
MP
I Pro
gram
min
g
Exam
ple:
sq-
sr1.
f (3/
6)R
eadi
ng d
istri
bute
d lo
cal d
ata
files
(sq.
*)
2526
2728
1718
1920
910
1112
12
34
2526
2728
1718
1920
910
1112
12
34
PE#0
PE#1
PE#2
1 2 3 4 9 1011121718192025262728
223
MP
I Pro
gram
min
g
Fortr
an
!C!C--VAL.
if (my_rank.eq.0) filename= 'sq.0'
if (my_rank.eq.1) filename= 'sq.1'
if (my_rank.eq.2) filename= 'sq.2'
if (my_rank.eq.3) filename= 'sq.3'
allocate (VAL(NP))
VAL= 0
open (21, file= filename, status= 'unknown')
do i= 1, N
read (21,*) VAL(i)
enddo
close (21)
!C===
N
: N
umbe
r of i
nter
nal p
oint
sVA
L: G
loba
l ID
of m
eshe
s
VAL
on e
xter
nal p
oint
s ar
e un
know
nat
this
sta
ge.
Exam
ple:
sq-
sr1.
f (4/
6)P
repa
ratio
n of
sen
ding
/rece
ivin
g bu
ffers
224
MP
I Pro
gram
min
g
Fortr
an
!C!C +--------+
!C | BUFFER |
!C +--------+
!C===allocate (SENDbuf(export_index(NEIBPETOT)))
allocate (RECVbuf(import_index(NEIBPETOT)))
SENDbuf= 0
RECVbuf= 0
do neib= 1, NEIBPETOT
iS= export_index(neib-1) + 1
iE= export_index(neib )
do i= iS, iE
SENDbuf(i)= VAL(export_item(i))
enddo
enddo
!C===
Info
. of b
ound
ary
poin
ts is
writ
ten
into
sen
ding
buf
fer (SendBuf
). In
fo. s
ent t
o NEIBPE(neib
) is
stor
ed in
export_index(neib-
1)+1:export_inedx(neib)
Send
ing
Buf
fer i
s ni
ce ..
.
2122
2324
1314
1516
910
1112
56
78
12
34
20 19 18 17PE
#0PE
#1
PE#2
Num
berin
g of
thes
e bo
unda
ry n
odes
is
not
con
tinuo
us, t
here
fore
the
follo
win
g pr
oced
ure
of M
PI_
Isen
d is
no
t app
lied
dire
ctly
:
・S
tarti
ng a
ddre
ss o
f sen
ding
buf
fer
・X
X-m
essa
ges
from
that
add
ress
225
MP
I Pro
gram
min
g
Fortr
an
do neib= 1, NEIBPETOT
iS_e= export_index(neib-1) + 1
iE_e= export_index(neib )
BUFlength_e= iE_e + 1 -iS_e
call MPI_ISEND &
& (VAL(...), BUFlength_e,MPI_INTEGER, NEIBPE(neib),0,&
& MPI_COMM_WORLD, request_send(neib),ierr)
enddo
Com
mun
icat
ion
Pat
tern
usi
ng 1
D
Stru
ctur
e
halo
halo
halo
halo
Dr.
Osn
i Mar
ques
(La
wre
nce
Ber
kele
y N
atio
nal
Labo
rato
ry)より借
用
226
MP
I Pro
gram
min
g
Exam
ple:
sq-
sr1.
f (5/
6)S
EN
D/E
xpor
t: M
PI_
Isen
d!C!C +-----------+
!C | SEND-RECV |
!C +-----------+
!C===allocate (stat_send(MPI_STATUS_SIZE,NEIBPETOT))
allocate (stat_recv(MPI_STATUS_SIZE,NEIBPETOT))
allocate (request_send(NEIBPETOT))
allocate (request_recv(NEIBPETOT))
do neib= 1, NEIBPETOT
iS= export_index(neib-1) + 1
iE= export_index(neib )
BUFlength= iE + 1 -iS
call MPI_ISEND (SENDbuf(iS), BUFlength, MPI_INTEGER, &
& NEIBPE(neib), 0, MPI_COMM_WORLD, &
& request_send(neib), ierr)
enddo
do neib= 1, NEIBPETOT
iS= import_index(neib-1) + 1
iE= import_index(neib )
BUFlength= iE + 1 -iS
call MPI_IRECV (RECVbuf(iS), BUFlength, MPI_INTEGER, &
& NEIBPE(neib), 0, MPI_COMM_WORLD, &
& request_recv(neib), ierr)
enddo
5758
5960
4950
5152
4142
4344
3334
3536
5758
5960
4950
5152
4142
4344
3334
3536
6162
6364
5354
5556
4546
4748
3738
3940
6162
6364
5354
5556
4546
4748
3738
3940
2526
2728
1718
1920
910
1112
12
34
2526
2728
1718
1920
910
1112
12
34
2930
3132
2122
2324
1314
1516
56
78
2930
3132
2122
2324
1314
1516
56
78
PE#0
PE#1
PE#2
PE#322
7M
PI P
rogr
amm
ing
Fortr
an
SEN
D/E
xpor
t: PE
#0#NEIBPEtot
2 #NEIBPE
1 2
#NODE
24 16
#IMPORTindex
4 8
#IMPORTitems
1718192021222324#EXPORTindex
4 8
#EXPORTitems
4 8 121613141516
2122
2324
1314
1516
910
1112
56
78
12
34
20 19 18 17PE
#0PE
#1
PE#2
228
MP
I Pro
gram
min
g
SEN
D: M
PI_I
send
/Irec
v/W
aita
llne
ib#1
SEN
Dbu
fne
ib#2
neib
#3ne
ib#4
expo
rt_i
ndex
(0)+
1
BU
Flen
gth_
eB
UFl
engt
h_e
BU
Flen
gth_
eB
UFl
engt
h_e
expo
rt_i
ndex
(1)+
1ex
port
_ind
ex(2
)+1
expo
rt_i
ndex
(3)+
1
do neib= 1, NEIBPETOT
do k= export_index(neib-1)+1, export_index(neib)
kk= export_item(k)
SENDbuf(k)= VAL(kk)
enddo
enddo
do neib= 1, NEIBPETOT
iS_e= export_index(neib-1) + 1
iE_e= export_index(neib )
BUFlength_e= iE_e + 1 -iS_e
call MPI_ISEND &
& (SENDbuf(iS_e), BUFlength_e,MPI_INTEGER, NEIBPE(neib),0,&
& MPI_COMM_WORLD, request_send(neib),ierr)
enddo
call MPI_WAITALL (NEIBPETOT, request_send, stat_recv, ierr)
expo
rt_i
ndex
(4)
Fortr
an22
9M
PI P
rogr
amm
ing
Cop
ies
to s
endi
ng b
uffe
rs
Not
ice:
Sen
d/R
ecv
Arr
ays
#PE0send:
VEC(start_send)~
VEC(start_send+length_send-1)
#PE1recv:
VEC(start_recv)~
VEC(start_recv+length_recv-1)
#PE1send:
VEC(start_send)~
VEC(start_send+length_send-1)
#PE0recv:
VEC(start_recv)~
VEC(start_recv+length_recv-1)
•“le
ngth
_sen
d” o
f sen
ding
pro
cess
mus
t be
equa
l to
“leng
th_r
ecv”
of r
ecei
ving
pro
cess
.–
PE
#0 to
PE
#1, P
E#1
to P
E#0
•“s
endb
uf” a
nd “r
ecvb
uf”:
diffe
rent
add
ress
230
MP
I Pro
gram
min
g
Rel
atio
nshi
p SE
ND
/REC
V
do neib= 1, NEIBPETOT
iS_i= import_index(neib-1) + 1
iE_i= import_index(neib )
BUFlength_i= iE_i + 1 -iS_i
call MPI_IRECV &
& (RECVbuf(iS_i), BUFlength_i,MPI_INTEGER, NEIBPE(neib),0,&
& MPI_COMM_WORLD, request_recv(neib),ierr)
enddo
do neib= 1, NEIBPETOT
iS_e= export_index(neib-1) + 1
iE_e= export_index(neib )
BUFlength_e= iE_e + 1 -iS_e
call MPI_ISEND
&& (SENDbuf(iS_e), BUFlength_e, MPI_INTEGER, NEIBPE(neib),0,&
& MPI_COMM_WORLD, request_send(neib),ierr)
enddo
•C
onsi
sten
cy o
f ID
’s o
f sou
rces
/des
tinat
ions
, siz
e an
d co
nten
ts o
f mes
sage
s !
•C
omm
unic
atio
n oc
curs
whe
n N
EIB
PE
(nei
b) m
atch
es
231
MP
I Pro
gram
min
g
Rel
atio
nshi
p SE
ND
/REC
V (#
0 to
#3)
•C
onsi
sten
cy o
f ID
’s o
f sou
rces
/des
tinat
ions
, siz
e an
d co
nten
ts o
f mes
sage
s !
•C
omm
unic
atio
n oc
curs
whe
n N
EIB
PE
(nei
b) m
atch
es
Send
#0
Rec
v. #
3
#1 #5 #9
#1 #10
#0
#3
NEI
BPE
(:)=1
,3,5
,9N
EIB
PE(:)
=1,0
,10
232
MP
I Pro
gram
min
g
!C!C +-----------+
!C | SEND-RECV |
!C +-----------+
!C===allocate (stat_send(MPI_STATUS_SIZE,NEIBPETOT))
allocate (stat_recv(MPI_STATUS_SIZE,NEIBPETOT))
allocate (request_send(NEIBPETOT))
allocate (request_recv(NEIBPETOT))
do neib= 1, NEIBPETOT
iS= export_index(neib-1) + 1
iE= export_index(neib )
BUFlength= iE + 1 -iS
call MPI_ISEND (SENDbuf(iS), BUFlength, MPI_INTEGER, &
& NEIBPE(neib), 0, MPI_COMM_WORLD, &
& request_send(neib), ierr)
enddo
do neib= 1, NEIBPETOT
iS= import_index(neib-1) + 1
iE= import_index(neib )
BUFlength= iE + 1 -iS
call MPI_IRECV (RECVbuf(iS), BUFlength, MPI_INTEGER, &
& NEIBPE(neib), 0, MPI_COMM_WORLD, &
& request_recv(neib), ierr)
enddo
Exam
ple:
sq-
sr1.
f (5/
6)R
EC
V/Im
port:
MP
I_Ire
cv
5758
5960
4950
5152
4142
4344
3334
3536
5758
5960
4950
5152
4142
4344
3334
3536
6162
6364
5354
5556
4546
4748
3738
3940
6162
6364
5354
5556
4546
4748
3738
3940
2526
2728
1718
1920
910
1112
12
34
2526
2728
1718
1920
910
1112
12
34
2930
3132
2122
2324
1314
1516
56
78
2930
3132
2122
2324
1314
1516
56
78
PE#0
PE#1
PE#2
PE#323
3M
PI P
rogr
amm
ing
Fortr
an
REC
V/Im
port
: PE#
0#NEIBPEtot
2 #NEIBPE
1 2
#NODE
24 16
#IMPORTindex
4 8
#IMPORTitems
1718192021222324#EXPORTindex
4 8
#EXPORTitems
4 8 121613141516
2122
2324
1314
1516
910
1112
56
78
12
34
20 19 18 17PE
#0PE
#1
PE#2
234
MP
I Pro
gram
min
g
REC
V: M
PI_I
send
/Irec
v/W
aita
ll
neib
#1R
ECVb
ufne
ib#2
neib
#3ne
ib#4
BU
Flen
gth_
iB
UFl
engt
h_i
BU
Flen
gth_
iB
UFl
engt
h_i
do neib= 1, NEIBPETOT
iS_i= import_index(neib-1) + 1
iE_i= import_index(neib )
BUFlength_i= iE_i + 1 -iS_i
call MPI_IRECV &
& (RECVbuf(iS_i), BUFlength_i,MPI_INTEGER, NEIBPE(neib),0,&
& MPI_COMM_WORLD, request_recv(neib),ierr)
enddo
call MPI_WAITALL (NEIBPETOT, request_recv, stat_recv, ierr)
do neib= 1, NEIBPETOT
do k= import_index(neib-1)+1, import_index(neib)
kk= import_item(k)
VAL(kk)= RECVbuf(k)
enddo
enddo
impo
rt_i
ndex
(0)+
1im
port
_ind
ex(1
)+1
impo
rt_i
ndex
(2)+
1im
port
_ind
ex(3
)+1
impo
rt_i
ndex
(4)
Fortr
an23
5M
PI P
rogr
amm
ing
Cop
ies
from
rece
ivin
g bu
ffers
Exam
ple:
sq-
sr1.
f (6/
6)R
eadi
ng in
fo o
f ext
pts
from
rece
ivin
g bu
ffers
call MPI_WAITALL (NEIBPETOT, request_recv, stat_recv, ierr)
do neib= 1, NEIBPETOT
iS= import_index(neib-1) + 1
iE= import_index(neib )
do i= iS, iE
VAL(import_item(i))= RECVbuf(i)
enddo
enddo
call MPI_WAITALL (NEIBPETOT, request_send, stat_send, ierr)
!C===
!C!C +--------+
!C | OUTPUT |
!C +--------+
!C===do neib= 1, NEIBPETOT
iS= import_index(neib-1) + 1
iE= import_index(neib )
do i= iS, iE
in= import_item(i)
write (*,'(a, 3i8)') 'RECVbuf', my_rank, NEIBPE(neib), VAL(in)
enddo
enddo
!C===call MPI_FINALIZE (ierr)
stop
end
Con
tent
s of
RecvBuf
are
copi
ed to
valu
es a
t ext
erna
l poi
nts.
236
MP
I Pro
gram
min
g
Fortr
an
call MPI_WAITALL (NEIBPETOT, request_recv, stat_recv, ierr)
do neib= 1, NEIBPETOT
iS= import_index(neib-1) + 1
iE= import_index(neib )
do i= iS, iE
VAL(import_item(i))= RECVbuf(i)
enddo
enddo
call MPI_WAITALL (NEIBPETOT, request_send, stat_send, ierr)
!C===
!C!C +--------+
!C | OUTPUT |
!C +--------+
!C===do neib= 1, NEIBPETOT
iS= import_index(neib-1) + 1
iE= import_index(neib )
do i= iS, iE
in= import_item(i)
write (*,'(a, 3i8)') 'RECVbuf', my_rank, NEIBPE(neib), VAL(in)
enddo
enddo
!C===call MPI_FINALIZE (ierr)
stop
end
Exam
ple:
sq-
sr1.
f (6/
6)W
ritin
g va
lues
at e
xter
nal p
oint
s
237
MP
I Pro
gram
min
g
Fortr
an
Res
ults
(PE#
0) RECVbuf 0 1 5
RECVbuf 0 1 13
RECVbuf 0 1 21
RECVbuf 0 1 29
RECVbuf 0 2 33
RECVbuf 0 2 34
RECVbuf 0 2 35
RECVbuf 0 2 36
RECVbuf 1 0 4
RECVbuf 1 0 12
RECVbuf 1 0 20
RECVbuf 1 0 28
RECVbuf 1 3 37
RECVbuf 1 3 38
RECVbuf 1 3 39
RECVbuf 1 3 40
RECVbuf 2 3 37
RECVbuf 2 3 45
RECVbuf 2 3 53
RECVbuf 2 3 61
RECVbuf 2 0 25
RECVbuf 2 0 26
RECVbuf 2 0 27
RECVbuf 2 0 28
RECVbuf 3 2 36
RECVbuf 3 2 44
RECVbuf 3 2 52
RECVbuf 3 2 60
RECVbuf 3 1 29
RECVbuf 3 1 30
RECVbuf 3 1 31
RECVbuf 3 1 32
5758
5960
4950
5152
4142
4344
3334
3536
6162
6364
5354
5556
4546
4748
3738
3940
2526
2728
1718
1920
910
1112
12
34
2930
3132
2122
2324
1314
1516
56
78
PE#0
PE#1
PE#2
PE#3
238
MP
I Pro
gram
min
g
Res
ults
(PE#
1) RECVbuf 0 1 5
RECVbuf 0 1 13
RECVbuf 0 1 21
RECVbuf 0 1 29
RECVbuf 0 2 33
RECVbuf 0 2 34
RECVbuf 0 2 35
RECVbuf 0 2 36
RECVbuf 1 0 4
RECVbuf 1 0 12
RECVbuf 1 0 20
RECVbuf 1 0 28
RECVbuf 1 3 37
RECVbuf 1 3 38
RECVbuf 1 3 39
RECVbuf 1 3 40
RECVbuf 2 3 37
RECVbuf 2 3 45
RECVbuf 2 3 53
RECVbuf 2 3 61
RECVbuf 2 0 25
RECVbuf 2 0 26
RECVbuf 2 0 27
RECVbuf 2 0 28
RECVbuf 3 2 36
RECVbuf 3 2 44
RECVbuf 3 2 52
RECVbuf 3 2 60
RECVbuf 3 1 29
RECVbuf 3 1 30
RECVbuf 3 1 31
RECVbuf 3 1 32
5758
5960
4950
5152
4142
4344
3334
3536
6162
6364
5354
5556
4546
4748
3738
3940
2526
2728
1718
1920
910
1112
12
34
2930
3132
2122
2324
1314
1516
56
78
PE#0
PE#1
PE#2
PE#3
239
MP
I Pro
gram
min
g
Res
ults
(PE#
2) RECVbuf 0 1 5
RECVbuf 0 1 13
RECVbuf 0 1 21
RECVbuf 0 1 29
RECVbuf 0 2 33
RECVbuf 0 2 34
RECVbuf 0 2 35
RECVbuf 0 2 36
RECVbuf 1 0 4
RECVbuf 1 0 12
RECVbuf 1 0 20
RECVbuf 1 0 28
RECVbuf 1 3 37
RECVbuf 1 3 38
RECVbuf 1 3 39
RECVbuf 1 3 40
RECVbuf 2 3 37
RECVbuf 2 3 45
RECVbuf 2 3 53
RECVbuf 2 3 61
RECVbuf 2 0 25
RECVbuf 2 0 26
RECVbuf 2 0 27
RECVbuf 2 0 28
RECVbuf 3 2 36
RECVbuf 3 2 44
RECVbuf 3 2 52
RECVbuf 3 2 60
RECVbuf 3 1 29
RECVbuf 3 1 30
RECVbuf 3 1 31
RECVbuf 3 1 32
6162
6364
5354
5556
4546
4748
3738
3940
2526
2728
1718
1920
910
1112
12
34
PE#0
PE#1
PE#2
PE#3
5758
5960
4950
5152
4142
4344
3334
3536
2930
3132
2122
2324
1314
1516
56
78
240
MP
I Pro
gram
min
g
5758
5960
4950
5152
4142
4344
3334
3536
2930
3132
2122
2324
1314
1516
56
78
Res
ults
(PE#
3) RECVbuf 0 1 5
RECVbuf 0 1 13
RECVbuf 0 1 21
RECVbuf 0 1 29
RECVbuf 0 2 33
RECVbuf 0 2 34
RECVbuf 0 2 35
RECVbuf 0 2 36
RECVbuf 1 0 4
RECVbuf 1 0 12
RECVbuf 1 0 20
RECVbuf 1 0 28
RECVbuf 1 3 37
RECVbuf 1 3 38
RECVbuf 1 3 39
RECVbuf 1 3 40
RECVbuf 2 3 37
RECVbuf 2 3 45
RECVbuf 2 3 53
RECVbuf 2 3 61
RECVbuf 2 0 25
RECVbuf 2 0 26
RECVbuf 2 0 27
RECVbuf 2 0 28
RECVbuf 3 2 36
RECVbuf 3 2 44
RECVbuf 3 2 52
RECVbuf 3 2 60
RECVbuf 3 1 29
RECVbuf 3 1 30
RECVbuf 3 1 31
RECVbuf 3 1 32
6162
6364
5354
5556
4546
4748
3738
3940
2526
2728
1718
1920
910
1112
12
34
PE#0
PE#1
PE#2
PE#3
241
MP
I Pro
gram
min
g
Dis
trib
uted
Loc
al D
ata
Stru
ctur
e fo
r Pa
ralle
l Com
puta
tion
•D
istri
bute
d lo
cal d
ata
stru
ctur
e fo
r dom
ain-
to-d
oain
co
mm
unic
atio
ns h
as b
een
intro
duce
d, w
hich
is
appr
opria
te fo
r suc
h ap
plic
atio
ns w
ith s
pars
e co
effic
ient
m
atric
es (e
.g. F
DM
, FE
M, F
VM
etc
.).–
SP
MD
–Lo
cal N
umbe
ring:
Inte
rnal
pts
to E
xter
nal p
ts–
Gen
eral
ized
com
mun
icat
ion
tabl
e
•E
very
thin
g is
eas
y, if
pro
per d
ata
stru
ctur
e is
def
ined
:–
Val
ues
at b
ound
ary
pts
are
copi
ed in
to s
endi
ng b
uffe
rs–
Sen
d/R
ecv
–V
alue
s at
ext
erna
lpts
are
upd
ated
thro
ugh
rece
ivin
g bu
ffers
242
MP
I Pro
gram
min
g
244
Thre
e D
omai
ns
45
89
10
1314
15
1819
20
2324
25
67
8
1112
1314
1617
1819
2122
2324
12
34
5
67
89
10
1112
13
#PE2
#PE1
#PE0
t2
245
Thre
e D
omai
ns
9 410 5
11 81 9
2 10
12 133 14
4 15
13 185 19
6 20
14 237 24
8 25
10 611 7
12 8
1 112 12
3 1313 14
4 165 17
6 1814 19
7 218 22
9 2315 24
1 12 2
3 34 4
5 5
6 67 7
8 89 9
10 10
11 1112 12
13 13
#PE2
#PE0
#PE1t2
246
PE
#0: s
qm.0
: fill ○
’s
9 410 5
11 81 9
2 10
12 133 14
4 15
13 185 19
6 20
14 237 24
8 25
10 611 7
12 8
1 112 12
3 1313 14
4 165 17
6 1814 19
7 218 22
9 2315 24
1 12 2
3 34 4
5 5
6 67 7
8 89 9
10 10
11 1112 12
13 13
#PE
2
#PE
0
#PE
1
9 410 5
11 81 9
2 10
12 133 14
4 15
13 185 19
6 20
14 237 24
8 25
10 611 7
12 8
1 112 12
3 1313 14
4 165 17
6 1814 19
7 218 22
9 2315 24
1 12 2
3 34 4
5 5
6 67 7
8 89 9
10 10
11 1112 12
13 13
#PE
2
#PE
0
#PE
1#NEIBPEtot
2#NEIBPE
1 2
#NODE13 8
(int+ext, int pts)
#IMPORTindex
○○
#IMPORTitems
○…
#EXPORTindex
○○
#EXPORTitems
○…
t2
247
PE
#1: s
qm.1
: fill ○
’s
9 410 5
11 81 9
2 10
12 133 14
4 15
13 185 19
6 20
14 237 24
8 25
10 611 7
12 8
1 112 12
3 1313 14
4 165 17
6 1814 19
7 218 22
9 2315 24
1 12 2
3 34 4
5 5
6 67 7
8 89 9
10 10
11 1112 12
13 13
#PE
2
#PE
0
#PE
1
9 410 5
11 81 9
2 10
12 133 14
4 15
13 185 19
6 20
14 237 24
8 25
10 611 7
12 8
1 112 12
3 1313 14
4 165 17
6 1814 19
7 218 22
9 2315 24
1 12 2
3 34 4
5 5
6 67 7
8 89 9
10 10
11 1112 12
13 13
#PE
2
#PE
0
#PE
1#NEIBPEtot
2#NEIBPE
0 2
#NODE 8 14
(int+ext, int pts)
#IMPORTindex
○○
#IMPORTitems
○…
#EXPORTindex
○○
#EXPORTitems
○…
t2
248
PE
#2: s
qm.2
: fill ○
’s
9 410 5
11 81 9
2 10
12 133 14
4 15
13 185 19
6 20
14 237 24
8 25
10 611 7
12 8
1 112 12
3 1313 14
4 165 17
6 1814 19
7 218 22
9 2315 24
1 12 2
3 34 4
5 5
6 67 7
8 89 9
10 10
11 1112 12
13 13
#PE
2
#PE
0
#PE
1
9 410 5
11 81 9
2 10
12 133 14
4 15
13 185 19
6 20
14 237 24
8 25
10 611 7
12 8
1 112 12
3 1313 14
4 165 17
6 1814 19
7 218 22
9 2315 24
1 12 2
3 34 4
5 5
6 67 7
8 89 9
10 10
11 1112 12
13 13
#PE
2
#PE
0
#PE
1#NEIBPEtot
2#NEIBPE
1 0
#NODE 9 15
(int+ext, int pts)
#IMPORTindex
○○
#IMPORTitems
○…
#EXPORTindex
○○
#EXPORTitems
○…
t2
249
9 410 5
11 81 9
2 10
12 133 14
4 15
13 185 19
6 20
14 237 24
8 25
10 611 7
12 8
1 112 12
3 1313 14
4 165 17
6 1814 19
7 218 22
9 2315 24
1 12 2
3 34 4
5 5
6 67 7
8 89 9
10 10
11 1112 12
13 13
#PE2
#PE0
#PE1t2
250
Proc
edur
es•
Num
ber o
f Int
erna
l/Ext
erna
l Poi
nts
•W
here
do
Ext
erna
l Pts
com
e fro
m ?
–IMPORTindex,IMPORTitems
–S
eque
nce
of NEIBPE
•Th
en c
heck
des
tinat
ions
of B
ound
ary
Pts
.–EXPORTindex,EXPORTitems
–S
eque
nce
of NEIBPE
•“s
q.*”
are
in <
$O-S
2>/e
x•
Cre
ate
“sqm
.*” b
y yo
urse
lf•
copy
<$O
-S2>
/a.o
ut(b
y sq
-sr1
.f) to
<$O
-S2>
/ex
•pj
sub
go3.
sh
t2
Rep
ort S
2 (1
/2)
•P
aral
leliz
e 1D
cod
e (1
d.f)
usin
g M
PI
•R
ead
entir
e el
emen
t num
ber,
and
deco
mpo
se in
to s
ub-
dom
ains
in y
our p
rogr
am
•M
easu
re p
aral
lel p
erfo
rman
ce
251
MP
I Pro
gram
min
g
Rep
ort S
2 (2
/2)
•D
eadl
ine:
17:
00 F
ebru
ary
14th
(Sat
), 20
15.
–S
end
files
via
e-m
ail a
t nakajima(at)cc.u-tokyo.ac.jp
•P
robl
em–
App
ly “G
ener
aliz
ed C
omm
unic
atio
n Ta
ble”
–R
ead
entir
e el
em. #
, dec
ompo
se in
to s
ub-d
omai
ns in
you
r pro
gram
–E
valu
ate
para
llel p
erfo
rman
ce•
You
nee
d hu
ge n
umbe
r of e
lem
ents
, to
get e
xcel
lent
per
form
ance
.•
Fix
num
ber o
f ite
ratio
ns (e
.g. 1
00),
if co
mpu
tatio
ns c
anno
t be
com
plet
ed.
•R
epor
t–
Cov
er P
age:
Nam
e, ID
, and
Pro
blem
ID (S
2) m
ust b
e w
ritte
n.
–Le
ss th
an e
ight
page
s in
clud
ing
figur
es a
nd ta
bles
(A4)
. •
Stra
tegy
, Stru
ctur
e of
the
Pro
gram
, Rem
arks
–S
ourc
e lis
t of t
he p
rogr
am (i
f you
hav
e bu
gs)
–O
utpu
t lis
t (as
sm
all a
s po
ssib
le)
252
MP
I Pro
gram
min
g
pFE
M3D
-2
255
!C
!C +-----------------+
!C | MATRIX ASSEMBLE |
!C +-----------------+
!C===
do icel= 1, NE
in1= ICELNOD(2*icel-1)
in2= ICELNOD(2*icel )
DL = dX
cK= AREA*Young/DL
EMAT(1,1)= Ck*KMAT(1,1)
EMAT(1,2)= Ck*KMAT(1,2)
EMAT(2,1)= Ck*KMAT(2,1)
EMAT(2,2)= Ck*KMAT(2,2)
DIAG(in1)= DIAG(in1) + EMAT(1,1)
DIAG(in2)= DIAG(in2) + EMAT(2,2)
if (my_rank.eq.0.and.icel.eq.1) then
k1= INDEX(in1-1) + 1
else
k1= INDEX(in1-1) + 2
endif
k2= INDEX(in2-1) + 1
AMAT(k1)= AMAT(k1) + EMAT(1,2)
AMAT(k2)= AMAT(k2) + EMAT(2,1)
enddo
!C===
pFE
M3D
-2
256
MA
T_A
SS_M
AIN
vis
its a
ll el
emen
tsin
clud
ing
over
lapp
ed e
lem
ents
with
ext
erna
l nod
es
12
3
45
67
89
11
10
14
13
15
12
PE#0
78
910
45
612
311
12
PE#3
71
23
10
911
12
56
84 PE#2
34
8
69
10
12
12
5
11
7
PE#1
12
34
5
21
22
23
24
25
16
17
18
20
11
12
13
14
15
67
89
10
19PE#0
PE#1
PE#2
PE#3