Source verification of mis-identified Arabidopsis thaliana accessions
Transcript of Source verification of mis-identified Arabidopsis thaliana accessions
TECHNICAL ADVANCE
Source verification of mis-identified Arabidopsis thalianaaccessions
Alison E. Anastasio1,†, Alexander Platt2,†, Matthew Horton1, Erich Grotewold3, Randy Scholl3, Justin O. Borevitz1,
Magnus Nordborg2,4 and Joy Bergelson1,*
1Ecology and Evolution, University of Chicago, Chicago, IL 60637, USA,2Molecular and Computational Biology, University of Southern California, Los Angeles, CA 90089, USA,3Arabidopsis Biological Resource Center, Ohio State University, Columbus, OH 43210, USA, and4Gregor Mendel Institute, 1030 Vienna, Austria
Received 9 March 2011; revised 3 April 2011; accepted 6 April 2011; published online 16 June 2011.*For correspondence (fax 773-702-9740; e-mail [email protected]).†These authors contributed equally to this work.
SUMMARY
A major strength of Arabidopsis thaliana as a model lies in the availability of a large number of naturally
occurring inbred lines. Recent studies of A. thaliana population structure, using thousands of accessions from
stock center and natural collections, have revealed a robust pattern of isolation by distance at several spatial
scales, such that genetically identical individuals are generally found close to each other. However, some
individual accessions deviate from this pattern. While some of these may be the products of rare long-distance
dispersal events, many deviations may be the result of mis-identification, in the sense that the data regarding
location of origin data are incorrect. Here, we aim to identify such discrepancies. Of the 5965 accessions
examined, we conclude that 286 deserve special attention as being potentially mis-identified. We describe
these suspicious accessions and their possible origins, and advise caution with regard to their use in
experiments in which accurate information on geographic origin is important. Finally, we discuss possibilities
for maintaining the integrity of stock lines.
Keywords: Arabidopsis thaliana, contamination, long-distance dispersal, stock center, natural variation,
population structure.
INTRODUCTION
The Arabidopsis community has been interested in natural
variation for decades (reviewed by Alonso-Blanco and
Koornneef, 2000; Meyerowitz, 2001). In 1937, Friedrich
Laibach began collecting local ecotypes (Laibach, 1943;
Meyerowitz, 2001; Koornneef and Meinke, 2010). His per-
sonal collection and compilation of natural accessions from
other collectors constituted the first standardized set of seed
stock used by researchers (Robbelen, 1965). As the popu-
larity of Arabidopsis increased, researchers focused on a few
standard lines (e.g. Ler-0, Col-0), from which mutants, and
later recombinant inbred lines, were derived. Researchers
also continued to collect new ecotypes from natural popu-
lations, notably Laibach in Europe, Ivo Cetl in Moravia, and
Albert Kranz in Germany. Several narrowly distributed
papers described phenotypic variation in these accessions,
but, with the advent of the Arabidopsis Information Service
(AIS) in 1964, descriptions of natural accessions and their
ecology were shared among a growing community (Cetl,
1965; Cetl et al., 1965; Robbelen, 1965; Effmertova and Cetl,
1966; Effmertova, 1967). Soon after, the AIS Seed Stock
Center was established to house ecotypes and mutant lines
(Somerville and Koornneef, 2002). Currently, three stock
centers supply Arabidopsis researchers around the world
with seeds and DNA for their work: the Arabidopsis Biolog-
ical Resource Center at Ohio State University, Columbus,
OH, USA (http://abrc.osu.edu), the Nottingham Arabidopsis
Stock Centre (NASC) at the University of Nottingham, UK
(http://arabidopsis.info), and the Biological Resource Center,
554 ª 2011 The AuthorsThe Plant Journal ª 2011 Blackwell Publishing Ltd
The Plant Journal (2011) 67, 554–566 doi: 10.1111/j.1365-313X.2011.04606.x
part of the RIKEN Organization (formerly Sendai Arabidopsis
Seed Stock Center), in Japan (http://www.brc.riken.go.jp/lab/
epd/Eng/catalog/seed.shtml).
The availability of so many natural accessions means that
scientists can exploit natural genetic variability within the
species to identify gene functions that mutagenesis fails
to uncover (Tonsor et al., 2005), illuminate traits that are
important in native habitats and are targets of selection
(Mauricio and Rausher, 1997; Stahl et al., 1999; Alonso-
Blanco and Koornneef, 2000; Johanson et al., 2000; Hauser
et al., 2001; Tian et al., 2002; Mauricio et al., 2003; Goss and
Bergelson, 2006; Mitchell-Olds and Schmitt, 2006; McKay
et al., 2008), test evolutionary theory (Bergelson et al., 2001;
Nordborg et al., 2005; Bakker et al., 2006; Ehrenreich and
Purugganan, 2006; Shindo et al., 2007; Novembre and
Slatkin, 2009), and understand the extent to which popula-
tion structure is important in natural populations (Bergelson
et al., 1998; Nordborg et al., 2005; Beck et al., 2008; Bom-
blies et al., 2010; Platt et al., 2010). A. thaliana has a global
distribution and is found in a variety of habitats. Consider-
able variation in ecologically relevant traits has been
uncovered using accessions from disparate geographical
areas (Aranzana et al., 2005; Banta et al., 2007; Shindo et al.,
2007; Bouchabke et al., 2008). This natural variation is
especially useful in the study of adaptation. Indeed, conclu-
sions about historical demographic events have been drawn
using these accessions and information regarding their
specific geographical locations (Nordborg et al., 2005; Beck
et al., 2008; Pico et al., 2008). These conclusions rely on the
association between genotypic and geographic information
being correct.
A way to validate the identity of accessions used for
research is to genotype a large collection of individuals,
characterize the population structure across the species
range, and identify individuals that appear to be out of place.
This is especially easy when a species has strong population
structure, such as in species that show clear isolation by
distance. In this case, misplaced individuals may either be
the result of recent long-distance migration or human error
(i.e. mislabeling or contamination of stocks). In general, we
do not expect outliers to be the result of true long-distance
migration. The mere fact that they are identifiable as outliers
means that long-distance migration is exceedingly rare;
furthermore, we would not expect to find such individuals in
studies with limited sample size. Exceptions include situa-
tions where migration rates have recently increased (as has
clearly happened in humans), or where genotypes with a
predisposition for migration are strongly favored by selec-
tion. However, outliers can usually be explained by misla-
beling or other human error.
Using a data set of 139 SNPs in almost 6000 common
laboratory strains and natural accessions, Platt et al. (2010)
revealed that A. thaliana conforms to a pattern of isolation
by distance at several scales, and that the scales differ
regionally. Although genetically identical individuals can be
found across North America, identical individuals are rarely
separated by more than 1 km in continental Eurasia. Given
the clear pattern of isolation by distance across large
geographical areas, there is an expectation of the degree
to which plants within a region are related. In preparing data
for our previous paper (Platt et al., 2010), we came across
a number of accessions from both our collections and the
stock centers that are outliers. Plants that are almost
identical to geographically distant plants stand out as
potentially mis-identified. In cases where local collections
are not especially diverse, a single distantly related outlier
(perhaps also found in another location) is also suspicious.
Importantly, the inverse is true as well. Plants that are closely
related to close neighbors and only distantly related to
distant neighbors are almost certainly correctly identified.
RESULTS
Based on pairwise comparisons between individuals, we
identified three categories of accessions, ranging from well-
corroborated to highly suspicious. Geographic information
for individuals on the ‘green list’ was corroborated by
neighbors. These accessions are genetically similar to their
neighbors, and genetically dissimilar from accessions found
farther away. The green list included 70% of accessions
(Table S1). Accessions on the ‘red list’ are genetically
differentiated from their neighbors, genetically similar to
geographically distant individuals, and often reveal other
characteristics strongly suggestive of contamination, as
discussed below. Just under 5% of the dataset (286 acces-
sions) were included on the red list: all of these are geneti-
cally similar to geographically distant accessions and
dissimilar to their geographic neighbors (Table 1). The
‘yellow list’ contained the remaining 1472 accessions
(almost 25%). These include 131 accessions for which there
are insufficient nearby collections for corroboration, and
1341 accessions whose only long-distance, genetically close
relatives were on the red list.
The 286 accessions on the red list were further divided
into four categories. The first category comprised 71 acces-
sions that were the sole member of their haplogroup. The
second category comprised 78 accessions that belonged
to haplogroups that included commonly used laboratory
strains and were extensively over-dispersed (Col-0, Kin-0,
Kondara, Ler-0 and Ws-2) (all members of such haplogroups
were included on the red list). The third category comprised
18 haplogroups in which one or more far-flung individuals
came from a different geographical area than the majority of
members in the group; 30 accessions fitted this description
and were the only members of their haplogroup placed on
the red list (although, accordingly, the remaining members
were on the yellow list). For instance, the notable Midwest
US haplogroup hg8115, known as the ‘Heartland’ haplo-
group, contains 1041 accessions, one of which is not
Mis-identified A. thaliana accessions 555
ª 2011 The AuthorsThe Plant Journal ª 2011 Blackwell Publishing Ltd, The Plant Journal, (2011), 67, 554–566
Tab
le1
Iden
tity
and
cate
go
riza
tio
no
fal
lac
cess
ion
so
nth
ere
dlis
t,in
clu
din
gm
ost
clo
sely
rela
ted
acce
ssio
n,
per
cen
tag
eid
enti
ty,
geo
gra
ph
icd
ista
nce
toth
em
ost
dis
tan
tre
lati
ve(>
70%
iden
tity
),th
em
ost
rela
ted
acce
ssio
nb
yn
ame
and
nu
mb
er,
per
cen
tsi
mila
rity
and
the
dis
tan
ceto
the
mo
stre
late
dac
cess
ion
Cat
ego
ryH
aplo
gro
up
Acc
essi
on
nam
eN
ativ
en
ame
Co
llect
or
Acc
essi
on
nu
mb
erC
ou
ntr
yLa
titu
de
Lon
git
ud
e
Max
imu
md
ista
nce
tore
lati
ve(k
m)
Mo
stre
late
dac
cess
ion
nu
mb
er
Mo
stre
late
dac
cess
ion
nam
eS
imila
rity
Geo
gra
ph
icd
ista
nce
(km
)
14
ALL
1-5
ALL
1-5
LeC
orr
e4
FRA
45.2
667
1.48
333
5868
.54
461
EM
-183
0.73
2139
8.85
161
ALL
1-4
ALL
1-4
LeC
orr
e3
FRA
45.2
667
1.48
333
499.
6461
CA
M-5
71
491.
561
77C
LA-7
CLA
-7Le
Co
rre
77FR
A48
.82.
2666
756
9.6
461
EM
-183
0.82
1420
8.14
178
CLE
-6C
LE-6
LeC
orr
e78
FRA
48.9
167
)0.4
8333
363
61.6
546
83U
KS
W06
-078
0.76
0931
9.98
113
7LD
V-4
4LD
V-4
4Le
Co
rre
137
FRA
48.5
167
)4.0
6667
6248
.936
7T
OU
-E-2
0.75
5361
9.6
114
9M
IB-9
6M
IB-9
6Le
Co
rre
234
FRA
47.3
833
5.31
667
766.
6210
5LD
V-1
51
698.
91
260
PA
R-5
PA
R-5
LeC
orr
e26
0FR
A46
.65
)0.2
542
5.64
166
MIB
-15
0.76
842
5.64
126
6R
AN
RA
NLe
Co
rre
266
FRA
48.6
5)2
6374
.54
8014
PT
1.09
0.72
0663
74.5
41
322
TO
U-A
1-45
TO
U-A
1-45
Ro
ux
322
FRA
46.6
667
4.11
667
5769
.07
196
MIB
-47
0.84
7610
0.35
133
7T
OU
-A1-
70T
OU
-A1-
70R
ou
x33
7FR
A46
.666
74.
1166
759
50.8
863
4LI
-OF-
077
0.71
5859
50.8
81
362
TO
U-C
-3T
OU
-C-3
Ro
ux
362
FRA
46.6
667
4.11
667
333.
2916
4M
IB-1
30.
7852
100.
651
369
TO
U-F
-1T
OU
-F-1
Ro
ux
369
FRA
46.6
667
4.11
667
1114
.96
86C
UR
-80.
7956
207.
381
922
LIN
S-1
5LI
NS
-15
Do
no
hu
e92
2U
SA
41.8
972
)71.
4378
1093
7.36
7184
CS
2837
80.
8551
1093
7.36
110
70B
rosa
rp-4
5-15
3B
rosa
rp-4
5-15
3A
gre
n10
70S
WE
55.7
167
14.1
333
8050
.65
5975
Dra
IV6-
40.
7432
483.
311
5341
UK
NW
06-0
18U
KN
W06
-018
Ho
lub
5363
UK
54.4
)315
429.
4653
41U
KS
E06
-628
132
5.36
157
11U
KID
4U
KID
4H
olu
b57
11U
K54
.8)3
.315
436.
2872
63C
S28
578
0.74
2615
436.
281
5751
UK
ID46
UK
ID46
Ho
lub
5751
UK
57.3
)5.7
1632
.77
6902
CS
2809
20.
848
1601
.68
157
72U
KID
67U
KID
67R
atcl
iffe
5772
UK
54.1
)2.3
827.
8783
43N
a-1
0.85
8250
4.07
158
29A
le1-
2A
le1-
2N
ord
bo
rg58
29S
WE
55.3
838
14.0
612
8936
.54
5098
UK
SE
06-2
460.
9193
6.18
164
13U
ll3-4
Ull3
-4N
ord
bo
rg64
13S
WE
56.0
613
.97
477.
0359
66D
raIV
5-30
0.72
9947
7.03
170
09C
S28
065
Ben
k-2
Ko
orn
nee
f70
09N
ED
525.
675
765.
3846
1E
M-1
830.
7963
361.
31
7011
CS
2806
3B
e-1
Kra
nz
7011
GE
R49
.680
38.
6161
5989
.47
911
LIN
F-18
0.74
159
89.4
71
7014
CS
2805
3B
a-1
Kra
nz
7014
UK
56.5
459
)4.7
9821
1335
.44
7429
CS
2849
60.
7412
1335
.44
170
30C
S28
060
Bch
-4K
ran
z70
30G
ER
49.5
166
9.31
6610
22.1
5981
Dra
IV6-
100.
7558
503.
581
7041
CS
2811
4B
u-1
7K
ran
z70
41G
ER
50.5
9.5
428.
1974
05C
S28
814
0.90
1414
6.65
170
94C
S28
200
Da-
0K
ran
z70
94G
ER
49.8
724
8.65
081
718.
972
09C
S28
439
0.73
1949
41
7113
CS
2822
4E
i-4
Kra
nz
7113
GE
R50
.36.
367
11.6
772
20C
S28
451
0.74
4512
5.56
171
16C
S28
227
Eil-
0K
ran
z71
16G
ER
51.4
599
12.6
327
1650
.67
417
Do
ub
ravn
ik14
0.76
2529
7.11
171
24C
S28
240
Eri
-1K
oo
rnn
eef
7124
SW
E56
.433
315
.35
667.
5559
62D
raIV
5-25
0.73
9652
5.21
171
25C
S28
239
Er-
0K
ran
z71
25G
ER
49.5
955
11.0
087
1354
.68
6980
CS
2882
50.
8195
1354
.68
171
32C
S28
270
Fr-6
Kra
nz
7132
GE
R50
.110
28.
6822
9184
.17
5877
Dra
II-13
0.91
3754
6.1
171
33C
S28
266
Fr-2
Kra
nz
7133
GE
R50
.110
28.
6822
6941
.26
7429
CS
2849
60.
7711
348.
091
7141
CS
2827
4G
a-2
Kra
nz
7141
GE
R50
.38
6829
.65
7161
CS
2827
50.
8699
277.
271
7147
CS
2828
0G
ie-0
Kra
nz
7147
GE
R50
.584
8.67
825
595.
4270
68C
S28
132
0.73
9759
5.42
171
58C
S28
326
Gr-
5K
ran
z71
58A
UT
4715
.538
0.81
5981
Dra
IV6-
100.
7176
162.
411
7163
CS
2833
6H
a-0
Kra
nz
7163
GE
R52
.372
19.
7356
926
3.82
7297
CS
2860
10.
8561
263.
821
7170
CS
2834
7H
l-0
Kra
nz
7170
GE
R52
.144
49.
3782
752
6.81
7429
CS
2849
60.
7143
526.
811
7181
CS
2836
4Je
-0K
ran
z71
81G
ER
50.9
2711
.587
377.
258
70D
ra3-
90.
7163
377.
21
7196
CS
2839
1K
l-2
Kra
nz
7196
GE
R50
.95
6.96
6690
0.27
1314
An
gso
-61-
423
0.72
3790
0.27
172
07C
S28
425
Kyo
toT
suka
ya72
07JP
N35
.008
513
5.75
210
171.
4323
33W
ilco
x-27
0.72
9310
171.
431
7208
CS
2844
1La
n-0
Kra
nz
7208
UK
55.6
739
)3.7
8181
8203
.92
5766
UK
ID61
0.90
3746
0.02
556 Alison E. Anastasio et al.
ª 2011 The AuthorsThe Plant Journal ª 2011 Blackwell Publishing Ltd, The Plant Journal, (2011), 67, 554–566
Tab
le1
(Co
nti
nu
ed)
Cat
ego
ryH
aplo
gro
up
Acc
essi
on
nam
eN
ativ
en
ame
Co
llect
or
Acc
essi
on
nu
mb
erC
ou
ntr
yLa
titu
de
Lon
git
ud
e
Max
imu
md
ista
nce
tore
lati
ve(k
m)
Mo
stre
late
dac
cess
ion
nu
mb
er
Mo
stre
late
dac
cess
ion
nam
eS
imila
rity
Geo
gra
ph
icd
ista
nce
(km
)
172
09C
S28
439
La-0
Kra
nz
7209
PO
L52
.733
315
.233
339
67.1
370
96C
S28
205
0.89
5880
3.64
172
10C
S28
440
La-1
Kra
nz
7210
PO
L52
.733
315
.233
364
5.77
7114
CS
2822
50.
8182
645.
771
7223
CS
2845
3Li
-2:1
Kra
nz
7223
GE
R50
.383
38.
0666
8196
.45
5962
Dra
IV5-
250.
7708
589.
41
7224
CS
2845
4Li
-3K
ran
z72
24G
ER
50.3
833
8.06
6670
24.1
446
1E
M-1
830.
7636
532.
811
7231
CS
2846
1Li
-7K
ran
z72
31G
ER
50.3
833
8.06
6653
2.81
7246
CS
2848
90.
7092
57.2
31
7268
CS
2857
2N
p-0
Kra
nz
7268
GE
R52
.696
910
.981
957.
6411
34G
ard
by-
19-2
050.
7792
467.
141
7287
CS
2859
0O
ve-0
Kra
nz
7287
GE
R53
.342
28.
4225
554
8.07
7209
CS
2843
90.
7286
458.
531
7294
CS
2859
8P
er-2
Kra
nz
7294
RU
S58
56.3
167
8314
.66
7438
CS
2851
90.
7941
1284
.68
172
97C
S28
601
Pf-
0K
ran
z72
97G
ER
48.5
479
9.11
033
263.
8271
63C
S28
336
0.85
6126
3.82
173
05C
S28
653
Pt-
0K
ran
z73
05G
ER
53.4
7610
.606
577
7.17
7147
CS
2828
00.
7273
237.
351
7306
CS
2865
0P
og
-0K
ran
z73
06C
AN
49.2
655
)123
.206
8355
.88
7130
CS
2824
60.
7464
8355
.88
173
09C
S28
649
Po
-1K
ran
z73
09G
ER
50.7
167
7.1
8130
.69
7300
CS
2864
00.
812
632.
51
7320
CS
2869
2R
ou
-0K
ran
z73
20FR
A49
.442
41.
0984
955
3.85
4840
UK
SW
06-2
400.
7468
432.
341
7325
CS
2871
7R
u-0
Kra
nz
7325
GE
R50
.330
87.
9154
123
1.93
8304
Hi-
00.
8723
231.
931
7333
CS
2872
9S
ei-0
Kra
nz
7333
ITA
46.5
438
11.5
614
815.
3673
71C
S28
778
0.93
6672
9.62
173
54C
S28
759
Tin
g-1
Ko
orn
nee
f73
54S
WE
56.5
14.9
1043
.52
5962
Dra
IV5-
250.
7368
489.
891
7371
CS
2877
8T
s-7
Kra
nz
7371
ES
P41
.719
42.
9305
672
9.62
7333
CS
2872
90.
9366
729.
621
7388
CS
2880
6W
ag-1
Ko
orn
nee
f73
88N
ED
51.9
666
5.66
6645
95.3
655
34U
KN
W06
-352
0.82
1461
7.94
174
05C
S28
814
Wc-
2K
ran
z74
05G
ER
52.6
10.0
667
516.
2870
41C
S28
114
0.90
1414
6.65
182
41Li
aru
mLi
aru
mS
all
8241
SW
E55
.95
13.8
511
26.1
784
23H
ov2
-10.
7917
13.3
51
8256
Ba1
-2B
a1-2
No
rdb
org
8256
SW
E56
.412
.910
69.7
157
48U
KID
430.
7126
1069
.71
183
13Jm
-0Jm
-0K
ran
z83
13C
ZE
4915
525.
2641
7D
ou
bra
vnik
140.
7625
101.
831
8326
Lis-
1Li
s-1
No
rdb
org
8326
SW
E56
14.7
1010
.37
5975
Dra
IV6-
40.
7436
496.
431
8343
Na-
1N
a-1
Kra
nz
8343
FRA
47.5
1.5
1162
.29
5772
UK
ID67
0.85
8250
4.07
183
87S
t-0
St-
0K
ran
z83
87S
WE
5918
1001
.29
8289
Ei-
20.
9441
997.
521
9058
Vas
terv
ikV
aste
rvik
No
rdb
org
9058
SW
E57
.75
16.6
333
1160
.496
8B
ols
ena-
4-11
80.
7808
968.
291
9230
CS
7557
1D
el-1
0B
eck
9230
SR
B44
.944
421
.182
881
88.1
982
6K
YF-
170.
7122
8188
.19
194
36P
uk
1P
uk
1A
nas
tasi
o94
36S
WE
56.1
633
14.6
806
1181
.27
5748
UK
ID43
0.73
8611
81.2
71
9455
Ste
4S
te4
An
asta
sio
9455
SW
E57
.800
918
.516
259
6.63
9413
Ko
r4
0.72
6614
7.07
194
67T
ur
1T
ur
1A
nas
tasi
o94
67S
WE
57.6
511
14.8
043
1145
.360
44Lo
v-3
145
9.85
218
54C
S28
686
Ri-
0K
ran
z73
17C
AN
49.1
632
)123
.137
7652
.08
1854
MN
F-P
ot-
231
2882
.44
218
54M
usk
SP
-82
Mu
skS
P-8
2B
yers
2090
US
A43
.248
3)8
6.33
6862
65.5
1854
MN
F-P
ot-
231
19.1
62
1854
Pen
t-67
Pen
t-67
Bye
rs22
27U
SA
43.7
623
)86.
3929
6195
.62
1854
MN
F-P
ot-
231
13.5
42
1854
CS
2818
6C
SH
L-10
Wei
ss67
34U
SA
40.8
585
)73.
4675
5610
.81
1854
MN
F-P
ot-
231
1051
.81
218
54C
S28
187
CS
HL-
11W
eiss
6735
US
A40
.858
5)7
3.46
7556
10.8
118
54M
NF-
Po
t-23
110
51.8
12
1854
CS
2838
8K
in-0
Kra
nz
6926
US
A44
.46
)85.
3760
84.4
318
54M
NF-
Po
t-23
185
.42
218
54C
S28
387
Kin
-0K
ran
z71
93U
SA
44.4
6)8
5.37
6084
.43
1854
MN
F-P
ot-
231
85.4
22
1854
CS
2872
8S
eatt
le-0
Am
asin
o73
32U
SA
47)1
22.2
7851
.93
1854
MN
F-P
ot-
231
2829
.42
1854
Sea
ttle
-0S
eatt
le-0
Am
asin
o82
45U
SA
47)1
22.2
7851
.93
1854
MN
F-P
ot-
231
2829
.42
1854
Kin
-0K
in-0
Kra
nz
8316
US
A44
.46
)85.
3761
19.7
318
54M
NF-
Po
t-23
185
.42
264
34Z
drI
2-9
Zd
rI2-
9R
elic
ho
va64
34C
ZE
49.3
853
16.2
544
4009
.49
7212
CS
2844
61
402.
842
6434
CS
2820
5D
i-G
Ko
orn
nee
f70
96FR
A47
.323
95.
0427
848
41.4
764
34Z
drI
2-9
183
1.55
Mis-identified A. thaliana accessions 557
ª 2011 The AuthorsThe Plant Journal ª 2011 Blackwell Publishing Ltd, The Plant Journal, (2011), 67, 554–566
Tab
le1
(Co
nti
nu
ed)
Cat
ego
ryH
aplo
gro
up
Acc
essi
on
nam
eN
ativ
en
ame
Co
llect
or
Acc
essi
on
nu
mb
erC
ou
ntr
yLa
titu
de
Lon
git
ud
e
Max
imu
md
ista
nce
tore
lati
ve(k
m)
Mo
stre
late
dac
cess
ion
nu
mb
er
Mo
stre
late
dac
cess
ion
nam
eS
imila
rity
Geo
gra
ph
icd
ista
nce
(km
)
264
34C
S28
209
Di-
2K
ran
z70
99FR
A47
548
59.8
164
34Z
drI
2-9
183
9.65
264
34C
S28
449
Ler-
1H
olu
b69
32G
ER
47.9
8410
.871
944
23.2
364
34Z
drI
2-9
140
2.84
264
34C
S28
448
Ler-
1H
olu
b72
11G
ER
47.9
8410
.871
944
23.2
364
34Z
drI
2-9
140
2.84
264
34C
S28
446
Ler-
0K
oo
rnn
eef
7212
GE
R47
.984
10.8
719
4423
.23
6434
Zd
rI2-
91
402.
842
6434
CS
2844
5Le
r-0
Ko
orn
nee
f72
13G
ER
47.9
8410
.871
944
23.2
364
34Z
drI
2-9
140
2.84
264
34C
S28
450
Ler-
2K
oo
rnn
eef
7214
GE
R47
.984
10.8
719
4423
.23
6434
Zd
rI2-
91
402.
842
6434
CS
2844
7Le
r-0
Ko
orn
nee
f72
15G
ER
47.9
8410
.871
944
23.2
364
34Z
drI
2-9
140
2.84
264
34Le
r-1
Ler-
1H
olu
b83
24G
ER
47.9
8410
.871
944
23.2
364
34Z
drI
2-9
140
2.84
264
34T
770
T77
0Ja
kob
sso
n61
30S
WE
55.8
561
13.3
247
3995
5765
UK
ID60
195
8.24
264
34E
M-0
48E
M-0
48H
olu
b44
6U
K51
.30.
549
43.0
364
34Z
drI
2-9
111
32.4
62
6434
UK
SW
06-1
81U
KS
W06
-181
Ho
lub
4781
UK
50.4
)4.9
5317
.45
6434
Zd
rI2-
91
1518
.41
264
34U
KS
E06
-427
UK
SE
06-4
27H
olu
b52
05U
K51
.30.
449
49.1
564
34Z
drI
2-9
111
39.5
32
6434
UK
NW
06-3
52U
KN
W06
-352
Ho
lub
5534
UK
54.6
)3.1
5000
.264
34Z
drI
2-9
113
98.4
62
6434
UK
ID60
UK
ID60
Ho
lub
5765
UK
57.1
)2.3
5974
.29
6130
T77
01
958.
242
7398
CS
2826
7Fr
-3K
ran
z71
34G
ER
50.1
102
8.68
2244
80.2
573
98C
S28
830
114
76.5
22
7398
CS
2882
8W
s-2
Feld
man
n69
81R
US
52.3
3030
11.9
573
98C
S28
830
10
273
98C
S28
829
Ws-
3H
olu
b73
95R
US
52.3
3030
11.9
560
42Lo
m1-
11
1070
.41
273
98C
S28
823
Ws
Dam
m73
97R
US
52.3
3030
11.9
560
42Lo
m1-
11
1070
.41
273
98C
S28
830
Ws-
4P
elle
tier
7398
RU
S52
.330
2965
.07
7134
CS
2826
71
1476
.52
273
98C
S28
827
Ws-
2Fe
ldm
ann
7399
RU
S52
.330
3011
.95
7398
CS
2883
01
02
7398
CS
2882
6W
s-1
Sco
lnik
7400
RU
S52
.330
3011
.95
7398
CS
2883
01
02
7398
Ws-
2W
s-2
Feld
man
n84
06R
US
52.3
3030
11.9
573
98C
S28
830
10
273
98Lu
l-4-
269
Lul-
4-26
9A
gre
n16
05S
WE
66.2
17.7
667
3596
.47
7398
CS
2883
01
1308
.12
273
98Lo
m1-
1Lo
m1-
1N
ord
bo
rg60
42S
WE
56.0
913
.939
54.2
373
14C
S28
667
110
96.9
62
7398
TH
O04
TH
O04
Jako
bss
on
6223
SW
E62
.799
217
.901
436
06.9
273
98C
S28
830
110
96.8
12
7398
CS
2866
7R
agl-
1K
oo
rnn
eef
7314
UK
54.3
512
)3.4
1697
5029
.960
42Lo
m1-
11
1096
.96
291
97U
od
-3U
od
-3K
och
6414
AU
T48
.314
.45
9512
.45
9197
CS
7553
81
1502
.72
9197
Uo
d-2
Uo
d-2
Ko
ch84
28A
UT
48.3
14.4
595
71.5
891
97C
S75
538
115
02.7
291
97D
raII-
13D
raII-
13R
elic
ho
va58
77C
ZE
49.4
112
16.2
815
9523
.75
8472
LP34
13.4
10.
9928
7450
.36
291
97D
raIII
-5D
raIII
-5R
elic
ho
va58
79C
ZE
49.4
112
16.2
815
9475
.53
9197
CS
7553
81
1657
.67
291
97D
raIII
-7D
raIII
-7R
elic
ho
va58
80C
ZE
49.4
112
16.2
815
9546
.66
9197
CS
7553
81
1657
.67
291
97U
du
I1-
8U
du
I1-
8R
elic
ho
va62
94C
ZE
49.2
771
16.6
314
9503
.91
9197
CS
7553
81
1683
.45
291
97C
S75
602
Ab
il-2
Bec
k92
61D
EN
56.6
752
9.58
9186
97.1
891
97C
S75
538
113
66.3
92
9197
CS
2808
7B
la-1
2K
ran
z70
18E
SP
41.6
833
2.8
9574
.85
9197
CS
7553
81
516.
682
9197
CS
7553
8C
ant-
4B
eck
9197
ES
P41
.241
7)3
.382
892
66.3
370
87C
S28
169
172
49.1
42
9197
MO
G-3
6M
OG
-36
LeC
orr
e24
1FR
A48
.666
7)4
.066
6785
82.6
591
97C
S75
538
139
1.21
291
97S
L-3
SL-
3A
gre
n14
41S
WE
62.8
18.0
833
8051
.23
9197
CS
7553
81
2152
.62
291
97U
KS
W06
-178
UK
SW
06-1
78H
olu
b47
78U
K50
.4)4
.985
13.5
691
97C
S75
538
150
1.92
291
97U
KS
W06
-255
UK
SW
06-2
55H
olu
b48
55U
K50
.4)4
.985
13.5
691
97C
S75
538
150
1.92
291
97U
KS
E06
-375
UK
SE
06-3
75H
olu
b51
77U
K51
.30.
485
80.6
391
97C
S75
538
162
1.4
291
97U
KS
E06
-450
UK
SE
06-4
50H
olu
b52
21U
K51
.20.
385
84.7
191
97C
S75
538
161
2.24
291
97U
KN
W06
-232
UK
NW
06-2
32H
olu
b54
85U
K54
.6)3
.381
23.7
691
97C
S75
538
174
2.18
558 Alison E. Anastasio et al.
ª 2011 The AuthorsThe Plant Journal ª 2011 Blackwell Publishing Ltd, The Plant Journal, (2011), 67, 554–566
Tab
le1
(Co
nti
nu
ed)
Cat
ego
ryH
aplo
gro
up
Acc
essi
on
nam
eN
ativ
en
ame
Co
llect
or
Acc
essi
on
nu
mb
erC
ou
ntr
yLa
titu
de
Lon
git
ud
e
Max
imu
md
ista
nce
tore
lati
ve(k
m)
Mo
stre
late
dac
cess
ion
nu
mb
er
Mo
stre
late
dac
cess
ion
nam
eS
imila
rity
Geo
gra
ph
icd
ista
nce
(km
)
291
97U
KN
W06
-476
UK
NW
06-4
76H
olu
b56
39U
K54
.7)3
.482
89.7
991
97C
S75
538
174
8.52
291
97U
KN
W06
-493
UK
NW
06-4
93H
olu
b56
56U
K54
.4)2
.983
34.7
791
97C
S75
538
173
0.5
291
97U
KID
5U
KID
5H
olu
b57
12U
K53
.2)4
.182
03.5
791
97C
S75
538
165
6.95
291
97U
KID
7U
KID
7H
olu
b57
14U
K51
.6)0
.685
06.4
291
97C
S75
538
160
1.63
291
97U
KID
61U
KID
61H
olu
b57
66U
K51
.11
8627
.08
9197
CS
7553
81
637.
342
9197
UK
ID11
4U
KID
114
Ho
lub
5818
UK
51.8
)0.6
8631
.17
9197
CS
7553
81
612.
712
9197
CS
2874
7S
q-4
Cra
wle
y68
96U
K51
.408
3)0
.638
385
21.3
491
97C
S75
538
158
9.89
291
97C
S28
074
Bg
-7W
inte
rer
6714
US
A47
.647
9)1
22.3
0586
80.8
591
97C
S75
538
184
18.0
92
9197
CS
2816
7C
ol-
0K
ran
z69
09U
SA
38.3
)92.
380
58.1
191
97C
S75
538
172
49.1
42
9197
CS
2806
7B
erke
ley
Mu
rph
y70
12U
SA
37.8
695
)122
.271
9574
.85
9197
CS
7553
81
9266
.33
291
97C
S28
166
Co
l-0
Kra
nz
7082
US
A38
.3)9
2.3
8058
.11
9197
CS
7553
81
7249
.14
291
97C
S28
172
Co
l-4
List
er70
83U
SA
38.3
)92.
380
87.0
591
97C
S75
538
172
49.1
42
9197
CS
2817
3C
ol-
7W
eig
el70
84U
SA
38.3
)92.
380
58.1
191
97C
S75
538
172
49.1
42
9197
CS
2817
1C
ol-
3M
eyer
ow
itz
7085
US
A38
.3)9
2.3
8058
.11
9197
CS
7553
81
7249
.14
291
97C
S28
175
Co
l-5
Ho
lub
7086
US
A38
.3)9
2.3
8001
.76
9197
CS
7553
81
7249
.14
291
97C
S28
169
Co
l-1
Red
ei70
87U
SA
38.3
)92.
380
01.7
691
97C
S75
538
172
49.1
42
9197
CS
2816
8C
ol-
8A
lon
so70
88U
SA
38.3
)92.
380
01.7
691
97C
S75
538
172
49.1
42
9197
CS
2817
0C
ol-
2S
om
ervi
lle70
90U
SA
38.3
)92.
380
87.0
591
97C
S75
538
172
49.1
42
9197
CS
2846
4Li
mep
ort
Mu
rph
y72
33U
SA
40.5
088
)75.
4472
6856
.59
9197
CS
7553
81
5871
.58
291
97C
S28
722
San
taC
lara
Mu
rph
y73
29U
SA
37.2
1)1
21.1
695
73.2
291
97C
S75
538
192
57.2
12
9197
Co
l-0
Co
l-0
Kra
nz
8279
US
A38
.3)9
2.3
8087
.05
9197
CS
7553
81
7249
.14
291
97S
anta
Cla
raS
anta
Cla
raM
urp
hy
8377
US
A37
.21
)121
.16
9573
.22
9197
CS
7553
81
9257
.21
291
97LP
3413
.41
LP34
13.4
1B
ore
vitz
8472
US
A41
.686
2)8
6.85
1374
50.3
670
84C
S28
173
0.99
2849
5.16
291
9732
8RM
X02
532
8RM
X02
5B
ore
vitz
8854
US
A42
.036
)86.
511
7401
.25
9197
CS
7553
81
6614
.24
329
CA
M-2
2C
AM
-22
LeC
orr
e29
FRA
48.2
667
)4.5
8333
738.
1218
1M
IB-3
01
738.
123
972
Bo
lsen
a-5-
120
Bo
lsen
a-5-
120
Ag
ren
969
ITA
42.6
512
7721
.11
1068
Bro
sarp
-37-
149
193
6.57
397
2B
ols
ena-
8-12
5B
ols
ena-
8-12
5A
gre
n97
2IT
A42
.65
1277
21.1
110
68B
rosa
rp-3
7-14
91
936.
573
972
Ham
-25-
251
Ham
-25-
251
Ag
ren
1372
SW
E59
.783
317
.583
313
71.8
810
68B
rosa
rp-3
7-14
91
379.
063
995
Bo
lsen
a-2-
113
Bo
lsen
a-2-
113
Ag
ren
966
ITA
42.6
512
7670
.25
995
Bo
lsen
a-11
-127
190
5.97
310
01C
HA
-4C
HA
-4D
on
oh
ue
923
US
A42
.363
4)7
1.14
4560
71.3
210
01A
le-S
ten
ar-6
3-22
159
72.0
33
1123
Gar
db
y-8-
171
Gar
db
y-8-
171
Ag
ren
1123
SW
E56
.616
716
.65
1526
.62
1426
Ro
d-5
-284
153
6.61
311
23G
ard
by-
17-1
98G
ard
by-
17-1
98A
gre
n11
32S
WE
56.6
167
16.6
579
41.6
1133
Gar
db
y-18
-201
10
311
23G
ard
by-
21-2
11G
ard
by-
21-2
11A
gre
n11
36S
WE
56.6
167
16.6
515
26.6
214
26R
od
-5-2
841
536.
613
1369
Ale
dal
-8-5
5A
led
al-8
-55
Ag
ren
1160
SW
E56
.716
.516
772
06.0
311
55A
led
al-3
-40
10
313
69A
ng
so-2
8-40
7A
ng
so-2
8-40
7A
gre
n13
06S
WE
59.5
667
16.8
667
7069
.61
1155
Ale
dal
-3-4
01
227.
453
1369
Ham
-8-2
35H
am-8
-235
Ag
ren
1364
SW
E59
.783
317
.583
319
12.3
1354
Ham
-16
10
313
69H
am-1
8-24
5H
am-1
8-24
5A
gre
n13
69S
WE
59.7
833
17.5
833
6976
.21
1354
Ham
-16
10
358
01U
KID
97U
KID
97H
olu
b58
01U
K51
)2.2
1335
.34
6904
CS
2809
51
1335
.34
367
09C
S28
143
CIB
C-5
Cra
wle
y69
08U
K51
.408
3)0
.638
379
65.4
167
09C
S28
069
176
56.5
73
6709
CIB
C-5
CIB
C-5
Cra
wle
y82
77U
K51
.408
3)0
.638
379
65.4
167
09C
S28
069
176
56.5
73
6797
NO
GN
OG
LeC
orr
e25
6FR
A47
.85
2.73
333
1469
8.83
6797
CS
2835
61
336.
363
6797
WH
A2
WH
A2
LeC
orr
e40
0U
K53
)451
08.9
767
97C
S28
356
125
5.37
Mis-identified A. thaliana accessions 559
ª 2011 The AuthorsThe Plant Journal ª 2011 Blackwell Publishing Ltd, The Plant Journal, (2011), 67, 554–566
Tab
le1
(Co
nti
nu
ed)
Cat
ego
ryH
aplo
gro
up
Acc
essi
on
nam
eN
ativ
en
ame
Co
llect
or
Acc
essi
on
nu
mb
erC
ou
ntr
yLa
titu
de
Lon
git
ud
e
Max
imu
md
ista
nce
tore
lati
ve(k
m)
Mo
stre
late
dac
cess
ion
nu
mb
er
Mo
stre
late
dac
cess
ion
nam
eS
imila
rity
Geo
gra
ph
icd
ista
nce
(km
)
373
76C
S28
784
Tu
-1K
ran
z73
76IT
A45
7.5
7116
.97
7198
CS
2839
31
351.
43
8115
CS
2828
1G
ifu
-2T
suka
ya71
48JP
N35
.45
137.
4211
308.
7384
34LP
3413
.101
110
316.
473
8115
328P
NA
061
328P
NA
061
Bo
revi
tz86
98U
SA
42.0
945
)86.
3253
1032
9.55
2408
Yn
g-4
70.
9643
28.8
93
8344
UK
ID96
UK
ID96
Ho
lub
5800
UK
57.4
)5.5
1174
.71
8344
Nd
-11
1174
.71
383
52U
KID
52U
KID
52R
atcl
iffe
5757
UK
54.6
)2.3
2198
.59
8352
Oy-
01
675.
763
8370
UK
ID59
UK
ID59
Ho
lub
5764
UK
54.7
)2.8
9198
.64
8370
Rm
x-A
021
5979
.27
383
70U
KID
63U
KID
63H
olu
b57
68U
K54
.1)1
.592
18.0
783
70R
mx-
A02
160
79.3
383
70U
KID
73U
KID
73H
olu
b57
78U
K52
.21.
593
05.3
183
70R
mx-
A02
163
43.0
53
8634
TO
U-A
1-13
8T
OU
-A1-
138
Ro
ux
299
FRA
46.6
667
4.11
667
6796
8634
11M
E2.
171
6794
.83
9059
Ho
g-2
Ho
g-2
No
rdb
org
9059
SW
E62
.79
17.9
6501
.99
7200
CS
2841
71
3980
.45
391
76C
S75
599
Tru
st-1
Bec
k92
58D
EN
56.2
945
9.67
4419
88.5
9176
CS
7551
71
1988
.53
9277
CS
7560
0T
rust
-2B
eck
9259
DE
N56
.294
59.
6744
6545
.32
9277
CS
7561
81
526.
824
1A
LL1-
2A
LL1-
2Le
Co
rre
1FR
A45
.266
71.
4833
346
7.77
141
LDV
-49
146
7.77
41
LDV
-49
LDV
-49
LeC
orr
e14
1FR
A48
.516
7)4
.066
6746
7.77
1A
LL1-
21
467.
774
9A
LL2-
1A
LL2-
1Le
Co
rre
9FR
A45
.266
71.
4833
316
69.1
426
5P
YL-
61
210.
844
9P
YL-
6P
YL-
6Le
Co
rre
265
FRA
44.6
5)1
.166
6719
16.9
79
ALL
2-1
121
0.84
482
CU
R-4
CU
R-4
LeC
orr
e82
FRA
451.
7520
7.38
315
TO
U-A
1-34
120
7.38
482
TO
U-A
1-34
TO
U-A
1-34
Ro
ux
315
FRA
46.6
667
4.11
667
207.
3882
CU
R-4
120
7.38
491
JEA
JEA
LeC
orr
e91
FRA
43.6
833
7.33
333
941.
1825
3M
OG
-57
194
1.18
491
MO
G-5
7M
OG
-57
LeC
orr
e25
3FR
A48
.666
7)4
.066
6761
34.3
991
JEA
194
1.18
415
6LD
V-7
0LD
V-7
0Le
Co
rre
156
FRA
48.5
167
)4.0
6667
619.
627
0T
OU
-A1-
101
619.
64
156
TO
U-A
1-10
TO
U-A
1-10
Ro
ux
270
FRA
46.6
667
4.11
667
5950
.88
156
LDV
-70
161
9.6
433
4LD
V-4
8LD
V-4
8Le
Co
rre
140
FRA
48.5
167
)4.0
6667
705.
533
4T
OU
-A1-
681
628.
584
334
TO
U-A
1-68
TO
U-A
1-68
Ro
ux
334
FRA
46.6
667
4.11
667
628.
5814
0LD
V-4
81
628.
584
973
Bo
lsen
a-11
-127
Bo
lsen
a-11
-127
Ag
ren
973
ITA
42.6
512
1128
.52
1367
Ham
-13-
241
0.98
811
28.5
24
973
Ham
-13-
241
Ham
-13-
241
Ag
ren
1367
SW
E59
.783
317
.583
368
25.9
973
Bo
lsen
a-8-
125
0.98
811
28.5
24
1062
Bro
sarp
-15-
138
Bro
sarp
-15-
138
Ag
ren
1062
SW
E55
.716
714
.133
311
52.4
511
40G
ard
by-
25-2
220.
9868
170.
864
1062
Gar
db
y-25
-222
Gar
db
y-25
-222
Ag
ren
1140
SW
E56
.616
716
.65
7121
.09
964
Bel
mo
nte
-15-
109
190
3.9
411
34B
rosa
rp-6
3-16
3B
rosa
rp-6
3-16
3A
gre
n10
75S
WE
55.7
167
14.1
333
778.
2311
34G
ard
by-
19-2
050.
9873
170.
124
1134
Gar
db
y-19
-205
Gar
db
y-19
-205
Ag
ren
1134
SW
E56
.616
716
.65
6071
.32
1118
Gar
db
y-28
10
411
37B
elm
on
te-1
5-10
9B
elm
on
te-1
5-10
9A
gre
n96
4IT
A42
.116
712
.483
376
29.6
111
37G
ard
by-
22-2
131
903.
94
1137
Gar
db
y-22
-213
Gar
db
y-22
-213
Ag
ren
1137
SW
E56
.616
716
.65
7941
.611
18G
ard
by-
281
04
1137
Ale
dal
-9-5
8A
led
al-9
-58
Ag
ren
1161
SW
E56
.716
.516
769
77.4
911
45A
led
al-2
60.
9881
04
1137
An
gso
-42-
411
An
gso
-42-
411
Ag
ren
1308
SW
E59
.566
716
.866
768
10.5
813
06A
ng
so-2
8-40
71
04
1305
An
gso
-26-
405
An
gso
-26-
405
Ag
ren
1305
SW
E59
.566
716
.866
722
69.4
313
08A
ng
so-4
2-41
11
04
1305
Ham
-26-
254
Ham
-26-
254
Ag
ren
1373
SW
E59
.783
317
.583
328
26.8
1305
An
gso
-26-
405
0.98
8544
.09
456
83C
S28
019
An
g-1
Kra
nz
6993
BE
L50
.35.
366
68.3
5683
UK
NW
99-0
401
640.
44
5683
An
g-0
An
g-0
Kra
nz
8254
BE
L50
.35.
374
5.38
5683
UK
NW
99-0
401
640.
44
5683
UK
NW
99-0
40U
KN
W99
-040
Ho
lub
5683
UK
54.6
)3.1
5951
.57
6993
CS
2801
91
640.
44
5748
CS
2838
9K
l-0
Kra
nz
7194
GE
R50
.95
6.96
6691
7.22
5748
UK
ID43
184
6.66
457
48U
KID
26U
KID
26H
olu
b57
33U
K54
.7)2
.613
70.4
257
48U
KID
430.
9886
150.
184
5748
UK
ID43
UK
ID43
Rat
clif
fe57
48U
K56
)4.4
6043
.471
94C
S28
389
184
6.66
560 Alison E. Anastasio et al.
ª 2011 The AuthorsThe Plant Journal ª 2011 Blackwell Publishing Ltd, The Plant Journal, (2011), 67, 554–566
Tab
le1
(Co
nti
nu
ed)
Cat
ego
ryH
aplo
gro
up
Acc
essi
on
nam
eN
ativ
en
ame
Co
llect
or
Acc
essi
on
nu
mb
erC
ou
ntr
yLa
titu
de
Lon
git
ud
e
Max
imu
md
ista
nce
tore
lati
ve(k
m)
Mo
stre
late
dac
cess
ion
nu
mb
er
Mo
stre
late
dac
cess
ion
nam
eS
imila
rity
Geo
gra
ph
icd
ista
nce
(km
)
457
48C
S28
386
Kil-
0K
ran
z71
92U
K55
.639
5)5
.663
6415
85.4
971
94C
S28
389
191
7.22
458
19C
S28
211
Dr-
0K
ran
z71
06G
ER
51.0
5113
.733
611
58.1
458
19U
KID
291
1158
.14
458
19U
KID
29U
KID
29H
olu
b58
19U
K55
.9)3
.214
26.3
371
06C
S28
211
111
58.1
44
5842
Bo
r-10
Bo
r-10
Rel
ich
ova
5842
CZ
E49
.401
316
.232
694
74.5
461
43T
910
147
9.48
458
42D
raIII
-14
Dra
III-1
4R
elic
ho
va58
82C
ZE
49.4
112
16.2
815
9475
.53
5837
Bo
r-1
13.
64
5842
T91
0T
910
Jako
bss
on
6143
SW
E55
.940
613
.538
386
78.3
358
42B
or-
101
479.
484
5964
Dra
IV5-
28D
raIV
5-28
Rel
ich
ova
5964
CZ
E49
.411
216
.281
591
95.7
180
77P
T2.
211
7472
.29
459
64P
T2.
21P
T2.
21D
un
nin
g80
77U
SA
41.3
423
)86.
7368
1037
9.46
5964
Dra
IV5-
281
7472
.29
461
08V
OU
-5V
OU
-5Le
Co
rre
394
FRA
46.6
50.
1666
6710
99.7
861
08T
480
110
99.7
84
6108
T48
0T
480
Jako
bss
on
6108
SW
E55
.798
913
.120
613
03.7
739
4V
OU
-51
1099
.78
469
90C
S28
132
Cer
v-1
Ko
orn
nee
f70
68IT
A42
12.1
7684
.74
6990
CS
2801
40.
9861
895.
944
6990
CS
2801
4A
mel
-1K
oo
rnn
eef
6990
NE
D53
.448
5.73
895.
9470
68C
S28
132
0.98
6189
5.94
469
96C
S28
017
An
-2K
ran
z69
96B
EL
51.2
167
4.4
906.
0770
29C
S28
059
0.99
1836
4.53
469
96C
S28
059
Bch
-3K
ran
z70
29G
ER
49.5
166
9.31
6610
21.4
769
96C
S28
017
0.99
1836
4.53
470
22C
S28
082
Bla
-4K
ran
z70
22E
SP
41.6
833
2.8
2591
.39
8238
Ken
t1
548.
214
7022
CS
2845
1Li
-1K
ran
z72
20G
ER
50.3
833
8.06
6618
05.5
470
22C
S28
082
163
3.21
470
22C
S28
716
Rsc
h-4
Kra
nz
7322
RU
S56
.334
2591
.39
7022
CS
2808
21
2591
.39
470
22R
sch
-4R
sch
-4K
ran
z83
74R
US
56.3
3425
91.3
970
22C
S28
082
125
91.3
94
7022
Ken
tK
ent
No
rdb
org
8238
UK
51.1
50.
422
64.7
170
22C
S28
082
154
8.21
471
27C
S28
242
Est
Dam
m71
27E
ST
58.6
656
24.9
871
317.
1871
28C
S28
243
134
.74
471
27C
S28
243
Est
-0K
ran
z71
28E
ST
58.3
25.3
274.
0271
27C
S28
242
134
.74
471
54C
S28
328
Gr3
Red
ei71
54A
UT
47.0
705
15.4
381
1143
.01
7458
CS
2806
61
590.
054
7154
CS
2832
2G
r-1
Kra
nz
7155
AU
T47
15.5
1116
.48
7154
CS
2832
81
6.3
471
54G
r-1
Gr-
1K
ran
z83
00A
UT
4715
.511
49.5
471
54C
S28
328
16.
34
7154
CS
2806
6B
erR
edei
7458
DE
N55
.675
12.5
687
859.
3671
54C
S28
328
159
0.05
471
54U
KS
E06
-476
UK
SE
06-4
76H
olu
b52
41U
K51
.20.
411
49.5
471
54C
S28
328
111
43.0
14
7156
CS
2832
3G
r-2
Kra
nz
7156
AU
T47
15.5
836.
6570
25C
S28
078
135
1.25
471
56C
S28
078
Bl-
1K
ran
z70
25IT
A44
.504
111
.339
665
7.08
7156
CS
2832
31
351.
254
7202
CS
2839
4K
l-5
Kra
nz
7199
GE
R50
.95
6.96
6667
93.1
572
02C
S28
380
112
0.55
472
02C
S28
380
Kb
-0K
ran
z72
02G
ER
50.1
797
8.50
861
6952
.73
7199
CS
2839
41
120.
554
7246
CS
2848
9M
a-2
Kra
nz
7246
GE
R50
.816
78.
7667
669.
8174
15C
S28
838
110
7.25
472
46C
S28
838
Wu
-0K
ran
z74
15G
ER
49.7
878
9.93
6167
0.24
7246
CS
2848
91
107.
254
7250
CS
2849
1M
e-0
Kra
nz
7250
GE
R51
.918
310
.113
870
77.1
284
03W
a-1
174
4.6
472
50C
S28
804
Wa-
1K
ran
z73
94P
OL
52.3
2172
52.1
372
50C
S28
491
174
4.6
472
50W
a-1
Wa-
1K
ran
z84
03P
OL
52.3
2172
52.1
372
50C
S28
491
174
4.6
474
29C
S28
455
Li-3
:3K
ran
z72
25G
ER
50.3
833
8.06
6668
28.9
274
29C
S28
496
141
0.94
474
29C
S28
496
Mr-
0K
ran
z74
29IT
A44
.15
9.65
9739
.57
7225
CS
2845
51
410.
944
7429
CS
2849
7M
r-0
Kra
nz
7522
ITA
44.1
59.
6513
25.3
183
38M
r-0
10
482
65B
lh-1
Blh
-1K
ran
z82
65C
ZE
4819
1667
.59
7117
CS
2822
80.
993
715.
664
8265
CS
2822
8E
l-0
Kra
nz
7117
GE
R51
.510
59.
6825
397
0.25
8265
Blh
-10.
993
715.
664
8280
CS
2820
6D
i-M
Viz
ir70
95FR
A47
524
05.6
482
80C
t-1
197
7.91
482
80C
S28
233
En
-1K
ran
z71
18G
ER
508.
520
29.4
482
80C
t-1
182
7.18
Mis-identified A. thaliana accessions 561
ª 2011 The AuthorsThe Plant Journal ª 2011 Blackwell Publishing Ltd, The Plant Journal, (2011), 67, 554–566
Tab
le1
(Co
nti
nu
ed)
Cat
ego
ryH
aplo
gro
up
Acc
essi
on
nam
eN
ativ
en
ame
Co
llect
or
Acc
essi
on
nu
mb
erC
ou
ntr
yLa
titu
de
Lon
git
ud
e
Max
imu
md
ista
nce
tore
lati
ve(k
m)
Mo
stre
late
dac
cess
ion
nu
mb
er
Mo
stre
late
dac
cess
ion
nam
eS
imila
rity
Geo
gra
ph
icd
ista
nce
(km
)
482
80C
S28
230
En
-DK
ran
z71
20G
ER
508.
520
29.4
482
80C
t-1
182
7.18
482
80E
n-1
En
-1K
ran
z82
90G
ER
508.
520
29.4
482
80C
t-1
182
7.18
482
80C
S28
196
Ct-
1K
ran
z69
10IT
A37
.315
2166
.15
8280
Ct-
11
04
8280
CS
2819
5C
t-1
Kra
nz
7067
ITA
37.3
1521
12.2
482
80C
t-1
10
482
80C
t-1
Ct-
1K
ran
z82
80IT
A37
.315
2166
.15
7118
CS
2823
31
827.
184
8341
CS
2848
8M
a-0
Kra
nz
7245
GE
R50
.816
78.
7667
1561
.25
8341
Mt-
01
1501
.97
483
41C
S28
502
Mt-
0K
ran
z72
49LI
B32
.334
22.4
626
51.9
8341
Mt-
01
04
8341
Mt-
0M
t-0
Kra
nz
8341
LIB
32.3
422
.46
1501
.97
7245
CS
2848
81
1501
.97
483
59U
KID
1U
KID
1H
olu
b57
08U
K56
.8)3
.959
00.2
983
59P
na-
171
5816
.74
483
59U
KID
23U
KID
23H
olu
b57
30U
K55
.3)1
.860
76.3
583
59P
na-
171
5993
.87
483
62C
S28
654
Pu
2-7
Cet
l69
56C
ZE
49.4
216
.36
1080
.56
8362
Pu
2-7
10
483
62P
u2-
7P
u2-
7C
etl
8362
CZ
E49
.42
16.3
611
46.1
557
27U
KID
201
1096
.69
483
62U
KID
20U
KID
20H
olu
b57
27U
K51
.31.
111
19.2
983
62P
u2-
71
1096
.69
483
82C
S28
744
Sp
r1-2
No
rdb
org
6964
SW
E56
.316
6283
.52
8382
Sp
r1-2
10
483
82S
pr1
-2S
pr1
-2N
ord
bo
rg83
82S
WE
56.3
1662
83.5
269
64C
S28
744
10
483
82U
KID
85U
KID
85H
olu
b57
90U
K51
.30.
454
78.7
883
82S
pr1
-21
1073
.59
483
86S
r:5
Sr:
5C
etl
8386
SW
E58
.911
.282
7.66
8423
Ho
v2-1
127
0.62
483
86H
ov2
-1H
ov2
-1N
ord
bo
rg84
23S
WE
56.1
13.7
467
86.6
483
86S
r:5
127
0.62
483
89C
S28
753
Ta-
0K
ran
z73
49C
ZE
49.5
14.5
8508
.283
89T
a-0
10
483
89T
a-0
Ta-
0K
ran
z83
89C
ZE
49.5
14.5
8508
.273
49C
S28
753
10
483
89C
S28
754
Tac
-0M
itch
ell-
Old
s73
50U
SA
47.2
413
)122
.459
8773
.53
8389
Ta-
01
8508
.24
8394
CS
2878
3T
u-0
Kra
nz
7375
ITA
457.
598
02.9
669
72C
S28
782
198
02.9
64
8394
Tu
-0T
u-0
Kra
nz
8395
ITA
457.
597
79.2
973
75C
S28
783
10
483
94C
S28
782
Tsu
-1H
olu
b69
72JP
N34
.43
136.
3198
02.9
673
75C
S28
783
198
02.9
64
8394
CS
2878
1T
su-1
Ho
lub
7374
JPN
34.4
313
6.31
9802
.96
7375
CS
2878
31
9802
.96
483
94T
su-1
Tsu
-1H
olu
b83
94JP
N34
.43
136.
3197
79.2
973
75C
S28
783
0.99
397
79.2
94
8399
Uo
d-7
Uo
d-7
Ko
ch83
99A
UT
48.3
14.4
563
8.44
6425
Zd
rI1-
241
149.
074
8399
Zd
rI1-
24Z
drI
1-24
Rel
ich
ova
6425
CZ
E49
.385
316
.254
454
6.39
8399
Uo
d-7
114
9.07
492
25C
S75
565
Pir
in-1
7B
eck
9224
BU
L41
.595
623
.548
314
46.5
692
25C
S75
566
126
1.5
492
25C
S75
566
Del
-1B
eck
9225
SR
B44
.944
421
.182
811
76.4
992
24C
S75
565
126
1.5
492
60C
S75
620
An
d-2
Bec
k92
79B
EL
50.8
54.
2833
380
7.32
9278
CS
7561
91
04
9260
CS
7562
1A
nd
-3B
eck
9280
BE
L50
.85
4.28
333
807.
3292
79C
S75
620
10
492
60C
S75
622
An
d-4
Bec
k92
81B
EL
50.8
54.
2833
323
27.5
392
60C
S75
601
155
5.85
492
60C
S75
619
An
d-1
Bec
k92
78B
EL
50.8
54.
2833
380
7.32
9279
CS
7562
01
04
9260
CS
7560
1A
bil-
1B
eck
9260
DE
N56
.675
29.
5891
8584
.97
9280
CS
7562
11
555.
85
AR
M=
Arm
enia
,A
UT
=A
ust
ria,
AZ
E=
Aze
rbai
jan
,B
EL
=B
elg
ium
,B
UL
=B
ulg
aria
,C
AN
=C
anad
a,C
PV
=C
ape
Ver
de
Isla
nd
s,C
RO
=C
roat
ia,
CZ
E=
Cze
chR
epu
blic
,D
EN
=D
enm
ark,
ES
P=
Sp
ain
,E
ST
=E
sto
nia
,FI
N=
Fin
lan
d,
FRA
=Fr
ance
,G
EO
=G
eorg
ia,
GE
R=
Ger
man
y,IN
D=
Ind
ia,
IRL
=Ir
elan
d,
ITA
=It
aly,
JPN
=Ja
pan
,K
AZ
=K
azak
hst
an,
KG
Z=
Kyr
gyz
stan
,LI
B=
Lib
ya,
LTU
=Li
thu
ania
,M
AR
=M
oro
cco
,N
ED
=N
eth
erla
nd
s,N
OR
=N
orw
ay,
NZ
L=
New
Zea
lan
d,
PO
L=
Po
lan
d,
PO
R=
Po
rtu
gal
,R
OU
=R
om
ania
,R
US
=R
uss
ia,
SR
B=
Ser
bia
,S
UI
=S
wit
zerl
and
,S
WE
=S
wed
en,
TJK
=T
ajik
ista
n,
TU
R=
Tu
rkey
,U
K=
Un
ited
Kin
gd
om
,U
KR
=U
krai
ne,
US
A=
Un
ited
Sta
tes
of
Am
eric
a,U
ZB
=U
zbek
ista
n
562 Alison E. Anastasio et al.
ª 2011 The AuthorsThe Plant Journal ª 2011 Blackwell Publishing Ltd, The Plant Journal, (2011), 67, 554–566
corroborated by its neighbors and was thus placed on the
red list. The fourth category comprised 107 members of 39
over-dispersed haplogroups for which it was impossible to
determine which individuals were out of place. All members
in these haplogroups were placed on the red list. Finally,
outside verification was used in several cases where knowl-
edge of the literature exonerated accessions from the red
list.
DISCUSSION
Patterns of isolation by distance are slow to emerge, and are
generally the result of many generations of low but steady
migration, genetic drift, and local recombination (Lew-
andowska-Sabat et al., 2010; Platt et al., 2010). Such patterns
are unlikely to occur accidentally as a result of incorrectly
associating a natural accession with an arbitrary sampling
location. The vast majority of the accessions we examined fit
this pattern of isolation by distance, and thus are almost
certainly properly identified.
Of the remainder, our analysis focused primarily on
haplogroups that were geographically over-dispersed,
indicating potential contamination. In regions such as
continental Europe where A. thaliana has been established
for thousands of years, local haplotype diversity is high,
long-distance migration is rare, much of the available
habitat is already colonized, and outcrossing happens
relatively frequently (Bomblies et al., 2010; Platt et al.,
2010). Naturally sampling two genetically identical plants
more than 10 km apart is extremely unlikely. First, a long-
distance migration event would have to have taken place.
The source population would have to be included in the
sample, and the individual genotype that migrated would
have to be sampled within it. The recipient population
would also have to be included in the sample, and even
though the newly entered genotype would be at very low
frequency in the new population, it too would have to be
represented in the collected sample. If sufficient time had
passed for the migrant haplotype to have increased to
appreciable frequency in its new region, it would also
have had the opportunity to backcross with the estab-
lished population, reducing the probability that it would
be identified as an outlier. A widespread haplotype that is
unrelated to the other individuals sampled in one or more
of its purported sites (particularly when those individuals
tend to be closely related to each other) is thus almost
certainly not a naturally occurring haplotype at those
sites.
A contaminant, on the other hand, need only have been
sampled once from the wild, and may spread easily from pot
to pot, vial to vial, or spreadsheet row to spreadsheet row.
It is surely no accident that many of the widely spread
haplogroups involve the most commonly used laboratory
strains. While it could be imagined that this is because
scientists have spread them worldwide, this over-estimates
our importance as dispersal agents. For example, according
to our results, Col-0 (an accession originally from Central
Europe) is found in ‘natural’ samples from the very north of
Sweden to the south of Spain. This seems unlikely com-
pared to the alternative explanation, which involves con-
tamination in Arabidopsis research facilities, practically all
of which are known to contain vast quantities of Col-0.
Similarly, accessions identical to Ws-2 (Wassilewskija, in
Russia) are found in Germany, Sweden and the UK; acces-
sions identical to Ler-0 are found in the Czech Republic,
France, Sweden and the UK. In all of these cases, many
different collectors were involved (including some of the
authors), suggesting that several laboratory collections are
probably contaminated. Error may also have happened at
the level of a single collector. We found several instances
where distant accessions sampled by the same collector
apparently shared a haplotype. The case for better growth
practices and record-keeping is clear.
Putative mislabeling events can be found when examining
genetic information for similarly named but geographically
distant accessions. For example, the nominally ‘Japanese’
Figure 1. Proportion of suspicious accessions (a) from each collection and (b)
from large and small local populations.
Mis-identified A. thaliana accessions 563
ª 2011 The AuthorsThe Plant Journal ª 2011 Blackwell Publishing Ltd, The Plant Journal, (2011), 67, 554–566
accession Tsu-0 appears to be identical to the nominally
‘Italian’ accession Tu-0, and the ‘American’ accession Tac-0 is
identical to the ‘Czech’ accession Ta-0. Such mix-ups have
been noted previously: C24 was historically referred to as
Columbia, but Torjek et al. (2003) suggested that it was
clearly a different accession both phenotypically and genet-
ically. The case for barcode-based labeling when collecting,
sowing seed and transplanting is also clear.
As expected if discrepancies occurred due to handling,
older collections are often affected. For example, in the Kranz
collection, the ‘German’ accession Li-1 appears to be iden-
tical to the ‘Spanish’ accession Bla-4 and the ‘Russian’
accession Rsch-4. Which origin is the original one (if any) is
very difficult to ascertain. Older collections also suffer from
the well-known problem that ‘ecotypes’ were originally
propagated in bulk, and then gradually developed into
individual lines. Thus, for example, our collection contains
ten different samples from ‘Ws’ (Wassilewskija, from Russia),
with ten different stock center IDs, but only six abbreviated
names (Ws, Ws-0, Ws-1, Ws-2, Ws-3 and Ws-4). Based on our
marker data, these appear to be two distinct genotypes: the
three accessions called Ws-0 were different from all the rest, a
finding that has also been described by Aukerman et al.
(1997). This result is not surprising given that the latter four
accessions (Ws-1–Ws-4) were all derived very recently from a
single line in the B. Griffing collection. This line has itself has
been separated from the Kranz set for decades, traversing
from Laibach to Langridge and Griffing in Australia, and then
to the Griffing laboratory at Ohio State University, while Ws-0
came directly from the Kranz collection.
We recommend that researchers do not use accessions
from our red list when geographic data are relevant, because
it appears unlikely that they are the same accessions that
were originally collected at these locations. Although we
have no specific reason to question the integrity of the
accessions on the yellow list, their location and genetic
information is not corroborated as well as those on the green
list. The relative sizes of the green and red lists suggest that
over 93% of accessions on the yellow list are likely to be
accurate. However, whether or not these accessions should
be included in analyses will need to be decided on a case-by-
case basis. The quality of the data accompanying the
accessions on the green list is generally excellent. However,
it is possible that the green list contains accessions that have
inadvertently been outcrossed during propagation. It is not
possible to distinguish such an accession from a properly
placed one given the level of genetic resolution in our dataset.
We must hope that such events are rare. In general, it is much
easier to definitively corroborate or question the identity of
accessions sampled in the general vicinity of other acces-
sions (Figure 1b). It is difficult to comment with any degree of
certainty on accessions sampled singly from remote regions.
Individual accessions that lack neighbors with which to be
compared are therefore most likely to end up on the yellow
list. Those on the red list should be treated with caution until
corroborating samples are collected. The practice of collect-
ing larger population samples provides researchers with
greater power to verify the integrity of a collection.
In our own collections, we found a 3% error rate in the
assigned origin of accessions (Figure 1a), demonstrating
that it is quite easy to introduce error over even a small
number of generations in the laboratory. The error rate in the
stock center collection was threefold higher (14%), although
these accessions have been propagated for several decades.
Given the increasing number of submissions to the stock
center and the increasing number of laboratories utilizing
these accessions, it is no surprise that mix-ups occur. The
early collections were even maintained through extraordi-
narily difficult conditions during the Second World War
(Maarten Koornneef, Max Plank Institute for Plant Breeding
Research (Cologne), Department of Plant Breeding and
Genetics, personal communication; Reinholz, 1965; Redei,
1992). We strongly urge researchers to be diligent in their
greenhouse and labeling practices for both laboratory-
specific and community-wide collections. Fast and cheap
fingerprinting of accessions, if widely used, could ensure
that mis-identification errors of this kind are unlikely to occur
in the future. For instance, primer information for assays
of the 149 SNPs on which these samples were genotyped is
publicly available through the Bergelson laboratory website
at the University of Chicago (http://bergelson.uchicago.edu/
a.thaliana-resources). In the near future, sequencing costs
will be low enough that even small laboratories will be able
to cheaply confirm the accuracy of geographic information.
There are several ways that Arabidopsis stock centers
could ensure the integrity of propagated lineages: genetic
fingerprinting of current stocks and submitted accessions,
provision of accurate GPS coordinates, and allocation of a
unique barcode to accessions at the time of collection or
initial propagation, to avoid use of the same name abbrevi-
ations for distant populations. Finally, in the interest of
maintaining a robust and informative collection of ecotypes
for use in future geographical and ecological studies, it would
be helpful if location attributes were collected along with
samples in the field. Clear photographs of an individual plant
and its surrounding habitat could be included, together with
altitude, land use, soil type and plant community data as
appropriate. These additional data and safeguards will
ensure that the hard work that goes into collecting and
maintaining stocks for use in the community results in a
trustworthy, informative and increasingly valuable resource.
EXPERIMENTAL PROCEDURES
Dataset
We used the dataset from Platt et al. (2010), described in detail athttp://arabidopsis.usc.edu/. It contains 4942 accessions collectedfrom natural populations (not all of which are available from thestock centers) and 1023 accessions from the Arabidopsis Biological
564 Alison E. Anastasio et al.
ª 2011 The AuthorsThe Plant Journal ª 2011 Blackwell Publishing Ltd, The Plant Journal, (2011), 67, 554–566
Resource Center. The accessions obtained from the ArabidopsisBiological Resource Center were as a leaf from a single referenceplant such that the distributed seed matched the genotype in thisstudy. Notably, many of the accessions were donated decades agoand have been perpetuated via single seed descent in growthchambers (or greenhouses, in the case of common laboratorystrains) at stock center facilities. The collection spans 42 countriesand four continents. These accessions were genotyped at 149 SNPmarkers using the Sequenom MASSArray technology (SequenomInc., http://www.sequenom.com). As in Platt et al. (2010), weexcluded individuals that lacked a large number of genotype calls(more than 50 of 149), as this indicates poor quality of the genomicDNA. We also excluded information from 10 SNPs due to poorlyperforming genotype assays. However, we did retain accessionsthat were deemed geographically or genetically suspicious by Plattet al. (2010), leaving a dataset of 5965 accessions.
How haplogroups were created
Accessions were grouped by haplotype by Platt et al. (2010), andwe used the same groups. Briefly, haplogroups were createdusing a modified quality threshold (QT) clustering algorithm(Heyer et al., 1999). Haplogroups are maximal collections ofaccessions for which the observed full SNP genotype of eachaccession in the haplogroup is expected to be consistent beyondthe variation attributable to genotyping error. The actual numberof discrepancies allowed between an observed genotype and thepresumed common haplotype is a function of the size of the ha-plogroup. For all values of N, the accession with the Nth mostdiscrepant genotype is expected to have fewer discrepancies thanthe Nth most discrepant genotype in 95% of hypothetical samplesof identical individuals (where genotyping errors occur at 0.5% ofmarkers and are independently distributed). In rare cases involv-ing highly heterozygous accessions that are likely to be recentcrosses between other sampled accessions, haplogroups may bemore diverse than desired. No accessions were included on thered list solely because of this effect, although some accessions onthe red list were affected by high heterozygosity. Heterozygotesare indicated using standard nomenclature in the ‘genetic finger-print’ column of Table S1.
Algorithm
Accessions were divided into two major groups based on theirsimilarity to geographic and genetic neighbors. Broadly, thoseaccessions whose identity was corroborated by nearby individu-als and without geographically distant genetic relatives wereincluded on the green list. The vast majority of accessions fellinto this category. Only when individuals were genetically similarto geographically distant accessions and whose identity was notcorroborated by geographically close neighbors were theyincluded in the red or yellow lists. A number of further classifi-cations were used to categorize the types of accessions on thered list.
Accessions were initially considered for inclusion on the greenlist when at least ten accessions were sampled within 10 km andeither a corroborating accession existed or the accession sharedalleles at more than 70% of its markers with all other accessionsfound within 10 km. A corroborating accession is either anotheraccession in the same haplogroup found within 10 km or anaccession, found within 100 km, that shares alleles at more than70% of markers. Accessions from North America or the UK wereincluded in the green list unless an accession sharing alleles atmore than 70% of markers was found on another continent or inanother country, respectively. Accessions from other regions
were included when no other members of their haplogroup werefound more than 10 km away and no accessions sharing allelesat more than 70% of markers were found more than 100 kmaway.
Accessions that were not from North America or the UK wereinitially considered for the red list if another accession in thesame haplogroup was found more than 10 km away or anaccession sharing alleles at more than 70% of markers was foundmore than 100 km away. For accessions from North America orthe UK to be considered for the red group, we required anaccession sharing alleles at more than 70% of markers to befound on another continent or in another country, respectively.An accession was included in the red list unless it was part of ahaplogroup of at least four accessions where another accessionwithin 10 km shared alleles at more than 70% of markers, andmore than half of the members of the haplogroup also hadaccessions within 10 km sharing alleles at more than 70% ofmarkers. This avoided inclusion on the red list of an entire large,well-supported haplogroup on the basis of a small number ofdubious members.
Once accessions had been relegated to the red list (because theywere not similar to neighbors and were similar to one or moreaccessions from a great distance), we further sub-divided the listinto four categories. In category 1, the flagged accession was thesole member of the haplogroup (long-distance migrants areexpected to be found here. In category 2, all accessions in ahaplogroup were flagged because the group contained a commonlyused laboratory strain that was a likely contaminant. In category 3,one or more accessions in a large haplogroup were flagged for notlooking like the others. Finally, category 4 comprised all accessionsin a haplogroup when it was not clear which (if any) had accurategeographic information.
Accessions that did not qualify for the green or red lists wereplaced on the yellow list.
ACKNOWLEDGEMENTS
We would like to thank Luz Rivero (Ohio State University, ABRC) forcomments and discussion, and two anonymous reviewers forimprovements to the manuscript. This work was primarily sup-ported by US National Science Foundation grant DEB-0519961 toJ.B. and M.N.
SUPPORTING INFORMATION
Additional Supporting Information may be found in the onlineversion of this article:Table S1. Complete dataset of 5965 accessions used in our analysis.Please note: As a service to our authors and readers, this journalprovides supporting information supplied by the authors. Suchmaterials are peer-reviewed and may be re-organized for onlinedelivery, but are not copy-edited or typeset. Technical supportissues arising from supporting information (other than missingfiles) should be addressed to the authors.
REFERENCES
Alonso-Blanco, C. and Koornneef, M. (2000) Naturally occurring variation in
Arabidopsis: an underexploited resources for plant genetics. Trends Plant
Sci. 5, 1360–1385.
Aranzana, M.J., Kim, S., Zhao, K.Y. et al. (2005) Genome-wide association
mapping in Arabidopsis identifies previously known flowering time and
pathogen resistance genes. PLoS Genet. 1, 531–539.
Aukerman, M.J., Hirschfeld, M., Wester, L., Weaver, M., Clack, T., Amasino,
R.M. and Sharrock, R.A. (1997) A deletion in the PHYD gene of the Arabi-
dopsis Wassilewskija ecotype defines a role for phytochrome D in red/far-
red light sensing. Plant Cell, 9, 1317–1326.
Mis-identified A. thaliana accessions 565
ª 2011 The AuthorsThe Plant Journal ª 2011 Blackwell Publishing Ltd, The Plant Journal, (2011), 67, 554–566
Bakker, E.G., Toomajian, C., Kreitman, M. and Bergelson, J. (2006) A genome-
wide survey of R gene polymorphisms in Arabidopsis. Plant Cell, 18, 1803–
1818.
Banta, J.A., Dole, J., Cruzan, M.B. and Pigliucci, M. (2007) Evidence of local
adaptation to coarse-grained environmental variation in Arabidopsis tha-
liana. Evolution, 61, 2419–2432.
Beck, J.B., Schmuths, H. and Schaal, B.A. (2008) Native range genetic varia-
tion in Arabidopsis thaliana is strongly geographically structured and
reflects Pleistocene glacial dynamics. Mol. Ecol. 17, 902–915.
Bergelson, J., Stahl, E., Dudek, S. and Kreitman, M. (1998) Genetic variation
within and among populations of Arabidopsis thaliana. Genetics, 148,
1311–1323.
Bergelson, J., Kreitman, M., Stahl, E. and Tian, D. (2001) Evolutionary
dynamics of plant R-genes. Science, 292, 2281–2284.
Bomblies, K., Yant, L., Laitinen, R.A., Kim, S.-T., Hollister, J.D., Warthmann,
N., Fitz, J. and Weigel, D. (2010) Local-scale patterns of genetic variability,
outcrossing, and spatial structure in natural stands of Arabidopsis thaliana.
PLoS Genet. 6, e1000890.
Bouchabke, O., Chang, F., Simon, M., Voisin, R., Pelletier, G. and Durand-
Tardif, M. (2008) Natural variation in Arabidopsis thaliana as a tool for
highlighting differential drought responses. PLoS ONE, 3, e1705.
Cetl, I. (1965) Racial differences in the number of days to appearance of the
flower primordia, in the number of rosette leaves, and in the number of
rosette leaves per day in Arabidopsis thaliana Heynh. Arabidopsis
Information Service, http://www.arabidopsis.org/ais/1965/cetl—1965-aag-
mi.html.
Cetl, I., Dobrovolna, J. and Effmertova, E. (1965) Distribution of spring and
winter types in the local populations of Arabidopsis thaliana (L.) Heynh.
from various localities in Western Moravia. Arabidopsis Information Ser-
vice, http://www.arabidopsis.org/ais/1965/cetl—1965-aagmh.html.
Effmertova, E. (1967) The behaviour of ‘summer annual’, ‘mixed’, and ‘winter
annual’ natural populations as compared with early and late races in field
conditions. Arabidopsis Information Service, http://www.arabidopsis.org/
ais/1967/effme-1967-aagph.html.
Effmertova, E. and Cetl, I. (1966) The vernalization requirement of ‘winter-
annual’ populations from Western Moravia. Arabidopsis Information Ser-
vice, http://arabidopsis.org/ais/1966/effme-1966-aagnw.html.
Ehrenreich, I.M. and Purugganan, M.D. (2006) The molecular genetic basis of
plant adaptation. Am. J. Bot. 93, 953–962.
Goss, E.M. and Bergelson, J. (2006) Variation in resistance and virulence in the
interaction between Arabidopsis thaliana and a bacterial pathogen. Evo-
lution, 60, 1562–1573.
Hauser, M.-T., Harr, B. and Schlotterer, C. (2001) Trichome distribution
in Arabidopsis thaliana and its close relative Arabidopsis lyrata:
molecular analysis of the candidate gene GLABROUS1. Mol. Biol. Evol. 18,
1754–1763.
Heyer, L.J., Kruglyak, S. and Yooseph, S. (1999) Exploring expression data:
identification and analysis of coexpressed genes. Genome Res. 9, 1106–
1115.
Johanson, U., West, J., Lister, C., Michaels, S., Amasino, R. and Dean, C.
(2000) Molecular analysis of FRIGIDA, a major determinant of natural var-
iation in Arabidopsis flowering time. Science, 290, 344–347.
Koornneef, M. and Meinke, D. (2010) The development of Arabidopsis as a
model plant. Plant J. 61, 909–921.
Laibach, F. (1943) Arabidopsis thaliana (L.) Heynh. als Objekt fur geneti-
sche und entwicklungsphysiologische Untersuchungen. Bot. Arch. 44,
439–455.
Lewandowska-Sabat, A.M., Fjellheim, S. and Rognli, O.A. (2010) Extremely
low genetic variability and highly structured local populations of Arabid-
opsis thaliana at higher latitudes. Mol. Ecol. 19, 4753–4764.
Mauricio, R. and Rausher, M.D. (1997) Experimental manipulation of putative
selective agents provides evidence for the role of natural enemies in the
evolution of plant defense. Evolution, 51, 1435–1444.
Mauricio, R., Stahl, E.A., Korves, T., Tian, D., Kreitman, M. and Bergelson, J.
(2003) Natural selection for polymorphism in the disease resistance gene
Rps2 of Arabidopsis thaliana. Genetics, 163, 735–746.
McKay, J.K., Richards, J.H., Nemali, K.S., Sen, S., Mitchell-Olds, T., Boles, S.,
Stahl, E.A., Wayne, T. and Juenger, T.E. (2008) Genetics of drought adap-
tation in Arabidopsis thaliana. II. QTL analysis of a new mapping popula-
tion, KAS-1 · TSU-1. Evolution, 62, 3014–3026.
Meyerowitz, E.M. (2001) Prehistory and history of Arabidopsis research. Plant
Physiol. 125, 15–19.
Mitchell-Olds, T. and Schmitt, J. (2006) Genetic mechanisms and evolution-
ary significance of natural variation in Arabidopsis. Nature, 441, 947–952.
Nordborg, M., Hu, T.T., Ishino, Y. et al. (2005) The pattern of polymorphism in
Arabidopsis thaliana. PLoS Biol. 3, 1289.
Novembre, J. and Slatkin, M. (2009) Likelihood-based inference in isolation-
by-distance models using the spatial distribution of low-frequency alleles.
Evolution, 63, 2914–2925.
Pico, F.X., Mendez-Vigo, B., Martınez-Zapater, J.M. and Alonso-Blanco, C.
(2008) Natural genetic variation of Arabidopsis thaliana is geographically
structured in the Iberian peninsula. Genetics, 180, 1009–1021.
Platt, A., Horton, M., Huang, Y.S. et al. (2010) The scale of population struc-
ture in Arabidopsis thaliana. PLoS Genet. 6, e1000843.
Redei, G.P. (1992) A heuristic glance at the past of Arabidopsis genetics. In
Methods in Arabidopsis Research (Koncz, C., Chua, N.-H. and Schell, J.,
eds). River Edge, New Jersey: World Scientific Press Inc, pp. 1–15.
Reinholz, E. (1965) Arabidopsis thaliana (L.) Heynh. als Objekt fur genetische
und entwicklungsphysiologische Untersuchungen. Arabidopsis Informa-
tion Service, http://www.arabidopsis.org/ais/1965/reinh-1965-aagld.html.
Robbelen, G. (1965) The LAIBACH standard collection of natural races. Ara-
bidopsis Information Service, http://www.arabidopsis.org/ais/1965/roebb-
1965-xxxxx.html.
Shindo, C., Bernasconi, G. and Hardtke, C.S. (2007) Natural genetic variation
in Arabidopsis: tools, traits and prospects for evolutionary ecology. Ann.
Bot. 99, 1043–1054.
Somerville, C. and Koornneef, M. (2002) A fortunate choice: the history of
Arabidopsis as a model plant. Nat. Rev. Genet. 3, 883–889.
Stahl, E.A., Dwyer, G., Mauricio, R., Kreitman, M. and Bergelson, J. (1999)
Dynamics of disease resistance polymorphism at the Rpm1 locus of Ara-
bidopsis. Nature, 400, 667–671.
Tian, D., Araki, H., Stahl, E., Bergelson, J. and Kreitman, M. (2002) Signature
of balancing selection in Arabidopsis. Proc. Natl Acad. Sci. USA, 99, 11525–
11530.
Tonsor, S.J., Alonso-Blanco, C. and Koornneef, M. (2005) Gene function
beyond the single trait: natural variation, gene effects, and evolutionary
ecology in Arabidopsis thaliana. Plant Cell Environ. 28, 2–20.
Torjek, O., Berger, D., Meyer, R.C., Mussig, C., Schmid, K.J., Sorensen, T.R.,
Weisshaar, B., Mitchell-Olds, T. and Altmann, T. (2003) Establishment of a
high-efficiency SNP-based framework marker set for Arabidopsis. Plant J.
36, 122–140.
566 Alison E. Anastasio et al.
ª 2011 The AuthorsThe Plant Journal ª 2011 Blackwell Publishing Ltd, The Plant Journal, (2011), 67, 554–566