Review
Medaka genomics: a bridge between mutant phenotype and gene function
Kiyoshi Narusea,*, Hiroshi Horib, Nobuyoshi Shimizuc, Yuji Koharad, Hiroyuki Takedaa,*,1
aDepartment of Biological Sciences, Graduate School of Science, The University of Tokyo, Bunkyo-ku, Tokyo 113-0033, JapanbDivision of Biological Science, Graduate School of Science, Nagoya University, Nagoya 464-8602, Japan
cDepartment of Molecular Biology, Keio University School of Medicine, Shinjuku-ku, Tokyo 160-8582, JapandGenome Biology Laboratory, Center for Genetic Resource Information, National Institute of Genetics, Mishima 411-8540, Japan
Received 6 January 2004; received in revised form 5 March 2004; accepted 14 May 2004
Abstract
Recent advances in medaka genetics have proven that the medakafish is an excellent model system for developmental and evolutionary
biology studies and that it can complement similar studies in zebrafish. Large-scale mutagenesis projects are now being conducted by several
groups in Japan and are delivering a vastly expanded pool of medaka mutant stocks. This growing availability of genomic resources will
greatly accelerate progress in moving from mutant phenotypes to the elucidation of gene function. This phenotype-driven approach can be
expected to lead to the identification and characterization of novel genes and pathways in vertebrate genomes. This review discusses the
current state of medaka genomic resources, the state of medaka gene mapping and medaka genome sequencing projects.
q 2004 Elsevier Ireland Ltd. All rights reserved.
Keywords: Medaka; Genetic map; Genome sequencing; SNP mapping
1. Introduction
The Medaka, Oryzias latipes (order Beloniformes), is a
small, egg-laying freshwater teleost fish found in brooks and
rice paddies in eastern Asia, primarily in Japan (Shima and
Mitani, 2004). There are two principal advantages of the
medaka, as a model for vertebrate genomics, over the more
commonly used zebrafish system. The first is the medaka
genome is smaller (800–1000 Mb) (Uwa and Iwata, 1981;
Lamatsch et al., 2000), being about half the size of the
zebrafish genome and one-third that of the human genome.
The second is that there are highly polymorphic inbred
medaka strains available that can be used for both
mutagenesis screening and genetic mapping (Wittbrodt
et al., 2002).
Bony fish have undergone significant genome-wide gene
duplication during their evolution and both zebrafish and
medaka have been shown to have seven Hox clusters
(Amores et al., 1998; Wittbrodt et al., 1998; Naruse et al.,
2000; Taylor et al., 2003) whereas only four have been
found in mouse. Generally, duplicated genes are subject to
disfunctionalization, neo-functionalization and sub-functio-
nalization (Force et al., 1999), resulting in greater genetic
diversity within fish species. Considering the long evolu-
tionary distance between them (diverging 110–160 million
years ago, Hedges and Kumar, 2002; Wittbrodt et al., 2002),
one would expect that the medaka and zebrafish species
would have different repertoires of gene sets that would
result in a different spectrum of mutant phenotypes (Loosli
et al., 2000; Ishikawa, 2000).
Ongoing large-scale ENU mutagenesis of the medaka
genome is providing a rapid and massive expansion of
available medaka mutant resources (Furutani-Seiki et al.,
2004). The ultimate goal of large-scale medaka mutagenesis
projects is the identification of novel genes and pathways,
thereby obtaining new insights into gene function in
vertebrates. This can be achieved through the rapid
progression from mutant phenotypes to an understanding
of specific gene functions using medaka genomics. Medaka
genomics is also providing new insights into vertebrate
genome evolution by comparative analyses with the
substantial genomic information that now exists for other
vertebrates such as human, mouse, Fugu and zebrafish. To
accelerate this progress, a medaka whole-genome shotgun
0925-4773/$ - see front matter q 2004 Elsevier Ireland Ltd. All rights reserved.
doi:10.1016/j.mod.2004.04.014
Mechanisms of Development 121 (2004) 619–628
www.elsevier.com/locate/modo
1 Tel.: þ81-3-5841-4431.
* Corresponding author. Tel.: þ81-3-5841-4443.
E-mail addresses: [email protected] (K. Naruse), htakeda@
biol.s.u-tokyo.ac.jp (H. Takeda).
sequencing project began in late 2002 at the National
Institute of Genetics (NIG) in Mishima, Japan. This review
discusses the current state of genetic mapping analyses, the
progress in genome sequencing and the isolation of ESTs in
addition to the other available genomic resources for the
medaka.
2. The origins of inbred medaka strains and genetic
differences between wild medaka populations
In the medaka, at least 15 inbred strains have so far been
described (see Table 1) (Hyodo-Taguchi, 1996; Shimada
and Shima, 1988; Loosli et al., 2001) and all except Cab,
AA2 and Kaga were established by Hyodo-Taguchi at the
National Institute of Radiological Sciences (NIRS) (Hyodo-
Taguchi, 1996). Inbred lines were derived from three
different wild populations of medaka, southern Japanese
(HO4C, HO5, HB32C, HB32D, HB12A, HB11A, HB11C,
Hd-rR, Hd-rr, Cab and AA2), northern Japanese (HNI-I,
HNI-II and Kaga) and east-Korean (HSOK) populations.
The origins of some of these strains are well known; for
example, the Hd-rR (a target of the genome sequencing
project, see below) and Hd-rr lines were derived from a
closed colony established by the late Toki-o Yamamoto at
Nagoya University for the purposes of experiments on sex
reversal by oestrogen and androgen (Yamamoto, 1953;
Yamamoto, 1958). The Cab strain established by Witt-
brodt’s group (Winkler et al., 2000) was originally obtained
from a commercial strain (of southern Japanese origin)
available from Carolina Biological Supply (http://www.
carolina.com/). The AA2 strain, which has three recessive
pigmentation phenotypes, was established by Shimada and
Shima (1988). The HNI and Kaga strains, however,
originated from northern medaka populations; the HNI-I
and II strains originated from a wild population in Niigata
City, Niigata prefecture, whereas the Kaga strain was
established from Kaga City, Ishikawa prefecture. From a
genetic standpoint, the three wild medaka populations have
a relatively similar level of genetic divergence from each
other, with the southern and northern populations grouped
together as a sister group (Sakaizumi, 1984; Takehana et al.,
2003). Although inbred strains from either the southern or
northern populations are commonly used, the Korean
HSOK strain will be of great usefulness because of its
genetic differences from these two main populations
(Sakaizumi and Joen, 1987).
It is known that the northern and southern Japanese
medaka populations differ from each other in many
morphological, behavioural and genetic characteristics. In
spite of these differences, intercrosses breed normally
producing hybrid offspring. Sequence comparisons of
orthologous loci reveals single nucleotide polymorphisms
(SNPs) between the two populations at a frequency of more
than 1% within exons and 3% in introns, in addition to many
insertions and deletions (Ohtsuka et al., 1999; Naruse et al.,
2000). When considering the fact that a 1–2% difference
exists between the human and great ape genomes (Fujiyama
et al., 2002), the SNP frequency between the two medaka
populations seems quite high. Hence, they are extremely
useful for genetic mapping of genes and mutations which
can be readily induced in medaka inbred lines. Information
on inbred lines and spontaneous mutants is available at
http://biol1.bio.nagoya-u.ac.jp:8000/.
3. Genetic mapping
High density genetic linkage maps are essential tools for
the identification of genes responsible for mutant pheno-
types and also for comparative and evolutionary genomics.
The markers that are commonly used for linkage map
Table 1
Medaka inbred strains and their features
Strain Genetic background Origin Special features Reference
HO4C Southern population Commercial strain in Japan Orange-red color (b/b) Hyodo-Taguchi, 1996
HO5 Southern population Commercial strain in Japan Orange-red color (b/b) Hyodo-Taguchi, 1996
HB32C Southern population Wild population in Chiba Wild type Hyodo-Taguchi, 1996
HB32D Southern population Wild population in Chiba Wild type Hyodo-Taguchi, 1996
HB12A Southern population Wild population in Chiba Wild type Hyodo-Taguchi, 1996
HB11A Southern population Wild population in Chiba Wild type Hyodo-Taguchi, 1996
HB11C Southern population Wild population in Chiba Wild type Hyodo-Taguchi, 1996
Hd-rR Southern population Stock at Nagoya Univ. Orange-red in male and white in female Hyodo-Taguchi, 1996
Hd-rr Southern population Stock at Nagoya Univ. White in both male and female Hyodo-Taguchi, 1996
Cab Southern population Commercial strain from Carolina
Biological Supply
Variegated body color Loosli et al., 2001
AA2 Southern population Mutant stocks at Nogoya Univ. b/b, gu/gu and lf/lf genotype Shimada and Shima, 1988
HNI-I Northern population Wild population in Niigata Wild type Hyodo-Taguchi, 1996
HNI-II Northern population Wild population in Niigata Wild type Hyodo-Taguchi, 1996
Kaga Northern population Wild population in Kaga Wild type Loosli et al., 2001
HSOK East-Korea population Wild population in Sokcho, Korea Wild type Hyodo-Taguchi, 1996
K. Naruse et al. / Mechanisms of Development 121 (2004) 619–628620
analysis include phenotypic traits, expressed sequence tags
(ESTs), randomly amplified polymorphic DNA markers
(RAPDs), amplified fragment length polymorphic makers
(ALFPs) and microsatellite markers. A medaka linkage map
was first described by Aida (1921), who demonstrated that
the male-determining factor (Y locus) had linkage to the
gene controlling carotenoid deposition in xanthophores
(R locus). More recently, a detailed, genome-wide linkage
map of the medaka was constructed using 633 markers (488
AFLPs, 28 RAPDs, 34 IRSs, 75 ESTs, four STSs and four
phenotypic traits) (Naruse et al., 2000). This map utilized
the high degree of polymorphism between two inbred
strains, HNI and AA2, derived from the northern and
southern Japanese populations, respectively, and was
constructed with a reference typing DNA panel from 39
cell lines derived from back-cross progeny of male meioses.
The number of linkage groups (LGs) was 24, which is
equivalent to the haploid chromosome number of the
medaka.
Most of the markers used for mapping were ‘DNA
fingerprinting’ polymorphisms and were used without other
genomic information, thus allowing a genome-wide genetic
map to be established rapidly. However, many of these
markers were found to be strain-specific and therefore
difficult to apply to other typing panels. For position-based
cloning, mutants have their own genetic background and
thus ‘portable single-locus type’ markers are needed that
represent a unique region of the medaka genome. To
develop such markers, ESTs from medaka cDNA libraries
were isolated from various sources and the resulting
sequence data were used to design PCR primer sets (Naruse
et al., 2004; Kimura et al., 2004). Due to the considerable
genetic divergence between the southern and northern
medaka populations, 80% of the amplified fragments that
appeared as a single band from the two populations showed
fragment length polymorphisms either directly or following
digestion with eight commonly used restriction enzymes.
The current medaka genetic map has accumulated 1762
single-locus type DNA markers including 1722 genes and
ESTs, though the resolution of the map is around 2 cM with
the 39 meiosis panel. Theoretically, using this number of
markers, one can map a mutated locus at the resolution of
500 kb, assuming the genome size of medaka is 800 Mbp.
For position-based cloning of novel mutants, usually DNA
marker within 1 cM of a mutant locus is obtained if over 100
meioses are analyzed.
Although the recombination frequency around the
mutant locus is one of the most important factors that affect
the success of positional cloning, the relationship between
the physical and genetic map lengths in cM has not been
precisely determined yet for medaka. This is because there
are only a few reported cases of successful positional
cloning of mutated medaka genes. One such example is the
sex-determining region of the medaka Y chromosome
(Matsuda et al., 2002) that harbours the transcription factor,
DMY, which contains a highly conserved DM domain
and plays a critical role in testis development (Matsuda et al.,
2002; Nanda et al., 2002). Matsuda et al. (2002) reported
that the map distance between markers 135D12.F and
51H7.F is 0.98 cM, which corresponds to about 500 kb
(510 kb/cM). Another reported example is the medaka B
locus, which was positionally cloned by Fukamachi et al.
(2001) and encodes a novel transporter protein, AIM1 that
affects melanin formation. According to the mapping data,
two STS markers, C27F and C27R, located at either end of a
36.3 kbp cosmid insert, correspond to 0.55 cM (66 kb/cM)
in female meiosis. This region therefore shows extremely
high recombination frequency, as the average is estimated at
470 kb/cM, based on cumulative map lengths in female
meiosis. These findings indicate that if the mutant locus of
interest is mapped to a region in which many markers are
clustered or a region with a low recombination frequency, it
may result in a poor outcome for position based cloning.
Table 2 summarizes statistical information for the
current map of medaka LGs. If the medaka genome is
800 Mbp, the estimated physical length of each LG would
vary from 59 to 19 Mbp; this estimation is based on the
distribution of anonymous DNA markers in each LG. The
cumulative map distances in each LG range from 104 to
26 cM in male meiosis. The largest LG, based on
Table 2
Distribution of mapped markers in each medaka linkage group
Medaka
linkage
group
No. mapped
EST And
gene
No.
anonymous
DNA markera
Ratiob Estimated
physical
length
(Mb)
Longest
segment
(cm)
1 33 41 0.80 59 53.3
2 21 40 0.53 58 43.6
3 32 38 0.84 55 45.6
4 28 37 0.76 53 104.4
5 51 29 1.76 42 70.6
6 26 25 1.04 36 53.5
7 49 24 2.04 35 45.6
8 44 24 1.83 35 80.5
9 34 23 1.48 33 70.9
10 38 23 1.65 33 65.6
11 42 22 1.91 32 59.1
12 37 21 1.76 30 80.8
13 34 21 1.62 30 31.8
14 34 21 1.62 30 57.9
15 29 21 1.38 30 59.6
16 44 20 2.20 29 75
17 39 18 2.17 26 64.1
18 19 18 1.06 26 61.4
19 34 17 2.00 25 52.4
20 28 16 1.75 23 79.5
21 38 14 2.71 20 76.1
22 41 14 2.93 20 26.6
23 21 14 1.50 20 43.6
24 23 13 1.77 19 52
819 554 1.48 800 1401.5
a Number of AFLP markers, RAPD markers and other STS markers.b Ratio of number of mapped EST and Gene markers and numbers of
anonymous markers.
K. Naruse et al. / Mechanisms of Development 121 (2004) 619–628 621
the estimated physical length, is LG1 (59 Mbp) whereas the
longest LG based on recombination mapping is LG4
(104 cM). As described above, the physical lengths of the
LGs may not simply reflect map distances. The distribution
of mapped markers is not uniform; 16 out of 24 LGs have
large clusters of markers and over 30% of the markers
within each LG mapped to the same positions in a male
meiotic panel. These results suggest that recombination
events in specific regions within these LGs are restricted
during male meiosis. We are now examining if this
phenomenon can also be observed in female meiosis (see
http://medaka.dsp.jst.go.jp/MGI/LG22/). The distribution
of both anonymous DNA markers and ESTs reveals
differences in gene density for each LG; the gene density
of LG2 is 3.2 times lower and that of LG22 is 1.7 times
higher than the average. The medaka genetic map would
provide reliable anchor points for positional cloning and a
portion of these mapping data are available at http://mbase.
bioweb.ne.jp/~dclust/medaka_top.html.
4. Evolutionary analyses
As described above, most single-locus type markers
mapped in medaka are attributable to either genes or ESTs
(Naruse et al., 2004) that have significant homology to
genes of other species. By comparison to the map positions
of these markers, we analysed the degree of synteny
conservation between different species. Fig. 1A and B show
Oxford grids for medaka, zebrafish and human, and indicate
that between medaka and human the distribution of
orthologous gene pairs seems scattered but is obviously
not random. One can easily detect clusters of orthologous
gene pairs in a medaka/human matrix, suggesting that the
medaka and human genomes share many conserved
syntenic segments even after more than four hundred
million years divergence from a common ancestor (Kumar
and Hedges, 1998). A greater number of orthologous gene
pair clusters are found in the medaka/zebrafish matrix
(Fig. 1B) and if the criterion of conserved synteny is set to at
least five orthologous pairs located on the same LG, the
conserved syntenic segments in LG1/LG1, LG3/LG7, LG7/
LG23 etc. becomes apparent.
For detailed analysis on conserved synteny, mapped
genes are sorted by human chromosome numbers, followed
by the assignment of colours to each human chromosome
(i.e. 23 colours). This procedure may exclude inversion
events within each chromosome that occurred following
the divergence of the medaka, zebrafish and human
lineages, and it reveals an interesting feature of chromo-
some evolution in vertebrates. For example, medaka LG11
and LG16 and zebrafish LG19 and LG16, which harbour
the HoxA cluster, show domain structures similar to human
orthologous gene pairs (Fig. 2). They each contain blocks
that correspond to human chromosome 1 (hsa1), hsa3,
hsa6, hsa7 and hsa8. This suggests that these chromosomes
arose from duplication of a single ancestral chromosome
(Proto-chromosome) and have maintained a paired
relationship. In principle, this can be extended to all LGs
of medaka and zebrafish, although there are three notable
exceptions in terms of the paired relationship. Phylogen-
etically, zebrafish and medaka have a relatively high level
of divergence from each other in terms of the ray finned-
fish lineages (Nelson, 1994; Kumar and Hedges, 1998;
Fig. 1. Oxford grid display of medaka-human (A) and medaka-zebrafish (B). Numbers in the cells depict the number of orthologous gene pairs in each matrix.
818 orthologous genes were analysed in the medaka-human matrix and 255 in the medaka-zebrafish matrix, respectively (For detail, see Naruse et al., 2004).
K. Naruse et al. / Mechanisms of Development 121 (2004) 619–628622
Miya et al., 2003). Indeed, medaka and zebrafish are
thought to diverge directly from a common ancestor of
nearly all euteleosts. The paired-chromosome relationship
must therefore be found in most euteleosts. Our results,
together with the current accumulated evidence, strongly
suggest that whole-genome duplication occurred in a
common ancestor of almost all euteleosts (Postlethwait
et al., 2000; Naruse et al., 2004).
5. Rapid mapping of medaka mutations using
the EST-marker set
The first step in the positional cloning of mutated genes is
their approximate assignment to a genomic region. To
facilitate this step, a bulked segregation analysis (BSA) with
selected EST markers (referred to as the M-marker set) has
been employed (Kimura et al., 2004). The latest version of
the M-marker set (M-marker 2003) could be applied to any
combination of HNI/Kaga and Hd-rR/AA2/Cab strains (see
detail for Kimura et al., 2004 and website http://medaka.lab.
nig.ac.jp). A similar system was established using a mapping
cross of Kaga and Cab strains (Martinez-Morales et al.,
2004). Both systems are equally effective but the M-marker
set seems more universal in that it can be applied to most
common strains including Hd-rR, d-rR, Cab, AA2, HNI and
Kaga. High-resolution mapping will necessarily follow by
the use of a higher number of mapped markers and embryos
with the result that mutation sites can finally be narrowed
down to a region that is covered by a small number of BACs.
Fig. 2. An example of a paired-chromosome relationship in medaka and zebrafish LGs. These four LGs have similar colour patterns, showing blocks of hsa1,
hsa3, hsa6, hsa7 and hsa8. Fifteen orthologous genes pairs were mapped to medaka LG11 and zebrafish LG19 and six orthologous genes pairs were mapped to
medaka LG16 and zebrafish LG16, suggesting the medaka LG11/zebrafish LG19 and medaka LG16/zebrafish LG16 are orthologous chromosomes,
respectively. However, two genes, RXRB and TWIST1 mapped to medaka LG16 are located on zebrafish LG19. This suggests a lineage specific loss of the
duplicated copy. Alternatively, another copy of the duplicated gene remains to be found in one or both of the two species. As a whole, these patterns suggest a
common origin of these four LGs (For detail, see Naruse et al., 2004).
K. Naruse et al. / Mechanisms of Development 121 (2004) 619–628 623
6. BAC libraries
In addition to a detailed linkage map, high quality
bacterial artificial chromosome (BAC) libraries are a
prerequisite for successful positional cloning of mutated
genes. The BAC system has been most commonly used for
large-insert DNAs. High-quality BAC libraries for medaka
should increase the rate of positional cloning of mutant
genes and are essential tools for subsequent genomic
analyses. At the moment, three gridded BAC libraries
have been established from different medaka strains. A
library derived from the Hd-rR southern inbred medaka
strain (HdrR library) was constructed by Matsuda et al.
(2001) and is available upon request (H. Hori, Nagoya
University). The average insert size in this library is 210 kb
and predicted 24-times coverage of the medaka genome. A
genomic BAC library of the HNI northern inbred strain was
previously constructed by Kondo et al. (2002). The average
insert size of this library was 160 kb and was expected to
cover 20 genome equivalents (refer to Wittbrodt et al.,
2002). A second BAC library of the southern medaka strain,
Cab, has also been established (Wittbrodt et al., 2002 and
http://www.rzpd.de, RZPD library number 756). The
average insert size of this library was 150 kb and it is
available from RZPD. Use of these BAC libraries
enabled cloning of the B locus and sex-determining genes
(Fukamachi et al., 2001; Matsuda et al., 2001). Furthermore,
Kondo et al. (2002) have determined the complete
nucleotide sequences of DMRT genes in HNI BAC to
study the evolution of vertebrate DMRT families.
The HdrR and Cab BAC libraries were used to construct a
BAC-based physical map, and the first generation of a BAC-
based physical map has been generated. Zadeh Khorasani
et al. (2004) hybridized 35-mer oligonucleotides to 60,000
BAC clones, which correspond to 14-fold coverage of the
medaka genome, and aligned them into 902 map segments
containing 2721 markers. The BAC physical map will
greatly facilitate the position based cloning of novel mutants
and BAC-based genome sequencing.
7. Large-scale isolation of ESTs
The EST approach is a powerful technique for large scale
cloning of cDNAs as well as large-scale characterization of
cDNA sequences in functional genomic studies. Several
groups have isolated ESTs from the medaka embryo
(Kimura et al., 2004) and adult, and from specific tissues
such as the liver and ovary. As a result, by March 2004,
about 150,000 entries of medaka ESTs were found in public
databases. As described above, the mapping of isolated
ESTs is in progress and their number is increasing rapidly
(http://mbase.bioweb.ne.jp/~dclust/medaka_top.html).
Expression analyses of isolated ESTs are also under way
(http://www.embl-heidelberg.de/mepd/ and http://medaka.
lab.nig.ac.jp/).
To take full advantage of the large and rapidly growing
body of medaka EST information, new technologies will
most certainly be required. The most powerful and versatile
tool currently available is a high-density array of oligo-
nucleotides or cDNAs which can measure the levels of gene
expression for thousands of genes simultaneously. Oligo-
nucleotide microarrays with 8,091 genes isolated from
medaka embryos (Medaka Microarray 8 K) have been made
and tested for their usefulness in expression analyses of
developing embryo (Kimura et al., 2004). The use of
microarrays could also play an important role in candidate
gene identification as comparison of wild-type and mutant
embryos will permit the identification of affected transcripts
that correspond to a candidate gene or candidate pathway
for a particular mutant phenotype.
8. Perspectives: ongoing projects
8.1. Single nucleotide polymorphism (SNP) mapping:
towards a high-density map
SNPs are stable genetic variations that spread throughout
the genome. The number of SNPs is huge and they can be
found in any region of the genome, albeit their non-uniform
distribution has been reported in the mouse genome (Wade
et al., 2002). Furthermore, with recent advances in genomic
technologies, SNPs can be mapped by high throughput
automated methods instead of conventional gel electrophor-
esis. The use of SNPs would thus allow for the generation of
an even higher density map, which would facilitate the fine-
scale mapping of mutants in addition to the assignment of
contigs created by the genome sequencing project (see
below) to a corresponding genomic region. Given that there
is a higher SNP rate (1–3%) found between northern and
southern inbred strains of medaka (in contrast to 0.2%
among mouse inbred strains, Waterston et al., 2002), medaka
could well be an ideal vertebrate for rapid construction of a
SNP map (Fig. 3B). We (KN, YK and HT) have therefore
commenced a SNP mapping project in the medaka.
In order to undertake SNP mapping, a reference typing
DNA panel from 94 back-cross progeny between the HNI
and Hd-rR medaka strains was generated (Fig. 3A). As a
pilot analysis, nine genes were selected from EST collec-
tions of Hd-rR and HNI strains, and were mapped either by
use of SNPs or by conventional gel-electrophoresis
methods. A high-throughput MALDI-TOF system (Jurinke
et al., 2002), was used for SNP mapping and, as shown in
Fig. 3C, the typing data that were obtained proved to be
identical to those found by the conventional method.
Furthermore, this SNP mapping system has great advantages
as it is relatively easy, faster than other techniques (a few
minutes for each SNP analysis in 96 well formats) and is
also more reliable. At present, large numbers of SNPs are
being collected by simple comparisons of sequences from
HNI strains (low-redundant coverage sequence data) with
K. Naruse et al. / Mechanisms of Development 121 (2004) 619–628624
Fig. 3. SNP genotyping for constructing a high-resolution genetic map. (A) Construction of a new typing panel with 94 back-cross progeny using female meioses. A female of HNI and a male of Hd-rR were
mated to obtain F1 offspring. An F1 female was backcrossed to a male of Hd-rR and 94 back-cross progeny were obtained. (B) Example of SNPs in the coding regions between HNI and Hd-rR strains. Several
substitutions were observed even in the coding regions. (C) Comparison of genotyping data obtained by MALDI-TOF and PCR-RFLP (conventional) systems. Genotyping for all individuals was identical
between the two systems, except for a few missing results from the PCR-RFLP method.
K.
Na
ruse
eta
l./
Mech
an
isms
of
Develo
pm
ent
12
1(2
00
4)
61
9–
62
86
25
those of a high-coverage Hd-rR strain (from the medaka
genome sequencing project, see below). The goal of this
project is to generate a set of 3000 SNP markers, and together
with a further 1700 EST markers, nearly 5000 invariant
genetic markers will be assigned throughout the medaka
genome. Theoretically, map resolution can reach the range
of 1 per 180–200 kb, the average size of a BAC insert, when
hundreds of meioses are used for positional cloning.
8.2. Genome sequencing
Over the past few years, more than 30 organisms
including human and mice have had their genomes
completely sequenced and other model organisms are
currently being sequenced or are on a waiting list for future
genome sequencing projects. The complete draft sequence
of Fugu (Aparicio et al., 2002) has had a high impact on
medaka genomics, as medaka and Fugu are evolutionarily
close to each other (Miya et al., 2003), about 60 Myr apart,
in contrast to zebrafish (110–160 Myr divergence from
medaka; Wittbrodt et al., 2002; Naruse et al., 2004). In
2000, the zebrafish genome sequencing project began at the
Sanger Center (http://www.sanger.ac.uk/Info/Press/001121.
shtml) and sequencing data continues to be updated
frequently on the database web site (http://pre.ensembl.
org/Danio_rerio/). Rapid completion of the genomic
sequence of medaka will certainly be a crucial step in the
rapid movement from mutant phenotypes to characterizing
novel gene functions in addition to the analysis of genomic
evolution during fish diversification.
There are two main approaches for sequencing large,
complex genomes such as medaka; shotgun sequencing of
the entire genome (whole-genome shotgun, WGS) and
shotgun sequencing of BAC clones or contigs arranged by
fingerprinting or hybridization (hierarchical shotgun). The
WGS approach has the advantage of both simplicity and
rapid early coverage of the whole genome. Indeed WGS
sequencing is also useful for identifying genes; almost all
genes are identified by at least one database hit at a twofold
level of redundancy (Bouck et al., 1998). However, at even
higher redundancies (e.g. six-fold), gaps and misassembled
fragments remain that require further directed sequencing
to be resolved. Furthermore, the WGS approach may
encounter difficulties when applied to genomes that contain
highly repetitive sequences, such as human. The hierarch-
ical approach, on the other hand, overcomes such
difficulties by sequencing assembled contigs or BAC
clones and thus decreases the number of repeats within
sets of sequencing data. The hierarchical approach will
therefore be required for finishing the complete sequencing
of complex genomes.
Although the ultimate goal is to obtain the finished
sequence of the medaka genome, a draft sequence is needed
as rapidly as possible to accelerate the progress from
identifying mutant phenotypes to characterizing novel genes
and gene functions. Accordingly, a strategy was adopted
that is mainly based upon WGS, with the integration of
detailed map information and pair-mate sequence data from
large inserts in BAC and/or fosmid vectors. Under the
support of the Grants-in-Aid for Scientific Research in
Priority Area ‘Genome Science’ from the Ministry of
Education, Culture, Sports, Science and Technology of
Japan (MEXT), this medaka genome project started in late
2002. A southern inbred strain, Hd-rR, was chosen for
sequencing, as most medaka mutants are of southern
Japanese origin. Sequencing is being carried out at the
Academia Sequencing Centre of the NIG in Mishima, Japan.
By the end of 2003, the sequence had 4-fold genome
coverage and reached 8.9-fold in May 2004 (http://dolphin.
lab.nig.ac.jp/medaka/). Mapping information of ESTs and
SNPs, pair-mate sequences of large inserts will be integrated
into the WGS sequence data to generate an initial draft
genome sequence. Additionally, in 2002, the National
Bioresource Project of MEXT supported medaka genome
sequencing (led by Y. Wakamatsu at Nagoya University;
http://shigen.lab.nig.ac.jp/medaka/genome/indexen.html)
and WGS data of approximately 1,000,000 reads (726 Mb,
approx. 0.9-fold genome coverage) of the Hd-rR genome
Table 3
Medaka research information and genomic resource websites
Site name Content URL
Medakafish homepage Medaka resource portal site http://biol1.bio.nagoya-u.ac.jp:8000/
MGI Medaka genome initiative homepage http://medaka.dsp.jst.go.jp/MGI/
ERATO (DMG) Kondoh Differentiation Signaling Project
(Developmental mutant group homepage)
http://medaka.dsp.jst.go.jp/DMG/
M base EST and linkage database http://mbase.bioweb.ne.jp/~dclust/ml_base.html;
TIGR Medaka Gene Index Integrated research data from international EST
sequencing and gene research projects
http://www.tigr.org/tdb/tgi/olgi/
NBRP medakafish genome project Genome sequence data http://shigen.lab.nig.ac.jp/medaka/genome/indexen.html;
MEPD Medaka expression pattern data http://www.embl-heidelberg.de/mepd/
Medaka EST database EST sequence search and expression pattern data http://medaka.lab.nig.ac.jp/
Mapping mutant with M-marker Medaka mutant mapping system http://medaka.lab.nig.ac.jp/
RZPD German Resource Center for Genome Research http://www.rzpd.de/cgi-bin/products/rzpd_products.pl.cgi
Medaka genome project Medaka genome project http://dolphin.lab.nig.ac.jp/medaka/
K. Naruse et al. / Mechanisms of Development 121 (2004) 619–628626
produced by the Academia Sequencing Centre and
the RIKEN Institute are already available (July, 2002;
http://shigen.lab.nig.ac.jp/medaka/genome/top.jsp). Initial
assembled medaka sequencing data of 6.7-fold genome
coverage will be available in the public domain via the NIG
and should be accessible by the time of publication of this
review. As a final step in the medaka genome project, the
isolation of ESTs will also be accelerated at the NIG to assist
in finding and annotating genes in a draft genome sequence
of the medaka. Together with the above projects, an
initiative has been launched within the medaka community,
the Medaka Genome Initiative (Wittbrodt et al., 2002; http://
medaka.dsp.jst.go.jp/MGI/) toward the complete sequen-
cing of the medaka genome.
9. Conclusions
The generation of hundreds of medaka mutants and the
unique phenotypes observed in some make this organism an
attractive model system for vertebrate genetics and
genomics that would nicely complement the zebrafish
(Wittbrodt et al., 2002; Naruse et al., 2004). As described
above, medaka genomic tools have been sufficiently
developed to permit rapid identification of mutated genes
(see Table 3). This involves the efficient determination of
approximate LG map positions by BSA with the M-marker
set, followed by fine mapping using thousands of mapped
markers (ESTs and SNPs), and finally, rapid identification
of candidate genes using existing information from other
vertebrate genomes. Genomic tools either are already in use
or will be available shortly, and there is no doubt that
the draft genome sequence of the medaka will greatly
accelerate these studies. Furthermore, whole-genome com-
parisons between zebrafish, Fugu, medaka and mammalian
draft genome sequences will provide novel insights into the
diversification of fish species during evolution and shed
light on vertebrate genome evolution.
Acknowledgements
The work presented here was supported in part by
Grants-in-Aid for Scientific Research Priority Area ‘Gen-
ome Science (K.N., Y.K., H.H., H.T.)’ and ‘Study of
Medaka as a Model for Organization and Evolution of the
Nuclear Genome (K.N.)’, and ‘Organized Research Com-
bination System (H.T.)’ from the Ministry of Education,
Culture, Sports, Science and Technology of Japan.
References
Aida, T., 1921. On the inheritance of colour in a freshwater fish,
Aplocheilus latipes Temminck and Schlegel, with special reference to
sex-linked inheritance. Genetics 6, 554–573.
Amores, A., Force, A., Yan, Y.L., Joly, L., Amemiya, C., Fritz, A., et al.,
1998. Zebrafish hox clusters and vertebrate genome evolution. Science
282, 1711–1714.
Aparicio, S., Chapman, J., Stupka, E., Putnam, N., Chia, J.M., Dehal, P.,
et al., 2002. Whole-genome shotgun assembly and analysis of the
genome of Fugu rubripes. Science 297, 1301–1310.
Bouck, J., Miller, W., Gorrell, J.H., Muzny, D., Gibbs, R.A., 1998. Analysis
of the quality and utility of random shotgun sequencing at low
redundancies. Genome Res. 8, 1074–1084.
Force, A., Lynch, M., Pickett, F.B., Amores, A., Yan, Y.L., Postlethwait, J.,
1999. Preservation of duplicate genes by complementary, degenerative
mutations. Genetics 151, 1531–1545.
Fujiyama, A., Watanabe, H., Toyoda, A., Taylor, T.D., Itoh, T., Tsai, S.F.,
et al., 2002. Construction and analysis of a human-chimpanzee
comparative clone map. Science 295, 131–134.
Fukamachi, S., Shimada, A., Shima, A., 2001. Mutations in the gene
encoding B, a novel transporter protein, reduce melanin content in
medaka. Nat. Genetics 28, 381–385.
Furutani-Seiki, M., Sasado, T., Morinaga, C., Suwa, H., Niwa, K., Yoda,
H., et al., 2004. A systematic genome-wide screen for mutations
affecting organogenesis in Medaka, Oryzias latipes. Mech. Dev. 121,
647–658.
Hedges, S.B., Kumar, S., 2002. Vertebrate genome compared. Science 297,
1283–1285.
Hyodo-Taguchi, Y., 1996. Inbred strains of the medaka, Oryzias
latipes.The Fish Biol. J. Medaka 8, 11–14.
Ishikawa, Y., 2000. Medakafish as a model system for vertebrate
developmental genetics. Bioessays 22, 487–495.
Jurinke, C., van den Boom, D., Cantor, C.R., Koster, H., 2002. Automated
Genotyping using DNA MassARRAY Technology. Methods in Mol.
Biol. 187, 179–192.
Kimura, T., Jindo, T., Narita, T., Naruse, K., Kobayashi, D., Shin-I, T.,
et al., 2004. Large-scale isolation of ESTs from medaka embryos and its
application to medaka developmental genetics. Mech. Dev. 121,
915–932.
Kondo, M., Froschauer, A., Kitano, A., Nanda, I., Hornung, U., Volff, J.,
et al., 2002. Molecular cloning and characterization of DMRT genes
from the medaka Oryzias latipes and the platyfish Xiphophorus
maculatus. Gene 295, 213–222.
Kumar, S., Hedges, S.B., 1998. A molecular timescale for vertebrate
evolution. Nature 392, 917–920.
Lamatsch, D.K., Steinlein, C., Schmid, M., Schartl, M., 2000. Noninvasive
determination of genome size and ploidy level in fishes by flow
cytometry: detection of triploid Poecilia formosa. Cytometry 36,
91–95.
Loosli, F., Koster, R.W., Carl, M., Kuhnlein, R., Henrich, T., Mucke, M.,
et al., 2000. A genetic screen for mutations affecting embryonic
development in medaka fish (Oryzias latipes). Mech. Dev. 97,
133–139.
Loosli, F., Winkler, S., Burgtorf, C., Wurmbach, E., Ansorge, W., Henrich,
T., et al., 2001. Medaka eyeless is the key factor linking retinal
determination and eye growth. Development 128, 4035–4044.
Martinez-Morales, J., Naruse, K., Mitani, H., Shima, A., Wittbrodt, J.,
2004. Rapid chromosomal assignment of Medaka mutant by bulked
segregation analysis. Gene 329, 159–165.
Matsuda, M., Kawato, N., Asakawa, S., Shimizu, N., Nagahama, Y.,
Hamaguchi, S., et al., 2001. Construction of a BAC library derived from
the inbred Hd-rR strain of the teleost fish, Oryzias latipes. Genes Genet.
Syst. 76, 61–63.
Matsuda, M., Nagahama, Y., Shinomiya, A., Sato, T., Matsuda, C.,
Kobayashi, T., et al., 2002. DMY is a Y-specific DM-domain gene
required for male development in the medaka fish. Nature 417,
559–563.
Miya, M., Takeshima, H., Endo, H., Ishiguro, N.B., Inoue, J.G., Mukai, T.,
et al., 2003. Major Patterns of higher teleostean phylogenies: a new
perspective based on 100 complete mitochondrial DNA sequences.
Mol. Phylogenet. Evol. 26, 121–138.
K. Naruse et al. / Mechanisms of Development 121 (2004) 619–628 627
Nanda, I., Kondo, M., Hornung, U., Asakawa, S., Winkler, C., Shimizu, A.,
et al., 2002. A duplicated copy of DMRT1 in the sex-determining region
of the Y chromosome of the medaka, Oryzias latipes. Proc. Natl Acad.
Sci. 99, 11778–11783.
Naruse, K., Fukamachi, S., Mitani, H., Kondo, M., Matsuoka, T., Kondo,
S., et al., 2000. A detailed linkage map of medaka, Oryzias latipes:
comparative genomics and genome evolution. Genetics 154,
1773–1784.
Naruse, K., Tanaka, M., Mita, K., Shima, A., Postlethwait, J., Mitani, H.,
2004. Medaka gene map: the trace of ancestral vertebrate proto-
chromosomes revealed by comparative gene mapping. Genome Res.
14, 820–828.
Nelson, J.S., 1994. Fishes of the World, 3rd ed, Wiley, New York, NY.
Ohtsuka, M., Makino, S., Yoda, K., Wada, H., Naruse, K., Mitani, H., et al.,
1999. Construction of a linkage map of the medaka (Oryzias latipes)
and mapping of the Da mutant locus defective in dorsoventral
patterning. Genome Res. 9, 1277–1287.
Postlethwait, J.H., Woods, I.G., Ngo-Hazelett, P., Yan, Y.L., Kelly, P.D.,
Chu, F., et al., 2000. Zebrafish comparative genomics and the origins of
vertebrate chromosomes. Genome Res. 10, 1890–1902.
Sakaizumi, M., 1984. Rigid isolation between the Norhern population and
the southern population of the medaka, Oryzias latipes. Zool. Sci. 1,
795–800.
Sakaizumi, M., Joen, S.R., 1987. Two divergent groups in the wild
population of medaka Oryzias latipes (Pisces: Oryziatidae) in Korea.
Korean J. Limnol. 20, 13–20.
Shima, A., Mitani, H., 2004. Medaka as a research organism: past, present
and future. Mech. Dev. 121, 599–604.
Shimada, A., Shima, A., 1988. Combination of genomic DNA fingerprint-
ing into the medaka specific-locus test system for studying environ-
mental germ-line mutagenesis. Mutation Res. 399, 149–165.
Takehana, Y., Nagai, N., Matsuda, M., Tsuchiya, K., Sakaizumi, M., 2003.
Geographic variation and diversity of the cytochrome b gene in
Japanese wild populations of medaka, Oryzias latipes. Zool. Sci. 20,
1279–1291.
Taylor, J.S., Braasch, I., Frickey, T., Meyer, A., Van de Peer, Y., 2003.
Genome duplication, a trait shared by 22000 species of ray-finned fish.
Genome Res. 13, 382–390.
Uwa, H., Iwata, A., 1981. Karyotype and cellular DNA content of Oryzias
javanicus (Oryziatidae, Pisces). Chromosome Info. Service 31, 24–26.
Wade, C.M., Kulbokas, E.J. III, Kirby, A.W., Zody, M.C., Mullikin, J.C.,
Lander, E.S., et al., 2002. The mosaic structure of variation in the
laboratory mouse genome. Nature 420, 574–578.
Waterston, R.H., Lindblad-Toh, K., Birney, E., Rogers, J., Abril, J.F.,
Agarwal, P., et al., 2002. Initial sequencing and comparative analysis of
the mouse genome. Nature 420, 520–562.
Winkler, S., Loosli, F., Henrich, T., Wakamatsu, Y., Wittbrodt, J., 2000.
The conditional medaka mutation eyeless uncouples patterning and
morphogenesis of the eye. Development 127, 1911–1919.
Wittbrodt, J., Meyer, A., Schartl, M., 1998. More genes in fish? BioEssay
20, 511–515.
Wittbrodt, J., Shima, A., Schartl, M., 2002. Medaka-a model organism from
the Far East. Nature Rev. Genet. 3, 53–64.
Yamamoto, T., 1953. Artificially induced sex-reversal in genotyping males
of medaka (Oryzias latipes). J. Exp. Zool. 123, 571–594.
Yamamoto, T., 1958. Artificial induction of functional sex-reversal in
genotypic females of the Medaka (Oryzias latipes). J. Exp. Zool. 137,
227–264.
Zadeh Khorasani, M., Hennig, S., Imre, G., Asakawa, S., Palczewski, S.,
Berger, A., et al., 2004. A first generation physical map of the medaka
genome in BACs essential for positional cloning and clone-by-clone
based genomic sequencing. Mech. Dev. 121, 903–913.
K. Naruse et al. / Mechanisms of Development 121 (2004) 619–628628
Top Related