期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

1.

Genome sequence of ground tit Pseudopodoces humilis and its adaptation to high altitude

Qingle Cai Xiaoju Qian Yongshan Lang Yadan Luo Jiaohui Xu Shengkai Pan Yuanyuan Hui Caiyun Gou Yue Cai Meirong Hao Jinyang Zhao Songbo Wang Zhaobao Wang Xinming Zhang Rongjun He Jinchao Liu Longhai Luo Yingrui Li Jun Wang 《Genome biology》2013,14(3):R29

Background

The mechanism of high-altitude adaptation has been studied in certain mammals. However, in avian species like the ground tit Pseudopodoces humilis, the adaptation mechanism remains unclear. The phylogeny of the ground tit is also controversial.

Results

Using next generation sequencing technology, we generated and assembled a draft genome sequence of the ground tit. The assembly contained 1.04 Gb of sequence that covered 95.4% of the whole genome and had higher N50 values, at the level of both scaffolds and contigs, than other sequenced avian genomes. About 1.7 million SNPs were detected, 16,998 protein-coding genes were predicted and 7% of the genome was identified as repeat sequences. Comparisons between the ground tit genome and other avian genomes revealed a conserved genome structure and confirmed the phylogeny of ground tit as not belonging to the Corvidae family. Gene family expansion and positively selected gene analysis revealed genes that were related to cardiac function. Our findings contribute to our understanding of the adaptation of this species to extreme environmental living conditions.

Conclusions

Our data and analysis contribute to the study of avian evolutionary history and provide new insights into the adaptation mechanisms to extreme conditions in animals. 相似文献

2.

Cross-species overgo hybridization and comparative physical mapping within avian genomes

Romanov MN Dodgson JB 《Animal genetics》2006,37(4):397-399

The chicken genome sequence facilitates comparative genomics within other avian species. We performed cross-species hybridizations using overgo probes designed from chicken genomic and zebra finch expressed sequence tags (ESTs) to turkey and zebra finch BAC libraries. As a result, 3772 turkey BACs were assigned to 336 markers or genes, and 1662 zebra finch BACs were assigned to 164 genes. As expected, cross-hybridization was more successful with overgos within coding sequences than within untranslated region, intron or flanking sequences and between chicken and turkey, when compared with chicken-zebra finch or zebra finch-turkey cross-hybridization. These data contribute to the comparative alignment of avian genome maps using a 'one sequence, multiple genomes' strategy. 相似文献

3.

Intrachromosomal rearrangements in avian genome evolution: evidence for regions prone to breakpoints

Skinner BM Griffin DK 《Heredity》2012,108(1):37-41

It is generally believed that the organization of avian genomes remains highly conserved in evolution as chromosome number is constant and comparative chromosome painting demonstrated there to be very few interchromosomal rearrangements. The recent sequencing of the zebra finch (Taeniopygia guttata) genome allowed an assessment of the number of intrachromosomal rearrangements between it and the chicken (Gallus gallus) genome, revealing a surprisingly high number of intrachromosomal rearrangements. With the publication of the turkey (Meleagris gallopavo) genome it has become possible to describe intrachromosomal rearrangements between these three important avian species, gain insight into the direction of evolutionary change and assess whether breakpoint regions are reused in birds. To this end, we aligned entire chromosomes between chicken, turkey and zebra finch, identifying syntenic blocks of at least 250 kb. Potential optimal pathways of rearrangements between each of the three genomes were determined, as was a potential Galliform ancestral organization. From this, our data suggest that around one-third of chromosomal breakpoint regions may recur during avian evolution, with 10% of breakpoints apparently recurring in different lineages. This agrees with our previous hypothesis that mechanisms of genome evolution are driven by hotspots of non-allelic homologous recombination. 相似文献

4.

Dissecting a Hidden Gene Duplication: The Arabidopsis thaliana SEC10 Locus

Nemanja Vuka?inovi? Fatima Cvr?ková Marek Eliá? Rex Cole John E. Fowler Viktor ?ársky Luká? Synek 《PloS one》2014,9(4)

相似文献

5.

Chromosome-scale haplotype-phased genome assemblies of the male and female lines of wild asparagus (Asparagus kiusianus), a dioecious plant species

Kenta Shirasawa Saki Ueta Kyoko Murakami Mostafa Abdelrahman Akira Kanno Sachiko Isobe 《DNA research》2022,29(1)

Asparagus kiusianus is a disease-resistant dioecious plant species and a wild relative of garden asparagus (Asparagus officinalis). To enhance A. kiusianus genomic resources, advance plant science, and facilitate asparagus breeding, we determined the genome sequences of the male and female lines of A. kiusianus. Genome sequence reads obtained with a linked-read technology were assembled into four haplotype-phased contig sequences (∼1.6 Gb each) for the male and female lines. The contig sequences were aligned onto the chromosome sequences of garden asparagus to construct pseudomolecule sequences. Approximately 55,000 potential protein-encoding genes were predicted in each genome assembly, and ∼70% of the genome sequence was annotated as repetitive. Comparative analysis of the genomes of the two species revealed structural and sequence variants between the two species as well as between the male and female lines of each species. Genes with high sequence similarity with the male-specific sex determinant gene in A. officinalis, MSE1/AoMYB35/AspTDF1, were presented in the genomes of the male line but absent from the female genome assemblies. Overall, the genome sequence assemblies, gene sequences, and structural and sequence variants determined in this study will reveal the genetic mechanisms underlying sexual differentiation in plants, and will accelerate disease-resistance breeding in garden asparagus. 相似文献

6.

Low diversity,activity, and density of transposable elements in five avian genomes

Bo Gao Saisai Wang Yali Wang Dan Shen Songlei Xue Cai Chen Hengmi Cui Chengyi Song 《Functional & integrative genomics》2017,17(4):427-439

In this study, we conducted the activity, diversity, and density analysis of transposable elements (TEs) across five avian genomes (budgerigar, chicken, turkey, medium ground finch, and zebra finch) to explore the potential reason of small genome sizes of birds. We found that these avian genomes exhibited low density of TEs by about 10% of genome coverages and low diversity of TEs with the TE landscapes dominated by CR1 and ERV elements, and contrasting proliferation dynamics both between TE types and between species were observed across the five avian genomes. Phylogenetic analysis revealed that CR1 clade was more diverse in the family structure compared with R2 clade in birds; avian ERVs were classified into four clades (alpha, beta, gamma, and ERV-L) and belonged to three classes of ERV with an uneven distributed in these lineages. The activities of DNA and SINE TEs were very low in the evolution history of avian genomes; most LINEs and LTRs were ancient copies with a substantial decrease of activity in recent, with only LTRs and LINEs in chicken and zebra finch exhibiting weak activity in very recent, and very few TEs were intact; however, the recent activity may be underestimated due to the sequencing/assembly technologies in some species. Overall, this study demonstrates low diversity, activity, and density of TEs in the five avian species; highlights the differences of TEs in these lineages; and suggests that the current and recent activity of TEs in avian genomes is very limited, which may be one of the reasons of small genome sizes in birds. 相似文献

7.

Genome assembly has a major impact on gene content: a comparison of annotation in two Bos taurus assemblies

Florea L Souvorov A Kalbfleisch TS Salzberg SL 《PloS one》2011,6(6):e21400

Gene and SNP annotation are among the first and most important steps in analyzing a genome. As the number of sequenced genomes continues to grow, a key question is: how does the quality of the assembled sequence affect the annotations? We compared the gene and SNP annotations for two different Bos taurus genome assemblies built from the same data but with significant improvements in the later assembly. The same annotation software was used for annotating both sequences. While some annotation differences are expected even between high-quality assemblies such as these, we found that a staggering 40% of the genes (>9,500) varied significantly between assemblies, due in part to the availability of new gene evidence but primarily to genome mis-assembly events and local sequence variations. For instance, although the later assembly is generally superior, 660 protein coding genes in the earlier assembly are entirely missing from the later genome''s annotation, and approximately 3,600 (15%) of the genes have complex structural differences between the two assemblies. In addition, 12–20% of the predicted proteins in both assemblies have relatively large sequence differences when compared to their RefSeq models, and 6–15% of bovine dbSNP records are unrecoverable in the two assemblies. Our findings highlight the consequences of genome assembly quality on gene and SNP annotation and argue for continued improvements in any draft genome sequence. We also found that tracking a gene between different assemblies of the same genome is surprisingly difficult, due to the numerous changes, both small and large, that occur in some genes. As a side benefit, our analyses helped us identify many specific loci for improvement in the Bos taurus genome assembly. 相似文献

8.

A versatile computational pipeline for bacterial genome annotation improvement and comparative analysis, with Brucella as a use case

Yu GX Snyder EE Boyle SM Crasta OR Czar M Mane SP Purkayastha A Sobral B Setubal JC 《Nucleic acids research》2007,35(12):3953-3962

We present a bacterial genome computational analysis pipeline, called GenVar. The pipeline, based on the program GeneWise, is designed to analyze an annotated genome and automatically identify missed gene calls and sequence variants such as genes with disrupted reading frames (split genes) and those with insertions and deletions (indels). For a given genome to be analyzed, GenVar relies on a database containing closely related genomes (such as other species or strains) as well as a few additional reference genomes. GenVar also helps identify gene disruptions probably caused by sequencing errors. We exemplify GenVar's capabilities by presenting results from the analysis of four Brucella genomes. Brucella is an important human pathogen and zoonotic agent. The analysis revealed hundreds of missed gene calls, new split genes and indels, several of which are species specific and hence provide valuable clues to the understanding of the genome basis of Brucella pathogenicity and host specificity. 相似文献

9.

De novo Assembly of a 40 Mb Eukaryotic Genome from Short Sequence Reads: Sordaria macrospora,a Model Organism for Fungal Morphogenesis

Minou Nowrousian Jason E. Stajich Meiling Chu Ines Engh Eric Espagne Karen Halliday Jens Kamerewerd Frank Kempken Birgit Knab Hsiao-Che Kuo Heinz D. Osiewacz Stefanie P?ggeler Nick D. Read Stephan Seiler Kristina M. Smith Denise Zickler Ulrich Kück Michael Freitag 《PLoS genetics》2010,6(4)

相似文献

10.

Genome Sequence of the Pea Aphid Acyrthosiphon pisum

The International Aphid Genomics Consortium 《PLoS biology》2010,8(2)

Aphids are important agricultural pests and also biological models for studies of insect-plant interactions, symbiosis, virus vectoring, and the developmental causes of extreme phenotypic plasticity. Here we present the 464 Mb draft genome assembly of the pea aphid Acyrthosiphon pisum. This first published whole genome sequence of a basal hemimetabolous insect provides an outgroup to the multiple published genomes of holometabolous insects. Pea aphids are host-plant specialists, they can reproduce both sexually and asexually, and they have coevolved with an obligate bacterial symbiont. Here we highlight findings from whole genome analysis that may be related to these unusual biological features. These findings include discovery of extensive gene duplication in more than 2000 gene families as well as loss of evolutionarily conserved genes. Gene family expansions relative to other published genomes include genes involved in chromatin modification, miRNA synthesis, and sugar transport. Gene losses include genes central to the IMD immune pathway, selenoprotein utilization, purine salvage, and the entire urea cycle. The pea aphid genome reveals that only a limited number of genes have been acquired from bacteria; thus the reduced gene count of Buchnera does not reflect gene transfer to the host genome. The inventory of metabolic genes in the pea aphid genome suggests that there is extensive metabolite exchange between the aphid and Buchnera, including sharing of amino acid biosynthesis between the aphid and Buchnera. The pea aphid genome provides a foundation for post-genomic studies of fundamental biological questions and applied agricultural problems. 相似文献

11.

Sequencing and comparative analyses of Aegilops tauschii chromosome arm 3DS reveal rapid evolution of Triticeae genomes

Jingzhong Xie Naxin Huo Shenghui Zhou Yi Wang Guanghao Guo Karin R. Deal Shuhong Ouyang Yong Liang Zhenzhong Wang Lichan Xiao Tingting Zhu Tiezhu Hu Vijay Tiwari Jianwei Zhang Hongxia Li Zhongfu Ni Yingyin Yao Huiru Peng Qixin Sun 《遗传学报》2017,44(1):51-61

Bread wheat (Triticum aestivum, AABBDD) is an allohexaploid species derived from two rounds of interspecific hybridizations. A high-quality genome sequence assembly of diploid Aegilops tauschii, the donor of the wheat D genome, will provide a useful platform to study polyploid wheat evolution. A combined approach of BAC pooling and next-generation sequencing technology was employed to sequence the minimum tiling path (MTP) of 3176 BAC clones from the short arm of Ae. tauschii chromosome 3 (At3DS). The final assembly of 135 super-scaffolds with an N50 of 4.2 Mb was used to build a 247-Mb pseudomolecule with a total of 2222 predicted protein-coding genes. Compared with the orthologous regions of rice, Brachypodium, and sorghum, At3DS contains 38.67% more genes. In comparison to At3DS, the short arm sequence of wheat chromosome 3B (Ta3BS) is 95-Mb large in size, which is primarily due to the expansion of the non-centromeric region, suggesting that transposable element (TE) bursts in Ta3B likely occurred there. Also, the size increase is accompanied by a proportional increase in gene number in Ta3BS. We found that in the sequence of short arm of wheat chromosome 3D (Ta3DS), there was only less than 0.27% gene loss compared to At3DS. Our study reveals divergent evolution of grass genomes and provides new insights into sequence changes in the polyploid wheat genome. 相似文献

12.

Complete Chloroplast Genome of Sedum sarmentosum and Chloroplast Genome Evolution in Saxifragales

Wenpan Dong Chao Xu Tao Cheng Shiliang Zhou 《PloS one》2013,8(10)

Comparative chloroplast genome analyses are mostly carried out at lower taxonomic levels, such as the family and genus levels. At higher taxonomic levels, chloroplast genomes are generally used to reconstruct phylogenies. However, little attention has been paid to chloroplast genome evolution within orders. Here, we present the chloroplast genome of Sedum sarmentosum and take advantage of several available (or elucidated) chloroplast genomes to examine the evolution of chloroplast genomes in Saxifragales. The chloroplast genome of S. sarmentosum is 150,448 bp long and includes 82,212 bp of a large single-copy (LSC) region, 16.670 bp of a small single-copy (SSC) region, and a pair of 25,783 bp sequences of inverted repeats (IRs).The genome contains 131 unique genes, 18 of which are duplicated within the IRs. Based on a comparative analysis of chloroplast genomes from four representative Saxifragales families, we observed two gene losses and two pseudogenes in Paeonia obovata, and the loss of an intron was detected in the rps16 gene of Penthorum chinense. Comparisons among the 72 common protein-coding genes confirmed that the chloroplast genomes of S. sarmentosum and Paeonia obovata exhibit accelerated sequence evolution. Furthermore, a strong correlation was observed between the rates of genome evolution and genome size. The detected genome size variations are predominantly caused by the length of intergenic spacers, rather than losses of genes and introns, gene pseudogenization or IR expansion or contraction. The genome sizes of these species are negatively correlated with nucleotide substitution rates. Species with shorter duration of the life cycle tend to exhibit shorter chloroplast genomes than those with longer life cycles. 相似文献

13.

Evolutionary Dynamics of Overlapped Genes in Salmonella

Yingqin Luo Fabia Battistuzzi Kui Lin 《PloS one》2013,8(11)

相似文献

14.

The complete sequence of the mitochondrial genome of Buteo buteo (Aves, Accipitridae) indicates an early split in the phylogeny of raptors 总被引：3，自引：0，他引：3

Haring E Kruckenhauser L Gamauf A Riesing MJ Pinsker W 《Molecular biology and evolution》2001,18(10):1892-1904

The complete sequence of the mitochondrial (mt) genome of Buteo buteo was determined. Its gene content and nucleotide composition are typical for avian genomes. Due to expanded noncoding sequences, Buteo possesses the longest mt genome sequenced so far (18,674 bp). The gene order comprising the control region and neighboring genes is identical to that of Falco peregrinus, suggesting that the corresponding rearrangement occurred before the falconid/accipitrid split. Phylogenetic analyses performed with the mt sequence of Buteo and nine other mt genomes suggest that for investigations at higher taxonomic levels (e.g., avian orders), concatenated rRNA and tRNA gene sequences are more informative than protein gene sequences with respect to resolution and bootstrap support. Phylogenetic analyses indicate an early split between Accipitridae and Falconidae, which, according to molecular dating of other avian divergence times, can be assumed to have taken place in the late Cretaceous 65-83 MYA. 相似文献

15.

Extended sequence of the turkey MHC <Emphasis Type="Italic">B</Emphasis>-locus and sequence variation in the highly polymorphic B-G loci

Bauer MM Reed KM 《Immunogenetics》2011,63(4):209-221

Genetic variation in the major histocompatibility complex (MHC) is directly correlated to differences in disease resistance. Immunity is greatly dependent on highly polymorphic genes in the MHC, such as class I, class II, and class III complement genes. Preliminary studies of wild turkey populations show extreme polymorphisms in a family of genes exclusive to the avian MHC, the class IV or B-G genes. Significance of this variation is unclear as there are few and conflicting studies of the expression of these genes. Confounding understanding of B-G variation is the lack of a complete delineation of the number of loci in the turkey genome. Direct 454 sequencing of a clone from the CHORI-260 BAC library was used to extend the turkey MHC B-locus sequence, identifying five additional complete B-locus genes including two B-G loci. Sequences of the new B-G genes were compared with those of other turkey gene (BG1–3) and sequences available for other galliformes. Phylogenetic analysis shows species-specific gene evolution supporting a birth–death model of evolution for the B-G gene family. Analysis of variation within the signal peptide sequence (exon 1) found two clusters of polymorphism among the turkey B-G genes. Resequencing of exon 1 in a diverse sample including wild, heritage, and commercial turkeys confirmed multiple alleles at each B-G gene. Future studies aim to correlate B-G variation with group and individual immunological differences. 相似文献

16.

The Complete Mitochondrial Genome of Gossypium hirsutum and Evolutionary Analysis of Higher Plant Mitochondrial Genomes

Guozheng Liu Dandan Cao Shuangshuang Li Aiguo Su Jianing Geng Corrinne E. Grover Songnian Hu Jinping Hua 《PloS one》2013,8(8)

Background

Mitochondria are the main manufacturers of cellular ATP in eukaryotes. The plant mitochondrial genome contains large number of foreign DNA and repeated sequences undergone frequently intramolecular recombination. Upland Cotton (Gossypium hirsutum L.) is one of the main natural fiber crops and also an important oil-producing plant in the world. Sequencing of the cotton mitochondrial (mt) genome could be helpful for the evolution research of plant mt genomes.

Methodology/Principal Findings

We utilized 454 technology for sequencing and combined with Fosmid library of the Gossypium hirsutum mt genome screening and positive clones sequencing and conducted a series of evolutionary analysis on Cycas taitungensis and 24 angiosperms mt genomes. After data assembling and contigs joining, the complete mitochondrial genome sequence of G. hirsutum was obtained. The completed G.hirsutum mt genome is 621,884 bp in length, and contained 68 genes, including 35 protein genes, four rRNA genes and 29 tRNA genes. Five gene clusters are found conserved in all plant mt genomes; one and four clusters are specifically conserved in monocots and dicots, respectively. Homologous sequences are distributed along the plant mt genomes and species closely related share the most homologous sequences. For species that have both mt and chloroplast genome sequences available, we checked the location of cp-like migration and found several fragments closely linked with mitochondrial genes.

Conclusion

The G. hirsutum mt genome possesses most of the common characters of higher plant mt genomes. The existence of syntenic gene clusters, as well as the conservation of some intergenic sequences and genic content among the plant mt genomes suggest that evolution of mt genomes is consistent with plant taxonomy but independent among different species. 相似文献

17.

Gene duplication and fragmentation in the zebra finch major histocompatibility complex

Christopher N Balakrishnan Robert Ekblom Martin Völker Helena Westerdahl Ricardo Godinez Holly Kotkiewicz David W Burt Tina Graves Darren K Griffin Wesley C Warren Scott V Edwards 《BMC biology》2010,8(1):1-19

Background

Due to its high polymorphism and importance for disease resistance, the major histocompatibility complex (MHC) has been an important focus of many vertebrate genome projects. Avian MHC organization is of particular interest because the chicken Gallus gallus, the avian species with the best characterized MHC, possesses a highly streamlined minimal essential MHC, which is linked to resistance against specific pathogens. It remains unclear the extent to which this organization describes the situation in other birds and whether it represents a derived or ancestral condition. The sequencing of the zebra finch Taeniopygia guttata genome, in combination with targeted bacterial artificial chromosome (BAC) sequencing, has allowed us to characterize an MHC from a highly divergent and diverse avian lineage, the passerines.

Results

The zebra finch MHC exhibits a complex structure and history involving gene duplication and fragmentation. The zebra finch MHC includes multiple Class I and Class II genes, some of which appear to be pseudogenes, and spans a much more extensive genomic region than the chicken MHC, as evidenced by the presence of MHC genes on each of seven BACs spanning 739 kb. Cytogenetic (FISH) evidence and the genome assembly itself place core MHC genes on as many as four chromosomes with TAP and Class I genes mapping to different chromosomes. MHC Class II regions are further characterized by high endogenous retroviral content. Lastly, we find strong evidence of selection acting on sites within passerine MHC Class I and Class II genes.

Conclusion

The zebra finch MHC differs markedly from that of the chicken, the only other bird species with a complete genome sequence. The apparent lack of synteny between TAP and the expressed MHC Class I locus is in fact reminiscent of a pattern seen in some mammalian lineages and may represent convergent evolution. Our analyses of the zebra finch MHC suggest a complex history involving chromosomal fission, gene duplication and translocation in the history of the MHC in birds, and highlight striking differences in MHC structure and organization among avian lineages. 相似文献

18.

Long-read sequence assembly: a technical evaluation in barley

Martin Mascher Thomas Wicker Jerry Jenkins Christopher Plott Thomas Lux Chu Shin Koh Jennifer Ens Heidrun Gundlach Lori B Boston Zuzana Tulpov Samuel Holden Inmaculada Hernndez-Pinzn Uwe Scholz Klaus F X Mayer Manuel Spannagl Curtis J Pozniak Andrew G Sharpe Hana &#x;imkov Matthew J Moscou Jane Grimwood Jeremy Schmutz Nils Stein 《The Plant cell》2021,33(6):1888

Sequence assembly of large and repeat-rich plant genomes has been challenging, requiring substantial computational resources and often several complementary sequence assembly and genome mapping approaches. The recent development of fast and accurate long-read sequencing by circular consensus sequencing (CCS) on the PacBio platform may greatly increase the scope of plant pan-genome projects. Here, we compare current long-read sequencing platforms regarding their ability to rapidly generate contiguous sequence assemblies in pan-genome studies of barley (Hordeum vulgare). Most long-read assemblies are clearly superior to the current barley reference sequence based on short-reads. Assemblies derived from accurate long reads excel in most metrics, but the CCS approach was the most cost-effective strategy for assembling tens of barley genomes. A downsampling analysis indicated that 20-fold CCS coverage can yield very good sequence assemblies, while even five-fold CCS data may capture the complete sequence of most genes. We present an updated reference genome assembly for barley with near-complete representation of the repeat-rich intergenic space. Long-read assembly can underpin the construction of accurate and complete sequences of multiple genomes of a species to build pan-genome infrastructures in Triticeae crops and their wild relatives.

A greatly improved reference genome sequence of barley was assembled from accurate long reads. 相似文献

19.

Molecular evolutionary genomics of birds

Ellegren H 《Cytogenetic and genome research》2007,117(1-4):120-130

Insight into the molecular evolution of birds has been offered by the steady accumulation of avian DNA sequence data, recently culminating in the first draft sequence of an avian genome, that of chicken. By studying avian molecular evolution we can learn about adaptations and phenotypic evolution in birds, and also gain an understanding of the similarities and differences between mammalian and avian genomes. In both these lineages, there is pronounced isochore structure with highly variable GC content. However, while mammalian isochores are decaying, they are maintained in the chicken lineage, which is consistent with a biased gene conversion model where the high and variable recombination rate of birds reinforces heterogeneity in GC. In Galliformes, GC is positively correlated with the rate of nucleotide substitution; the mean neutral mutation rate is 0.12-0.15% at each site per million years but this estimate comes with significant local variation in the rate of mutation. Comparative genomics reveals lower d(N)/d(S) ratios on micro- compared to macrochromosomes, possibly due to population genetic effects or a non-random distribution of genes with respect to chromosome size. A non-random genomic distribution is shown by genes with sex-biased expression, with male-biased genes over-represented and female-biased genes under-represented on the Z chromosome. A strong effect of selection is evident on the non-recombining W chromosome with high d(N)/d(S) ratios and limited polymorphism. Nucleotide diversity in chicken is estimated at 4-5 x 10(-3) which might be seen as surprisingly high given presumed bottlenecks during domestication, but is lower than that recently observed in several natural populations of other species. Several important aspects of the molecular evolutionary process of birds remain to be understood and it can be anticipated that the upcoming genome sequence of a second bird species, the zebra finch, as well as the integration of data on gene expression, shall further advance our knowledge of avian evolution. 相似文献

20.

Gene identification in novel eukaryotic genomes by self-training algorithm 总被引：8，自引：0，他引：8

Lomsadze A Ter-Hovhannisyan V Chernoff YO Borodovsky M 《Nucleic acids research》2005,33(20):6494-6506

Finding new protein-coding genes is one of the most important goals of eukaryotic genome sequencing projects. However, genomic organization of novel eukaryotic genomes is diverse and ab initio gene finding tools tuned up for previously studied species are rarely suitable for efficacious gene hunting in DNA sequences of a new genome. Gene identification methods based on cDNA and expressed sequence tag (EST) mapping to genomic DNA or those using alignments to closely related genomes rely either on existence of abundant cDNA and EST data and/or availability on reference genomes. Conventional statistical ab initio methods require large training sets of validated genes for estimating gene model parameters. In practice, neither one of these types of data may be available in sufficient amount until rather late stages of the novel genome sequencing. Nevertheless, we have shown that gene finding in eukaryotic genomes could be carried out in parallel with statistical models estimation directly from yet anonymous genomic DNA. The suggested method of parallelization of gene prediction with the model parameters estimation follows the path of the iterative Viterbi training. Rounds of genomic sequence labeling into coding and non-coding regions are followed by the rounds of model parameters estimation. Several dynamically changing restrictions on the possible range of model parameters are added to filter out fluctuations in the initial steps of the algorithm that could redirect the iteration process away from the biologically relevant point in parameter space. Tests on well-studied eukaryotic genomes have shown that the new method performs comparably or better than conventional methods where the supervised model training precedes the gene prediction step. Several novel genomes have been analyzed and biologically interesting findings are discussed. Thus, a self-training algorithm that had been assumed feasible only for prokaryotic genomes has now been developed for ab initio eukaryotic gene identification. 相似文献