首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 343 毫秒
1.
Copy number variations (CNVs) have provided a dynamic aspect to the apparently static human genome. We have analyzed CNVs larger than 100 kb in 477 healthy individuals from 26 diverse Indian populations of different linguistic, ethnic and geographic backgrounds. These CNVRs were identified using the Affymetrix 50K Xba 240 Array. We observed 1,425 and 1,337 CNVRs in the deletion and amplification sets, respectively, after pooling data from all the populations. More than 50% of the genes encompassed entirely in CNVs had both deletions and amplifications. There was wide variability across populations not only with respect to CNV extent (ranging from 0.04–1.14% of genome under deletion and 0.11–0.86% under amplification) but also in terms of functional enrichments of processes like keratinization, serine proteases and their inhibitors, cadherins, homeobox, olfactory receptors etc. These did not correlate with linguistic, ethnic, geographic backgrounds and size of populations. Certain processes were near exclusive to deletion (serine proteases, keratinization, olfactory receptors, GPCRs) or duplication (homeobox, serine protease inhibitors, embryonic limb morphogenesis) datasets. Populations having same enriched processes were observed to contain genes from different genomic loci. Comparison of polymorphic CNVRs (5% or more) with those cataloged in Database of Genomic Variants revealed that 78% (2473) of the genes in CNVRs in Indian populations are novel. Validation of CNVs using Sequenom MassARRAY revealed extensive heterogeneity in CNV boundaries. Exploration of CNV profiles in such diverse populations would provide a widely valuable resource for understanding diversity in phenotypes and disease.  相似文献   

2.
Due to the recent growth of the alien population in Korea, tracking the ethnic origin of human specimens has gained importance in genetic forensics. To address this issue, we developed a method to analyze single nucleotide polymorphisms (SNPs) based on next-generation sequencing (NGS) technology, which is now being used for personal identification in forensics. We designed a panel of 153 Korean-specific high-performance multiplexed SNPs and performed NGS using 233 DNA samples collected from eight different ethnic groups around the world. Eight Korean-specific genetic markers (rs28777, rs1010872, rs6043841, rs6034433, rs885479, rs2503107, rs4530059, and rs214955) that were screened showed significant variability among the ethnic groups. Three markers were novel SNPs that were absent from our multiplexed SNP panel, and two markers were associated with specific phenotypes. Our high-performance multiplexed SNP panel allows efficient screening of Korean-specific SNP alleles in populations that are genetically similar to the Korean population (e.g., Japanese and northeast Asians including Chinese and Eurasians), which will be useful for personal identification, paternity testing, and forensic investigations.  相似文献   

3.
Advances in high‐throughput sequencing have promoted the collection of reference genomes and genome‐wide diversity. However, the assessment of genomic variation among populations has hitherto mainly been surveyed through single‐nucleotide polymorphisms (SNPs) and largely ignored the often major fraction of genomes represented by transposable elements (TEs). Despite accumulating evidence supporting the evolutionary significance of TEs, comprehensive surveys remain scarce. Here, we sequenced the full genomes of 304 individuals of Arabis alpina sampled from four nearby natural populations to genotype SNPs as well as polymorphic long terminal repeat retrotransposons (polymorphic TEs; i.e., presence/absence of TE insertions at specific loci). We identified 291,396 SNPs and 20,548 polymorphic TEs, comparing their contributions to genomic diversity and divergence across populations. Few SNPs were shared among populations and overall showed high population‐specific variation, whereas most polymorphic TEs segregated among populations. The genomic context of these two classes of variants further highlighted candidate adaptive loci having a putative impact on functional genes. In particular, 4.96% of the SNPs were identified as nonsynonymous or affecting start/stop codons. In contrast, 43% of the polymorphic TEs were present next to Arabis genes enriched in functional categories related to the regulation of reproduction and responses to biotic as well as abiotic stresses. This unprecedented data set, mapping variation gained from SNPs and complementary polymorphic TEs within and among populations, will serve as a rich resource for addressing microevolutionary processes shaping genome variation.  相似文献   

4.
Secondary metabolites (SMs) are crucial for fungi and vary in function from beneficial antibiotics to pathogenicity factors. To generate diversified SMs that enable different functions, SM-coding regions rapidly evolve in fungal genomes. However, the driving force and genetic mechanism of fungal SM diversification in the context of host-pathogen interactions remain largely unknown. Previously, we grouped field populations of the rice blast fungus Magnaporthe oryzae (syn: Pyricularia oryzae) into three major globally distributed clades based on population genomic analyses. Here, we characterize a recent duplication of an avirulent gene-containing SM cluster, ACE1, in a clonal M. oryzae population (Clade 2). We demonstrate that the ACE1 cluster is specifically duplicated in Clade 2, a dominant clade in indica rice-growing areas. With long-read sequencing, we obtained chromosome-level genome sequences of four Clade 2 isolates, which displayed differences in genomic organization of the ACE1 duplication process. Comparative genomic analyses suggested that the original ACE1 cluster experienced frequent rearrangement in Clade 2 isolates and revealed that the new ACE1 cluster is located in a newly formed and transposable element-rich region. Taken together, these results highlight the frequent mutation and expansion of an avirulent gene-containing SM cluster through transposable element-mediated whole-cluster duplication in the context of host-pathogen interactions.  相似文献   

5.
Konkel MK  Wang J  Liang P  Batzer MA 《Gene》2007,390(1-2):28-38
Mobile elements represent a relatively new class of markers for the study of human evolution. Long interspersed elements (LINEs) belong to a group of retrotransposons comprising approximately 21% of the human genome. Young LINE-1 (L1) elements that have integrated recently into the human genome can be polymorphic for insertion presence/absence in different human populations at particular chromosomal locations. To identify putative novel L1 insertion polymorphisms, we computationally compared two draft assemblies of the whole human genome (Public and Celera Human Genome assemblies). We identified a total of 148 potential polymorphic L1 insertion loci, among which 73 were candidates for novel polymorphic loci. Based on additional analyses we selected 34 loci for further experimental studies. PCR-based assays and DNA sequence analysis were performed for these 34 loci in 80 unrelated individuals from four diverse human populations: African-American, Asian, Caucasian, and South American. All but two of the selected loci were confirmed as polymorphic in our human population panel. Approximately 47% of the analyzed loci integrated into other repetitive elements, most commonly older L1s. One of the insertions was accompanied by a BC200 sequence. Collectively, these mobile elements represent a valuable source of genomic polymorphism for the study of human population genetics. Our results also suggest that the exhaustive identification of L1 insertion polymorphisms is far from complete, and new whole genome sequences are valuable sources for finding novel retrotransposon insertion polymorphisms.  相似文献   

6.
The discovery of RFLPs and their utilization as genetic markers has revolutionized research in human molecular genetics. However, only a fraction of the DNA sequence polymorphisms in the human genome affect the length of a restriction fragment and hence result in an RFLP. Polymorphisms that are not detected as RFLPs are typically passed over in the screening process though they represent a potentially important source of informative genetic markers. We have used a rapid method for the detection of naturally occurring DNA sequence variations that is based on enzymatic amplification and direct sequencing of genomic DNA. This approach can detect essentially all useful sequence variations within the region screened. We demonstrate the feasibility of the technique by applying it to the human retinoblastoma susceptibility locus. We screened 3,712 bp of genomic DNA from each of nine individuals and found four DNA sequence polymorphisms. At least one of these DNA sequence polymorphisms was informative in each of three families with hereditary retinoblastoma that were not informative with any of the known RFLPs at this locus. We believe that direct sequencing is a reasonable alternative to other methods of screening for DNA sequence polymorphisms and that it represents a step forward for obtaining informative markers at well-characterized loci that have been minimally informative in the past.  相似文献   

7.
Microsatellites, short tandem repeats, are useful markers for genetic analysis because of their high frequency of occurrence over the genome, high information content due to variable repeat lengths, and ease of typing. To establish a panel of microsatellite markers useful for genetic studies of the Korean population, the allele frequencies and heterozygosities of 207 microsatellite markers in 119 unrelated Korean, Indian and Pakistani individuals were compared. The average heterozygosity of the Korean population was 0.71, similar to that of the Indian and Pakistani populations. More than 80% of the markers showed heterozygosity of over 0.6 and were valuable as genetic markers for genome-wide screening for disease susceptibility loci in these populations. To identify the allelic distributions of the multilocus genetic data from these microsatellite markers, the population structures were assessed by clustering. These markers supported, with the most probability, three clustering groups corresponding to the three geographical populations. When we assumed only two hypothetical clusters (K), the Korean population was separate from the others, suggesting a relatively deep divergence of the Korean population. The present 207 microsatellite markers appear to reflect the historical and geographical origins of the different populations as well as displaying a similar degree of variation to that seen in previously published genetic data. Thus, these markers will be useful as a reference for human genetic studies on Asians.  相似文献   

8.
Copy-number variations (CNVs) are widespread in the human genome, but comprehensive assignments of integer locus copy-numbers (i.e., copy-number genotypes) that, for example, enable discrimination of homozygous from heterozygous CNVs, have remained challenging. Here we present CopySeq, a novel computational approach with an underlying statistical framework that analyzes the depth-of-coverage of high-throughput DNA sequencing reads, and can incorporate paired-end and breakpoint junction analysis based CNV-analysis approaches, to infer locus copy-number genotypes. We benchmarked CopySeq by genotyping 500 chromosome 1 CNV regions in 150 personal genomes sequenced at low-coverage. The assessed copy-number genotypes were highly concordant with our performed qPCR experiments (Pearson correlation coefficient 0.94), and with the published results of two microarray platforms (95-99% concordance). We further demonstrated the utility of CopySeq for analyzing gene regions enriched for segmental duplications by comprehensively inferring copy-number genotypes in the CNV-enriched >800 olfactory receptor (OR) human gene and pseudogene loci. CopySeq revealed that OR loci display an extensive range of locus copy-numbers across individuals, with zero to two copies in some OR loci, and two to nine copies in others. Among genetic variants affecting OR loci we identified deleterious variants including CNVs and SNPs affecting ~15% and ~20% of the human OR gene repertoire, respectively, implying that genetic variants with a possible impact on smell perception are widespread. Finally, we found that for several OR loci the reference genome appears to represent a minor-frequency variant, implying a necessary revision of the OR repertoire for future functional studies. CopySeq can ascertain genomic structural variation in specific gene families as well as at a genome-wide scale, where it may enable the quantitative evaluation of CNVs in genome-wide association studies involving high-throughput sequencing.  相似文献   

9.
Despite considerable excitement over the potential functional significance of copy-number variants (CNVs), we still lack knowledge of the fine-scale architecture of the large majority of CNV regions in the human genome. In this study, we used a high-resolution array-based comparative genomic hybridization (aCGH) platform that targeted known CNV regions of the human genome at approximately 1 kb resolution to interrogate the genomic DNAs of 30 individuals from four HapMap populations. Our results revealed that 1020 of 1153 CNV loci (88%) were actually smaller in size than what is recorded in the Database of Genomic Variants based on previously published studies. A reduction in size of more than 50% was observed for 876 CNV regions (76%). We conclude that the total genomic content of currently known common human CNVs is likely smaller than previously thought. In addition, approximately 8% of the CNV regions observed in multiple individuals exhibited genomic architectural complexity in the form of smaller CNVs within larger ones and CNVs with interindividual variation in breakpoints. Future association studies that aim to capture the potential influences of CNVs on disease phenotypes will need to consider how to best ascertain this previously uncharacterized complexity.  相似文献   

10.
India represents an amazing confluence of geographically, linguistically and socially disparate ethnic populations (Indian Genome Variation Consortium, J Genet 87:3–20, 2008). Understanding the genetic diversity of Indian population remains a daunting task. In this paper we present detailed analysis of genomic variations (high-depth coverage (~?30×) using Illumina Hiseq 2000 platform) from three healthy Indian male individuals each belonging to three geographically delineated regions and linguistic phylum viz. high altitude region of Ladakh (Tibeto-Burman linguistic phylum), sub mountainous region of Kumaun (Indo-European linguistic phylum) and sea level region of Telangana (Dravidian linguistic phylum) for probing the extent of genetic diversity in our population. The sequencing analysis provided high quality data (~?95% of the total reads aligned to the human reference genome for each sample) and very good alignment quality (>?80% of the filtered mapped reads had a quality score of 60). A total of 4.3, 3.7 and 4.3 million single nucleotide variations were identified in the genome of high altitude, sub mountainous and sea level respectively by comparing with human reference genome. Approximately 17.3, 18.2, 17.4% of the variants were unique in the three genomes. The study identified many novel variations in the three diverse genomes (132,970 in Ladakh, 112,317 in Kumaun and 128,881 in Telangana individual) and is an important resource for creating a baseline and a comprehensive catalogue of human genomic variation across the Indian as well as the Asian continent.  相似文献   

11.
BACKGROUND/AIMS: The L1 retrotransposable element family is the most successful self-replicating genomic parasite of the human genome. L1 elements drive replication of Alu elements, and both have had far-reaching impacts on the human genome. We use L1 and Alu insertion polymorphisms to analyze human population structure. METHODS: We genotyped 75 recent, polymorphic L1 insertions in 317 individuals from 21 populations in sub-Saharan Africa, East Asia, Europe and the Indian subcontinent. This is the first sample of L1 loci large enough to support detailed population genetic inference. We analyzed these data in parallel with a set of 100 polymorphic Alu insertion loci previously genotyped in the same individuals. RESULTS AND CONCLUSION: The data sets yield congruent results that support the recent African origin model of human ancestry. A genetic clustering algorithm detects clusters of individuals corresponding to continental regions. The number of loci sampled is critical: with fewer than 50 typical loci, structure cannot be reliably discerned in these populations. The inclusion of geographically intermediate populations (from India) reduces the distinctness of clustering. Our results indicate that human genetic variation is neither perfectly correlated with geographic distance (purely clinal) nor independent of distance (purely clustered), but a combination of both: stepped clinal.  相似文献   

12.
13.
Retroelements (REs) occupy up to 40% of the human genome. Newly integrated REs can change the pattern of expression of pre-existing host genes and therefore might play a significant role in evolution. In particular, human- and primate-specific REs could affect the divergence of the Hominoidea superfamily. A comparative genome-wide analysis of RE sites of integration, neighboring genes, and their regulatory interplay in human and ape genomes would be of help in understanding the impact of REs on evolution and genome regulation. We have developed a technique for the genome-wide comparison of the integrations of transposable elements in genomic DNAs of closely related species. The technique called targeted genome differences analysis (TGDA) is also useful for the detection of deletion/insertion polymorphisms of REs. The technique is based on an enhanced version of subtractive hybridization and does not require preliminary knowledge of the genome sequences under comparison. In this report, we describe its application to the detection and analysis of human specific L1 integrations and their polymorphisms. We obtained a library highly enriched in human-specific L1 insertions and identified 24 such new insertions. Many of these insertions are polymorphic in human populations. The total number of human-specific L1 inserts was estimated to be approximately 4000. The results suggest that TGDA is a universal method that can be successfully used for the detection of evolutionary and polymorphic markers in any closely related genomes.  相似文献   

14.
Alu elements are transposable elements that have reached over one million copies in the human genome. Some Alu elements inserted in the genome so recently that they are still polymorphic for insertion presence or absence in human populations. Recently, there has been an increasing interest in using Alu variation for studies of human population genetic structure and inference of individual geographic origin. Currently, this requires a high number of Alu loci. Here, we used a linker-mediated polymerase chain reaction method to preferentially identify low-frequency Alu elements in various human DNA samples with different geographic origins. The candidate Alu loci were subsequently genotyped in 18 worldwide human populations (approximately 370 individuals), resulting in the identification of two new Alu insertions restricted to populations of African ancestry. Our results suggest that it may ultimately become possible to correctly infer the geographic affiliation of unknown samples with high levels of confidence without having to genotype as many as 100 Alu loci. This is desirable if Alu insertion polymorphisms are to be used for human evolution studies or forensic applications.  相似文献   

15.
Olfactory receptors (ORs), which are involved in odorant recognition, form the largest mammalian protein superfamily. The genomic content of OR genes is considerably reduced in humans, as reflected by the relatively small repertoire size and the high fraction ( approximately 55%) of human pseudogenes. Since several recent low-resolution surveys suggested that OR genomic loci are frequently affected by copy-number variants (CNVs), we hypothesized that CNVs may play an important role in the evolution of the human olfactory repertoire. We used high-resolution oligonucleotide tiling microarrays to detect CNVs across 851 OR gene and pseudogene loci. Examining genomic DNA from 25 individuals with ancestry from three populations, we identified 93 OR gene loci and 151 pseudogene loci affected by CNVs, generating a mosaic of OR dosages across persons. Our data suggest that approximately 50% of the CNVs involve more than one OR, with the largest CNV spanning 11 loci. In contrast to earlier reports, we observe that CNVs are more frequent among OR pseudogenes than among intact genes, presumably due to both selective constraints and CNV formation biases. Furthermore, our results show an enrichment of CNVs among ORs with a close human paralog or lacking a one-to-one ortholog in chimpanzee. Interestingly, among the latter we observed an enrichment in CNV losses over gains, a finding potentially related to the known diminution of the human OR repertoire. Quantitative PCR experiments performed for 122 sampled ORs agreed well with the microarray results and uncovered 23 additional CNVs. Importantly, these experiments allowed us to uncover nine common deletion alleles that affect 15 OR genes and five pseudogenes. Comparison to the chimpanzee reference genome revealed that all of the deletion alleles are human derived, therefore indicating a profound effect of human-specific deletions on the individual OR gene content. Furthermore, these deletion alleles may be used in future genetic association studies of olfactory inter-individual differences.  相似文献   

16.
Chung HY  Kim TH  Choi BH  Jang GW  Lee JW  Lee KT  Ha JM 《Biochemical genetics》2006,44(11-12):527-541
Microsatellite loci were isolated using five repetitive probes for Korean native cattle. Eleven microsatellite loci were developed based on a biotin hybrid capture method, and enrichment of the genomic libraries (AAAT, TG, AG, T, and TGC repeats) was performed using Sau3AI adapters. The isolated markers were tested in two half-sib Korean cattle families and four imported breeds (Angus, Limousine, Holstein, and Shorthorn). Nine informative microsatellite loci were observed, and two microsatellite loci were revealed as monomorphic in Korean cattle. In the imported breeds, however, all of the markers were informative. In total, 213 alleles were obtained at the 11 loci across five breeds, and the average number of alleles found per locus, considering all populations, was 4.26. Heterozygosity was 0.71 (expected) and 0.57 (observed). The range of the polymorphic information content for the markers in all cattle populations was 0.43-0.69. Eleven percent of genetic variation was attributed to differentiation between populations as determined by the mean F (ST) values. The remaining 89% corresponded to differences among individuals. The isolated markers may be used to identify and classify the local breeds on a molecular basis.  相似文献   

17.
Optimal integration of next-generation sequencing into mainstream research requires re-evaluation of how problems can be reasonably overcome and what questions can be asked. One potential application is the rapid acquisition of genomic information to identify microsatellite loci for evolutionary, population genetic and chromosome linkage mapping research on non-model and not previously sequenced organisms. Here, we report on results using high-throughput sequencing to obtain a large number of microsatellite loci from the venomous snake Agkistrodon contortrix, the copperhead. We used the 454 Genome Sequencer FLX next-generation sequencing platform to sample randomly ∼27 Mbp (128 773 reads) of the copperhead genome, thus sampling about 2% of the genome of this species. We identified microsatellite loci in 11.3% of all reads obtained, with 14 612 microsatellite loci identified in total, 4564 of which had flanking sequences suitable for polymerase chain reaction primer design. The random sequencing-based approach to identify microsatellites was rapid, cost-effective and identified thousands of useful microsatellite loci in a previously unstudied species.  相似文献   

18.
We developed 'clipping reveals structure' (CREST), an algorithm that uses next-generation sequencing reads with partial alignments to a reference genome to directly map structural variations at the nucleotide level of resolution. Application of CREST to whole-genome sequencing data from five pediatric T-lineage acute lymphoblastic leukemias (T-ALLs) and a human melanoma cell line, COLO-829, identified 160 somatic structural variations. Experimental validation exceeded 80%, demonstrating that CREST had a high predictive accuracy.  相似文献   

19.
Wang YM  Dong ZY  Zhang ZJ  Lin XY  Shen Y  Zhou D  Liu B 《Genetics》2005,170(4):1945-1956
To study the possible impact of alien introgression on a recipient plant genome, we examined >6000 unbiased genomic loci of three stable rice recombinant inbred lines (RILs) derived from intergeneric hybridization between rice (cv. Matsumae) and a wild relative (Zizania latifolia Griseb.) followed by successive selfing. Results from amplified fragment length polymorphism (AFLP) analysis showed that, whereas the introgressed Zizania DNA comprised <0.1% of the genome content in the RILs, extensive and genome-wide de novo variations occurred in up to 30% of the analyzed loci for all three lines studied. The AFLP-detected changes were validated by DNA gel-blot hybridization and/or sequence analysis of genomic loci corresponding to a subset of the differentiating AFLP fragments. A BLAST analysis revealed that the genomic variations occurred in diverse sequences, including protein-coding genes, transposable elements, and sequences of unknown functions. Pairwise sequence comparison of selected loci between a RIL and its rice parent showed that the variations represented either base substitutions or small insertion/deletions. Genome variations were detected in all 12 rice chromosomes, although their distribution was uneven both among and within chromosomes. Taken together, our results imply that even cryptic alien introgression can be highly mutagenic to a recipient plant genome.  相似文献   

20.
Clones from full-length enriched cDNA libraries serve as valuable resources for functional genomic studies. We analyzed 3.210 chromatograms obtained from sequencing the 5′-ends of brainstem, liver, neocortex, and spleen clones derived from full-length enriched cDNA libraries of Korean native pigs. In addition, 50,000 pig EST sequence trace files were obtained from Genbank and combined with our sequencing information for SNP identificationin silico. For the SNP analysis, neocortex, and liver libraries were newly constructed, whereas the sequencing results from brainstem and spleen libraries were from previously constructed libraries. The putative SNPs from thein silico analysis were confirmed by genomic PCR from a group of 20 pigs of four different breeds. Using this approach, 86% of cSNPs identifiedin silico were confirmed and the SNP detection frequency was 1 SNP per 338 bp. Interestingly, we found a valine deletion at amino acid position 126 of the neuronal and endocrine protein gene in the Korean native pig. We confirmed that this deletion was caused by alternative splicing at the NAGNAG acceptors. Our study shows that large-scale EST sequencing of Korean native pigs can be effectively employed for natural polymorphism-based pig genome analysis.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号