首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 78 毫秒
1.

Background

One of the goals of genomics is to identify the genetic loci responsible for variation in phenotypic traits. The completion of the tomato genome sequence and recent advances in DNA sequencing technology allow for in-depth characterization of genetic variation present in the tomato genome. Like many self-pollinated crops, cultivated tomato accessions show a low molecular but high phenotypic diversity. Here we describe the whole-genome resequencing of eight accessions (four cherry-type and four large fruited lines) chosen to represent a large range of intra-specific variability and the identification and annotation of novel polymorphisms.

Results

The eight genomes were sequenced using the GAII Illumina platform. Comparison of the sequences with the reference genome yielded more than 4 million single nucleotide polymorphisms (SNPs). This number varied from 80,000 to 1.5 million according to the accessions. Almost 128,000 InDels were detected. The distribution of SNPs and InDels across and within chromosomes was highly heterogeneous revealing introgressions from wild species and the mosaic structure of the genomes of the cherry tomato accessions. In-depth annotation of the polymorphisms identified more than 16,000 unique non-synonymous SNPs. In addition 1,686 putative copy-number variations (CNVs) were identified.

Conclusions

This study represents the first whole genome resequencing experiment in cultivated tomato. Substantial genetic differences exist between the sequenced tomato accessions and the reference sequence. The heterogeneous distribution of the polymorphisms may be related to introgressions that occurred during domestication or breeding. The annotated SNPs, InDels and CNVs identified in this resequencing study will serve as useful genetic tools, and as candidate polymorphisms in the search for phenotype-altering DNA variations.

Electronic supplementary material

The online version of this article (doi:10.1186/1471-2164-14-791) contains supplementary material, which is available to authorized users.  相似文献   

2.

Background

The genome of the melon (Cucumis melo L.) double-haploid line DHL92 was recently sequenced, with 87.5 and 80.8% of the scaffold assembly anchored and oriented to the 12 linkage groups, respectively. However, insufficient marker coverage and a lack of recombination left several large, gene rich scaffolds unanchored, and some anchored scaffolds unoriented. To improve the anchoring and orientation of the melon genome assembly, we used resequencing data between the parental lines of DHL92 to develop a new set of SNP markers from unanchored scaffolds.

Results

A high-resolution genetic map composed of 580 SNPs was used to anchor 354.8 Mb of sequence, contained in 141 scaffolds (average size 2.5 Mb) and corresponding to 98.2% of the scaffold assembly, to the 12 melon chromosomes. Over 325.4 Mb (90%) of the assembly was oriented. The genetic map revealed regions of segregation distortion favoring SC alleles as well as recombination suppression regions coinciding with putative centromere, 45S, and 5S rDNA sites. New chromosome-scale pseudomolecules were created by incorporating to the previous v3.5 version an additional 38.3 Mb of anchored sequence representing 1,837 predicted genes contained in 55 scaffolds. Using fluorescent in situ hybridization (FISH) with BACs that produced chromosome-specific signals, melon chromosomes that correspond to the twelve linkage groups were identified, and a standardized karyotype of melon inbred line T111 was developed.

Conclusions

By utilizing resequencing data and targeted SNP selection combined with a large F2 mapping population, we significantly improved the quantity of anchored and oriented melon scaffold genome assembly. Using genome information combined with FISH mapping provided the first cytogenetic map of an inodorus melon type. With these results it was possible to make inferences on melon chromosome structure by relating zones of recombination suppression to centromeres and 45S and 5S heterochromatic regions. This study represents the first steps towards the integration of the high-resolution genetic and cytogenetic maps with the genomic sequence in melon that will provide more information on genome organization and allow for the improvement of the melon genome draft sequence.

Electronic supplementary material

The online version of this article (doi:10.1186/s12864-014-1196-3) contains supplementary material, which is available to authorized users.  相似文献   

3.

Background

Single nucleotide polymorphisms (SNPs) are the most common type of genetic variation. Identification of large numbers of SNPs is helpful for genetic diversity analysis, map-based cloning, genome-wide association analyses and marker-assisted breeding. Recently, identifying genome-wide SNPs in allopolyploid Brassica napus (rapeseed, canola) by resequencing many accessions has become feasible, due to the availability of reference genomes of Brassica rapa (2n = AA) and Brassica oleracea (2n = CC), which are the progenitor species of B. napus (2n = AACC). Although many SNPs in B. napus have been released, the objective in the present study was to produce a larger, more informative set of SNPs for large-scale and efficient genotypic screening. Hence, short-read genome sequencing was conducted on ten elite B. napus accessions for SNP discovery. A subset of these SNPs was randomly selected for sequence validation and for genotyping efficiency testing using the Illumina GoldenGate assay.

Results

A total of 892,536 bi-allelic SNPs were discovered throughout the B. napus genome. A total of 36,458 putative amino acid variants were located in 13,552 protein-coding genes, which were predicted to have enriched binding and catalytic activity as a result. Using the GoldenGate genotyping platform, 94 of 96 SNPs sampled could effectively distinguish genotypes of 130 lines from two mapping populations, with an average call rate of 92%.

Conclusions

Despite the polyploid nature of B. napus, nearly 900,000 simple SNPs were identified by whole genome resequencing. These SNPs were predicted to be effective in high-throughput genotyping assays (51% polymorphic SNPs, 92% average call rate using the GoldenGate assay, leading to an estimated >450 000 useful SNPs). Hence, the development of a much larger genotyping array of informative SNPs is feasible. SNPs identified in this study to cause non-synonymous amino acid substitutions can also be utilized to directly identify causal genes in association studies.  相似文献   

4.

Background

Ultra high throughput sequencing (UHTS) technologies find an important application in targeted resequencing of candidate genes or of genomic intervals from genetic association studies. Despite the extraordinary power of these new methods, they are still rarely used in routine analysis of human genomic variants, in part because of the absence of specific standard procedures. The aim of this work is to provide human molecular geneticists with a tool to evaluate the best UHTS methodology for efficiently detecting DNA changes, from common SNPs to rare mutations.

Methodology/Principal Findings

We tested the three most widespread UHTS platforms (Roche/454 GS FLX Titanium, Illumina/Solexa Genome Analyzer II and Applied Biosystems/SOLiD System 3) on a well-studied region of the human genome containing many polymorphisms and a very rare heterozygous mutation located within an intronic repetitive DNA element. We identify the qualities and the limitations of each platform and describe some peculiarities of UHTS in resequencing projects.

Conclusions/Significance

When appropriate filtering and mapping procedures are applied UHTS technology can be safely and efficiently used as a tool for targeted human DNA variations detection. Unless particular and platform-dependent characteristics are needed for specific projects, the most relevant parameter to consider in mainstream human genome resequencing procedures is the cost per sequenced base-pair associated to each machine.  相似文献   

5.

Background

Human Papillomavirus type 16 (HPV16) causes over half of all cervical cancer and some HPV16 variants are more oncogenic than others. The genetic basis for the extraordinary oncogenic properties of HPV16 compared to other HPVs is unknown. In addition, we neither know which nucleotides vary across and within HPV types and lineages, nor which of the single nucleotide polymorphisms (SNPs) determine oncogenicity.

Methods

A reference set of 62 HPV16 complete genome sequences was established and used to examine patterns of evolutionary relatedness amongst variants using a pairwise identity heatmap and HPV16 phylogeny. A BLAST-based algorithm was developed to impute complete genome data from partial sequence information using the reference database. To interrogate the oncogenic risk of determined and imputed HPV16 SNPs, odds-ratios for each SNP were calculated in a case-control viral genome-wide association study (VWAS) using biopsy confirmed high-grade cervix neoplasia and self-limited HPV16 infections from Guanacaste, Costa Rica.

Results

HPV16 variants display evolutionarily stable lineages that contain conserved diagnostic SNPs. The imputation algorithm indicated that an average of 97.5±1.03% of SNPs could be accurately imputed. The VWAS revealed specific HPV16 viral SNPs associated with variant lineages and elevated odds ratios; however, individual causal SNPs could not be distinguished with certainty due to the nature of HPV evolution.

Conclusions

Conserved and lineage-specific SNPs can be imputed with a high degree of accuracy from limited viral polymorphic data due to the lack of recombination and the stochastic mechanism of variation accumulation in the HPV genome. However, to determine the role of novel variants or non-lineage-specific SNPs by VWAS will require direct sequence analysis. The investigation of patterns of genetic variation and the identification of diagnostic SNPs for lineages of HPV16 variants provides a valuable resource for future studies of HPV16 pathogenicity.  相似文献   

6.

Purpose

FKBP51, (FKBP5), is a negative regulator of Akt. Variability in FKBP5 expression level is a major factor contributing to variation in response to chemotherapeutic agents including gemcitabine, a first line treatment for pancreatic cancer. Genetic variation in FKBP5 could influence its function and, ultimately, treatment response of pancreatic cancer.

Experimental Design

We set out to comprehensively study the role of genetic variation in FKBP5 identified by Next Generation DNA resequencing on response to gemcitabine treatment of pancreatic cancer by utilizing both tumor and germline DNA samples from 43 pancreatic cancer patients, including 19 paired normal-tumor samples. Next, genotype-phenotype association studies were performed with overall survival as well as with FKBP5 gene expression in tumor using the same samples in which resequencing had been performed, followed by functional genomics studies.

Results

In-depth resequencing identified 404 FKBP5 single nucleotide polymorphisms (SNPs) in normal and tumor DNA. SNPs with the strongest associations with survival or FKBP5 expression were subjected to functional genomic study. Electromobility shift assay showed that the rs73748206 “A(T)” SNP altered DNA-protein binding patterns, consistent with significantly increased reporter gene activity, possibly through its increased binding to Glucocorticoid Receptor (GR). The effect of rs73748206 was confirmed on the basis of its association with FKBP5 expression by affecting the binding to GR in lymphoblastoid cell lines derived from the same patients for whom DNA was used for resequencing.

Conclusion

This comprehensive FKBP5 resequencing study provides insights into the role of genetic variation in variation of gemcitabine response.  相似文献   

7.
《Genome biology》2013,14(7):R82

Background

The mouse inbred line C57BL/6J is widely used in mouse genetics and its genome has been incorporated into many genetic reference populations. More recently large initiatives such as the International Knockout Mouse Consortium (IKMC) are using the C57BL/6N mouse strain to generate null alleles for all mouse genes. Hence both strains are now widely used in mouse genetics studies. Here we perform a comprehensive genomic and phenotypic analysis of the two strains to identify differences that may influence their underlying genetic mechanisms.

Results

We undertake genome sequence comparisons of C57BL/6J and C57BL/6N to identify SNPs, indels and structural variants, with a focus on identifying all coding variants. We annotate 34 SNPs and 2 indels that distinguish C57BL/6J and C57BL/6N coding sequences, as well as 15 structural variants that overlap a gene. In parallel we assess the comparative phenotypes of the two inbred lines utilizing the EMPReSSslim phenotyping pipeline, a broad based assessment encompassing diverse biological systems. We perform additional secondary phenotyping assessments to explore other phenotype domains and to elaborate phenotype differences identified in the primary assessment. We uncover significant phenotypic differences between the two lines, replicated across multiple centers, in a number of physiological, biochemical and behavioral systems.

Conclusions

Comparison of C57BL/6J and C57BL/6N demonstrates a range of phenotypic differences that have the potential to impact upon penetrance and expressivity of mutational effects in these strains. Moreover, the sequence variants we identify provide a set of candidate genes for the phenotypic differences observed between the two strains.  相似文献   

8.
9.

Background

High-yielding cultivars of rice (Oryza sativa L.) have been developed in Japan from crosses between overseas indica and domestic japonica cultivars. Recently, next-generation sequencing technology and high-throughput genotyping systems have shown many single-nucleotide polymorphisms (SNPs) that are proving useful for detailed analysis of genome composition. These SNPs can be used in genome-wide association studies to detect candidate genome regions associated with economically important traits. In this study, we used a custom SNP set to identify introgressed chromosomal regions in a set of high-yielding Japanese rice cultivars, and we performed an association study to identify genome regions associated with yield.

Results

An informative set of 1152 SNPs was established by screening 14 high-yielding or primary ancestral cultivars for 5760 validated SNPs. Analysis of the population structure of high-yielding cultivars showed three genome types: japonica-type, indica-type and a mixture of the two. SNP allele frequencies showed several regions derived predominantly from one of the two parental genome types. Distinct regions skewed for the presence of parental alleles were observed on chromosomes 1, 2, 7, 8, 11 and 12 (indica) and on chromosomes 1, 2 and 6 (japonica). A possible relationship between these introgressed regions and six yield traits (blast susceptibility, heading date, length of unhusked seeds, number of panicles, surface area of unhusked seeds and 1000-grain weight) was detected in eight genome regions dominated by alleles of one parental origin. Two of these regions were near Ghd7, a heading date locus, and Pi-ta, a blast resistance locus. The allele types (i.e., japonica or indica) of significant SNPs coincided with those previously reported for candidate genes Ghd7 and Pi-ta.

Conclusions

Introgression breeding is an established strategy for the accumulation of QTLs and genes controlling high yield. Our custom SNP set is an effective tool for the identification of introgressed genome regions from a particular genetic background. This study demonstrates that changes in genome structure occurred during artificial selection for high yield, and provides information on several genomic regions associated with yield performance.

Electronic supplementary material

The online version of this article (doi:10.1186/1471-2164-15-346) contains supplementary material, which is available to authorized users.  相似文献   

10.

Background

Genotyping by sequencing, a new low-cost, high-throughput sequencing technology was used to genotype 2,815 maize inbred accessions, preserved mostly at the National Plant Germplasm System in the USA. The collection includes inbred lines from breeding programs all over the world.

Results

The method produced 681,257 single-nucleotide polymorphism (SNP) markers distributed across the entire genome, with the ability to detect rare alleles at high confidence levels. More than half of the SNPs in the collection are rare. Although most rare alleles have been incorporated into public temperate breeding programs, only a modest amount of the available diversity is present in the commercial germplasm. Analysis of genetic distances shows population stratification, including a small number of large clusters centered on key lines. Nevertheless, an average fixation index of 0.06 indicates moderate differentiation between the three major maize subpopulations. Linkage disequilibrium (LD) decays very rapidly, but the extent of LD is highly dependent on the particular group of germplasm and region of the genome. The utility of these data for performing genome-wide association studies was tested with two simply inherited traits and one complex trait. We identified trait associations at SNPs very close to known candidate genes for kernel color, sweet corn, and flowering time; however, results suggest that more SNPs are needed to better explore the genetic architecture of complex traits.

Conclusions

The genotypic information described here allows this publicly available panel to be exploited by researchers facing the challenges of sustainable agriculture through better knowledge of the nature of genetic diversity.  相似文献   

11.
Dong C  Qian Z  Jia P  Wang Y  Huang W  Li Y 《PloS one》2007,2(12):e1262

Background

The high-throughput genotyping chips have contributed greatly to genome-wide association (GWA) studies to identify novel disease susceptibility single nucleotide polymorphisms (SNPs). The high-density chips are designed using two different SNP selection approaches, the direct gene-centric approach, and the indirect quasi-random SNPs or linkage disequilibrium (LD)-based tagSNPs approaches. Although all these approaches can provide high genome coverage and ascertain variants in genes, it is not clear to which extent these approaches could capture the common genic variants. It is also important to characterize and compare the differences between these approaches.

Methodology/Principal Findings

In our study, by using both the Phase II HapMap data and the disease variants extracted from OMIM, a gene-centric evaluation was first performed to evaluate the ability of the approaches in capturing the disease variants in Caucasian population. Then the distribution patterns of SNPs were also characterized in genic regions, evolutionarily conserved introns and nongenic regions, ontologies and pathways. The results show that, no mater which SNP selection approach is used, the current high-density SNP chips provide very high coverage in genic regions and can capture most of known common disease variants under HapMap frame. The results also show that the differences between the direct and the indirect approaches are relatively small. Both have similar SNP distribution patterns in these gene-centric characteristics.

Conclusions/Significance

This study suggests that the indirect approaches not only have the advantage of high coverage but also are useful for studies focusing on various functional SNPs either in genes or in the conserved regions that the direct approach supports. The study and the annotation of characteristics will be helpful for designing and analyzing GWA studies that aim to identify genetic risk factors involved in common diseases, especially variants in genes and conserved regions.  相似文献   

12.

Background

There is a growing interest among geneticists in developing panels of Ancestry Informative Markers (AIMs) aimed at measuring the biogeographical ancestry of individual genomes. The efficiency of these panels is commonly tested empirically by contrasting self-reported ancestry with the ancestry estimated from these panels.

Results

Using SNP data from HapMap we carried out a simulation-based study aimed at measuring the effect of SNP coverage on the estimation of genome ancestry. For three of the main continental groups (Africans, East Asians, Europeans) ancestry was first estimated using the whole HapMap SNP database as a proxy for global genome ancestry; these estimates were subsequently compared to those obtained from pre-designed AIM panels. Panels that consider >400 AIMs capture genome ancestry reasonably well, while those containing a few dozen AIMs show a large variability in ancestry estimates. Curiously, 500-1,000 SNPs selected at random from the genome provide an unbiased estimate of genome ancestry and perform as well as any AIM panel of similar size. In simulated scenarios of population admixture, panels containing few AIMs also show important deficiencies to measure genome ancestry.

Conclusions

The results indicate that the ability to estimate genome ancestry is strongly dependent on the number of AIMs used, and not primarily on their individual informativeness. Caution should be taken when making individual (medical, forensic, or anthropological) inferences based on AIMs.

Electronic supplementary material

The online version of this article (doi:10.1186/1471-2164-15-543) contains supplementary material, which is available to authorized users.  相似文献   

13.

Background

Twin studies have shown that anxiety in a general population sample of children involves both domain-general and trait-specific genetic effects. For this reason, in an attempt to identify genes responsible for these effects, we investigated domain-general and trait-specific genetic associations in the first genome-wide association (GWA) study on anxiety-related behaviours (ARBs) in childhood.

Methods

The sample included 2810 7-year-olds drawn from the Twins Early Development Study (TEDS) with data available for parent-rated anxiety and genome-wide DNA markers. The measure was the Anxiety-Related Behaviours Questionnaire (ARBQ), which assesses four anxiety traits and also yields a general anxiety composite. Affymetrix GeneChip 6.0 DNA arrays were used to genotype nearly 700,000 single-nucleotide polymorphisms (SNPs), and IMPUTE v2 was used to impute more than 1 million SNPs. Several GWA associations from this discovery sample were followed up in another TEDS sample of 4804 children. In addition, Genome-wide Complex Trait Analysis (GCTA) was used on the discovery sample, to estimate the total amount of variance in ARBs that can be accounted for by SNPs on the array.

Results

No SNP associations met the demanding criterion of genome-wide significance that corrects for multiple testing across the genome (p<5×10−8). Attempts to replicate the top associations did not yield significant results. In contrast to the substantial twin study estimates of heritability which ranged from 0.50 (0.03) to 0.61 (0.01), the GCTA estimates of phenotypic variance accounted for by the SNPs were much lower 0.01 (0.11) to 0.19 (0.12).

Conclusions

Taken together, these GWAS and GCTA results suggest that anxiety – similar to height, weight and intelligence − is affected by many genetic variants of small effect, but unlike these other prototypical polygenic traits, genetic influence on anxiety is not well tagged by common SNPs.  相似文献   

14.

Background

Genomic selection estimates genetic merit based on dense SNP (single nucleotide polymorphism) genotypes and phenotypes. This requires that SNPs explain a large fraction of the genetic variance. The objectives of this work were: (1) to estimate the fraction of genetic variance explained by dense genome-wide markers using 54 K SNP chip genotyping, and (2) to evaluate the effect of alternative marker-based relationship matrices and corrections for the base population on the fraction of the genetic variance explained by markers.

Methods

Two alternative marker-based relationship matrices were estimated using 35 706 SNPs on 1086 dairy bulls. Both pedigree- and marker-based relationship matrices were fitted simultaneously or separately in an animal model to estimate the fraction of variance not explained by the markers, i.e. the fraction explained by the pedigree. The phenotypes considered in the analysis were the deregressed estimated breeding values (dEBV) for milk, fat and protein yield and for somatic cell score (SCS).

Results

When dEBV were not sufficiently accurate (50 or 70%), the estimated fraction of the genetic variance explained by the markers was around 65% for yield traits and 45% for SCS. Scaling marker genotypes with locus-specific frequencies of heterozygotes slightly increased the variance explained by markers, compared with scaling with the average frequency of heterozygotes across loci. The estimated fraction of the genetic variance explained by the markers using separately both relationships matrices followed the same trends but the results were underestimated. With less accurate dEBV estimates, the fraction of the genetic variance explained by markers was underestimated, which is probably an artifact due to the dEBV being estimated by a pedigree-based animal model.

Conclusions

When using only highly accurate dEBV, the proportion of the genetic variance explained by the Illumina 54 K SNP chip was approximately 80% for Brown Swiss cattle. These results depend on the SNP chip used and the family structure of the population, i.e. more dense SNPs and closer family relationships are expected to result in a higher fraction of the variance explained by the SNPs.  相似文献   

15.

Background

A major concern in conservation genetics is to maintain the genetic diversity of populations. Genetic variation in livestock species is threatened by the progressive marginalisation of local breeds in benefit of high-output pigs worldwide. We used high-density SNP and re-sequencing data to assess genetic diversity of local pig breeds from Europe. In addition, we re-sequenced pigs from commercial breeds to identify potential candidate mutations responsible for phenotypic divergence among these groups of breeds.

Results

Our results point out some local breeds with low genetic diversity, whose genome shows a high proportion of regions of homozygosis (>50%) and that harbour a large number of potentially damaging mutations. We also observed a high correlation between genetic diversity estimates using high-density SNP data and Next Generation Sequencing data (r = 0.96 at individual level). The study of non-synonymous SNPs that were fixed in commercial breeds and also in any local breed, but with different allele, revealed 99 non-synonymous SNPs affecting 65 genes. Candidate mutations that may underlie differences in the adaptation to the environment were exemplified by the genes AZGP1 and TAS2R40. We also observed that highly productive breeds may have lost advantageous genotypes within genes involve in immune response – e.g. IL12RB2 and STAB1–, probably as a result of strong artificial in the intensive production systems in pig.

Conclusions

The high correlation between genetic diversity computed with the 60K SNP and whole genome re-sequence data indicates that the Porcine 60K SNP Beadchip provides reliable estimates of genomic diversity in European pig populations despite the expected bias. Moreover, this analysis gave insights for strategies to the genetic characterization of local breeds. The comparison between re-sequenced local pigs and re-sequenced commercial pigs made it possible to report candidate mutations to be responsible for phenotypic divergence among those groups of breeds. This study highlights the importance of low input breeds as a valuable genetic reservoir for the pig production industry. However, the high levels of ROHs, inbreeding and potentially damaging mutations emphasize the importance of the genetic characterization of local breeds to preserve their genomic variability.

Electronic supplementary material

The online version of this article (doi:10.1186/1471-2164-15-601) contains supplementary material, which is available to authorized users.  相似文献   

16.

Background

Interlocus gene conversion (IGC) is a recombination-based mechanism that results in the unidirectional transfer of short stretches of sequence between paralogous loci. Although IGC is a well-established mechanism of human disease, the extent to which this mutagenic process has shaped overall patterns of segregating variation in multi-copy regions of the human genome remains unknown. One expected manifestation of IGC in population genomic data is the presence of one-to-one paralogous SNPs that segregate identical alleles.

Results

Here, I use SNP genotype calls from the low-coverage phase 3 release of the 1000 Genomes Project to identify 15,790 parallel, shared SNPs in duplicated regions of the human genome. My approach for identifying these sites accounts for the potential redundancy of short read mapping in multi-copy genomic regions, thereby effectively eliminating false positive SNP calls arising from paralogous sequence variation. I demonstrate that independent mutation events to identical nucleotides at paralogous sites are not a significant source of shared polymorphisms in the human genome, consistent with the interpretation that these sites are the outcome of historical IGC events. These putative signals of IGC are enriched in genomic contexts previously associated with non-allelic homologous recombination, including clear signals in gene families that form tandem intra-chromosomal clusters.

Conclusions

Taken together, my analyses implicate IGC, not point mutation, as the mechanism generating at least 2.7 % of single nucleotide variants in duplicated regions of the human genome.

Electronic supplementary material

The online version of this article (doi:10.1186/s12864-015-1681-3) contains supplementary material, which is available to authorized users.  相似文献   

17.
18.

Background

Numerous efforts have been made to elucidate the etiology and improve the treatment of lung cancer, but the overall five-year survival rate is still only 15%. Although cigarette smoking is the primary risk factor for lung cancer, only 7% of female lung cancer patients in Taiwan have a history of smoking. Since cancer results from progressive accumulation of genetic aberrations, genomic rearrangements may be early events in carcinogenesis.

Results

In order to identify biomarkers of early-stage adenocarcinoma, the genome-wide DNA aberrations of 60 pairs of lung adenocarcinoma and adjacent normal lung tissue in non-smoking women were examined using Affymetrix Genome-Wide Human SNP 6.0 arrays. Common copy number variation (CNV) regions were identified by ≥30% of patients with copy number beyond 2 ± 0.5 of copy numbers for each single nucleotide polymorphism (SNP) and at least 100 continuous SNP variant loci. SNPs associated with lung adenocarcinoma were identified by McNemar’s test. Loss of heterozygosity (LOH) SNPs were identified in ≥18% of patients with LOH in the locus. Aberration of SNP rs10248565 at HDAC9 in chromosome 7p21.1 was identified from concurrent analyses of CNVs, SNPs, and LOH.

Conclusion

The results elucidate the genetic etiology of lung adenocarcinoma by demonstrating that SNP rs10248565 may be a potential biomarker of cancer susceptibility.  相似文献   

19.

Background

Non-heading Chinese cabbage (NHCC), belonging to Brassica, is an important leaf vegetable in Asia. Although genetic analyses have been performed through conventional selection and breeding efforts, the domestication history of NHCC and the genetics underlying its morphological diversity remain unclear. Thus, the reliable molecular markers representative of the whole genome are required for molecular-assisted selection in NHCC.

Results

A total of 20,836 simple sequence repeats (SSRs) were detected in NHCC, containing repeat types from mononucleotide to nonanucleotide. The average density was 62.93 SSRs/Mb. In gene regions, 5,435 SSRs were identified in 4,569 genes. A total of 5,008 primer pairs were designed, and 74 were randomly selected for validation. Among these, 60 (81.08%) were polymorphic in 18 Cruciferae. The number of polymorphic bands ranged from two to five, with an average of 2.70 for each primer. The average values of the polymorphism information content, observed heterozygosity, Hardy-Weinberg equilibrium, and Shannon’s information index were 0.2970, 0.4136, 0.5706, and 0.5885, respectively. Four clusters were classified according to the unweighted pair-group method with arithmetic average cluster analysis of 18 genotypes. In addition, a total of 1,228,979 single nucleotide polymorphisms (SNPs) were identified in the NHCC through a comparison with the genome of Chinese cabbage, and the average SNP density in the whole genome was 4.33/Kb. The number of SNPs ranged from 341,939 to 591,586 in the 10 accessions, and the average heterozygous SNPs ratio was ~42.53%. All analyses showed these markers were high quality and reliable. Therefore, they could be used in the construction of a linkage map and for genetic diversity studies for NHCC in future.

Conclusions

This is the first systematic and comprehensive analysis and identification of SSRs in NHCC and 17 species. The development of a large number of SNP and SSR markers was successfully achieved for NHCC. These novel markers are valuable for constructing genetic linkage maps, comparative genome analysis, quantitative trait locus (QTL) mapping, genome-wide association studies, and marker-assisted selection in NHCC breeding system research.

Electronic supplementary material

The online version of this article (doi:10.1186/s12864-015-1534-0) contains supplementary material, which is available to authorized users.  相似文献   

20.
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号