首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.

Background

The determination of structural haplotypes at copy number variable regions can indicate the mechanisms responsible for changes in copy number, as well as explain the relationship between gene copy number and expression. However, obtaining spatial information at regions displaying extensive copy number variation, such as the DEFA1A3 locus, is complex, because of the difficulty in the phasing and assembly of these regions. The DEFA1A3 locus is intriguing in that it falls within a region of high linkage disequilibrium, despite its high variability in copy number (n = 3–16); hence, the mechanisms responsible for changes in copy number at this locus are unclear.

Results

In this study, a region flanking the DEFA1A3 locus was sequenced across 120 independent haplotypes with European ancestry, identifying five common classes of DEFA1A3 haplotype. Assigning DEFA1A3 class to haplotypes within the 1000 Genomes project highlights a significant difference in DEFA1A3 class frequencies between populations with different ancestry. The features of each DEFA1A3 class, for example, the associated DEFA1A3 copy numbers, were initially assessed in a European cohort (n = 599) and replicated in the 1000 Genomes samples, showing within-class similarity, but between-class and between-population differences in the features of the DEFA1A3 locus. Emulsion haplotype fusion-PCR was used to generate 61 structural haplotypes at the DEFA1A3 locus, showing a high within-class similarity in structure.

Conclusions

Structural haplotypes across the DEFA1A3 locus indicate that intra-allelic rearrangement is the predominant mechanism responsible for changes in DEFA1A3 copy number, explaining the conservation of linkage disequilibrium across the locus. The identification of common structural haplotypes at the DEFA1A3 locus could aid studies into how DEFA1A3 copy number influences expression, which is currently unclear.

Electronic supplementary material

The online version of this article (doi:10.1186/1471-2164-15-614) contains supplementary material, which is available to authorized users.  相似文献   

2.

Background

Several resistance traits, including the I2 resistance against tomato fusarium wilt, were mapped to the long arm of chromosome 11 of Solanum. However, the structure and evolution of this locus remain poorly understood.

Results

Comparative analysis showed that the structure and evolutionary patterns of the I2 locus vary considerably between potato and tomato. The I2 homologues from different Solanaceae species usually do not have orthologous relationship, due to duplication, deletion and frequent sequence exchanges. At least 154 sequence exchanges were detected among 76 tomato I2 homologues, but sequence exchanges between I2 homologues in potato is less frequent. Previous study showed that I2 homologues in potato were targeted by miR482. However, our data showed that I2 homologues in tomato were targeted by miR6024 rather than miR482. Furthermore, miR6024 triggers phasiRNAs from I2 homologues in tomato. Sequence analysis showed that miR6024 was originated after the divergence of Solanaceae. We hypothesized that miR6024 and miR482 might have facilitated the expansion of the I2 family in Solanaceae species, since they can minimize their potential toxic effects by down-regulating their expression.

Conclusions

The I2 locus represents a most divergent resistance gene cluster in Solanum. Its high divergence was partly due to frequent sequence exchanges between homologues. We propose that the successful expansion of I2 homologues in Solanum was at least partially attributed to miRNA mediated regulation.

Electronic supplementary material

The online version of this article (doi:10.1186/1471-2164-15-743) contains supplementary material, which is available to authorized users.  相似文献   

3.

Background

Numerous efforts have been made to elucidate the etiology and improve the treatment of lung cancer, but the overall five-year survival rate is still only 15%. Although cigarette smoking is the primary risk factor for lung cancer, only 7% of female lung cancer patients in Taiwan have a history of smoking. Since cancer results from progressive accumulation of genetic aberrations, genomic rearrangements may be early events in carcinogenesis.

Results

In order to identify biomarkers of early-stage adenocarcinoma, the genome-wide DNA aberrations of 60 pairs of lung adenocarcinoma and adjacent normal lung tissue in non-smoking women were examined using Affymetrix Genome-Wide Human SNP 6.0 arrays. Common copy number variation (CNV) regions were identified by ≥30% of patients with copy number beyond 2 ± 0.5 of copy numbers for each single nucleotide polymorphism (SNP) and at least 100 continuous SNP variant loci. SNPs associated with lung adenocarcinoma were identified by McNemar’s test. Loss of heterozygosity (LOH) SNPs were identified in ≥18% of patients with LOH in the locus. Aberration of SNP rs10248565 at HDAC9 in chromosome 7p21.1 was identified from concurrent analyses of CNVs, SNPs, and LOH.

Conclusion

The results elucidate the genetic etiology of lung adenocarcinoma by demonstrating that SNP rs10248565 may be a potential biomarker of cancer susceptibility.  相似文献   

4.

Background

Although Mycobacterium tuberculosis isolates are consisted of several different lineages and the epidemiology analyses are usually assessed relative to a particular reference genome, M. tuberculosis H37Rv, which might introduce some biased results. Those analyses are essentially based genome sequence information of M. tuberculosis and could be performed in sillico in theory, with whole genome sequence (WGS) data available in the databases and obtained by next generation sequencers (NGSs). As an approach to establish higher resolution methods for such analyses, whole genome sequences of the M. tuberculosis complexes (MTBCs) strains available on databases were aligned to construct virtual reference genome sequences called the consensus sequence (CS), and evaluated its feasibility in in sillico epidemiological analyses.

Results

The consensus sequence (CS) was successfully constructed and utilized to perform phylogenetic analysis, evaluation of read mapping efficacy, which is crucial for detecting single nucleotide polymorphisms (SNPs), and various MTBC typing methods virtually including spoligotyping, VNTR, Long sequence polymorphism and Beijing typing. SNPs detected based on CS, in comparison with H37Rv, were utilized in concatemer-based phylogenetic analysis to determine their reliability relative to a phylogenetic tree based on whole genome alignment as the gold standard. Statistical comparison of phylogenic trees based on CS with that of H37Rv indicated the former showed always better results that that of later. SNP detection and concatenation with CS was advantageous because the frequency of crucial SNPs distinguishing among strain lineages was higher than those of H37Rv. The number of SNPs detected was lower with the consensus than with the H37Rv sequence, resulting in a significant reduction in computational time. Performance of each virtual typing was satisfactory and accorded with those published when those are available.

Conclusions

These results indicated that virtual CS constructed from genome sequence data is an ideal approach as a reference for MTBC studies.

Electronic supplementary material

The online version of this article (doi:10.1186/s12864-015-1368-9) contains supplementary material, which is available to authorized users.  相似文献   

5.

Background

Single copy genes are common across angiosperm genomes. With the sufficiently high quality sequenced genomes, the identification of large-scale single copy genes among multiple species is possible. Although some characteristics have been reported, our study provides novel insights into single copy genes.

Results

We identified single copy genes across 29 angiosperm genomes. A significant negative correlation was found between the number of duplicate blocks and the number of single copy genes. We found that a considerable number of single copy genes are located in organelles, showing a preference for binding and catalytic activity. The analysis of effective number of codons (Nc) illustrates that single copy genes have a stronger codon bias than non-single copy genes in eudicots. The relative high expression level of single copy genes was partially confirmed by the RNA-seq data, rather than the Codon Adaptation Index (CAI). Unlike in most other species, a strongly negatively correlation occurs between Nc and GC3 among single copy genes in grass genomes. When compared to all non-single copy genes, single copy genes indicate more conservation (as indicated by Ka and Ks values). But our alternative splicing (AS) results reveal that selective constraints are weaker in single copy genes than in low copy family genes (1–10 in-paralogs) and stronger than high copy family genes (>10 in-paralogs). Using concatenated shared single copy genes, we obtained a well-resolved phylogenetic tree. With the addition of intron sequences, the branch support is improved, but striking incongruences are also evident. Therefore, it is noteworthy that inclusion of intron sequences seems more appropriate for the phylogenetic reconstruction at lower taxonomic levels.

Conclusions

Our analysis provides insight into the evolutionary characteristics of single copy genes across 29 angiosperm genomes. The results suggest that there are key differences in evolutionary constraints between single copy genes and non-single copy genes. And to some extent, these evolutionary constraints show some species-specific differences, especially between eudicots and monocots. Our preliminary evidence also suggests that the concatenated shared single copy genes are well suited for use in resolving phylogenetic relationships.

Electronic supplementary material

The online version of this article (doi:10.1186/1471-2164-15-504) contains supplementary material, which is available to authorized users.  相似文献   

6.

Background

A RIL population between Solanum lycopersicum cv. Moneymaker and S. pimpinellifolium G1.1554 was genotyped with a custom made SNP array. Additionally, a subset of the lines was genotyped by sequencing (GBS).

Results

A total of 1974 polymorphic SNPs were selected to develop a linkage map of 715 unique genetic loci. We generated plots for visualizing the recombination patterns of the population relating physical and genetic positions along the genome.This linkage map was used to identify two QTLs for TYLCV resistance which contained favourable alleles derived from S. pimpinellifolium. Further GBS was used to saturate regions of interest, and the mapping resolution of the two QTLs was improved. The analysis showed highest significance on Chromosome 11 close to the region of 51.3 Mb (qTy-p11) and another on Chromosome 3 near 46.5 Mb (qTy-p3). Furthermore, we explored the population using untargeted metabolic profiling, and the most significant differences between susceptible and resistant plants were mainly associated with sucrose and flavonoid glycosides.

Conclusions

The SNP information obtained from an array allowed a first QTL screening of our RIL population. With additional SNP data of a RILs subset, obtained through GBS, we were able to perform an in silico mapping improvement to further confirm regions associated with our trait of interest. With the combination of different ~ omics platforms we provide valuable insight into the genetics of S. pimpinellifolium-derived TYLCV resistance.

Electronic supplementary material

The online version of this article (doi:10.1186/1471-2164-15-1152) contains supplementary material, which is available to authorized users.  相似文献   

7.

Background

Intrachromosomal segmental duplications provide the substrate for non-allelic homologous recombination, facilitating extensive copy number variation in the human genome. Many multi-copy gene families are embedded within genomic regions with high levels of sequence identity (>95%) and therefore pose considerable analytical challenges. In some cases, the complexity involved in analyzing such regions is largely underestimated. Rapid, cost effective analysis of multi-copy gene regions have typically implemented quantitative approaches, however quantitative data are not an absolute means of certainty. Therefore any technique prone to degrees of measurement error can produce ambiguous results that may lead to spurious associations with complex disease.

Results

In this study we have focused on testing the accuracy and reproducibility of quantitative analysis techniques. With reference to the C-C Chemokine Ligand-3-like-1 (CCL3L1) gene, we performed analysis using real-time Quantitative PCR (QPCR), Multiplex Ligation-dependent Probe Amplification (MLPA) and Paralogue Ratio Test (PRT). After controlling for potential outside variables on assay performance, including DNA concentration, quality, preparation and storage conditions, we find that real-time QPCR produces data that does not cluster tightly around copy number integer values, with variation substantially greater than that of the MLPA or PRT systems. We find that the method of rounding real-time QPCR measurements can potentially lead to mis-scoring of copy number genotypes and suggest caution should be exercised in interpreting QPCR data.

Conclusions

We conclude that real-time QPCR is inherently prone to measurement error, even under conditions that would seem favorable for association studies. Our results indicate that potential variability in the physicochemical properties of the DNA samples cannot solely explain the poor performance exhibited by the real-time QPCR systems. We recommend that more robust approaches such as PRT or MLPA should be used to genotype multi-allelic copy number variation in disease association studies and suggest several approaches which can be implemented to ensure the quality of the copy number typing using quantitative methods.  相似文献   

8.

Background

Copy number variations (CNVs) are a main source of genomic structural variations underlying animal evolution and production traits. Here, with one pure-blooded Angus bull as reference, we describe a genome-wide analysis of CNVs based on comparative genomic hybridization arrays in 29 Chinese domesticated bulls and examined their effects on gene expression and cattle growth traits.

Results

We identified 486 copy number variable regions (CNVRs), covering 2.45% of the bovine genome, in 24 taurine (Bos taurus), together with 161 ones in 2 yaks (Bos grunniens) and 163 ones in 3 buffaloes (Bubalus bubalis). Totally, we discovered 605 integrated CNVRs, with more “loss” events than both “gain” and “both” ones, and clearly clustered them into three cattle groups. Interestingly, we confirmed their uneven distributions across chromosomes, and the differences of mitochondrion DNA copy number (gain: taurine, loss: yak & buffalo). Furthermore, we confirmed approximately 41.8% (253/605) and 70.6% (427/605) CNVRs span cattle genes and quantitative trait loci (QTLs), respectively. Finally, we confirmed 6 CNVRs in 9 chosen ones by using quantitative PCR, and further demonstrated that CNVR22 had significantly negative effects on expression of PLA2G2D gene, and both CNVR22 and CNVR310 were associated with body measurements in Chinese cattle, suggesting their key effects on gene expression and cattle traits.

Conclusions

The results advanced our understanding of CNV as an important genomic structural variation in taurine, yak and buffalo. This study provides a highly valuable resource for Chinese cattle’s evolution and breeding researches.

Electronic supplementary material

The online version of this article (doi:10.1186/1471-2164-15-480) contains supplementary material, which is available to authorized users.  相似文献   

9.

Background

Despite having predominately deleterious fitness effects, transposable elements (TEs) are major constituents of eukaryote genomes in general and of plant genomes in particular. Although the proportion of the genome made up of TEs varies at least four-fold across plants, the relative importance of the evolutionary forces shaping variation in TE abundance and distributions across taxa remains unclear. Under several theoretical models, mating system plays an important role in governing the evolutionary dynamics of TEs. Here, we use the recently sequenced Capsella rubella reference genome and short-read whole genome sequencing of multiple individuals to quantify abundance, genome distributions, and population frequencies of TEs in three recently diverged species of differing mating system, two self-compatible species (C. rubella and C. orientalis) and their self-incompatible outcrossing relative, C. grandiflora.

Results

We detect different dynamics of TE evolution in our two self-compatible species; C. rubella shows a small increase in transposon copy number, while C. orientalis shows a substantial decrease relative to C. grandiflora. The direction of this change in copy number is genome wide and consistent across transposon classes. For insertions near genes, however, we detect the highest abundances in C. grandiflora. Finally, we also find differences in the population frequency distributions across the three species.

Conclusion

Overall, our results suggest that the evolution of selfing may have different effects on TE evolution on a short and on a long timescale. Moreover, cross-species comparisons of transposon abundance are sensitive to reference genome bias, and efforts to control for this bias are key when making comparisons across species.

Electronic supplementary material

The online version of this article (doi:10.1186/1471-2164-15-602) contains supplementary material, which is available to authorized users.  相似文献   

10.
11.

Background

Chlamydia pneumoniae (Cpn) are obligate intracellular bacteria that cause acute infections of the upper and lower respiratory tract and have been implicated in chronic inflammatory diseases. Although of significant clinical relevance, complete genome sequences of only four clinical Cpn strains have been obtained. All of them were isolated from the respiratory tract and shared more than 99% sequence identity. Here we investigate genetic differences on the whole-genome level that are related to Cpn tissue tropism and pathogenicity.

Results

We have sequenced the genomes of 18 clinical isolates from different anatomical sites (e.g. lung, blood, coronary arteries) of diseased patients, and one animal isolate. In total 1,363 SNP loci and 184 InDels have been identified in the genomes of all clinical Cpn isolates. These are distributed throughout the whole chlamydial genome and enriched in highly variable regions. The genomes show clear evidence of recombination in at least one potential region but no phage insertions. The tyrP gene was always encoded as single copy in all vascular isolates. Phylogenetic reconstruction revealed distinct evolutionary lineages containing primarily non-respiratory Cpn isolates. In one of these, clinical isolates from coronary arteries and blood monocytes were closely grouped together. They could be distinguished from all other isolates by characteristic nsSNPs in genes involved in RB to EB transition, inclusion membrane formation, bacterial stress response and metabolism.

Conclusions

This study substantially expands the genomic data of Cpn and elucidates its evolutionary history. The translation of the observed Cpn genetic differences into biological functions and the prediction of novel pathogen-oriented diagnostic strategies have to be further explored.

Electronic supplementary material

The online version of this article (doi:10.1186/s12864-015-1377-8) contains supplementary material, which is available to authorized users.  相似文献   

12.

Background

Previous genome-wide association analyses identified QTL regions in the X chromosome for percentage of normal sperm and scrotal circumference in Brahman and Tropical Composite cattle. These traits are important to be studied because they are indicators of male fertility and are correlated with female sexual precocity and reproductive longevity. The aim was to investigate candidate genes in these regions and to identify putative causative mutations that influence these traits. In addition, we tested the identified mutations for female fertility and growth traits.

Results

Using a combination of bioinformatics and molecular assay technology, twelve non-synonymous SNPs in eleven genes were genotyped in a cattle population. Three and nine SNPs explained more than 1% of the additive genetic variance for percentage of normal sperm and scrotal circumference, respectively. The SNPs that had a major influence in percentage of normal sperm were mapped to LOC100138021 and TAF7L genes; and in TEX11 and AR genes for scrotal circumference. One SNP in TEX11 was explained ~13% of the additive genetic variance for scrotal circumference at 12 months. The tested SNP were also associated with weight measurements, but not with female fertility traits.

Conclusions

The strong association of SNPs located in X chromosome genes with male fertility traits validates the QTL. The implicated genes became good candidates to be used for genetic evaluation, without detrimentally influencing female fertility traits.

Electronic supplementary material

The online version of this article (doi:10.1186/s12864-015-1595-0) contains supplementary material, which is available to authorized users.  相似文献   

13.

Background

Deviations in the amount of genomic content that arise during tumorigenesis, called copy number alterations, are structural rearrangements that can critically affect gene expression patterns. Additionally, copy number alteration profiles allow insight into cancer discrimination, progression and complexity. On data obtained from high-throughput sequencing, improving quality through GC bias correction and keeping false positives to a minimum help build reliable copy number alteration profiles.

Results

We introduce seqCNA, a parallelized R package for an integral copy number analysis of high-throughput sequencing cancer data. The package includes novel methodology on (i) filtering, reducing false positives, and (ii) GC content correction, improving copy number profile quality, especially under great read coverage and high correlation between GC content and copy number. Adequate analysis steps are automatically chosen based on availability of paired-end mapping, matched normal samples and genome annotation.

Conclusions

seqCNA, available through Bioconductor, provides accurate copy number predictions in tumoural data, thanks to the extensive filtering and better GC bias correction, while providing an integrated and parallelized workflow.

Electronic supplementary material

The online version of this article (doi:10.1186/1471-2164-15-178) contains supplementary material, which is available to authorized users.  相似文献   

14.

Background

High-yielding cultivars of rice (Oryza sativa L.) have been developed in Japan from crosses between overseas indica and domestic japonica cultivars. Recently, next-generation sequencing technology and high-throughput genotyping systems have shown many single-nucleotide polymorphisms (SNPs) that are proving useful for detailed analysis of genome composition. These SNPs can be used in genome-wide association studies to detect candidate genome regions associated with economically important traits. In this study, we used a custom SNP set to identify introgressed chromosomal regions in a set of high-yielding Japanese rice cultivars, and we performed an association study to identify genome regions associated with yield.

Results

An informative set of 1152 SNPs was established by screening 14 high-yielding or primary ancestral cultivars for 5760 validated SNPs. Analysis of the population structure of high-yielding cultivars showed three genome types: japonica-type, indica-type and a mixture of the two. SNP allele frequencies showed several regions derived predominantly from one of the two parental genome types. Distinct regions skewed for the presence of parental alleles were observed on chromosomes 1, 2, 7, 8, 11 and 12 (indica) and on chromosomes 1, 2 and 6 (japonica). A possible relationship between these introgressed regions and six yield traits (blast susceptibility, heading date, length of unhusked seeds, number of panicles, surface area of unhusked seeds and 1000-grain weight) was detected in eight genome regions dominated by alleles of one parental origin. Two of these regions were near Ghd7, a heading date locus, and Pi-ta, a blast resistance locus. The allele types (i.e., japonica or indica) of significant SNPs coincided with those previously reported for candidate genes Ghd7 and Pi-ta.

Conclusions

Introgression breeding is an established strategy for the accumulation of QTLs and genes controlling high yield. Our custom SNP set is an effective tool for the identification of introgressed genome regions from a particular genetic background. This study demonstrates that changes in genome structure occurred during artificial selection for high yield, and provides information on several genomic regions associated with yield performance.

Electronic supplementary material

The online version of this article (doi:10.1186/1471-2164-15-346) contains supplementary material, which is available to authorized users.  相似文献   

15.

Background

Carbohydrate metabolism is a key feature of vascular plant architecture, and is of particular importance in large woody species, where lignocellulosic biomass is responsible for bearing the bulk of the stem and crown. Since Carbohydrate Active enZymes (CAZymes) in plants are responsible for the synthesis, modification and degradation of carbohydrate biopolymers, the differences in gene copy number and regulation between woody and herbaceous species have been highlighted previously. There are still many unanswered questions about the role of CAZymes in land plant evolution and the formation of wood, a strong carbohydrate sink.

Results

Here, twenty-two publically available plant genomes were used to characterize the frequency, diversity and complexity of CAZymes in plants. We find that a conserved suite of CAZymes is a feature of land plant evolution, with similar diversity and complexity regardless of growth habit and form. In addition, we compared the diversity and levels of CAZyme gene expression during wood formation in trees using mRNA-seq data from two distantly related angiosperm tree species Eucalyptus grandis and Populus trichocarpa, highlighting the major CAZyme classes involved in xylogenesis and lignocellulosic biomass production.

Conclusions

CAZyme domain ratio across embryophytes is maintained, and the diversity of CAZyme domains is similar in all land plants, regardless of woody habit. The stoichiometric conservation of gene expression in woody and non-woody tissues of Eucalyptus and Populus are indicative of gene balance preservation.

Electronic supplementary material

The online version of this article (doi:10.1186/s12864-015-1571-8) contains supplementary material, which is available to authorized users.  相似文献   

16.

Background

Single nucleotide polymorphism (SNP) markers have a wide range of applications in crop genetics and genomics. Due to their polyploidy nature, many important crops, such as wheat, cotton and rapeseed contain a large amount of repeat and homoeologous sequences in their genomes, which imposes a huge challenge in high-throughput genotyping with sequencing and/or array technologies. Allotetraploid Brassica napus (AACC, 2n = 4x = 38) comprises of two highly homoeologous sub-genomes derived from its progenitor species B. rapa (AA, 2n = 2x = 20) and B. oleracea (CC, 2n = 2x = 18), and is an ideal species to exploit methods for reducing the interference of extensive inter-homoeologue polymorphisms (mHemi-SNPs and Pseudo-simple SNPs) between closely related sub-genomes.

Results

Based on a recent B. napus 6K SNP array, we developed a bi-filtering procedure to identify unauthentic lines in a DH population, and mHemi-SNPs and Pseudo-simple SNPs in an array data matrix. The procedure utilized both monomorphic and polymorphic SNPs in the DH population and could effectively distinguish the mHemi-SNPs and Pseudo-simple SNPs that resulted from superposition of the signals from multiple SNPs. Compared with conventional procedure for array data processing, the bi-filtering method could minimize the pseudo linkage relationship caused by the mHemi-SNPs and Pseudo-simple SNPs, thus improving the quality of SNP genetic map. Furthermore, the improved genetic map could increase the accuracies of mapping of QTLs as demonstrated by the ability to eliminate non-real QTLs in the mapping population.

Conclusions

The bi-filtering analysis of the SNP array data represents a novel approach to effectively assigning the multi-loci SNP genotypes in polyploid B. napus and may find wide applications to SNP analyses in polyploid crops.

Electronic supplementary material

The online version of this article (doi:10.1186/s12864-015-1559-4) contains supplementary material, which is available to authorized users.  相似文献   

17.

Background

Septoria tritici blotch is an important leaf disease of European winter wheat. In our survey, we analyzed Septoria tritici blotch resistance in field trials with a large population of 1,055 elite hybrids and their 87 parental lines. Entries were fingerprinted with the 9 k SNP array. The accuracy of prediction of Septoria tritici blotch resistance achieved with different genome-wide mapping approaches was evaluated based on robust cross validation scenarios.

Results

Septoria tritici blotch disease severities were normally distributed, with genotypic variation being significantly (P < 0.01) larger than zero. The cross validation study revealed an absence of large effect QTL for additive and dominance effects. Application of genomic selection approaches particularly designed to tackle complex agronomic traits allowed to double the accuracy of prediction of Septoria tritici blotch resistance compared to calculation methods suited to detect QTL with large effects.

Conclusions

Our study revealed that Septoria tritici blotch resistance in European winter wheat is controlled by multiple loci with small effect size. This suggests that the currently achieved level of resistance in this collection is likely to be durable, as involvement of a high number of genes in a resistance trait reduces the risk of the resistance to be overcome by specific pathogen isolates or races.

Electronic supplementary material

The online version of this article (doi:10.1186/1471-2164-14-858) contains supplementary material, which is available to authorized users.  相似文献   

18.

Background

Copy number variation is an important dimension of genetic diversity and has implications in development and disease. As an important model organism, the mouse is a prime candidate for copy number variant (CNV) characterization, but this has yet to be completed for a large sample size. Here we report CNV analysis of publicly available, high-density microarray data files for 351 mouse tail samples, including 290 mice that had not been characterized for CNVs previously.

Results

We found 9634 putative autosomal CNVs across the samples affecting 6.87 % of the mouse reference genome. We find significant differences in the degree of CNV uniqueness (single sample occurrence) and the nature of CNV-gene overlap between wild-caught mice and classical laboratory strains. CNV-gene overlap was associated with lipid metabolism, pheromone response and olfaction compared to immunity, carbohydrate metabolism and amino-acid metabolism for wild-caught mice and classical laboratory strains, respectively. Using two subspecies of wild-caught Mus musculus, we identified putative CNVs unique to those subspecies and show this diversity is better captured by wild-derived laboratory strains than by the classical laboratory strains. A total of 9 genic copy number variable regions (CNVRs) were selected for experimental confirmation by droplet digital PCR (ddPCR).

Conclusion

The analysis we present is a comprehensive, genome-wide analysis of CNVs in Mus musculus, which increases the number of known variants in the species and will accelerate the identification of novel variants in future studies.

Electronic supplementary material

The online version of this article (doi:10.1186/s12864-015-1713-z) contains supplementary material, which is available to authorized users.  相似文献   

19.

Background

Polyploidy is a major component of eukaryote evolution. Estimation of allele copy numbers for molecular markers has long been considered a challenge for polyploid species, while this process is essential for most genetic research. With the increasing availability and whole-genome coverage of single nucleotide polymorphism (SNP) markers, it is essential to implement a versatile SNP genotyping method to assign allelic configuration efficiently in polyploids.

Scope

This work evaluates the usefulness of the KASPar method, based on competitive allele-specific PCR, for the assignment of SNP allelic configuration. Citrus was chosen as a model because of its economic importance, the ongoing worldwide polyploidy manipulation projects for cultivar and rootstock breeding, and the increasing availability of SNP markers.

Conclusions

Fifteen SNP markers were successfully designed that produced clear allele signals that were in agreement with previous genotyping results at the diploid level. The analysis of DNA mixes between two haploid lines (Clementine and pummelo) at 13 different ratios revealed a very high correlation (average = 0·9796; s.d. = 0·0094) between the allele ratio and two parameters [θ angle = tan−1 (y/x) and y′ = y/(x + y)] derived from the two normalized allele signals (x and y) provided by KASPar. Separated cluster analysis and analysis of variance (ANOVA) from mixed DNA simulating triploid and tetraploid hybrids provided 99·71 % correct allelic configuration. Moreover, triploid populations arising from 2n gametes and interploid crosses were easily genotyped and provided useful genetic information. This work demonstrates that the KASPar SNP genotyping technique is an efficient way to assign heterozygous allelic configurations within polyploid populations. This method is accurate, simple and cost-effective. Moreover, it may be useful for quantitative studies, such as relative allele-specific expression analysis and bulk segregant analysis.  相似文献   

20.
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号