首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 62 毫秒
1.

Background

In single-cell human genome analysis using whole-genome amplified product, a strong amplification bias involving allele dropout and preferential amplification hampers the quality of results. Using an oligonucleotide single nucleotide polymorphism (SNP) array, we systematically examined the nature of this amplification bias, including frequency, degree, and preference for genomic location, and we assessed the effects of this amplification bias on subsequent genotype and chromosomal copy number analyses.

Methodology/Principal Findings

We found a large variability in amplification bias among the amplified products obtained by multiple displacement amplification (MDA), and this bias had a severe effect on the genotype and chromosomal copy number analyses. We established optimal experimental conditions for pre-screening for high-quality amplified products, processing array data, and analyzing chromosomal structural alterations. Using this optimized protocol, we successfully detected previously unidentified chromosomal structural alterations in single cells from a lymphoblastoid cell line. These alterations were subsequently confirmed by karyotype analysis. In addition, we successfully obtained reproducible chromosomal copy number profiles of single cells from the cell line with a complex karyotype, indicating the applicability and potential of our optimized workflow.

Conclusions/Significance

Our results suggest that the quality of amplification products should be critically assessed before using them for genomic analyses. The method of MDA-based whole-genome amplification followed by SNP array analysis described here will be useful for exploring chromosomal alterations in single cells.  相似文献   

2.

Background

Tumor single nucleotide polymorphism (SNP) array is a common platform for investigating the cancer genomic aberration and the functionally important altered genes. Original SNP array signals are usually corrupted by noise, and need to be de-convoluted into absolute copy number profile by analytical methods. Unfortunately, in contrast with the popularity of tumor Affymetrix SNP array, the methods that are specifically designed for this platform are still limited. The complicated characteristics of noise in signals is one of the difficulties for dissecting tumor Affymetrix SNP array data, as they inevitably blur the distinction between aberrations and create an obstacle for the copy number aberration (CNA) identification.

Results

We propose a tool named TAFFYS for comprehensive analysis of tumor Affymetrix SNP array data. TAFFYS introduce a wavelet-based de-noising approach and copy number-specific signal variance model for suppressing and modelling the noise in signals. Then a hidden Markov model is employed for copy number inference. Finally, by using the absolute copy number profile, statistical significance of each aberration region is calculated in term of different aberration types, including amplification, deletion and loss of heterozygosity (LOH). The result shows that copy number specific-variance model and wavelet de-noising algorithm fits well with the Affymetrix SNP array signals, leading to more accurate estimation for diluted tumor sample (even with only 30% of cancer cells) than other existed methods. Results of examinations also demonstrate a good compatibility and extensibility for different Affymetrix SNP array platforms. Application on the 35 breast tumor samples shows that TAFFYS can automatically dissect the tumor samples and reveal statistically significant aberration regions where cancer-related genes locate.

Conclusions

TAFFYS provide an efficient and convenient tool for identifying the copy number alteration and allelic imbalance and assessing the recurrent aberrations for the tumor Affymetrix SNP array data.  相似文献   

3.

Background  

DNA copy number aberration (CNA) is one of the key characteristics of cancer cells. Recent studies demonstrated the feasibility of utilizing high density single nucleotide polymorphism (SNP) genotyping arrays to detect CNA. Compared with the two-color array-based comparative genomic hybridization (array-CGH), the SNP arrays offer much higher probe density and lower signal-to-noise ratio at the single SNP level. To accurately identify small segments of CNA from SNP array data, segmentation methods that are sensitive to CNA while resistant to noise are required.  相似文献   

4.

Background  

Recent developments of high-density SNP chips across a number of species require accurate genetic maps. Despite rapid advances in genome sequence assembly and availability of a number of tools for creating genetic maps, the exact genome location for a number of SNPs from these SNP chips still remains unknown. We have developed a locus ordering procedure based on linkage disequilibrium (LODE) which provides estimation of the chromosomal positions of unaligned SNPs and scaffolds. It also provides an alternative means for verification of genetic maps. We exemplified LODE in cattle.  相似文献   

5.

Background  

Genome-wide expression signatures are emerging as potential marker for overall survival and disease recurrence risk as evidenced by recent commercialization of gene expression based biomarkers in breast cancer. Similar predictions have recently been carried out using genome-wide copy number alterations and microRNAs. Existing software packages for microarray data analysis provide functions to define expression-based survival gene signatures. However, there is no software that can perform survival analysis using SNP array data or draw survival curves interactively for expression-based sample clusters.  相似文献   

6.

Background

Genomic deletions and duplications are important in the pathogenesis of diseases, such as cancer and mental retardation, and have recently been shown to occur frequently in unaffected individuals as polymorphisms. Affymetrix GeneChip whole genome sampling analysis (WGSA) combined with 100 K single nucleotide polymorphism (SNP) genotyping arrays is one of several microarray-based approaches that are now being used to detect such structural genomic changes. The popularity of this technology and its associated open source data format have resulted in the development of an increasing number of software packages for the analysis of copy number changes using these SNP arrays.

Results

We evaluated four publicly available software packages for high throughput copy number analysis using synthetic and empirical 100 K SNP array data sets, the latter obtained from 107 mental retardation (MR) patients and their unaffected parents and siblings. We evaluated the software with regards to overall suitability for high-throughput 100 K SNP array data analysis, as well as effectiveness of normalization, scaling with various reference sets and feature extraction, as well as true and false positive rates of genomic copy number variant (CNV) detection.

Conclusion

We observed considerable variation among the numbers and types of candidate CNVs detected by different analysis approaches, and found that multiple programs were needed to find all real aberrations in our test set. The frequency of false positive deletions was substantial, but could be greatly reduced by using the SNP genotype information to confirm loss of heterozygosity.  相似文献   

7.

Background

In recent years, the use of genomic information in livestock species for genetic improvement, association studies and many other fields has become routine. In order to accommodate different market requirements in terms of genotyping cost, manufacturers of single nucleotide polymorphism (SNP) arrays, private companies and international consortia have developed a large number of arrays with different content and different SNP density. The number of currently available SNP arrays differs among species: ranging from one for goats to more than ten for cattle, and the number of arrays available is increasing rapidly. However, there is limited or no effort to standardize and integrate array- specific (e.g. SNP IDs, allele coding) and species-specific (i.e. past and current assemblies) SNP information.

Results

Here we present SNPchiMp v.3, a solution to these issues for the six major livestock species (cow, pig, horse, sheep, goat and chicken). Original data was collected directly from SNP array producers and specific international genome consortia, and stored in a MySQL database. The database was then linked to an open-access web tool and to public databases. SNPchiMp v.3 ensures fast access to the database (retrieving within/across SNP array data) and the possibility of annotating SNP array data in a user-friendly fashion.

Conclusions

This platform allows easy integration and standardization, and it is aimed at both industry and research. It also enables users to easily link the information available from the array producer with data in public databases, without the need of additional bioinformatics tools or pipelines. In recognition of the open-access use of Ensembl resources, SNPchiMp v.3 was officially credited as an Ensembl E!mpowered tool. Availability at http://bioinformatics.tecnoparco.org/SNPchimp.  相似文献   

8.

Background

Chromosomal breakage followed by faulty DNA repair leads to gene amplifications and deletions in cancers. However, the mere assessment of the extent of genomic changes, amplifications and deletions may reduce the complexity of genomic data observed by array comparative genomic hybridization (array CGH). We present here a novel approach to array CGH data analysis, which focuses on putative breakpoints responsible for rearrangements within the genome.

Results

We performed array comparative genomic hybridization in 29 primary tumors from high risk patients with breast cancer. The specimens were flow sorted according to ploidy to increase tumor cell purity prior to array CGH. We describe the number of chromosomal breaks as well as the patterns of breaks on individual chromosomes in each tumor. There were differences in chromosomal breakage patterns between the 3 clinical subtypes of breast cancers, although the highest density of breaks occurred at chromosome 17 in all subtypes, suggesting a particular proclivity of this chromosome for breaks. We also observed chromothripsis affecting various chromosomes in 41% of high risk breast cancers.

Conclusions

Our results provide a new insight into the genomic complexity of breast cancer. Genomic instability dependent on chromosomal breakage events is not stochastic, targeting some chromosomes clearly more than others. We report a much higher percentage of chromothripsis than described previously in other cancers and this suggests that massive genomic rearrangements occurring in a single catastrophic event may shape many breast cancer genomes.

Electronic supplementary material

The online version of this article (doi:10.1186/1471-2164-15-579) contains supplementary material, which is available to authorized users.  相似文献   

9.
Intra-tumor heterogeneity concerns the existence of genetically different subclones within the same tumor. Single sample quantification of heterogeneity relies on precise determination of chromosomal copy numbers throughout the genome, and an assessment of whether identified mutation variant allele fractions match clonal or subclonal copy numbers. We discuss these issues using data from SNP arrays, whole exome sequencing and pathologist purity estimates on several breast cancers characterized by ERBB2 amplification. We show that chromosomal copy numbers can only be estimated from SNP array signals or sequencing depths for subclonal tumor samples with simple subclonal architectures under certain assumptions.

Electronic supplementary material

The online version of this article (doi:10.1186/s13059-014-0470-7) contains supplementary material, which is available to authorized users.  相似文献   

10.

Background  

Array comparative genomic hybridization (CGH) is a technique which detects copy number differences in DNA segments. Complete sequencing of the human genome and the development of an array representing a tiling set of tens of thousands of DNA segments spanning the entire human genome has made high resolution copy number analysis throughout the genome possible. Since array CGH provides signal ratio for each DNA segment, visualization would require the reassembly of individual data points into chromosome profiles.  相似文献   

11.

Background

Next-generation sequencing techniques, such as genotyping-by-sequencing (GBS), provide alternatives to single nucleotide polymorphism (SNP) arrays. The aim of this work was to evaluate the potential of GBS compared to SNP array genotyping for genomic selection in livestock populations.

Methods

The value of GBS was quantified by simulation analyses in which three parameters were varied: (i) genome-wide sequence read depth (x) per individual from 0.01x to 20x or using SNP array genotyping; (ii) number of genotyped markers from 3000 to 300 000; and (iii) size of training and prediction sets from 500 to 50 000 individuals. The latter was achieved by distributing the total available x of 1000x, 5000x, or 10 000x per genotyped locus among the varying number of individuals. With SNP arrays, genotypes were called from sequence data directly. With GBS, genotypes were called from sequence reads that varied between loci and individuals according to a Poisson distribution with mean equal to x. Simulated data were analyzed with ridge regression and the accuracy and bias of genomic predictions and response to selection were quantified under the different scenarios.

Results

Accuracies of genomic predictions using GBS data or SNP array data were comparable when large numbers of markers were used and x per individual was ~1x or higher. The bias of genomic predictions was very high at a very low x. When the total available x was distributed among the training individuals, the accuracy of prediction was maximized when a large number of individuals was used that had GBS data with low x for a large number of markers. Similarly, response to selection was maximized under the same conditions due to increasing both accuracy and selection intensity.

Conclusions

GBS offers great potential for developing genomic selection in livestock populations because it makes it possible to cover large fractions of the genome and to vary the sequence read depth per individual. Thus, the accuracy of predictions is improved by increasing the size of training populations and the intensity of selection is increased by genotyping a larger number of selection candidates.

Electronic supplementary material

The online version of this article (doi:10.1186/s12711-015-0102-z) contains supplementary material, which is available to authorized users.  相似文献   

12.

Background  

DNA copy number aberration (CNA) is very important in the pathogenesis of tumors and other diseases. For example, CNAs may result in suppression of anti-oncogenes and activation of oncogenes, which would cause certain types of cancers. High density single nucleotide polymorphism (SNP) array data is widely used for the CNA detection. However, it is nontrivial to detect the CNA automatically because the signals obtained from high density SNP arrays often have low signal-to-noise ratio (SNR), which might be caused by whole genome amplification, mixtures of normal and tumor cells, experimental noise or other technical limitations. With the reduction in SNR, many false CNA regions are often detected and the true CNA regions are missed. Thus, more sophisticated statistical models are needed to make the CNAs detection, using the low SNR signals, more robust and reliable.  相似文献   

13.
14.

Background

The identification of disease-associated genes using single nucleotide polymorphisms (SNPs) has been increasingly reported. In particular, the Affymetrix Mapping 10 K SNP microarray platform uses one PCR primer to amplify the DNA samples and determine the genotype of more than 10,000 SNPs in the human genome. This provides the opportunity for large scale, rapid and cost-effective genotyping assays for linkage analysis. However, the analysis of such datasets is nontrivial because of the large number of markers, and visualizing the linkage scores in the context of genome maps remains less automated using the current linkage analysis software packages. For example, the haplotyping results are commonly represented in the text format.

Results

Here we report the development of a novel software tool called CompareLinkage for automated formatting of the Affymetrix Mapping 10 K genotype data into the "Linkage" format and the subsequent analysis with multi-point linkage software programs such as Merlin and Allegro. The new software has the ability to visualize the results for all these programs in dChip in the context of genome annotations and cytoband information. In addition we implemented a variant of the Lander-Green algorithm in the dChipLinkage module of dChip software (V1.3) to perform parametric linkage analysis and haplotyping of SNP array data. These functions are integrated with the existing modules of dChip to visualize SNP genotype data together with LOD score curves. We have analyzed three families with recessive and dominant diseases using the new software programs and the comparison results are presented and discussed.

Conclusions

The CompareLinkage and dChipLinkage software packages are freely available. They provide the visualization tools for high-density oligonucleotide SNP array data, as well as the automated functions for formatting SNP array data for the linkage analysis programs Merlin and Allegro and calling these programs for linkage analysis. The results can be visualized in dChip in the context of genes and cytobands. In addition, a variant of the Lander-Green algorithm is provided that allows parametric linkage analysis and haplotyping.  相似文献   

15.

Background  

Some diseases, like tumors, can be related to chromosomal aberrations, leading to changes of DNA copy number. The copy number of an aberrant genome can be represented as a piecewise constant function, since it can exhibit regions of deletions or gains. Instead, in a healthy cell the copy number is two because we inherit one copy of each chromosome from each our parents.  相似文献   

16.

Background  

Illumina's Infinium SNP BeadChips are extensively used in both small and large-scale genetic studies. A fundamental step in any analysis is the processing of raw allele A and allele B intensities from each SNP into genotype calls (AA, AB, BB). Various algorithms which make use of different statistical models are available for this task. We compare four methods (GenCall, Illuminus, GenoSNP and CRLMM) on data where the true genotypes are known in advance and data from a recently published genome-wide association study.  相似文献   

17.

Background  

With the rapid development of high-throughput genotyping technologies, efficient methods for identifying linked regions using high-density SNP genotype data have become more and more important. Recently, a deterministic method that works very well on SNP genotyping data has been developed (Lin et al. Bioinformatics 2008, 24(1): 86–93). However, that program can only work on a limited number of family structures. In particular, the results (if any) will be poor when the genotype data for the whole chromosome of one of the parents in a nuclear family is missing.  相似文献   

18.

Background

Single nucleotide polymorphism (SNP) markers have a wide range of applications in crop genetics and genomics. Due to their polyploidy nature, many important crops, such as wheat, cotton and rapeseed contain a large amount of repeat and homoeologous sequences in their genomes, which imposes a huge challenge in high-throughput genotyping with sequencing and/or array technologies. Allotetraploid Brassica napus (AACC, 2n = 4x = 38) comprises of two highly homoeologous sub-genomes derived from its progenitor species B. rapa (AA, 2n = 2x = 20) and B. oleracea (CC, 2n = 2x = 18), and is an ideal species to exploit methods for reducing the interference of extensive inter-homoeologue polymorphisms (mHemi-SNPs and Pseudo-simple SNPs) between closely related sub-genomes.

Results

Based on a recent B. napus 6K SNP array, we developed a bi-filtering procedure to identify unauthentic lines in a DH population, and mHemi-SNPs and Pseudo-simple SNPs in an array data matrix. The procedure utilized both monomorphic and polymorphic SNPs in the DH population and could effectively distinguish the mHemi-SNPs and Pseudo-simple SNPs that resulted from superposition of the signals from multiple SNPs. Compared with conventional procedure for array data processing, the bi-filtering method could minimize the pseudo linkage relationship caused by the mHemi-SNPs and Pseudo-simple SNPs, thus improving the quality of SNP genetic map. Furthermore, the improved genetic map could increase the accuracies of mapping of QTLs as demonstrated by the ability to eliminate non-real QTLs in the mapping population.

Conclusions

The bi-filtering analysis of the SNP array data represents a novel approach to effectively assigning the multi-loci SNP genotypes in polyploid B. napus and may find wide applications to SNP analyses in polyploid crops.

Electronic supplementary material

The online version of this article (doi:10.1186/s12864-015-1559-4) contains supplementary material, which is available to authorized users.  相似文献   

19.

Key message

A high-density genetic map constructed with a wheat 55 K SNP array was highly consistent with the physical map of this species and it facilitated the identification of a novel major QTL for productive tiller number.

Abstract

Productive tiller number (PTN) plays a key role in wheat grain yield. In this study, a recombinant inbred line population with 199 lines derived from a cross between ‘20828’ and ‘Chuannong16’ was used to construct a high-density genetic map using wheat 55 K single nucleotide polymorphism (SNP) array. The constructed genetic map contains 12,109 SNP markers spanning 3021.04 cM across the 21 wheat chromosomes. The orders of the genetic and physical positions of these markers are generally in agreement, and they also match well with those based on the 660 K SNP array from which the one used in this study was derived. The ratios of SNPs located in each of the wheat deletion bins were similar among the wheat 9 K, 55 K, 90 K, 660 K and 820 K SNP arrays. Based on the constructed maps, a novel major quantitative trait locus QPtn.sau-4B for PTN was detected across multi-environments in a 0.55 cM interval on 4B and it explained 17.23–45.46% of the phenotypic variance. Twenty common genes in the physical interval between the flanking markers were identified on chromosome 4B of ‘Chinese Spring’ and wild emmer. These results indicate that wheat 55 K SNP array could be an ideal tool in primary mapping of target genes and the identification of QPtn.sau-4B laid a foundation for the following fine mapping and cloning work.
  相似文献   

20.

Background  

Affymetrix SNP arrays can interrogate thousands of SNPs at the same time. This allows us to look at the genomic content of cancer cells and to investigate the underlying events leading to cancer. Genomic copy-numbers are today routinely derived from SNP array data, but the proposed algorithms for this task most often disregard the genotype information available from germline cells in paired germline-tumour samples. Including this information may deepen our understanding of the "true" biological situation e.g. by enabling analysis of allele specific copy-numbers. Here we rely on matched germline-tumour samples and have developed a Hidden Markov Model (HMM) to estimate allelic copy-number changes in tumour cells. Further with this approach we are able to estimate the proportion of normal cells in the tumour (mixture proportion).  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号