首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
Methods for multiple-testing correction in local expression quantitative trait locus (cis-eQTL) studies are a trade-off between statistical power and computational efficiency. Bonferroni correction, though computationally trivial, is overly conservative and fails to account for linkage disequilibrium between variants. Permutation-based methods are more powerful, though computationally far more intensive. We present an alternative correction method called eigenMT, which runs over 500 times faster than permutations and has adjusted p values that closely approximate empirical ones. To achieve this speed while also maintaining the accuracy of permutation-based methods, we estimate the effective number of independent variants tested for association with a particular gene, termed Meff, by using the eigenvalue decomposition of the genotype correlation matrix. We employ a regularized estimator of the correlation matrix to ensure Meff is robust and yields adjusted p values that closely approximate p values from permutations. Finally, using a common genotype matrix, we show that eigenMT can be applied with even greater efficiency to studies across tissues or conditions. Our method provides a simpler, more efficient approach to multiple-testing correction than existing methods and fits within existing pipelines for eQTL discovery.  相似文献   

2.
3.
4.
This exhaled breath ammonia method uses a fast and highly sensitive spectroscopic method known as quartz enhanced photoacoustic spectroscopy (QEPAS) that uses a quantum cascade based laser. The monitor is coupled to a sampler that measures mouth pressure and carbon dioxide. The system is temperature controlled and specifically designed to address the reactivity of this compound. The sampler provides immediate feedback to the subject and the technician on the quality of the breath effort. Together with the quick response time of the monitor, this system is capable of accurately measuring exhaled breath ammonia representative of deep lung systemic levels. Because the system is easy to use and produces real time results, it has enabled experiments to identify factors that influence measurements. For example, mouth rinse and oral pH reproducibly and significantly affect results and therefore must be controlled. Temperature and mode of breathing are other examples. As our understanding of these factors evolves, error is reduced, and clinical studies become more meaningful. This system is very reliable and individual measurements are inexpensive. The sampler is relatively inexpensive and quite portable, but the monitor is neither. This limits options for some clinical studies and provides rational for future innovations.  相似文献   

5.
6.
Linear motifs mediate a wide variety of cellular functions, which makes their characterization in protein sequences crucial to understanding cellular systems. However, the short length and degenerate nature of linear motifs make their discovery a difficult problem. Here, we introduce MotifHound, an algorithm particularly suited for the discovery of small and degenerate linear motifs. MotifHound performs an exact and exhaustive enumeration of all motifs present in proteins of interest, including all of their degenerate forms, and scores the overrepresentation of each motif based on its occurrence in proteins of interest relative to a background (e.g., proteome) using the hypergeometric distribution. To assess MotifHound, we benchmarked it together with state-of-the-art algorithms. The benchmark consists of 11,880 sets of proteins from S. cerevisiae; in each set, we artificially spiked-in one motif varying in terms of three key parameters, (i) number of occurrences, (ii) length and (iii) the number of degenerate or “wildcard” positions. The benchmark enabled the evaluation of the impact of these three properties on the performance of the different algorithms. The results showed that MotifHound and SLiMFinder were the most accurate in detecting degenerate linear motifs. Interestingly, MotifHound was 15 to 20 times faster at comparable accuracy and performed best in the discovery of highly degenerate motifs. We complemented the benchmark by an analysis of proteins experimentally shown to bind the FUS1 SH3 domain from S. cerevisiae. Using the full-length protein partners as sole information, MotifHound recapitulated most experimentally determined motifs binding to the FUS1 SH3 domain. Moreover, these motifs exhibited properties typical of SH3 binding peptides, e.g., high intrinsic disorder and evolutionary conservation, despite the fact that none of these properties were used as prior information. MotifHound is available (http://michnick.bcm.umontreal.ca or http://tinyurl.com/motifhound) together with the benchmark that can be used as a reference to assess future developments in motif discovery.  相似文献   

7.
MOTIVATION: The intensification of DNA sequencing will increasingly unveil uncharacterized species with potential alternative genetic codes. A total of 0.65% of the DNA sequences currently in Genbank encode their proteins with a variant genetic code, and these exceptions occur in many unrelated taxa. RESULTS: We introduce FACIL (Fast and Accurate genetic Code Inference and Logo), a fast and reliable tool to evaluate nucleic acid sequences for their genetic code that detects alternative codes even in species distantly related to known organisms. To illustrate this, we apply FACIL to a set of mitochondrial genomic contigs of Globobulimina pseudospinescens. This foraminifer does not have any sequenced close relative in the databases, yet we infer its alternative genetic code with high confidence values. Results are intuitively visualized in a Genetic Code Logo. Availability and implementation: FACIL is available as a web-based service at http://www.cmbi.ru.nl/FACIL/ and as a stand-alone program.  相似文献   

8.
9.
The belief propagation (BP) algorithm has some limitations, including ambiguous edges and textureless regions, and slow convergence speed. To address these problems, we present a novel algorithm that intrinsically improves both the accuracy and the convergence speed of BP. First, traditional BP generally consumes time due to numerous iterations. To reduce the number of iterations, inspired by the crucial importance of the initial value in nonlinear problems, a novel initial-value belief propagation (IVBP) algorithm is presented, which can greatly improve both convergence speed and accuracy. Second, .the majority of the existing research on BP concentrates on the smoothness term or other energy terms, neglecting the significance of the data term. In this study, a self-adapting dissimilarity data term (SDDT) is presented to improve the accuracy of the data term, which incorporates an additional gradient-based measure into the traditional data term, with the weight determined by the robust measure-based control function. Finally, this study explores the effective combination of local methods and global methods. The experimental results have demonstrated that our method performs well compared with the state-of-the-art BP and simultaneously holds better edge-preserving smoothing effects with fast convergence speed in the Middlebury and new 2014 Middlebury datasets.  相似文献   

10.
A key challenge in genomics is to identify genetic variants that distinguish patients with different survival time following diagnosis or treatment. While the log-rank test is widely used for this purpose, nearly all implementations of the log-rank test rely on an asymptotic approximation that is not appropriate in many genomics applications. This is because: the two populations determined by a genetic variant may have very different sizes; and the evaluation of many possible variants demands highly accurate computation of very small p-values. We demonstrate this problem for cancer genomics data where the standard log-rank test leads to many false positive associations between somatic mutations and survival time. We develop and analyze a novel algorithm, Exact Log-rank Test (ExaLT), that accurately computes the p-value of the log-rank statistic under an exact distribution that is appropriate for any size populations. We demonstrate the advantages of ExaLT on data from published cancer genomics studies, finding significant differences from the reported p-values. We analyze somatic mutations in six cancer types from The Cancer Genome Atlas (TCGA), finding mutations with known association to survival as well as several novel associations. In contrast, standard implementations of the log-rank test report dozens-hundreds of likely false positive associations as more significant than these known associations.  相似文献   

11.
A mathematical analysis for fast changes of ethylene concentration in an open flow system (non-steady-state conditions) is presented and experimentally tested. In this way it becomes possible to determine true values of ethylene production in the minute range following physiological and environmental changes which influence ethylene evolution. By this procedure ethylene kinetics can also be compared in absolute values independent of flow rate and plant chamber volume.  相似文献   

12.
13.
14.
Multiplexing is of vital importance for utilizing the full potential of next generation sequencing technologies. We here report TagGD (DNA-based Tag Generator and Demultiplexor), a fully-customisable, fast and accurate software package that can generate thousands of barcodes satisfying user-defined constraints and can guarantee full demultiplexing accuracy. The barcodes are designed to minimise their interference with the experiment. Insertion, deletion and substitution events are considered when designing and demultiplexing barcodes. 20,000 barcodes of length 18 were designed in 5 minutes and 2 million barcoded Illumina HiSeq-like reads generated with an error rate of 2% were demultiplexed with full accuracy in 5 minutes. We believe that our software meets a central demand in the current high-throughput biology and can be utilised in any field with ample sample abundance. The software is available on GitHub (https://github.com/pelinakan/UBD.git).  相似文献   

15.

Background

The simplest definition of cis-eQTLs versus trans, refers to genetic variants that affect expression in an allele specific manner, with implications on underlying mechanism. Yet, due to technical limitations of expression microarrays, the vast majority of eQTL studies performed in the last decade used a genomic distance based definition as a surrogate for cis, therefore exploring local rather than cis-eQTLs.

Results

In this study we use RNAseq to explore allele specific expression (ASE) in adipose tissue of male and female F1 mice, produced from reciprocal crosses of C57BL/6J and DBA/2J strains. Comparison of the identified cis-eQTLs, to local-eQTLs, that were obtained from adipose tissue expression in two previous population based studies in our laboratory, yields poor overlap between the two mapping approaches, while both local-eQTL studies show highly concordant results. Specifically, local-eQTL studies show ~60% overlap between themselves, while only 15-20% of local-eQTLs are identified as cis by ASE, and less than 50% of ASE genes are recovered in local-eQTL studies. Utilizing recently published ENCODE data, we also find that ASE genes show significant bias for SNPs prevalence in DNase I hypersensitive sites that is ASE direction specific.

Conclusions

We suggest a new approach to analysis of allele specific expression that is more sensitive and accurate than the commonly used fisher or chi-square statistics. Our analysis indicates that technical differences between the cis and local-eQTL approaches, such as differences in genomic background or sex specificity, account for relatively small fraction of the discrepancy. Therefore, we suggest that the differences between two eQTL mapping approaches may facilitate sorting of SNP-eQTL interactions into true cis and trans, and that a considerable portion of local-eQTL may actually represent trans interactions.

Electronic supplementary material

The online version of this article (doi:10.1186/1471-2164-15-471) contains supplementary material, which is available to authorized users.  相似文献   

16.
17.
The explosion of bioinformatics technologies in the form of next generation sequencing (NGS) has facilitated a massive influx of genomics data in the form of short reads. Short read mapping is therefore a fundamental component of next generation sequencing pipelines which routinely match these short reads against reference genomes for contig assembly. However, such techniques have seldom been applied to microbial marker gene sequencing studies, which have mostly relied on novel heuristic approaches. We propose NINJA Is Not Just Another OTU-Picking Solution (NINJA-OPS, or NINJA for short), a fast and highly accurate novel method enabling reference-based marker gene matching (picking Operational Taxonomic Units, or OTUs). NINJA takes advantage of the Burrows-Wheeler (BW) alignment using an artificial reference chromosome composed of concatenated reference sequences, the “concatesome,” as the BW input. Other features include automatic support for paired-end reads with arbitrary insert sizes. NINJA is also free and open source and implements several pre-filtering methods that elicit substantial speedup when coupled with existing tools. We applied NINJA to several published microbiome studies, obtaining accuracy similar to or better than previous reference-based OTU-picking methods while achieving an order of magnitude or more speedup and using a fraction of the memory footprint. NINJA is a complete pipeline that takes a FASTA-formatted input file and outputs a QIIME-formatted taxonomy-annotated BIOM file for an entire MiSeq run of human gut microbiome 16S genes in under 10 minutes on a dual-core laptop.  相似文献   

18.
Microsatellite instability(MSI) is a key biomarker for cancer therapy and prognosis. Traditional experimental assays are laborious and time-consuming, and next-generation sequencingbased computational methods do not work on leukemia samples, paraffin-embedded samples, or patient-derived xenografts/organoids, due to the requirement of matched normal samples. Herein,we developed MSIsensor-pro, an open-source single sample MSI scoring method for research and clinical applications. MSIsensor-pro introduces a multinomial distribution model to quantify polymerase slippages for each tumor sample and a discriminative site selection method to enable MSI detection without matched normal samples. We demonstrate that MSIsensor-pro is an ultrafast,accurate, and robust MSI calling method. Using samples with various sequencing depths and tumor purities, MSIsensor-pro significantly outperformed the current leading methods in both accuracy and computational cost. MSIsensor-pro is available at https://github.com/xjtu-omics/msisensor-pro and free for non-commercial use, while a commercial license is provided upon request.  相似文献   

19.
《IRBM》2022,43(4):279-289
The glaucoma is an eye disease that causes blindness when it progresses in an advanced stage. Early glaucoma diagnosis is essential to prevent the vision loss. However, early detection is not covered due to the lack of ophthalmologists and the limited accessibility to retinal image capture devices.In this paper, we present an automated method for glaucoma screening dedicated for Smartphone Captured Fundus Images (SCFIs). The implementation of the method into a smartphone associated to an optical lens for retina capturing leads to a mobile aided screening system for glaucoma. The challenge consists in insuring higher performance detection despite the moderate quality of SCFIs, with a reduced execution time to be adequate for the clinical use. The main idea consists in deducing glaucoma based on the vessel displacement inside the Optic Disk (OD), where the vessel tree remains sufficiently modeled on SCFIs. Within this objective, our major contribution consists in proposing: (1) a robust processing for locating vessel centroids in order to adequately model the vessel distribution, and (2) a feature vector that relevantly reflect two main glaucoma biomarkers in terms of vessel displacement. Furthermore, all processing steps are carefully chosen based on lower complexity, to be suitable for fast clinical screening.A first evaluation of our method is performed using the two public DRISHTI-DB and DRIONS-DB databases, where 99% and 95% accuracy, 96.77% and 97,5% specificity and 100% and 95% sensitivity are respectively achieved. Thereafter, the method is evaluated using two fundus image databases respectively captured through a smartphone and retinograph for the same persons. We achieve 100% accuracy using both databases which assesses the robustness of our method. In addition, the detection is performed on 0.027 and 0.029 second when executed respectively on the Samsung-M51 on the Samsung-A70 smartphone devices. Our proposed smartphone app provides a cost-effective and widely accessible mobile platform for early screening of glaucoma in remote clinics or areas with limited access to fundus cameras and ophthalmologists.  相似文献   

20.

Motivation

To obtain large-scale sequence alignments in a fast and flexible way is an important step in the analyses of next generation sequencing data. Applications based on the Smith-Waterman (SW) algorithm are often either not fast enough, limited to dedicated tasks or not sufficiently accurate due to statistical issues. Current SW implementations that run on graphics hardware do not report the alignment details necessary for further analysis.

Results

With the Parallel SW Alignment Software (PaSWAS) it is possible (a) to have easy access to the computational power of NVIDIA-based general purpose graphics processing units (GPGPUs) to perform high-speed sequence alignments, and (b) retrieve relevant information such as score, number of gaps and mismatches. The software reports multiple hits per alignment. The added value of the new SW implementation is demonstrated with two test cases: (1) tag recovery in next generation sequence data and (2) isotype assignment within an immunoglobulin 454 sequence data set. Both cases show the usability and versatility of the new parallel Smith-Waterman implementation.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号