共查询到20条相似文献,搜索用时 0 毫秒
1.
Hiromasa Takemura Cesar F. Caiafa Brian A. Wandell Franco Pestilli 《PLoS computational biology》2016,12(2)
Tractography uses diffusion MRI to estimate the trajectory and cortical projection zones of white matter fascicles in the living human brain. There are many different tractography algorithms and each requires the user to set several parameters, such as curvature threshold. Choosing a single algorithm with specific parameters poses two challenges. First, different algorithms and parameter values produce different results. Second, the optimal choice of algorithm and parameter value may differ between different white matter regions or different fascicles, subjects, and acquisition parameters. We propose using ensemble methods to reduce algorithm and parameter dependencies. To do so we separate the processes of fascicle generation and evaluation. Specifically, we analyze the value of creating optimized connectomes by systematically combining candidate streamlines from an ensemble of algorithms (deterministic and probabilistic) and systematically varying parameters (curvature and stopping criterion). The ensemble approach leads to optimized connectomes that provide better cross-validated prediction error of the diffusion MRI data than optimized connectomes generated using a single-algorithm or parameter set. Furthermore, the ensemble approach produces connectomes that contain both short- and long-range fascicles, whereas single-parameter connectomes are biased towards one or the other. In summary, a systematic ensemble tractography approach can produce connectomes that are superior to standard single parameter estimates both for predicting the diffusion measurements and estimating white matter fascicles. 相似文献
2.
Michael Riss 《PloS one》2014,9(4)
The analysis of electrophysiological recordings often involves visual inspection of time series data to locate specific experiment epochs, mask artifacts, and verify the results of signal processing steps, such as filtering or spike detection. Long-term experiments with continuous data acquisition generate large amounts of data. Rapid browsing through these massive datasets poses a challenge to conventional data plotting software because the plotting time increases proportionately to the increase in the volume of data. This paper presents FTSPlot, which is a visualization concept for large-scale time series datasets using techniques from the field of high performance computer graphics, such as hierarchic level of detail and out-of-core data handling. In a preprocessing step, time series data, event, and interval annotations are converted into an optimized data format, which then permits fast, interactive visualization. The preprocessing step has a computational complexity of ; the visualization itself can be done with a complexity of and is therefore independent of the amount of data. A demonstration prototype has been implemented and benchmarks show that the technology is capable of displaying large amounts of time series data, event, and interval annotations lag-free with ms. The current 64-bit implementation theoretically supports datasets with up to bytes, on the x86_64 architecture currently up to bytes are supported, and benchmarks have been conducted with bytes/1 TiB or double precision samples. The presented software is freely available and can be included as a Qt GUI component in future software projects, providing a standard visualization method for long-term electrophysiological experiments. 相似文献
3.
Sean C. Warren Anca Margineanu Dominic Alibhai Douglas J. Kelly Clifford Talbot Yuriy Alexandrov Ian Munro Matilda Katan Chris Dunsby Paul M. W. French 《PloS one》2013,8(8)
Fluorescence lifetime imaging (FLIM) is widely applied to obtain quantitative information from fluorescence signals, particularly using Förster Resonant Energy Transfer (FRET) measurements to map, for example, protein-protein interactions. Extracting FRET efficiencies or population fractions typically entails fitting data to complex fluorescence decay models but such experiments are frequently photon constrained, particularly for live cell or in vivo imaging, and this leads to unacceptable errors when analysing data on a pixel-wise basis. Lifetimes and population fractions may, however, be more robustly extracted using global analysis to simultaneously fit the fluorescence decay data of all pixels in an image or dataset to a multi-exponential model under the assumption that the lifetime components are invariant across the image (dataset). This approach is often considered to be prohibitively slow and/or computationally expensive but we present here a computationally efficient global analysis algorithm for the analysis of time-correlated single photon counting (TCSPC) or time-gated FLIM data based on variable projection. It makes efficient use of both computer processor and memory resources, requiring less than a minute to analyse time series and multiwell plate datasets with hundreds of FLIM images on standard personal computers. This lifetime analysis takes account of repetitive excitation, including fluorescence photons excited by earlier pulses contributing to the fit, and is able to accommodate time-varying backgrounds and instrument response functions. We demonstrate that this global approach allows us to readily fit time-resolved fluorescence data to complex models including a four-exponential model of a FRET system, for which the FRET efficiencies of the two species of a bi-exponential donor are linked, and polarisation-resolved lifetime data, where a fluorescence intensity and bi-exponential anisotropy decay model is applied to the analysis of live cell homo-FRET data. A software package implementing this algorithm, FLIMfit, is available under an open source licence through the Open Microscopy Environment. 相似文献
4.
5.
Marcus Lechner Maribel Hernandez-Rosales Daniel Doerr Nicolas Wieseke Annelyse Thévenin Jens Stoye Roland K. Hartmann Sonja J. Prohaska Peter F. Stadler 《PloS one》2014,9(8)
The elucidation of orthology relationships is an important step both in gene function prediction as well as towards understanding patterns of sequence evolution. Orthology assignments are usually derived directly from sequence similarities for large data because more exact approaches exhibit too high computational costs. Here we present PoFF, an extension for the standalone tool Proteinortho, which enhances orthology detection by combining clustering, sequence similarity, and synteny. In the course of this work, FFAdj-MCS, a heuristic that assesses pairwise gene order using adjacencies (a similarity measure related to the breakpoint distance) was adapted to support multiple linear chromosomes and extended to detect duplicated regions. PoFF largely reduces the number of false positives and enables more fine-grained predictions than purely similarity-based approaches. The extension maintains the low memory requirements and the efficient concurrency options of its basis Proteinortho, making the software applicable to very large datasets. 相似文献
6.
Linear mixed effects models are widely used to analyze a clustered response variable. Motivated by a recent study to examine and compare the hospital length of stay (LOS) between patients undertaking percutaneous coronary intervention (PCI) and coronary artery bypass graft (CABG) from several international clinical trials, we proposed a bivariate linear mixed effects model for the joint modeling of clustered PCI and CABG LOSs where each clinical trial is considered a cluster. Due to the large number of patients in some trials, commonly used commercial statistical software for fitting (bivariate) linear mixed models failed to run since it could not allocate enough memory to invert large dimensional matrices during the optimization process. We consider ways to circumvent the computational problem in the maximum likelihood (ML) inference and restricted maximum likelihood (REML) inference. Particularly, we developed an expected and maximization (EM) algorithm for the REML inference and presented an ML implementation using existing software. The new REML EM algorithm is easy to implement and computationally stable and efficient. With this REML EM algorithm, we could analyze the LOS data and obtained meaningful results. 相似文献
7.
Benedikt Kirsch-Gerweck Leonard Bohnenkmper Michel T Henrichs Jarno N Alanko Hideo Bannai Bastien Cazaux Pierre Peterlongo Joachim Burger Jens Stoye Yoan Diekmann 《Molecular biology and evolution》2023,40(3)
Genomic regions under positive selection harbor variation linked for example to adaptation. Most tools for detecting positively selected variants have computational resource requirements rendering them impractical on population genomic datasets with hundreds of thousands of individuals or more. We have developed and implemented an efficient haplotype-based approach able to scan large datasets and accurately detect positive selection. We achieve this by combining a pattern matching approach based on the positional Burrows–Wheeler transform with model-based inference which only requires the evaluation of closed-form expressions. We evaluate our approach with simulations, and find it to be both sensitive and specific. The computational resource requirements quantified using UK Biobank data indicate that our implementation is scalable to population genomic datasets with millions of individuals. Our approach may serve as an algorithmic blueprint for the era of “big data” genomics: a combinatorial core coupled with statistical inference in closed form. 相似文献
8.
The ability to evaluate the validity of data is essential to any investigation, and manual “eyes on” assessments of data quality have dominated in the past. Yet, as the size of collected data continues to increase, so does the effort required to assess their quality. This challenge is of particular concern for networks that automate their data collection, and has resulted in the automation of many quality assurance and quality control analyses. Unfortunately, the interpretation of the resulting data quality flags can become quite challenging with large data sets. We have developed a framework to summarize data quality information and facilitate interpretation by the user. Our framework consists of first compiling data quality information and then presenting it through 2 separate mechanisms; a quality report and a quality summary. The quality report presents the results of specific quality analyses as they relate to individual observations, while the quality summary takes a spatial or temporal aggregate of each quality analysis and provides a summary of the results. Included in the quality summary is a final quality flag, which further condenses data quality information to assess whether a data product is valid or not. This framework has the added flexibility to allow “eyes on” information on data quality to be incorporated for many data types. Furthermore, this framework can aid problem tracking and resolution, should sensor or system malfunctions arise. 相似文献
9.
10.
Clare M. Lee Manikhandan A. V. Mudaliar D. R. Haggart C. Roland Wolf Gino Miele J. Keith Vass Desmond J. Higham Daniel Crowther 《PloS one》2012,7(12)
Non-negative matrix factorization is a useful tool for reducing the dimension of large datasets. This work considers simultaneous non-negative matrix factorization of multiple sources of data. In particular, we perform the first study that involves more than two datasets. We discuss the algorithmic issues required to convert the approach into a practical computational tool and apply the technique to new gene expression data quantifying the molecular changes in four tissue types due to different dosages of an experimental panPPAR agonist in mouse. This study is of interest in toxicology because, whilst PPARs form potential therapeutic targets for diabetes, it is known that they can induce serious side-effects. Our results show that the practical simultaneous non-negative matrix factorization developed here can add value to the data analysis. In particular, we find that factorizing the data as a single object allows us to distinguish between the four tissue types, but does not correctly reproduce the known dosage level groups. Applying our new approach, which treats the four tissue types as providing distinct, but related, datasets, we find that the dosage level groups are respected. The new algorithm then provides separate gene list orderings that can be studied for each tissue type, and compared with the ordering arising from the single factorization. We find that many of our conclusions can be corroborated with known biological behaviour, and others offer new insights into the toxicological effects. Overall, the algorithm shows promise for early detection of toxicity in the drug discovery process. 相似文献
11.
通过对海洋13个样品元基因组数据的BLAsT搜索,筛选到了16s rRNA基因序列1600条,18s RNA基因序列61条。分类结果显示,细菌在海岸、公开海域深层海水和表层海水3种海洋环境类型中都占优势,其相对百分比分别为98%、59%和91%。相比于海岸和公开海域表层海水,公开海域深层海水中古生菌和Deltaproteobacteria所占的相对含量较高,各31%和27%。海岸检测到的古生菌主要为Euryarcheata,公开海域深层海水检测到的古生菌主要为Crenarchaaeota(93%为与氨氧化相关的MGI纲)。结果表明,氨氧化相关古生菌在深海生态系统中的作用可能较大。 相似文献
12.
13.
《PLoS genetics》2013,9(6)
Genome-wide association studies have mainly relied on common HapMap sequence variations. Recently, sequencing approaches have allowed analysis of low frequency and rare variants in conjunction with common variants, thereby improving the search for functional variants and thus the understanding of the underlying biology of human traits and diseases. Here, we used a large Icelandic whole genome sequence dataset combined with Danish exome sequence data to gain insight into the genetic architecture of serum levels of vitamin B12 (B12) and folate. Up to 22.9 million sequence variants were analyzed in combined samples of 45,576 and 37,341 individuals with serum B12 and folate measurements, respectively. We found six novel loci associating with serum B12 (CD320, TCN2, ABCD4, MMAA, MMACHC) or folate levels (FOLR3) and confirmed seven loci for these traits (TCN1, FUT6, FUT2, CUBN, CLYBL, MUT, MTHFR). Conditional analyses established that four loci contain additional independent signals. Interestingly, 13 of the 18 identified variants were coding and 11 of the 13 target genes have known functions related to B12 and folate pathways. Contrary to epidemiological studies we did not find consistent association of the variants with cardiovascular diseases, cancers or Alzheimer''s disease although some variants demonstrated pleiotropic effects. Although to some degree impeded by low statistical power for some of these conditions, these data suggest that sequence variants that contribute to the population diversity in serum B12 or folate levels do not modify the risk of developing these conditions. Yet, the study demonstrates the value of combining whole genome and exome sequencing approaches to ascertain the genetic and molecular architectures underlying quantitative trait associations. 相似文献
14.
Next generation sequencing (NGS) of metagenomic samples is becoming a standard approach to detect individual species or pathogenic strains of microorganisms. Computer programs used in the NGS community have to balance between speed and sensitivity and as a result, species or strain level identification is often inaccurate and low abundance pathogens can sometimes be missed. We have developed Taxoner, an open source, taxon assignment pipeline that includes a fast aligner (e.g. Bowtie2) and a comprehensive DNA sequence database. We tested the program on simulated datasets as well as experimental data from Illumina, IonTorrent, and Roche 454 sequencing platforms. We found that Taxoner performs as well as, and often better than BLAST, but requires two orders of magnitude less running time meaning that it can be run on desktop or laptop computers. Taxoner is slower than the approaches that use small marker databases but is more sensitive due the comprehensive reference database. In addition, it can be easily tuned to specific applications using small tailored databases. When applied to metagenomic datasets, Taxoner can provide a functional summary of the genes mapped and can provide strain level identification. Taxoner is written in C for Linux operating systems. The code and documentation are available for research applications at http://code.google.com/p/taxoner. 相似文献
15.
16.
In recent years, several new diffusion MRI approaches have been proposed to explore microstructural properties of the white matter, such as Q-ball imaging and spherical deconvolution-based techniques to estimate the orientation distribution function. These methods can describe the estimated diffusion profile with a higher accuracy than the more conventional second-rank diffusion tensor imaging technique. Despite many important advances, there are still inconsistent findings between different models that investigate the “crossing fibers” issue. Due to the high information content and the complex nature of the data, it becomes virtually impossible to interpret and compare results in a consistent manner. In this work, we present novel fiber tractography visualization approaches that provide a more complete picture of the microstructural architecture of fiber pathways: multi-fiber hyperstreamlines and streamribbons. By visualizing, for instance, the estimated fiber orientation distribution along the reconstructed tract in a continuous way, information of the local fiber architecture is combined with the global anatomical information derived from tractography. Facilitating the interpretation of diffusion MRI data, this approach can be useful for comparing different diffusion reconstruction techniques and may improve our understanding of the intricate white matter network. 相似文献
17.
Tractography algorithms have been developed to reconstruct likely WM pathways in the brain from diffusion tensor imaging (DTI) data. In this study, an elegant and simple means for improving existing tractography algorithms is proposed by allowing tracts to propagate through diagonal trajectories between voxels, instead of only rectilinearly to their facewise neighbors. A series of tests (using both real and simulated data sets) are utilized to show several benefits of this new approach. First, the inclusion of diagonal tract propagation decreases the dependence of an algorithm on the arbitrary orientation of coordinate axes and therefore reduces numerical errors associated with that bias (which are also demonstrated here). Moreover, both quantitatively and qualitatively, including diagonals decreases overall noise sensitivity of results and leads to significantly greater efficiency in scanning protocols; that is, the obtained tracts converge much more quickly (i.e., in a smaller amount of scanning time) to those of data sets with high SNR and spatial resolution. Importantly, the inclusion of diagonal propagation adds essentially no appreciable time of calculation or computational costs to standard methods. This study focuses on the widely-used streamline tracking method, FACT (fiber assessment by continuous tracking), and the modified method is termed "FACTID" (FACT including diagonals). 相似文献
18.
19.
20.
Choukri Mekkaoui Prashob Porayette Marcel P. Jackowski William J. Kostis Guangping Dai Stephen Sanders David E. Sosnovik 《PloS one》2013,8(8)