期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

1.

Parallel Clustering Algorithm for Large-Scale Biological Data Sets

Minchao Wang Wu Zhang Wang Ding Dongbo Dai Huiran Zhang Hao Xie Luonan Chen Yike Guo Jiang Xie 《PloS one》2014,9(4)

Backgrounds

Recent explosion of biological data brings a great challenge for the traditional clustering algorithms. With increasing scale of data sets, much larger memory and longer runtime are required for the cluster identification problems. The affinity propagation algorithm outperforms many other classical clustering algorithms and is widely applied into the biological researches. However, the time and space complexity become a great bottleneck when handling the large-scale data sets. Moreover, the similarity matrix, whose constructing procedure takes long runtime, is required before running the affinity propagation algorithm, since the algorithm clusters data sets based on the similarities between data pairs.

Methods

Two types of parallel architectures are proposed in this paper to accelerate the similarity matrix constructing procedure and the affinity propagation algorithm. The memory-shared architecture is used to construct the similarity matrix, and the distributed system is taken for the affinity propagation algorithm, because of its large memory size and great computing capacity. An appropriate way of data partition and reduction is designed in our method, in order to minimize the global communication cost among processes.

Result

A speedup of 100 is gained with 128 cores. The runtime is reduced from serval hours to a few seconds, which indicates that parallel algorithm is capable of handling large-scale data sets effectively. The parallel affinity propagation also achieves a good performance when clustering large-scale gene data (microarray) and detecting families in large protein superfamilies. 相似文献

2.

DTI of the Visual Pathway - White Matter Tracts and Cerebral Lesions

Ardian Hana Andreas Husch Vimal Raj Nitish Gunness Christophe Berthold Anisa Hana Georges Dooms Hans Boecher Schwarz Frank Hertel 《Journal of visualized experiments : JoVE》2014,(90)

DTI is a technique that identifies white matter tracts (WMT) non-invasively in healthy and non-healthy patients using diffusion measurements. Similar to visual pathways (VP), WMT are not visible with classical MRI or intra-operatively with microscope. DTI will help neurosurgeons to prevent destruction of the VP while removing lesions adjacent to this WMT. We have performed DTI on fifty patients before and after surgery between March 2012 to January 2014. To navigate we used a 3DT1-weighted sequence. Additionally, we performed a T2-weighted and DTI-sequences. The parameters used were, FOV: 200 x 200 mm, slice thickness: 2 mm, and acquisition matrix: 96 x 96 yielding nearly isotropic voxels of 2 x 2 x 2 mm. Axial MRI was carried out using a 32 gradient direction and one b0-image. We used Echo-Planar-Imaging (EPI) and ASSET parallel imaging with an acceleration factor of 2 and b-value of 800 s/mm². The scanning time was less than 9 min.The DTI-data obtained were processed using a FDA approved surgical navigation system program which uses a straightforward fiber-tracking approach known as fiber assignment by continuous tracking (FACT). This is based on the propagation of lines between regions of interest (ROI) which is defined by a physician. A maximum angle of 50, FA start value of 0.10 and ADC stop value of 0.20 mm²/s were the parameters used for tractography.There are some limitations to this technique. The limited acquisition time frame enforces trade-offs in the image quality. Another important point not to be neglected is the brain shift during surgery. As for the latter intra-operative MRI might be helpful. Furthermore the risk of false positive or false negative tracts needs to be taken into account which might compromise the final results. 相似文献

3.

Performance of deep learning restoration methods for the extraction of particle dynamics in noisy microscopy image sequences

Paul Kefer Fadil Iqbal Maelle Locatelli Josh Lawrimore Mengdi Zhang Kerry Bloom Keith Bonin Pierre-Alexandre Vidi Jing Liu 《Molecular biology of the cell》2021,32(9):903

Particle tracking in living systems requires low light exposure and short exposure times to avoid phototoxicity and photobleaching and to fully capture particle motion with high-speed imaging. Low-excitation light comes at the expense of tracking accuracy. Image restoration methods based on deep learning dramatically improve the signal-to-noise ratio in low-exposure data sets, qualitatively improving the images. However, it is not clear whether images generated by these methods yield accurate quantitative measurements such as diffusion parameters in (single) particle tracking experiments. Here, we evaluate the performance of two popular deep learning denoising software packages for particle tracking, using synthetic data sets and movies of diffusing chromatin as biological examples. With synthetic data, both supervised and unsupervised deep learning restored particle motions with high accuracy in two-dimensional data sets, whereas artifacts were introduced by the denoisers in three-dimensional data sets. Experimentally, we found that, while both supervised and unsupervised approaches improved tracking results compared with the original noisy images, supervised learning generally outperformed the unsupervised approach. We find that nicer-looking image sequences are not synonymous with more precise tracking results and highlight that deep learning algorithms can produce deceiving artifacts with extremely noisy images. Finally, we address the challenge of selecting parameters to train convolutional neural networks by implementing a frugal Bayesian optimizer that rapidly explores multidimensional parameter spaces, identifying networks yielding optimal particle tracking accuracy. Our study provides quantitative outcome measures of image restoration using deep learning. We anticipate broad application of this approach to critically evaluate artificial intelligence solutions for quantitative microscopy. 相似文献

4.

A New Exhaustive Method and Strategy for Finding Motifs in ChIP-Enriched Regions

Caiyan Jia Matthew B. Carson Yang Wang Youfang Lin Hui Lu 《PloS one》2014,9(1)

相似文献

5.

Pluripotency Transcription Factor Oct4 Mediates Stepwise Nucleosome Demethylation and Depletion

Arvind Shakya Catherine Callister Alon Goren Nir Yosef Neha Garg Vahid Khoddami David Nix Aviv Regev Dean Tantin 《Molecular and cellular biology》2015,35(6):1014-1025

相似文献

6.

Segmentation of 3D tubular objects with adaptive front propagation and minimal tree extraction for 3D medical imaging

Cohen LD Deschamps T 《Computer methods in biomechanics and biomedical engineering》2007,10(4):289-305

We present a new fast approach for segmentation of thin branching structures, like vascular trees, based on Fast-Marching (FM) and Level Set (LS) methods. FM allows segmentation of tubular structures by inflating a "long balloon" from a user given single point. However, when the tubular shape is rather long, the front propagation may blow up through the boundary of the desired shape close to the starting point. Our contribution is focused on a method to propagate only the useful part of the front while freezing the rest of it. We demonstrate its ability to segment quickly and accurately tubular and tree-like structures. We also develop a useful stopping criterion for the causal front propagation. We finally derive an efficient algorithm for extracting an underlying 1D skeleton of the branching objects, with minimal path techniques. Each branch being represented by its centerline, we automatically detect the bifurcations, leading to the "Minimal Tree" representation. This so-called "Minimal Tree" is very useful for visualization and quantification of the pathologies in our anatomical data sets. We illustrate our algorithms by applying it to several arteries datasets. 相似文献

7.

Inferring Demographic History from a Spectrum of Shared Haplotype Lengths

Kelley Harris Rasmus Nielsen 《PLoS genetics》2013,9(6)

There has been much recent excitement about the use of genetics to elucidate ancestral history and demography. Whole genome data from humans and other species are revealing complex stories of divergence and admixture that were left undiscovered by previous smaller data sets. A central challenge is to estimate the timing of past admixture and divergence events, for example the time at which Neanderthals exchanged genetic material with humans and the time at which modern humans left Africa. Here, we present a method for using sequence data to jointly estimate the timing and magnitude of past admixture events, along with population divergence times and changes in effective population size. We infer demography from a collection of pairwise sequence alignments by summarizing their length distribution of tracts of identity by state (IBS) and maximizing an analytic composite likelihood derived from a Markovian coalescent approximation. Recent gene flow between populations leaves behind long tracts of identity by descent (IBD), and these tracts give our method power by influencing the distribution of shared IBS tracts. In simulated data, we accurately infer the timing and strength of admixture events, population size changes, and divergence times over a variety of ancient and recent time scales. Using the same technique, we analyze deeply sequenced trio parents from the 1000 Genomes project. The data show evidence of extensive gene flow between Africa and Europe after the time of divergence as well as substructure and gene flow among ancestral hominids. In particular, we infer that recent African-European gene flow and ancient ghost admixture into Europe are both necessary to explain the spectrum of IBS sharing in the trios, rejecting simpler models that contain less population structure. 相似文献

8.

Myogenin Recruits the Histone Chaperone Facilitates Chromatin Transcription (FACT) to Promote Nucleosome Disassembly at Muscle-specific Genes

Alexandra A. Lolis Priya Londhe Benjamin C. Beggs Stephanie D. Byrum Alan J. Tackett Judith K. Davie 《The Journal of biological chemistry》2013,288(11):7676-7687

相似文献

9.

Algorithms for differential splicing detection using exon arrays: a comparative assessment

Karin Zimmermann Marcel Jentsch Axel Rasche Michael Hummel Ulf Leser 《BMC genomics》2015,16(1):136

相似文献

10.

A novel validation algorithm allows for automated cell tracking and the extraction of biologically meaningful parameters

Rapoport DH Becker T Madany Mamlouk A Schicktanz S Kruse C 《PloS one》2011,6(11):e27315

Automated microscopy is currently the only method to non-invasively and label-free observe complex multi-cellular processes, such as cell migration, cell cycle, and cell differentiation. Extracting biological information from a time-series of micrographs requires each cell to be recognized and followed through sequential microscopic snapshots. Although recent attempts to automatize this process resulted in ever improving cell detection rates, manual identification of identical cells is still the most reliable technique. However, its tedious and subjective nature prevented tracking from becoming a standardized tool for the investigation of cell cultures. Here, we present a novel method to accomplish automated cell tracking with a reliability comparable to manual tracking. Previously, automated cell tracking could not rival the reliability of manual tracking because, in contrast to the human way of solving this task, none of the algorithms had an independent quality control mechanism; they missed validation. Thus, instead of trying to improve the cell detection or tracking rates, we proceeded from the idea to automatically inspect the tracking results and accept only those of high trustworthiness, while rejecting all other results. This validation algorithm works independently of the quality of cell detection and tracking through a systematic search for tracking errors. It is based only on very general assumptions about the spatiotemporal contiguity of cell paths. While traditional tracking often aims to yield genealogic information about single cells, the natural outcome of a validated cell tracking algorithm turns out to be a set of complete, but often unconnected cell paths, i.e. records of cells from mitosis to mitosis. This is a consequence of the fact that the validation algorithm takes complete paths as the unit of rejection/acceptance. The resulting set of complete paths can be used to automatically extract important biological parameters with high reliability and statistical significance. These include the distribution of life/cycle times and cell areas, as well as of the symmetry of cell divisions and motion analyses. The new algorithm thus allows for the quantification and parameterization of cell culture with unprecedented accuracy. To evaluate our validation algorithm, two large reference data sets were manually created. These data sets comprise more than 320,000 unstained adult pancreatic stem cells from rat, including 2592 mitotic events. The reference data sets specify every cell position and shape, and assign each cell to the correct branch of its genealogic tree. We provide these reference data sets for free use by others as a benchmark for the future improvement of automated tracking methods. 相似文献

11.

Effect of position-fixing interval on estimated swimming speed and movement pattern of fish tracked with a stationary positioning system 总被引：1，自引：1，他引：0

Løkkeborg Svein Fernö Anders Jørgensen Terje 《Hydrobiologia》2002,483(1-3):259-264

Ultrasonic telemetry using stationary positioning systems allows several fish to be tracked simultaneously, but systems that are incapable of sampling multiple frequencies simultaneously can record data from only one transmitter (individual) at a time. Tracking several individuals simultaneously thus results in longer intervals between successive position fixes for each fish. This deficiency leads to loss of detail in the tracking data collected, and may be expected to cause loss of accuracy in estimates of the swimming speeds and movement patterns of the fish tracked. Even systems that track fish on multiple frequencies are not capable of continuous tracking due to technical issues. We determined the swimming speed, area occupied, activity rhythm and movement pattern of cod (Gadus morhua) using a stationary single-channel positioning system, and analysed how estimates of these behavioural parameters were affected by the interval between successive position fixes. Single fish were tracked at a time, and position fixes were eliminated at regular intervals in the original data to generate new data sets, as if they had been collected in the course of tracking several fish (2–16). In comparison with the complete set, these data sets gave 30–70% decreases in estimates of swimming speed depending on the number of fish supposedly being tracked. These results were similar for two individuals of different size and activity level, indicating that they can be employed as correction factors to partly compensate for underestimates of swimming speed when several fish are tracked simultaneously. Tracking `several' fish only slightly affected the estimates of area occupied (1–15%). The diurnal activity rhythm was also similar between the data sets, whereas details in search pattern were not seen when several fish were tracked simultaneously. 相似文献

12.

The FACT chromatin modulator: genetic and structure/function relationships.

Richard A Singer Gerald C Johnston 《Biochimie et biologie cellulaire》2004,82(4):419-427

相似文献

13.

Accelerating Bayesian Hierarchical Clustering of Time Series Data with a Randomised Algorithm

Robert Darkins Emma J. Cooke Zoubin Ghahramani Paul D. W. Kirk David L. Wild Richard S. Savage 《PloS one》2013,8(4)

相似文献

14.

ParAlign: a parallel sequence alignment algorithm for rapid and sensitive database searches

Rognes T 《Nucleic acids research》2001,29(7):1647-1652

There is a need for faster and more sensitive algorithms for sequence similarity searching in view of the rapidly increasing amounts of genomic sequence data available. Parallel processing capabilities in the form of the single instruction, multiple data (SIMD) technology are now available in common microprocessors and enable a single microprocessor to perform many operations in parallel. The ParAlign algorithm has been specifically designed to take advantage of this technology. The new algorithm initially exploits parallelism to perform a very rapid computation of the exact optimal ungapped alignment score for all diagonals in the alignment matrix. Then, a novel heuristic is employed to compute an approximate score of a gapped alignment by combining the scores of several diagonals. This approximate score is used to select the most interesting database sequences for a subsequent Smith-Waterman alignment, which is also parallelised. The resulting method represents a substantial improvement compared to existing heuristics. The sensitivity and specificity of ParAlign was found to be as good as Smith-Waterman implementations when the same method for computing the statistical significance of the matches was used. In terms of speed, only the significantly less sensitive NCBI BLAST 2 program was found to outperform the new approach. Online searches are available at http://dna.uio.no/search/ 相似文献

15.

Reproductive memory for diagonal and nondiagonal patterns in chimpanzees

Jacques Vauclair Howard A. Rollins Ronald D. Nadler 《Behavioural processes》1983,8(3):289-300

Two male juvenile chimpanzees were trained to reproduce from memory geometric patterns composed of lighted cells in a 3 x 3 matrix. In Experiment I, subjects reproduced 3-cell horizontal, vertical and diagonal patterns with either 0- or 5-second delay between stimulus offset and response. Diagonals were more difficult and were more affected by delay than were nondiagonal patterns. The sequence of response to diagonals was less structured than to nondiagonals. In Experiment II, more complex 4-cell patterns were used and, following training, subjects were tested for transfer to new patterns. Again, diagonals were more difficult to reproduce than nondiagonals. Transfer of training to new patterns requiring different motoric responses was successful. Similar to Experiment I, organization of responding was greater for nondiagonals than for diagonals. These results are discussed with regard to the presence of internal representation of visual information in nonhuman primates. 相似文献

16.

FACT and the reorganized nucleosome

Formosa T 《Molecular bioSystems》2008,4(11):1085-1093

相似文献

17.

FACT Proteins,SUPT16H and SSRP1, Are Transcriptional Suppressors of HIV-1 and HTLV-1 That Facilitate Viral Latency

Huachao Huang Netty Santoso Derek Power Sydney Simpson Michael Dieringer Hongyu Miao Katerina Gurova Chou-Zen Giam Stephen J. Elledge Jian Zhu 《The Journal of biological chemistry》2015,290(45):27297-27310

相似文献

18.

Prediction of peptide binding to MHC using machine learning with sequence and structure-based feature sets

《Biochimica et Biophysica Acta (BBA)/General Subjects》2020,1864(4):129535

相似文献

19.

An exact solution for the segment-to-segment multiple sequence alignment problem 总被引：1，自引：0，他引：1

Lenhof HP Morgenstern B Reinert K 《Bioinformatics (Oxford, England)》1999,15(3):203-210

MOTIVATION: In molecular biology, sequence alignment is a crucial tool in studying the structure and function of molecules, as well as the evolution of species. In the segment-to-segment variation of the multiple alignment problem, the input can be seen as a set of non-gapped segment pairs (diagonals). Given a weight function that assigns a weight score to every possible diagonal, the goal is to choose a consistent set of diagonals of maximum weight. We show that the segment-to-segment multiple alignment problem is equivalent to a novel formulation of the Maximum Trace problem: the Generalized Maximum Trace (GMT) problem. Solving this problem to optimality, therefore, may improve upon the previous greedy strategies that are used for solving the segment-to-segment multiple sequence alignment problem. We show that the GMT can be stated in terms of an integer linear program and then solve the integer linear program using methods from polyhedral combinatorics. This leads to a branch-and-cut algorithm for segment-to-segment multiple sequence alignment. RESULTS: We report on our first computational experiences with this novel method and show that the program is able to find optimal solutions for real-world test examples. 相似文献

20.

MADS: a new and improved method for analysis of differential alternative splicing by exon-tiling microarrays

Xing Y Stoilov P Kapur K Han A Jiang H Shen S Black DL Wong WH 《RNA (New York, N.Y.)》2008,14(8):1470-1479

相似文献