期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

1.

From RNA-seq reads to differential expression results

Oshlack A Robinson MD Young MD 《Genome biology》2010,11(12):220

Many methods and tools are available for preprocessing high-throughput RNA sequencing data and detecting differential expression. 相似文献

2.

RNASequel: accurate and repeat tolerant realignment of RNA-seq reads

Gavin W. Wilson Lincoln D. Stein 《Nucleic acids research》2015,43(18):e122

相似文献

3.

Haplotype estimation from fuzzy genotypes using penalized likelihood

Uh HW Eilers PH 《PloS one》2011,6(9):e24219

The Composite Link Model is a generalization of the generalized linear model in which expected values of observed counts are constructed as a sum of generalized linear components. When combined with penalized likelihood, it provides a powerful and elegant way to estimate haplotype probabilities from observed genotypes. Uncertain ("fuzzy") genotypes, like those resulting from AFLP scores, can be handled by adding an extra layer to the model. We describe the model and the estimation algorithm. We apply it to a data set of accurate human single nucleotide polymorphism (SNP) and to a data set of fuzzy tomato AFLP scores. 相似文献

4.

CLASS2: accurate and efficient splice variant annotation from RNA-seq reads

Li Song Sarven Sabunciyan Liliana Florea 《Nucleic acids research》2016,44(10):e98

相似文献

5.

CRAC: an integrated approach to the analysis of RNA-seq reads

Nicolas Philippe Mika?l Salson Thérèse Commes Eric Rivals 《Genome biology》2013,14(3):R30

A large number of RNA-sequencing studies set out to predict mutations, splice junctions or fusion RNAs. We propose a method, CRAC, that integrates genomic locations and local coverage to enable such predictions to be made directly from RNA-seq read analysis. A k-mer profiling approach detects candidate mutations, indels and splice or chimeric junctions in each single read. CRAC increases precision compared with existing tools, reaching 99:5% for splice junctions, without losing sensitivity. Importantly, CRAC predictions improve with read length. In cancer libraries, CRAC recovered 74% of validated fusion RNAs and predicted novel recurrent chimeric junctions. CRAC is available at http://crac.gforge.inria.fr. 相似文献

6.

UnSplicer: mapping spliced RNA-seq reads in compact genomes and filtering noisy splicing

Paul D. Burns Yang Li Jian Ma Mark Borodovsky 《Nucleic acids research》2014,42(4):e25

Accurate mapping of spliced RNA-Seq reads to genomic DNA has been known as a challenging problem. Despite significant efforts invested in developing efficient algorithms, with the human genome as a primary focus, the best solution is still not known. A recently introduced tool, TrueSight, has demonstrated better performance compared with earlier developed algorithms such as TopHat and MapSplice. To improve detection of splice junctions, TrueSight uses information on statistical patterns of nucleotide ordering in intronic and exonic DNA. This line of research led to yet another new algorithm, UnSplicer, designed for eukaryotic species with compact genomes where functional alternative splicing is likely to be dominated by splicing noise. Genome-specific parameters of the new algorithm are generated by GeneMark-ES, an ab initio gene prediction algorithm based on unsupervised training. UnSplicer shares several components with TrueSight; the difference lies in the training strategy and the classification algorithm. We tested UnSplicer on RNA-Seq data sets of Arabidopsis thaliana, Caenorhabditis elegans, Cryptococcus neoformans and Drosophila melanogaster. We have shown that splice junctions inferred by UnSplicer are in better agreement with knowledge accumulated on these well-studied genomes than predictions made by earlier developed tools. 相似文献

7.

Predicting survival times for neuroblastoma patients using RNA-seq expression profiles

Tyler Grimes Alejandro R. Walker Susmita Datta Somnath Datta 《Biology direct》2018,13(1):11

相似文献

8.

Isoform abundance inference provides a more accurate estimation of gene expression levels in RNA-seq

Wang X Wu Z Zhang X 《Journal of bioinformatics and computational biology》2010,8(Z1):177-192

相似文献

9.

A model based criterion for gene expression calls using RNA-seq data

Günter P. Wagner Koryu Kin Vincent J. Lynch 《Theorie in den Biowissenschaften》2013,132(3):159-164

相似文献

10.

Haplotype analysis of the human beta-globin gene complex using multiple locus specific oligonucleotide probes 总被引：1，自引：0，他引：1

G Nozari S Rahbar R B Wallace 《Analytical biochemistry》1988,172(1):180-184

Three oligonucleotide probes complementary to specific DNA sequences of the six human globin genes (epsilon, G gamma, A gamma, psi beta, delta, beta) were synthesized. The oligonucleotides were used either singly or in combination as hybridization probes to determine the haplotype of the human beta-globin gene cluster employing the four conventionally used restriction endonucleases HincII, HindIII, AvaII, and BamHI, in addition to HpaI. Polymorphism in the epsilon- and psi beta-genes (HincII) can be simultaneously determined with a single probe mixture. One of the probes complementary to both the psi beta- and gamma-genes is useful for determining both HindIII and HincII polymorphisms. The advantages of these probes relative to conventional cDNA probes are discussed. 相似文献

11.

Annotation of metagenome short reads using proxygenes

Dalevi D Ivanova NN Mavromatis K Hooper SD Szeto E Hugenholtz P Kyrpides NC Markowitz VM 《Bioinformatics (Oxford, England)》2008,24(16):i7-13

MOTIVATION: A typical metagenome dataset generated using a 454 pyrosequencing platform consists of short reads sampled from the collective genome of a microbial community. The amount of sequence in such datasets is usually insufficient for assembly, and traditional gene prediction cannot be applied to unassembled short reads. As a result, analysis of such datasets usually involves comparisons in terms of relative abundances of various protein families. The latter requires assignment of individual reads to protein families, which is hindered by the fact that short reads contain only a fragment, usually small, of a protein. RESULTS: We have considered the assignment of pyrosequencing reads to protein families directly using RPS-BLAST against COG and Pfam databases and indirectly via proxygenes that are identified using BLASTx searches against protein sequence databases. Using simulated metagenome datasets as benchmarks, we show that the proxygene method is more accurate than the direct assignment. We introduce a clustering method which significantly reduces the size of a metagenome dataset while maintaining a faithful representation of its functional and taxonomic content. 相似文献

12.

Characterizing and annotating the genome using RNA-seq data 总被引：2，自引：0，他引：2

Geng Chen Tieliu Shi Leming Shi

《中国科学：生命科学英文版》

相似文献

13.

Circular RNA expression profiles and features in human tissues: a study using RNA-seq data

Tianyi Xu Jing Wu Ping Han Zhongming Zhao Xiaofeng Song 《BMC genomics》2017,18(6):680

Background

Circular RNA (circRNA) is one type of noncoding RNA that forms a covalently closed continuous loop. Similar to long noncoding RNA (lncRNA), circRNA can act as microRNA (miRNA) ‘sponges’ to regulate gene expression, and its abnormal expression is related to diseases such as atherosclerosis, nervous system disorders and cancer. So far, there have been no systematic studies on circRNA abundance and expression profiles in human adult and fetal tissues.

Results

We explored circRNA expression profiles using RNA-seq data for six adult and fetal normal tissues (colon, heart, kidney, liver, lung, and stomach) and four gland normal tissues (adrenal gland, mammary gland, pancreas, and thyroid gland). A total of 8120, 25,933 and 14,433 circRNAs were detected by at least two supporting junction reads in adult, fetal and gland tissues, respectively. Among them, 3092, 14,241 and 6879 circRNAs were novel when compared to the published results. In each adult tissue type, we found at least 1000 circRNAs, among which 36.97–50.04% were tissue-specific. We reported 33 circRNAs that were ubiquitously expressed in all the adult tissues we examined. To further explore the potential “housekeeping” function of these circRNAs, we constructed a circRNA-miRNA-mRNA regulatory network containing 17 circRNAs, 22 miRNAs and 90 mRNAs. Furthermore, we found that both the abundance and the relative expression level of circRNAs were higher in fetal tissue than adult tissue. The number of circRNAs in gland tissues, especially in mammary gland (9665 circRNA candidates), was higher than that of other adult tissues (1160–3777).

Conclusions

We systematically investigated circRNA expression in a variety of human adult and fetal tissues. Our observation of different expression level of circRNAs in adult and fetal tissues suggested that circRNAs might play their role in a tissue-specific and development-specific fashion. Analysis of circRNA-miRNA-mRNA network provided potential targets of circRNAs. High expression level of circRNAs in mammary gland might be attributed to the rich innervation.

相似文献

14.

Genome assembly using Nanopore-guided long and error-free DNA reads

Mohammed-Amin Madoui Stefan Engelen Corinne Cruaud Caroline Belser Laurie Bertrand Adriana Alberti Arnaud Lemainque Patrick Wincker Jean-Marc Aury 《BMC genomics》2015,16(1)

Background

Long-read sequencing technologies were launched a few years ago, and in contrast with short-read sequencing technologies, they offered a promise of solving assembly problems for large and complex genomes. Moreover by providing long-range information, it could also solve haplotype phasing. However, existing long-read technologies still have several limitations that complicate their use for most research laboratories, as well as in large and/or complex genome projects. In 2014, Oxford Nanopore released the MinION® device, a small and low-cost single-molecule nanopore sequencer, which offers the possibility of sequencing long DNA fragments.

Results

The assembly of long reads generated using the Oxford Nanopore MinION® instrument is challenging as existing assemblers were not implemented to deal with long reads exhibiting close to 30% of errors. Here, we presented a hybrid approach developed to take advantage of data generated using MinION® device. We sequenced a well-known bacterium, Acinetobacter baylyi ADP1 and applied our method to obtain a highly contiguous (one single contig) and accurate genome assembly even in repetitive regions, in contrast to an Illumina-only assembly. Our hybrid strategy was able to generate NaS (Nanopore Synthetic-long) reads up to 60 kb that aligned entirely and with no error to the reference genome and that spanned highly conserved repetitive regions. The average accuracy of NaS reads reached 99.99% without losing the initial size of the input MinION® reads.

Conclusions

We described NaS tool, a hybrid approach allowing the sequencing of microbial genomes using the MinION® device. Our method, based ideally on 20x and 50x of NaS and Illumina reads respectively, provides an efficient and cost-effective way of sequencing microbial or small eukaryotic genomes in a very short time even in small facilities. Moreover, we demonstrated that although the Oxford Nanopore technology is a relatively new sequencing technology, currently with a high error rate, it is already useful in the generation of high-quality genome assemblies.

Electronic supplementary material

The online version of this article (doi:10.1186/s12864-015-1519-z) contains supplementary material, which is available to authorized users. 相似文献

15.

Isoform-level microRNA-155 target prediction using RNA-seq

Deng N Puetter A Zhang K Johnson K Zhao Z Taylor C Flemington EK Zhu D 《Nucleic acids research》2011,39(9):e61

相似文献

16.

Computational methods for transcriptome annotation and quantification using RNA-seq 总被引：2，自引：0，他引：2

Garber M Grabherr MG Guttman M Trapnell C 《Nature methods》2011,8(6):469-477

相似文献

17.

Haplotype sharing analysis using mantel statistics

Beckmann L Thomas DC Fischer C Chang-Claude J 《Human heredity》2005,59(2):67-78

OBJECTIVE: The potential value of haplotypes has attracted widespread interest in the mapping of complex traits. Haplotype sharing methods take the linkage disequilibrium information between multiple markers into account, and may have good power to detect predisposing genes. We present a new approach based on Mantel statistics for spacetime clustering, which is developed in order to improve the power of haplotype sharing analysis for gene mapping in complex disease. METHODS: The new statistic correlates genetic similarity and phenotypic similarity across pairs of haplotypes for case-only and case-control studies. The genetic similarity is measured as the shared length between haplotypes around a putative disease locus. The phenotypic similarity is measured as the mean-corrected cross-product based on the respective phenotypes. We analyzed two tests for statistical significance with respect to type I error: (1) assuming asymptotic normality, and (2) using a Monte Carlo permutation procedure. The results were compared to the chi(2) test for association based on 3-marker haplotypes. RESULTS: The results of the type I error rates for the Mantel statistics using the permutational procedure yielded pointwise valid tests. The approach based on the assumption of asymptotic normality was seriously liberal. CONCLUSION: Power comparisons showed that the Mantel statistics were better than or equal to the chi(2) test for all simulated disease models. 相似文献

18.

Bovine tryptases. cDNA cloning, tissue specific expression and characterization of the lung isoform.

Alessandra Gambacurta Laura Fiorucci Paolo Basili Fulvio Erba Angela Amoresano Franca Ascoli 《European journal of biochemistry》2003,270(3):507-517

A complementary DNA encoding a new bovine tryptase isoform (here named BLT) was cloned and sequenced from lung tissue. Analysis of sequence indicates the presence of a 26-amino acid prepro-sequence and a 245 amino acid catalytic domain. It contains six different residues when compared with the previously characterized tryptase from bovine liver capsule (BLCT), with the most significant difference residing at the primary specificity S1 pocket. In BLT, the canonical residues Asp-Ser are present at positions 188-189, while in BLCT these positions are occupied by residues Asn-Phe. This finding was confirmed by mass fingerprinting of the peptide mixture obtained upon in-gel tryptic digestion of BLT. Analysis by gel filtration of the purified protein shows that BLT is probably tetrameric, similar to the previously identified tryptases from other species, with monomer migrating as 35-40 kDa multiple bands in SDS/PAGE. As expected, the catalytic abilities of the two bovine tryptases are different. The specificity constant values (kcat/Km) assayed with model substrates are 10- to 60-fold higher in the case of BLT. The tissue-specific expression of the two tryptases was evaluated at the RNA level by analysis of their different restriction patterns. In lung, only BLT was found to be expressed, while in liver capsule only BLCT is present. Both isoforms are distributed in similar amounts in heart and spleen. Analysis of the two gene sequences reveals the presence of several recognition sequences in the promoter regions and suggest a role for hormones in governing the mechanism of tissue expression of bovine tryptases. 相似文献

19.

Comparative assessment of methods for the computational inference of transcript isoform abundance from RNA-seq data

Alexander Kanitz Foivos Gypas Andreas J. Gruber Andreas R. Gruber Georges Martin Mihaela Zavolan 《Genome biology》2015,16(1)

相似文献

20.

Moderated estimation of fold change and dispersion for RNA-seq data with DESeq2 总被引：1，自引：0，他引：1

Michael I Love Wolfgang Huber Simon Anders 《Genome biology》2014,15(12)

In comparative high-throughput sequencing assays, a fundamental task is the analysis of count data, such as read counts per gene in RNA-seq, for evidence of systematic changes across experimental conditions. Small replicate numbers, discreteness, large dynamic range and the presence of outliers require a suitable statistical approach. We present DESeq2, a method for differential analysis of count data, using shrinkage estimation for dispersions and fold changes to improve stability and interpretability of estimates. This enables a more quantitative analysis focused on the strength rather than the mere presence of differential expression. The DESeq2 package is available at http://www.bioconductor.org/packages/release/bioc/html/DESeq2.html.

Electronic supplementary material

The online version of this article (doi:10.1186/s13059-014-0550-8) contains supplementary material, which is available to authorized users. 相似文献