首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 0 毫秒
1.
Clustering analysis of SAGE data using a Poisson approach   总被引:3,自引:1,他引:2       下载免费PDF全文
Serial analysis of gene expression (SAGE) data have been poorly exploited by clustering analysis owing to the lack of appropriate statistical methods that consider their specific properties. We modeled SAGE data by Poisson statistics and developed two Poisson-based distances. Their application to simulated and experimental mouse retina data show that the Poisson-based distances are more appropriate and reliable for analyzing SAGE data compared to other commonly used distances or similarity measures such as Pearson correlation or Euclidean distance.  相似文献   

2.

Background  

Tiling array data is hard to interpret due to noise. The wavelet transformation is a widely used technique in signal processing for elucidating the true signal from noisy data. Consequently, we attempted to denoise representative tiling array datasets for ChIP-chip experiments using wavelets. In doing this, we used specific wavelet basis functions, Coiflets, since their triangular shape closely resembles the expected profiles of true ChIP-chip peaks.  相似文献   

3.
Many exome sequencing studies of Mendelian disorders fail to optimally exploit family information. Classical genetic linkage analysis is an effective method for eliminating a large fraction of the candidate causal variants discovered, even in small families that lack a unique linkage peak. We demonstrate that accurate genetic linkage mapping can be performed using SNP genotypes extracted from exome data, removing the need for separate array-based genotyping. We provide software to facilitate such analyses.  相似文献   

4.
随着高通量RNA测序(RNA-Seq)技术的发展和测序成本迅速下降,RNA-Seq技术已经成为生物学研究的重要工具,为生物学家全面地了解和研究转录组提供了机遇。高通量测序具有读长短、存在一定比例的测序错误、数据量大等特点,因此RNA-Seq数据分析与基因组分析和传统的EST数据分析有所不同。本文通过介绍不同的测序平台、原始数据产生和低质量数据过滤的计算流程,对短序列比对、转录组拼接、功能注释、以及差异表达分析进行了研究和分析,最后对RNA-Seq在昆虫学研究中的应用进行了综述,并对RNA-Seq技术进行了总结和展望。  相似文献   

5.
The NARWHAL software pipeline has been developed to automate the primary analysis of Illumina sequencing data. This pipeline combines a new and flexible de-multiplexing tool with open-source aligners and automated quality assessment. The entire pipeline can be run using only one simple sample-sheet for diverse sequencing applications. NARWHAL creates a sample-oriented data structure and outperforms existing tools in speed. AVAILABILITY: https://trac.nbic.nl/narwhal/.  相似文献   

6.
Zhao Y  Yu H  Zhu Y  Ter-Minassian M  Peng Z  Shen H  Diao N  Chen F 《PloS one》2012,7(2):e31134
Family based association study (FBAS) has the advantages of controlling for population stratification and testing for linkage and association simultaneously. We propose a retrospective multilevel model (rMLM) approach to analyze sibship data by using genotypic information as the dependent variable. Simulated data sets were generated using the simulation of linkage and association (SIMLA) program. We compared rMLM to sib transmission/disequilibrium test (S-TDT), sibling disequilibrium test (SDT), conditional logistic regression (CLR) and generalized estimation equations (GEE) on the measures of power, type I error, estimation bias and standard error. The results indicated that rMLM was a valid test of association in the presence of linkage using sibship data. The advantages of rMLM became more evident when the data contained concordant sibships. Compared to GEE, rMLM had less underestimated odds ratio (OR). Our results support the application of rMLM to detect gene-disease associations using sibship data. However, the risk of increasing type I error rate should be cautioned when there is association without linkage between the disease locus and the genotyped marker.  相似文献   

7.
A discrete library of linear and hydantoin-containing dipeptide derivatives, based on the Lys-Trp(Nps) scaffold, was prepared by solid-phase synthesis. SAR studies indicated that potency for TRPV1 blockade and selectivity towards NMDA is mainly dictated by the side-chain length and the basic nature of α, ω-groups in the N-terminal residue. The 2-Nps moiety at position 2 of Trp indole ring is preferred over the 2-pyridine one.  相似文献   

8.
A significant advance made in combinatorial approach research was that the emphasis shifted from simple mixing to intelligent screening, so as to improve the efficiency and accuracy of discovering new materials from a larger number of diverse compositions. In this study, the long‐lasting luminescence of SrAl2O4, which is co‐doped with Eu2+, Ce3+, Dy3+, Li+ and H3BO3, was investigated based on a combinatorial approach in conjunction with the Taguchi method. The minimal number of 16 samples to be tested (five dopants and four levels of concentration) were designed using the Taguchi method. The samples to be screened were synthesized using a parallel combinatorial strategy based on ink‐jetting of precursors into an array of micro‐reactor wells. The relative brightness of luminescence of the different phosphors over a particular period was assessed. Ce3+ was identified as the constituent that detrimentally affected long‐lasting luminescence. Its concentration was optimized to zero. Li+ had a minor effect on long‐lasting luminescence but the main factors that contributed to the objective property (long‐lasting luminescence) were Eu2+, Dy3+ and H3BO3, and the concentrations of these dopants were optimized to 0.020, 0.030 and 0.300, respectively, for co‐doping into SrAl2O4. This study demonstrates that the utility of the combinatorial approach for evaluating the effect of components on an objective property (e.g. phosphorescence) and estimating the expected performance under the optimal conditions can be improved by the Taguchi method. Copyright © 2011 John Wiley & Sons, Ltd.  相似文献   

9.
MOTIVATION: Sequence annotations, functional and structural data on snake venom neurotoxins (svNTXs) are scattered across multiple databases and literature sources. Sequence annotations and structural data are available in the public molecular databases, while functional data are almost exclusively available in the published articles. There is a need for a specialized svNTXs database that contains NTX entries, which are organized, well annotated and classified in a systematic manner. RESULTS: We have systematically analyzed svNTXs and classified them using structure-function groups based on their structural, functional and phylogenetic properties. Using conserved motifs in each phylogenetic group, we built an intelligent module for the prediction of structural and functional properties of unknown NTXs. We also developed an annotation tool to aid the functional prediction of newly identified NTXs as an additional resource for the venom research community. AVAILABILITY: We created a searchable online database of NTX proteins sequences (http://research.i2r.a-star.edu.sg/Templar/DB/snake_neurotoxin). This database can also be found under Swiss-Prot Toxin Annotation Project website (http://www.expasy.org/sprot/).  相似文献   

10.
Large combinatorial libraries of random peptides have been used for a variety of applications that include analysis of protein-protein interactions, epitope mapping, and drug targeting. The major obstacle in screening these libraries is the loss of specific but low affinity binding peptides during washing steps. Loss of these specific binders often results in isolation of peptides that bind nonspecifically to components used in the selection process. Previously, it has been demonstrated that dimerizing or multimerizing a peptide can remarkably improve its binding kinetics by 10- to 1000-fold due to an avidity effect. To take advantage of this observation, we constructed a random library of 12 amino acid dimeric peptides on polyethylene glycol acrylamide (PEGA) beads by modifying the 'one-bead-one-compound' approach. The chemical synthesis of 100,000 peptides as dimers can be problematic due to steric and aggregation effects and the presence of many peptide sequences that are difficult to synthesize. We have designed a method, described in detail here, to minimize the problems inherent in the synthesis of a dimeric library by modifying the existing 'split and pool' synthetic method. Using this approach the dimeric library was used to isolate a series of peptides that bound selectively to epithelial cancer cells. One peptide with the amino acid sequence QMARIPKRLARH bound as a dimer to prostate cancer cells spiked into the blood but did not bind to circulating hematopoeitic cells. The monomeric form of this peptide, however, did not bind well to the same LNCaP cell line. These data demonstrate that "hits" obtained from such a 'one-bead-one-dimer' library can be used directly for the final application or used as leads for construction of second generation libraries.  相似文献   

11.
李文轲  李丰余  张思瑶  蔡斌  郑娜  聂宇  周到  赵倩 《遗传》2014,36(6):618-624
二代测序技术的发展对测序数据的处理分析提出了很高的要求。目前二代测序数据分析软件很多, 但是绝大多数软件仅能完成单一的分析功能(例如:仅进行序列比对或变异读取或功能注释等), 如何能正确高效地选择整合这些软件已成为迫切需求。文章设计了一套基于perl语言和SGE资源管理的自动化处理流程来分析Illumina平台基因组测序数据。该流程以测序原始序列数据作为输入, 调用业界标准的数据处理软件(如:BWA, Samtools, GATK, ANNOVAR等), 最终生成带有相应功能注释、便于研究者进一步分析的变异位点列表。该流程通过自动化并行脚本控制流程的高效运行, 一站式输出分析结果和报告, 简化了数据分析过程中的人工操作, 大大提高了运行效率。用户只需填写配置文件或使用图形界面输入即可完成全部操作。该工作为广大研究者分析二代测序数据提供了便利的途径。  相似文献   

12.
The Nogo receptor (NgR) plays a central role in mediating growth-inhibitory activities of myelin-derived proteins, thereby severely limiting axonal regeneration after injury of the adult mammalian central nervous system (CNS). The inhibitory proteins Nogo, myelin-associated glycoprotein (MAG) and oligodendrocyte myelin glycoprotein (OMgp) all bind to the extracellular leucine-rich repeat (LRR) domain of NgR, which provides a large molecular surface for protein-protein interactions. However, epitopes within the LRR domain of NgR for binding Nogo, MAG and OMgp have not yet been revealed. Here, we report an evolutionary approach based on the ribosome display technology for detecting regions involved in ligand binding. By applying this method of "affinity fingerprinting" to the NgR ligand binding domain we were able to detect a distinct region important for binding to Nogo. Several residues defining the structural epitope of NgR involved in interaction with Nogo were subsequently confirmed by alanine scanning mutagenesis.  相似文献   

13.
Many different methods exist for pattern detection in gene expression data. In contrast to classical methods, biclustering has the ability to cluster a group of genes together with a group of conditions (replicates, set of patients or drug compounds). However, since the problem is NP-complex, most algorithms use heuristic search functions and therefore might converge towards local maxima. By using the results of biclustering on discrete data as a starting point for a local search function on continuous data, our algorithm avoids the problem of heuristic initialization. Similar to OPSM, our algorithm aims to detect biclusters whose rows and columns can be ordered such that row values are growing across the bicluster's columns and vice-versa. Results have been generated on the yeast genome (Saccharomyces cerevisiae), a human cancer dataset and random data. Results on the yeast genome showed that 89% of the one hundred biggest non-overlapping biclusters were enriched with Gene Ontology annotations. A comparison with OPSM and ISA demonstrated a better efficiency when using gene and condition orders. We present results on random and real datasets that show the ability of our algorithm to capture statistically significant and biologically relevant biclusters.  相似文献   

14.
Advances in chemical biology have led to selection of synthetic functional nucleic acids for in vivo applications. Discovery of synthetic nucleic acid regulatory elements has been a long-standing goal of chemical biologists. Availability of vast genome level genetic resources has motivated efforts for discovery and understanding of inducible synthetic genetic regulatory elements. Such elements can lead to custom-design of switches and sensors, oscillators, digital logic evaluators and cell–cell communicators. Here, we describe a simple, robust and universally applicable module for discovery of inducible gene regulatory elements. The distinguishing feature is the use of a toxic peptide as a reporter to suppress the background of unwanted bacterial recombinants. Using this strategy, we show that it is possible to isolate genetic elements of non-genomic origin which specifically get activated in the presence of DNA gyrase A inhibitors belonging to fluoroquinolone (FQ) group of chemicals. Further, using a system level genetic resource, we prove that the genetic regulation is exerted through histone-like nucleoid structuring (H-NS) repressor protein. Till date, there are no reports of in vivo selection of non-genomic origin inducible regulatory promoter like elements. Our strategy opens an uncharted route to discover inducible synthetic regulatory elements from biologically-inspired nucleic acid sequences.  相似文献   

15.

Background

miRNAs play a key role in normal physiology and various diseases. miRNA profiling through next generation sequencing (miRNA-seq) has become the main platform for biological research and biomarker discovery. However, analyzing miRNA sequencing data is challenging as it needs significant amount of computational resources and bioinformatics expertise. Several web based analytical tools have been developed but they are limited to processing one or a pair of samples at time and are not suitable for a large scale study. Lack of flexibility and reliability of these web applications are also common issues.

Results

We developed a Comprehensive Analysis Pipeline for microRNA Sequencing data (CAP-miRSeq) that integrates read pre-processing, alignment, mature/precursor/novel miRNA detection and quantification, data visualization, variant detection in miRNA coding region, and more flexible differential expression analysis between experimental conditions. According to computational infrastructure, users can install the package locally or deploy it in Amazon Cloud to run samples sequentially or in parallel for a large number of samples for speedy analyses. In either case, summary and expression reports for all samples are generated for easier quality assessment and downstream analyses. Using well characterized data, we demonstrated the pipeline’s superior performances, flexibility, and practical use in research and biomarker discovery.

Conclusions

CAP-miRSeq is a powerful and flexible tool for users to process and analyze miRNA-seq data scalable from a few to hundreds of samples. The results are presented in the convenient way for investigators or analysts to conduct further investigation and discovery.

Electronic supplementary material

The online version of this article (doi:10.1186/1471-2164-15-423) contains supplementary material, which is available to authorized users.  相似文献   

16.
Cyclic decapeptides were developed based on the previously reported peptide c(LysLeuLysLeuLysPheLysLeuLysGln). These compounds were active against the economically important plant pathogenic bacteria Erwinia amylovora, Pseudomonas syringae and Xanthomonas vesicatoria. A library of 56 cyclic decapeptides was prepared and screened for antibacterial activity and eukaryotic cytotoxicity, and led to the identification of peptides with improved minimum inhibitory concentration (MIC) against P. syringae (3.1-6.2 microM) and X. vesicatoria (1.6-3.1 microM). Notably, peptides active against E. amylovora (MIC of 12.5-25 microM) were found, constituting the first report of cyclic peptides with activity towards this bacteria. A second library based on the structure c(X(1)X(2)X(3)X(4)LysPheLysLysLeuGln) with X being Lys or Leu yielded peptides with optimized activity profiles. The activity against E. amylovora was further improved (MIC of 6.2-12.5 microM) and the best peptides displayed a low eukaryotic cytotoxicity at concentrations 30-120 times higher than the MIC values. A design of experiments permitted to define rules for high antibacterial activity and low cytotoxicity, being the main rule X(2) not equal X(3), and the secondary rule X(4)=Lys. The best analogs can be considered as good candidates for the development of effective antibacterial agents for use in plant protection.  相似文献   

17.

Background  

Multilocus Sequence Typing (MLST) is a frequently used typing method for the analysis of the clonal relationships among strains of several clinically relevant microbial species. MLST is based on the sequence of housekeeping genes that result in each strain having a distinct numerical allelic profile, which is abbreviated to a unique identifier: the sequence type (ST). The relatedness between two strains can then be inferred by the differences between allelic profiles. For a more comprehensive analysis of the possible patterns of evolutionary descent, a set of rules were proposed and implemented in the eBURST algorithm. These rules allow the division of a data set into several clusters of related strains, dubbed clonal complexes, by implementing a simple model of clonal expansion and diversification. Within each clonal complex, the rules identify which links between STs correspond to the most probable pattern of descent. However, the eBURST algorithm is not globally optimized, which can result in links, within the clonal complexes, that violate the rules proposed.  相似文献   

18.
19.

Background  

Whole exome capture sequencing allows researchers to cost-effectively sequence the coding regions of the genome. Although the exome capture sequencing methods have become routine and well established, there is currently a lack of tools specialized for variant calling in this type of data.  相似文献   

20.
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号