首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
Esophageal squamous-cell carcinoma (ESCC) is one of the most lethal malignancies in the world and occurs at particularly higher frequency in China. While several genome-wide association studies (GWAS) of germline variants and whole-genome or whole-exome sequencing studies of somatic mutations in ESCC have been published, there is no comprehensive database publically available for this cancer. Here, we developed the Chinese Cancer Genomic Database-Esophageal Squamous Cell Carcinoma (CCGD-ESCC) database, which contains the associations of 69,593 single nucleotide polymorphisms (SNPs) with ESCC risk in 2022 cases and 2039 controls, survival time of 1006 ESCC patients (survival GWAS) and gene expression (expression quantitative trait loci, eQTL) in 94 ESCC patients. Moreover, this database also provides the associations between 8833 somatic mutations and survival time in 675 ESCC patients. Our user-friendly database is a resource useful for biologists and oncologists not only in identifying the associations of genetic variants or somatic mutations with the development and progression of ESCC but also in studying the underlying mechanisms for tumorigenesis of the cancer. CCGD-ESCC is freely accessible at http://db.cbi.pku.edu.cn/ccgd/ESCCdb.  相似文献   

2.
DRTF: a database of rice transcription factors   总被引:7,自引:0,他引:7  
  相似文献   

3.
DPTF: a database of poplar transcription factors   总被引:3,自引:0,他引:3  
  相似文献   

4.
5.
He  Feifei  Li  Yang  Tang  Yu-Hang  Ma  Jian  Zhu  Huaiqiu 《BMC genomics》2016,17(1):141-151
Background

The identification of inversions of DNA segments shorter than read length (e.g., 100 bp), defined as micro-inversions (MIs), remains challenging for next-generation sequencing reads. It is acknowledged that MIs are important genomic variation and may play roles in causing genetic disease. However, current alignment methods are generally insensitive to detect MIs. Here we develop a novel tool, MID (Micro-Inversion Detector), to identify MIs in human genomes using next-generation sequencing reads.

Results

The algorithm of MID is designed based on a dynamic programming path-finding approach. What makes MID different from other variant detection tools is that MID can handle small MIs and multiple breakpoints within an unmapped read. Moreover, MID improves reliability in low coverage data by integrating multiple samples. Our evaluation demonstrated that MID outperforms Gustaf, which can currently detect inversions from 30 bp to 500 bp.

Conclusions

To our knowledge, MID is the first method that can efficiently and reliably identify MIs from unmapped short next-generation sequencing reads. MID is reliable on low coverage data, which is suitable for large-scale projects such as the 1000 Genomes Project (1KGP). MID identified previously unknown MIs from the 1KGP that overlap with genes and regulatory elements in the human genome. We also identified MIs in cancer cell lines from Cancer Cell Line Encyclopedia (CCLE). Therefore our tool is expected to be useful to improve the study of MIs as a type of genetic variant in the human genome. The source code can be downloaded from: http://cqb.pku.edu.cn/ZhuLab/MID.

  相似文献   

6.
7.
DATF: a database of Arabidopsis transcription factors   总被引:10,自引:0,他引:10  
Guo A  He K  Liu D  Bai S  Gu X  Wei L  Luo J 《Bioinformatics (Oxford, England)》2005,21(10):2568-2569
  相似文献   

8.
GSDS: 基因结构显示系统   总被引:62,自引:1,他引:62  
郭安源  朱其慧  陈新  罗静初 《遗传》2007,29(8):1023-1026
构建了一个用于绘制基因结构示意图的网站系统(http://gsds.cbi.pku.edu.cn/)。用户可提交核酸序列、NCBI核酸序列号或基因外显子位置信息, 得到基因结构示意图; 并可指定在基因结构图上标注某些特定区域。系统允许用户同时输入多个基因, 并指定输出次序和标注区域。结果可用位图和矢量图两种图形格式显示。点击位图格式结果, 可以查看相应序列。系统提供中英文两种用户界面。  相似文献   

9.
Eating disorder is a group of physiological and psychological disorders affecting approximately 1% of the female population worldwide. Although the genetic epidemiology of eating disorder is becoming increasingly clear with accumulated studies, the underlying molecular mechanisms are still unclear. Recently, integration of various high-throughput data expanded the range of candidate genes and started to generate hypotheses for understanding potential pathogenesis in complex diseases. This article presents EDdb (Eating Disorder database), the first evidence-based gene resource for eating disorder. Fifty-nine experimentally validated genes from the literature in relation to eating disorder were collected as the core dataset. Another four datasets with 2824 candidate genes across 601 genome regions were expanded based on the core dataset using different criteria (e.g., protein-protein interactions, shared cytobands, and related complex diseases). Based on human protein-protein interaction data, we reconstructed a potential molecular sub-network related to eating disorder. Furthermore, with an integrative pathway enrichment analysis of genes in EDdb, we identified an extended adipocytokine signaling pathway in eating disorder. Three genes in EDdb (ADIPO (adiponectin), TNF (tumor necrosis factor) and NR3C1 (nuclear receptor subfamily 3, group C, member 1)) link the KEGG (Kyoto Encyclopedia of Genes and Genomes) “adipocytokine signaling pathway” with the BioCarta “visceral fat deposits and the metabolic syndrome” pathway to form a joint pathway. In total, the joint pathway contains 43 genes, among which 39 genes are related to eating disorder. As the first comprehensive gene resource for eating disorder, EDdb (http://eddb.cbi.pku.edu.cn) enables the exploration of gene-disease relationships and cross-talk mechanisms between related disorders. Through pathway statistical studies, we revealed that abnormal body weight caused by eating disorder and obesity may both be related to dysregulation of the novel joint pathway of adipocytokine signaling. In addition, this joint pathway may be the common pathway for body weight regulation in complex human diseases related to unhealthy lifestyle.  相似文献   

10.
Aspirin-exacerbated respiratory disease (AERD) remains widely underdiagnosed in asthmatics, primarily due to insufficient awareness of the relationship between aspirin ingestion and asthma exacerbation. The identification of aspirin hypersensitivity is therefore essential to avoid serious aspirin complications. The goal of the study was to develop plasma biomarkers to predict AERD. We identified differentially expressed genes in peripheral blood mononuclear cells (PBMC) between subjects with AERD and those with aspirin-tolerant asthma (ATA). The genes were matched with the secreted protein database (http://spd.cbi.pku.edu.cn/) to select candidate proteins in the plasma. Plasma levels of the candidate proteins were then measured in AERD (n = 40) and ATA (n = 40) subjects using an enzyme-linked immunosorbent assay (ELISA). Target genes were validated as AERD biomarkers using an ROC curve analysis. From 175 differentially expressed genes (p-value <0.0001) that were queried to the secreted protein database, 11 secreted proteins were retrieved. The gene expression patterns were predicted as elevated for 7 genes and decreased for 4 genes in AERD as compared with ATA subjects. Among these genes, significantly higher levels of plasma eosinophil-derived neurotoxin (RNASE2) were observed in AERD as compared with ATA subjects (70(14.62∼311.92) µg/ml vs. 12(2.55∼272.84) µg/ml, p-value <0.0003). Based on the ROC curve analysis, the AUC was 0.74 (p-value = 0.0001, asymptotic 95% confidence interval [lower bound: 0.62, upper bound: 0.83]) with 95% sensitivity, 60% specificity, and a cut-off value of 27.15 µg/ml. Eosinophil-derived neurotoxin represents a novel biomarker to distinguish AERD from ATA.  相似文献   

11.
The homeobox genes are a large and diverse group of genes, many of which play important roles in the embryonic development of animals. Comparative study of homeobox genes, both within and between species, requires an evolutionary-based classification. HomeoDB was designed and implemented as a manually curated database to collect and present homeobox genes in an evolutionarily structured way, allowing genes, gene families and gene classes to be compared between species more readily than was possible previously. In its first release, HomeoDB includes all homeobox genes from human, amphioxus (Branchiostoma floridae) and fruitfly (Drosophila melanogaster); additional species can be added. HomeoDB is freely accessible at (http://homeodb.cbi.pku.edu.cn).  相似文献   

12.
13.
Drug addiction is a serious worldwide problem with strong genetic and environmental influences. Different technologies have revealed a variety of genes and pathways underlying addiction; however, each individual technology can be biased and incomplete. We integrated 2,343 items of evidence from peer-reviewed publications between 1976 and 2006 linking genes and chromosome regions to addiction by single-gene strategies, microrray, proteomics, or genetic studies. We identified 1,500 human addiction-related genes and developed KARG (http://karg.cbi.pku.edu.cn), the first molecular database for addiction-related genes with extensive annotations and a friendly Web interface. We then performed a meta-analysis of 396 genes that were supported by two or more independent items of evidence to identify 18 molecular pathways that were statistically significantly enriched, covering both upstream signaling events and downstream effects. Five molecular pathways significantly enriched for all four different types of addictive drugs were identified as common pathways which may underlie shared rewarding and addictive actions, including two new ones, GnRH signaling pathway and gap junction. We connected the common pathways into a hypothetical common molecular network for addiction. We observed that fast and slow positive feedback loops were interlinked through CAMKII, which may provide clues to explain some of the irreversible features of addiction.  相似文献   

14.
ABCGrid: Application for Bioinformatics Computing Grid   总被引:1,自引:0,他引:1  
We have developed a package named Application for Bioinformatics Computing Grid (ABCGrid). ABCGrid was designed for biology laboratories to use heterogeneous computing resources and access bioinformatics applications from one master node. ABCGrid is very easy to install and maintain at the premise of robustness and high performance. We implement a mechanism to install and update all applications and databases in worker nodes automatically to reduce the workload of manual maintenance. We use a backup task method and self-adaptive job dispatch approach to improve performance. Currently, ABCGrid integrates NCBI_BLAST, Hmmpfam and CE, running on a number of computing platforms including UNIX/Linux, Windows and Mac OS X. AVAILABILITY: The source code, executables and documents can be downloaded from http://abcgrid.cbi.pku.edu.cn  相似文献   

15.
Many genes are involved in mammalian cell apoptosis pathway. These apoptosis genes often contain characteristic functional domains, and can be classified into at least 15 functional groups, according to previous reports. Using an integrated bioinformatics platform for motif or domain search from three public mammalian proteomes (International Protein Index database for human, mouse, and rat), we systematically cataloged all of the proteins involved in mammalian apoptosis pathway. By localizing those proteins onto the genomes, we obtained a gene locus centric apoptosis gene catalog for human, mouse and rat.Further phylogenetic analysis showed that most of the apoptosis related gene loci are conserved among these three mammals. Interestingly, about one-third of apoptosis gene loci form gene clusters on mammal chromosomes, and exist in the three species, which indicated that mammalian apoptosis gene orders are also conserved. In addition, some tandem duplicated gene loci were revealed by comparing gene loci clusters in the three species. All data produced in this work were stored in a relational database and may be viewed at http://pcas.cbi.pku.edu.cn/database/apd.php.  相似文献   

16.
MOTIVATION: The rapid accumulation of single amino acid polymorphisms (SAPs), also known as non-synonymous single nucleotide polymorphisms (nsSNPs), brings the opportunities and needs to understand and predict their disease association. Currently published attributes are limited, the detailed mechanisms governing the disease association of a SAP remain unclear and thus, further investigation of new attributes and improvement of the prediction are desired. RESULTS: A SAP dataset was compiled from the Swiss-Prot variant pages. We extracted and demonstrated the effectiveness of several new biologically informative attributes including the structural neighbor profiles that describe the SAP's microenvironment, nearby functional sites that measure the structure-based and sequence-based distances between the SAP site and its nearby functional sites, aggregation properties that measure the likelihood of protein aggregation and disordered regions that consider whether the SAP is located in structurally disordered regions. The new attributes provided insights into the mechanisms of the disease association of SAPs. We built a support vector machines (SVMs) classifier employing a carefully selected set of new and previously published attributes. Through a strict protein-level 5-fold cross-validation, we attained an overall accuracy of 82.61%, and an MCC of 0.60. Moreover, a web server was developed to provide a user-friendly interface for biologists. AVAILABILITY: The web server is available at http://sapred.cbi.pku.edu.cn/  相似文献   

17.
TSdb (http://tsdb.cbi.pku.edu.cn) is the first manually curated central repository that stores formatted information on the substrates of transporters. In total, 37608 transporters with 15075 substrates from 884 organisms were curated from UniProt functional annotation. A unique feature of TSdb is that all the substrates are mapped to identifiers from the KEGG Ligand compound database. Thus, TSdb links current metabolic pathway schema with compound transporter systems via the shared compounds in the pathways. Furthermore, all the transporter substrates in TSdb are classified according to their biochemical properties, biological roles and subcellular localizations. In addition to the functional annotation of transporters, extensive compound annotation that includes inhibitor information from the KEGG Ligand and BRENDA databases has been integrated, making TSdb a useful source for the discovery of potential inhibitory mechanisms linking transporter substrates and metabolic enzymes. User-friendly web interfaces are designed for easy access, query and download of the data. Text and BLAST searches against all transporters in the database are provided. We will regularly update the substrate data with evidence from new publications.  相似文献   

18.
19.
MOTIVATION: We explored the feasibility of using unaligned rRNA gene sequences as DNA barcodes, based on correlation analysis of composition vectors (CVs) derived from nucleotide strings. We tested this method with seven rRNA (including 12, 16, 18, 26 and 28S) datasets from a wide variety of organisms (from archaea to tetrapods) at taxonomic levels ranging from class to species. RESULT: Our results indicate that grouping of taxa based on CV analysis is always in good agreement with the phylogenetic trees generated by traditional approaches, although in some cases the relationships among the higher systemic groups may differ. The effectiveness of our analysis might be related to the length and divergence among sequences in a dataset. Nevertheless, the correct grouping of sequences and accurate assignment of unknown taxa make our analysis a reliable and convenient approach in analyzing unaligned sequence datasets of various rRNAs for barcoding purposes. AVAILABILITY: The newly designed software (CVTree 1.0) is publicly available at the Composition Vector Tree (CVTree) web server http://cvtree.cbi.pku.edu.cn.  相似文献   

20.
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号