首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 328 毫秒
1.
为了寻找新型几丁质酶编码基因,提取了汕头湾海域海底表层沉积物中微生物的宏基因组,采用PCR-DGGE技术扩增和分离获得了新型几丁质酶基因编码信息。实验共得到63条几丁质酶基因片段,其编码的蛋白序列与NCBI数据库收录的序列相似率在41%~97%之间,且大多数在70%以下,只有8条相似率在70%以上,NCBI数据库中与其相似性最高的几丁质酶蛋白序列有18个种属,其中与橙色滑柱菌株(Herpetosiphon aurantiacus)ATCC 23779的几丁质酶存在着最高相似性的序列有27条,另外,与未可培养细菌几丁质酶基因片段相似的有22条,占34.9%,分属9个种属。说明汕头湾表层沉积物中存在多种几丁质酶编码基因,其中有很多是尚未研究的,为寻找新型的几丁质酶编码基因提供了有力的生物资源。  相似文献   

2.
牛催乳素基因组及其cDNA全长序列的分子克隆和分析   总被引:17,自引:0,他引:17  
通过LongPCR等技术首次克隆得到全长9388bp的牛催乳素(bPRL)基因组序列(GenBank登录号AF426315),其中包括bPRL基因全部5个外显子和4个内含子,5′端854bp的上游调控区以及3′端69bp的UTR,AF426315基因编码的蛋白质在GenBank中的序号为AAL28075,由229个氨基酸残基组成,1-30位氨基酸残基为信号肽序列,成熟的多肽含有199个氨基酸残基,将bPRL基因组DNA真核表达载体转染COS-7细胞后通过RT-PCR得到长度为804bp的bPRLcDNA序列,该序列涵盖了bPRL基因的全部ORF区,证明本研究所获得的bPRL基因组DNA具有转录的生物学功能,Blast搜索结果显示,GenBank数据库中收集有多条bPRL基因的mRNA和EST序列,各序列间存在多个SNP位点,主要分布于下游编码区和3′端的UTR,这些位点均未改变相应的氨基酸残基的性质,此外,5′端编码信号肽序列的区域呈现高度保守性。  相似文献   

3.
目的 获得版纳微型猪近交系(BMI)生长激素受体基因(GHR)序列,通过生物信息学分析预测GHR功能并进行GHR mRNA多组织表达谱分析.方法 以版纳微型猪近交系的肝脏组织为材料提取RNA,RT-PCR方法扩增GHR基因编码区序列,将序列连接至pMD18-T载体进行克隆、测序和生物信息学分析;半定量PCR检测GHR mRNA在BMI不同组织中表达量的差异.结果克隆出了BMI GHR 编码区序列,提交GenBank获得登录号KC999114.该基因CDS长1917 bp,编码638个氨基酸.生物信息学分析表明,与长白猪的GHR序列相比BMI存在4处氨基酸替换,分别为p.E381D、p.A409S、p.L556V和p.A580G,均发生在胞内域.GHR基因多组织表达谱分析显示:GHR mRNA几乎在各组织中均有表达,在肌肉中表达量最高,在小肠、心、肝、神经纤维、脾、卵巢中表达量较高,在肺、胃、大脑、胰和肾中的表达量较低.结论 成功克隆了版纳微型猪近交系GHR全长编码区序列,进行了生物信息学功能分析和组织表达谱分析,为进一步阐明版纳微型猪近交系生长矮小机理奠定了基础.  相似文献   

4.
目的获得版纳微型猪近交系(BMI)生长激素受体基因(GHR)序列,通过生物信息学分析预测GHR功能并进行GHR mRNA多组织表达谱分析。方法以版纳微型猪近交系的肝脏组织为材料提取RNA,RTPCR方法扩增GHR基因编码区序列,将序列连接至pMD18-T载体进行克隆、测序和生物信息学分析;半定量PCR检测GHR mRNA在BMI不同组织中表达量的差异。结果克隆出了BMI GHR编码区序列,提交GenBank获得登录号KC999114。该基因CDS长1917 bp,编码638个氨基酸。生物信息学分析表明,与长白猪的GHR序列相比BMI存在4处氨基酸替换,分别为p.E381D、p.A409S、p.L556V和p.A580G,均发生在胞内域。GHR基因多组织表达谱分析显示:GHR mRNA几乎在各组织中均有表达,在肌肉中表达量最高,在小肠、心、肝、神经纤维、脾、卵巢中表达量较高,在肺、胃、大脑、胰和肾中的表达量较低。结论成功克隆了版纳微型猪近交系GHR全长编码区序列,进行了生物信息学功能分析和组织表达谱分析,为进一步阐明版纳微型猪近交系生长矮小机理奠定了基础。  相似文献   

5.
摘要:【目的】克隆小麦条锈菌几丁质合成酶基因PstChsII,分析其在小麦条锈菌不同发育时期的表达水平。【方法】利用RT-PCR和PCR技术克隆PstChsII的cDNA序列和基因组序列,利用不同的生物信息学软件对序列进行分析,运用实时荧光定量技术分析基因在孢子、芽管以及不同侵染时间的表达水平。【结果】PstChsII基因(Genbank登录号GQ329851)编码区存在15个内含子,开放阅读框长2727 bp,编码908个氨基酸。PstChsII蛋白C端含有7个跨膜螺旋区,N端含多个保守结构域和“QXR  相似文献   

6.
ncRNA和mRNA一样,都是重要的功能分子。以k-tuple(k字)含量为特征,对酵母ncRNA成熟序列和mRNA的编码区、上游序列与下游序列进行了分类与比较研究,结果显示:基于ncRNA成熟序列与mRNA编码区的3-tuple的含量,ncRNA和mRNA的交叉有效性分类精度(leave-one out cross-validation,LOOCV)平均值达到93.93%;基于上游序列4-tuple和5-tuple的含量,分类精度分别为92.49%和92.76%;基于下游序列4-tuple和5-tuple的含量,分类精度分别为91.58%和90.60%;利用上游序列和下游序列的4-tuple与5-tuple的含量,其平均分类精度分别为94.68%和94.83%;通过t检验,得到了在ncRNA和mRNA上、下游序列中具有显著统计学差异的k-tuple。上述结果表明,基于ncRNA成熟序列与mRNA编码区的3-tuple含量和基于ncRNA与mRNA上、下游序列的4或5-tuple含量可以有效地区分ncRNA与mRNA。此研究结果不仅有助于准确识别ncRNA与mRNA,还有助于发现ncRNA特异的转录因子结合位点。  相似文献   

7.
ncRNA和mRNA一样,都是重要的功能分子.以κ-tuple(κ字)含量为特征,对酵母ncRNA成熟序列和mRNA的编码区、上游序列与下游序列进行了分类与比较研究,结果显示:基于ncRNA成熟序列与mRNA编码区的3-tuple的含量,ncRNA和mRNA的交叉有效性分类精度(leave-one out cross-validation,LOOCV)平均值达到93.93%;基于上游序列4-tuple和5-tuple的含量,分类精度分别为92.49%和92.76%;基于下游序列4-tuple和5-tuple的含量,分类精度分别为91.58%和90.60%;利用上游序列和下游序列的4-tuple与5-tuple的含量,其平均分类精度分别为94.68%和94.83%;通过t检验,得到了在ncRNA和mRNA上、下游序列中具有显著统计学差异的κ-tuple.上述结果表明,基于ncRNA成熟序列与mRNA编码区的3-tuple含量和基于ncRNA与mRNA上、下游序列的4或5-tuple含量可以有效地区分ncRNA与mRNA.此研究结果不仅有助于准确识别ncRNA与mRNA,还有助于发现ncRNA特异的转录因子结合位点.  相似文献   

8.
UniProt蛋白质数据库简介   总被引:1,自引:0,他引:1       下载免费PDF全文
罗静初 《生物信息学》2019,17(3):131-144
UniProt(https://www.uniprot.org/)是国际知名蛋白质数据库,主要包括UniProtKB知识库、UniParc归档库和UniRef参考序列集三部分。UniProtKB知识库是UniProt的核心,除蛋白质序列数据外,还包括大量注释信息。UniProtKB知识库分Swiss-Prot和TrEMBL两个子库。Swiss-Prot子库中50多万条序列均由人工审阅和注释,而TrEMBL子库中1.4亿多条序列是由核酸序列数据库EMBL中的蛋白质编码序列翻译所得,并由计算机根据一定规则进行注释。UniParc归档库将存放于不同数据库中的同一个蛋白质归并到一个记录中以避免冗余,并赋予序列唯一性特定标识符。UniRef参考序列集按相似性程度将UniProtKB和UniParc中的序列分为UniRef100、UniRef90和UniRef50三个数据集。UniProt网站为用户提供了高效实用的高级检索系统和大量帮助文档。UniProt数据库每4周发布新版的同时也发布统计报表,用户可通过统计报表了解该数据库的数据量及更新情况、数据类别和物种分布等基本信息,查看常规注释信息、序列特征注释信息和数据库交叉链接等统计数据。UniProt是目前国际上序列数据最完整、注释信息最丰富的非冗余蛋白质序列数据库,自本世纪初创建以来,为生命科学领域提供了宝贵资源。  相似文献   

9.
建立了一个包含核酸序列信息的蛋白质折叠数据库。以此为基础,对于每一个蛋白质,计算了其相应编码mRNA序列的茎结构含量、环结构含量、折叠自由能及mRNA的柔性等描述mRNA二级结构特征的基本参量。进一步分析了这些mRNA二级结构参量与相应蛋白质折叠速率的关系。结果表明,mRNA茎结构含量与蛋白质折叠速率呈显著负相关性,而环结构含量则与蛋白质折叠速率呈显著正相关性;同时,mRNA的柔性与相应蛋白质折叠速率呈极显著正相关性。进一步的分析表明,当把蛋白质分为不同二级结构类型和折叠类型后,mRNA的柔性对不同类型蛋白质的折叠速率均为重要的影响因素,而mRNA的茎结构含量和环结构含量主要影响二态蛋白质的折叠。结果证实,mRNA的二级结构对蛋白质的折叠有着重要作用。  相似文献   

10.
以菜心(Brassica rapa var. parachinensis)抗小菜蛾品种Caixin65和感小菜蛾品种Caixin69及6份F2株系为材料,利用非变性聚丙烯酰胺凝胶电泳检测菜心抗虫株系的SCoT多态性,同时进行SCoT多态性的非变性聚丙烯酰胺凝胶电泳检测效率、遗传多样性分析和差异条带克隆分析。结果表明,非变性聚丙烯酰胺凝胶能检测出亲本与子代间的SCoT多态性和遗传多样性变化,条带数量较多且清晰,提高了SCoT标记检测效率。选取亲本的10条非变性聚丙烯酰胺凝胶差异片段克隆测序,获得感虫序列4条、抗虫序列6条,GENSCAN预测其中8条具有启动子、终止子、阅读框等基因结构序列。同源性检索分析表明,感虫序列分别与泛素羧基末端水解酶mRNA序列、大白菜克隆序列、白菜线粒体丙酮酸载体蛋白序列、甘蓝基因组编码未知蛋白的HDEM序列同源,抗虫序列分别与白菜RNA假尿苷合酶4线粒体mRNA序列、京水菜线粒体DNA序列、白菜未知蛋白mRNA序列、白菜4-香豆酸:辅酶A连接酶mRNA序列、大白菜克隆序列、哈茨木霉mRNA序列同源。本研究提高了SCoT标记清晰度、遗传多样性检测水平和差异片段克隆的精准性,使得SCoT成为批量克隆差异片段的高效工具,有助于挖掘SCoT功能性标记信息,开展初步的功能基因组学研究,提高优异株系筛选鉴定效率,加快育种进程,为菜心抗虫性机制的进一步研究提供理论基础。  相似文献   

11.
Mutations help us to understand the molecular origins of diseases. Researchers, therefore, both publish and seek disease-relevant mutations in public databases and in scientific literature, e.g. Medline. The retrieval tends to be time-consuming and incomplete. Automated screening of the literature is more efficient. We developed extraction methods (called MEMA) that scan Medline abstracts for mutations. MEMA identified 24,351 singleton mutations in conjunction with a HUGO gene name out of 16,728 abstracts. From a sample of 100 abstracts we estimated the recall for the identification of mutation-gene pairs to 35% at a precision of 93%. Recall for the mutation detection alone was >67% with a precision rate of >96%. This shows that our system produces reliable data. The subset consisting of protein sequence mutations (PSMs) from MEMA was compared to the entries in OMIM (20,503 entries versus 6699, respectively). We found 1826 PSM-gene pairs to be in common to both datasets (cross-validated). This is 27% of all PSM-gene pairs in OMIM and 91% of those pairs from OMIM which co-occur in at least one Medline abstract. We conclude that Medline covers a large portion of the mutations known to OMIM. Another large portion could be artificially produced mutations from mutagenesis experiments. Access to the database of extracted mutation-gene pairs is available through the web pages of the EBI (refer to http://www.ebi. ac.uk/rebholz/index.html).  相似文献   

12.
Mining literature for protein-protein interactions   总被引:7,自引:0,他引:7  
MOTIVATION: A central problem in bioinformatics is how to capture information from the vast current scientific literature in a form suitable for analysis by computer. We address the special case of information on protein-protein interactions, and show that the frequencies of words in Medline abstracts can be used to determine whether or not a given paper discusses protein-protein interactions. For those papers determined to discuss this topic, the relevant information can be captured for the Database of Interacting PROTEINS: Furthermore, suitable gene annotations can also be captured. RESULTS: Our Bayesian approach scores Medline abstracts for probability of discussing the topic of interest according to the frequencies of discriminating words found in the abstract. More than 80 discriminating words (e.g. complex, interaction, two-hybrid) were determined from a training set of 260 Medline abstracts corresponding to previously validated entries in the Database of Interacting Proteins. Using these words and a log likelihood scoring function, approximately 2000 Medline abstracts were identified as describing interactions between yeast proteins. This approach now forms the basis for the rapid expansion of the Database of Interacting Proteins.  相似文献   

13.
14.
Unix下EST数据库本地化更新及序列预处理分析   总被引:1,自引:1,他引:0  
利用FreeBSD操作系统的文本过滤命令将NCBI的Genbank数据库的EST序列实现本地化导入到MySQL数据库中并能够进行更新,这利于对不同物种不同组织器官基因表达的分析。以水稻EST数据为例,对EST序列两端出现的polyA/T和载体序列亦进行了鉴别及去除,经过预处理的EST序列数据将为进一步进行EST聚类及基因表达分析提供可靠的保证。  相似文献   

15.
16.
MicroRNAs (miRNAs) regulate a wide range of cellular and developmental processes through gene expression suppression or mRNA degradation. Experimentally validated miRNA gene targets are often reported in the literature. In this paper, we describe miRTex, a text mining system that extracts miRNA-target relations, as well as miRNA-gene and gene-miRNA regulation relations. The system achieves good precision and recall when evaluated on a literature corpus of 150 abstracts with F-scores close to 0.90 on the three different types of relations. We conducted full-scale text mining using miRTex to process all the Medline abstracts and all the full-length articles in the PubMed Central Open Access Subset. The results for all the Medline abstracts are stored in a database for interactive query and file download via the website at http://proteininformationresource.org/mirtex. Using miRTex, we identified genes potentially regulated by miRNAs in Triple Negative Breast Cancer, as well as miRNA-gene relations that, in conjunction with kinase-substrate relations, regulate the response to abiotic stress in Arabidopsis thaliana. These two use cases demonstrate the usefulness of miRTex text mining in the analysis of miRNA-regulated biological processes.  相似文献   

17.
Background: In the field of bioinformatics interchangeable data formats based on XML are widely used. XML-type data is also at the core of most web services. With the increasing amount of data stored in XML comes the need for storing and accessing the data. In this paper we analyse the suitability of different database systems for storing and querying large datasets in general and Medline in particular.Results: All reviewed database systems perform well when tested with small to medium sized datasets, however when the full Medline dataset is queried a large variation in query times is observed. Conclusions: There is not one system that is vastly superior to the others in this comparison and, depending on the database size and the query requirements, different systems are most suitable. The best all-round solution is the Oracle 11~g database system using the new binary storage option. Alias-i's Lingpipe is a more lightweight, customizable and sufficiently fast solution. It does however require more initial configuration steps. For data with a changing XML structure Sedna and BaseX as native XML database systems or MySQL with an XML-type column are suitable.  相似文献   

18.
19.
The GoSh database is a collection of 58 990 Capra hircus and Ovis aries expressed sequence tags. A perl pipeline was prepared to process sequences, and data were collected in a MySQL database. A PHP-based web interface allows browsing and querying the database. Putative single nucleotide polymorphism (SNP) detection, as well as search to repeats were performed, and links to external related resources were provided. Sequences were annotated against three different databases and an algorithm was implemented to create statistics of the distribution of retrieved homologous ontologies in the Gene Ontology categories. The GoSh database is a repository of data and links related to goat and sheep expressed genes. AVAILABILITY: The GoSh database is available at http://www.itb.cnr.it/gosh/  相似文献   

20.
GenBank.   总被引:5,自引:2,他引:3       下载免费PDF全文
The GenBank sequence database continues to expand its data coverage, quality control, annotation content and retrieval services. GenBank is comprised of DNA sequences submitted directly by authors as well as sequences from the other major public databases. An integrated retrieval system, known as Entrez, contains data from GenBank and from the major protein sequence and structural databases, as well as related MEDLINE abstracts. Users may access GenBank over the Internet through the World Wide Web and through special client-server programs for text and sequence similarity searching. FTP, CD-ROM and e-mail servers are alternate means of access.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号