首页 | 本学科首页   官方微博 | 高级检索  
 共查询到20条相似文献,搜索用时 0 毫秒
In order to maximize protein identification by peptide mass fingerprinting noise peaks must be removed from spectra and recalibration is often required. The preprocessing of the spectra before database searching is essential but is time-consuming. Nevertheless, the optimal database search parameters often vary over a batch of samples. For high-throughput protein identification, these factors should be set automatically, with no or little human intervention. In the present work automated batch filtering and recalibration using a statistical filter is described. The filter is combined with multiple data searches that are performed automatically. We show that, using several hundred protein digests, protein identification rates could be more than doubled, compared to standard database searching. Furthermore, automated large-scale in-gel digestion of proteins with endoproteinase LysC, and matrix-assisted laser desorption/ionization-time of flight (MALDI-TOF) analysis, followed by subsequent trypsin digestion and MALDI-TOF analysis were performed. Several proteins could be identified only after digestion with one of the enzymes, and some less significant protein identifications were confirmed after digestion with the other enzyme. The results indicate that identification of especially small and low-abundance proteins could be significantly improved after sequential digestions with two enzymes.  相似文献   

Identification of anonymous proteins from two-dimensional (2-D) gels by peptide mass fingerprinting is one area of proteomics that can greatly benefit from a simple, automated workflow to minimize sample contamination and facilitate high-throughput sample processing. In this investigation we outline a workflow employing robotic automation at each step subsequent to 2-D gel electrophoresis. As proof-of-concept, 96 protein spots from a 2-D gel were analyzed using this approach. Whole protein (1 mg) from mature, dry soybean (Glycine max [L.] Merr.) cv. Jefferson seed was resolved by high resolution 2-D gel electrophoresis. Approximately 150 proteins were observed after staining with Coomassie Blue. The rather low number of detected proteins was due to the fact that the dynamic range of protein expression was greater than 100-fold. The most abundant proteins were seed storage proteins which in total represented over 60% of soybean seed protein. Using peptide mass fingerprinting 44 protein spots were identified. Identification of soybean proteins was greatly aided by the use of annotated, contiguous Expressed Sequence Tag (EST) databases which are available for public access (UniGene, ftp.ncbi.nih.gov/repository/UniGene/). Searches were orders of magnitude faster when compared to searches of unannotated EST databases and resulted in a higher frequency of valid, high-scoring matches. Some abundant, non seed storage proteins identified in this investigation include an isoelectric series of sucrose binding proteins, alcohol dehydrogenase and seed maturation proteins. This survey of anonymous seed proteins will serve as the basis for future comparative analysis of seed-filling in soybean as well as comparisons with other soybean varieties.  相似文献   

白菜的EST标记及其对油菜的通用性   总被引:11,自引:0,他引:11  
忻雅  崔海瑞  张明龙  林容杓  崔水莲 《遗传》2005,27(3):410-416
根据白菜的表达序列标签,设计了28对引物。在对引物、dNTP、MgCl2的浓度及退火温度等参数进行测试后,建立了合适的PCR反应体系。在此反应体系下,以构建EST的白菜自交系A的DNA为模板,对设计的引物进行了筛选,发现有18对引物能对白菜DNA扩增出产物。用筛选出来的引物分别对17个白菜类品种进行PCR扩增,用琼脂糖凝胶电泳分析其产物的多态性,发现10对引物有多态性,这占了筛选引物的55.6%。为检测白菜EST标记的通用性,进一步利用设计的引物对不同油菜品种的DNA进行PCR扩增。在检测的28对引物中,共有24对引物能扩增出产物,占引物总数的85.7%,显示多态性的引物为18对,占引物总数的64.3%.。在对白菜DNA能扩增出产物的18对引物中,对油菜完全可用,且有13对引物产生多态性。而在那些对白菜未扩增出产物的10对引物中,也有6对能扩增出产物,其中5对显示多态性。文章研究结果证明,通过EST建立分子标记是可行的,而且这种标记对近缘物种是可通用的。  相似文献   

Alternative splicing is generally accepted as a mechanism that explains the discrepancy between the number of genes and proteins. We used peptide mass fingerprinting with a theoretical database and scoring method to discover and identify alternative splicing isoforms. Our theoretical database was built using published alternative splicing databases such as ECgene, H-DBAS, and TISA. According to our theoretical database of 190,529 isoforms, 37% of human genes have multiple isoforms. The isoforms produced from a gene partially share common peptide fragments because they have common exons, making it difficult to distinguish isoforms. Therefore, we developed a new method that effectively distinguishes a true isoform among multiple isoforms in a gene. In order to evaluate our algorithm, we made test sets for 4226 protein isoforms extracted from our theoretical database randomly. Consequently, 94% of true isoforms were identified by our scoring algorithm.  相似文献   

During cell line development for an IgG1 antibody candidate (mAb1), a C-terminal extension was identified in 2 product candidate clones expressed in CHO-K1 cell line. The extension was initially observed as the presence of anomalous new peaks in these clones after analysis by cation exchange chromatography (CEX-HPLC) and reduced capillary electrophoresis (rCE-SDS). Reduced mass analysis of these CHO-K1 clones revealed that a larger than expected mass was present on a sub-population of the heavy chain species, which could not be explained by any known chemical or post-translational modifications. It was suspected that this additional mass on the heavy chain was due to the presence of an additional amino acid sequence. To identify the suspected additional sequence, de novo sequencing in combination with proteomic searching was performed against translated DNA vectors for the heavy chain and light chain. Peptides unique to the clones containing the extension were identified matching short sequences (corresponding to 9 and 35 amino acids, respectively) from 2 non-coding sections of the light chain vector construct. After investigation, this extension was observed to be due to the re-arrangement of the DNA construct, with the addition of amino acids derived from the light chain vector non-translated sequence to the C-terminus of the heavy chain. This observation showed the power of proteomic mass spectrometric techniques to identify an unexpected antibody sequence variant using de novo sequencing combined with database searching, and allowed for rapid identification of the root cause for new peaks in the cation exchange and rCE-SDS assays.  相似文献   

During cell line development for an IgG1 antibody candidate (mAb1), a C-terminal extension was identified in 2 product candidate clones expressed in CHO-K1 cell line. The extension was initially observed as the presence of anomalous new peaks in these clones after analysis by cation exchange chromatography (CEX-HPLC) and reduced capillary electrophoresis (rCE-SDS). Reduced mass analysis of these CHO-K1 clones revealed that a larger than expected mass was present on a sub-population of the heavy chain species, which could not be explained by any known chemical or post-translational modifications. It was suspected that this additional mass on the heavy chain was due to the presence of an additional amino acid sequence. To identify the suspected additional sequence, de novo sequencing in combination with proteomic searching was performed against translated DNA vectors for the heavy chain and light chain. Peptides unique to the clones containing the extension were identified matching short sequences (corresponding to 9 and 35 amino acids, respectively) from 2 non-coding sections of the light chain vector construct. After investigation, this extension was observed to be due to the re-arrangement of the DNA construct, with the addition of amino acids derived from the light chain vector non-translated sequence to the C-terminus of the heavy chain. This observation showed the power of proteomic mass spectrometric techniques to identify an unexpected antibody sequence variant using de novo sequencing combined with database searching, and allowed for rapid identification of the root cause for new peaks in the cation exchange and rCE-SDS assays.  相似文献   

There are many computer programs that can match tandem mass spectra of peptides to database-derived sequences; however, situations can arise where mass spectral data cannot be correlated with any database sequence. In such cases, sequences can be automatically deduced de novo, without recourse to sequence databases, and the resulting peptide sequences can be used to perform homologous nonexact searches of sequence databases. This article describes details on how to implement both a de novo sequencing program called “Lutefisk,” and a version of FASTA that has been modified to account for sequence ambiguities inherent in tandem mass spectrometry data.  相似文献   

We describe an integrated workstation for the automated, high-throughput, and conclusive identification of proteins by reverse-phase chromatography electrospray ionization tandem mass spectrometry. The instrumentation consists of a refrigerated autosampler, a submicrobore reverse-phase liquid chromatograph, and an electrospray triple quadrupole mass spectrometer. For protein identification, enzymatic digests of either homogeneous polypeptides or simple protein mixtures were generated and loaded into the autosampler. Samples were sequentially injected every 32 min. Ions of eluting peptides were automatically selected by the mass spectrometer and subjected to collision-induced dissociation. Following each run, the resulting tandem mass spectra were automatically analyzed by SEQUEST, a program that correlates uninterpreted peptide fragmentation patterns with amino acid sequences contained in databases. Protein identification was established by SEQUEST_SUMMARY a program that combines the SEQUEST scores of peptides originating from the same protein and ranks the cumulative results in a short summary. The workstation's performance was demonstrated by the unattended identification of 90 proteins from the yeast Saccharomyces cerevisiae, which were separated by high-resolution two-dimensional PAGE. The system was found to be very robust and identification was reliably and conclusively established for proteins if quantities exceeding 1-5 pmol were applied to the gel. The level of automation, the throughput, and the reliability of the results suggest that this system will be useful for the many projects that require the characterization of large numbers of proteins.  相似文献   

This protocol details a method for the identification of proteins that have been separated by gel electrophoresis. In-gel digestion of the protein bands with trypsin followed by quadrupole ion-trap or other triple quadrupole mass spectrometry techniques is described. The proteins can be identified by database searching of the mass fingerprint of the intact peptides and of the characteristic fragment masses produced by tandem mass spectrometry.  相似文献   

To increase the number of genes that can be mapped to the genome of the tammar wallaby (Macropus eugenii), we sequenced 100 randomly chosen clones from a mammary gland cDNA library. Provisional identifications were made of seven nuclear genes and one mitochondrial gene encoding two caseins, -galactosidase, acetyl-coenzyme A synthetase, lipoprotein lipase, inorganic pyrophosphatase, an ATP-dependent RNA helicase, and cytochromec oxidase I. Highly conserved genes, such as that encoding acetyl-coenzyme A synthetase, were easily identified even from cross-kingdom matches. Genes which are highly divergent, however, such as those encoding themature casein peptides, could not be aligned with homologues in the databases. Even in an organ where there is high mRNA species redundancy, the sequence characterization of expressed sequence tags provides a rapid means of gene identification for mapping purposes.  相似文献   

植物逆境胁迫耐受性功能基因组研究进展   总被引:6,自引:0,他引:6  
为了更加高效地利用基因工程技术提高植物对逆境胁迫的耐受性,需要在全基因组水平上对植物逆境胁迫耐受性的复杂机制进行整合性研究.植物逆境胁迫耐受性功能基因组的研究可概括为:利用胁迫特异性的表达序列标签(EST)及cDNA微阵列(或基因芯片)技术筛选与胁迫相关的候选基因,然后利用反向遗传学等技术对候选基因的功能进行研究,利用酵母双杂交、正向遗传学等技术对基因及基因产物间的相互关系进行研究.通过这些研究可以全面地了解植物对胁迫(渗透、干旱、极端温度)响应的复杂机制和相互作用以及相应的信号转导途径,从而为更加高效地利用基因工程技术提高植物对逆境胁迫的耐受性奠定基础.  相似文献   

We searched partial sequences of over 22,706 rice cDNA and 1220genomic DNA clones to find and characterize simple sequencerepeats (SSRs) in the rice genome. The most frequently foundrepeated SSR motif in both cDNA and genomic DNA sequences wasd(CCG/CGG)n. The second most frequently found SSR was d(AG/CT)n.In contrast with mammalian genomes, in which d(AC/GT)n sequencesare the most abundant, d(AC/GT)n sequences were not frequentlyobserved in rice. Sequences containing d(CCG/CGG)n, d(AG/CT)nrepeats, and other SSRs were chosen for polymorphism detection.It was predicted that 17 of 20 SSRs in cDNA sequences were locatedin 5'-untranslated regions near initiation codons. Twenty-twoloci can be mapped on our RFLP linkage map by these SSRs. Sixmarkers were tested with 16 japonica rice varieties as templatesfor PCR. Two markers exhibited amplified fragment length polymorphismamong these rice varieties, implying that SSRs are polymorphicamong rice varieties which have similar genetic backgrounds.Even these polymorphic SSRs are located within or around geneswhich code ubiquitous proteins.  相似文献   

Microsatellite or simple sequence repeat markers derived from expressed sequence tags (ESTs) provide genetic markers within potentially functional genes, which could be very useful for breeding programs. To date, the development of microsatellite markers in the genus Fragaria has focused mainly on Fragaria vesca. However, most of the interests of breeding programs relate to specific characteristics of cultivated strawberry. Here, we describe a set of 10 EST‐derived microsatellites from Fragaria × ananassa. These markers showed high levels of polymorphism within strawberry cultivars and among different Fragaria species, indicating their potential for genetic studies not only on strawberry but also in other species within the genus.  相似文献   

贻贝通过足腺分泌特有的足丝并以此粘附于水下各种基质表面.贻贝足丝中富含各种粘附蛋白,其优异的水下粘附性能使其成为开发新型生物粘合剂的候选分子.厚壳贻贝足丝粘附能力强,本文采用尿素及盐酸胍抽提结合二维双向电泳技术(two-dimensional electrophoresis, 2-DE),分别对厚壳贻贝足丝纤维和足丝盘的蛋白质进行分离及染色;采用串联质谱技术结合常规搜库和表达序列标签(EST) 数据库搜索,对分离获得的蛋白质点进行鉴定,从中获得了mfp-3、mfp-6、胶原蛋白以及3种未曾报道过的新型贻贝足丝蛋白成分.上述研究为深入了解厚壳贻贝足丝蛋白的分子多样性、探讨其粘附机理以及从中筛选具有应用前景的贻贝足丝蛋白奠定了基础.  相似文献   

利用反相高效液相色谱 (RP HPLC)和电喷雾串联质谱 (ESI MS MS)联用技术直接对模式蛋白分子 (牛血清白蛋白 ,BSA)的胰蛋白酶酶解产物进行分离和测定 .获得的一系列BSA酶解片段的一级 (MS)和二级 (MS MS)质谱数据经分析软件处理后 ,分别在不同处理和不同参数条件下 ,用 3种不同的方法通过网上蛋白质数据库进行蛋白质搜寻鉴定 .结果显示 ,3种搜寻法都能正确地鉴定该蛋白质 ,其中以利用MS数据的肽质量指纹谱搜寻法 (PMF法 )较为快捷方便 ,但鉴定结果易受数据处理和数据库搜寻鉴定时参数设置等因素的影响 ;利用未解析MS MS数据 (rawMS MSdata)的搜寻法可在较宽的搜寻参数变化范围内获得明确的鉴定结果 ;而借助从头测序 (denovosequencing)结果的序列搜寻法 (sequencequery)则显示出更高的专一性 ,利用较少酶解片段数据就能得到稳定和明确的鉴定结果 ,搜寻参数变化的影响很小 .就酶解条件、数据处理和搜寻参数设置对蛋白质鉴定结果的影响展开详细的讨论 ,为蛋白质组学研究中的数据处理和库搜寻鉴定积累了可借鉴的资料  相似文献   

Patterns of genetic diversity and differentiation among five wild and four hatchery populations of Atlantic salmon in the Baltic Sea were assessed based on eight assumedly neutral microsatellite loci and six gene-associated markers, including four expressed sequence tag (EST) linked and two major histocompatibility complex (MHC) linked tandem repeat markers (micro- and mini-satellites). The coalescent simulations based on the method of Beaumont and Nichols (1996, Proc. R. Soc. Lond. Ser. B – Biol. Sci., 263, 1619–1626) indicated that two loci (MHCIIα and Ssa171, with the lowest and highest overall FST estimates, respectively) exhibited significant departures (P<0.05) from the neutral expectations. Another coalescent-based test for selective neutrality (Vitalis et al. 2001, Genetics, 158, 1811–1823) further supported the outlier status of the Ssa171 microsatellite locus but not of the MHCIIα linked minisatellite. In addition, actin related protein linked microsatellite locus was identified with this test as an outlier in six pairwise population comparisons. All genetic diversity estimates revealed more genetic variation in hatchery stocks than in the small wild salmon populations from the Gulf of Finland. However, the wild populations possessed alleles at gene-associated markers (e.g. MHCI and IGF) not found in the hatchery stocks, which together with moderate genetic differentiation and distinctive environmental conditions justifies the special conservation measures for the last remaining native salmon populations in the Gulf of Finland.  相似文献   

There is a general lack of genomic information available for chlorophyte seaweed genera such as Ulva, and in particular there is no information concerning the genes that contribute to adhesion and cell wall biosynthesis for this organism. Partial sequencing of cDNA libraries to generate expressed sequence tags (ESTs) is an effective means of gene discovery and characterization of expression patterns. In this study, a cDNA library was created from sporulating tissue of Ulva linza L. Initially, 650 ESTs were randomly selected from a cDNA library and sequenced from their 5′ ends to obtain an indication of the level of redundancy of the library (21%). The library was normalized to enrich for rarer sequences, and a further 1920 ESTs were sequenced. These sequences were subjected to contig assembly that resulted in a unigene set of approximately 1104 ESTs. Forty‐eight percent of these sequences exhibited significant similarity to sequences in the databases. Phylogenetic comparisons are made between selected sequences with similarity in the databases to proteins involved in aspects of extracellular matrix/cell wall assembly and adhesion.  相似文献   

Objective: Large scale analysis of gene expression in adipose tissue provides a basis for the identification of novel candidate genes involved in the pathophysiology of obesity. Our goal was to explore gene expression in human adipose tissue at a partial genome scale using DNA array. Research Methods and Procedures: Labeled cDNA, derived from human adipose tissue poly(A+) RNA, was hybridized to a DNA array containing over 18,000 human expressed sequence‐tagged (EST) clones. The results were analyzed by database searches. Results: Homology searches of the 300 EST clones with highest hybridization signals revealed that 145 contained DNA sequences identical to known genes and 79 could be linked to UniGene clusters. Of the 145 identified genes, 136 were nonredundant and subsequently characterized with respect to function and chromosomal localization by searching MEDLINE, UniGene, GeneMap, OMIM, SWISS‐PROT, the Genome Database, and the Location Data Base. The identified genes were grouped according to their putative functions; cell/organism defense (9.6%), cell division (5.1%), cell signaling/communication (19.8%), cell structure/motility (12.5%), gene/protein expression (16.9%), metabolism (16.2%), and unclassified (19.8%). Less than 50% of these genes have previously been reported to be expressed in adipose tissue. The chromosomal localization of 268 genes strongly expressed in adipose tissue showed that their relative abundance was significantly increased on chromosomes 11, 19, and 22 compared to the expected distribution of the same number of random genes. Discussion: Our study resulted in the identification of numerous genes previously not reported to be expressed in adipose tissue. These results suggest that DNA array is a powerful tool in the search for novel regulatory pathways within adipose tissue on a scale that is not possible using conventional methods.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号