首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
Phyloproteomics is a novel analytical tool that solves the issue of comparability between proteomic analyses, utilizes a total spectrum-parsing algorithm, and produces biologically meaningful classification of specimens. Phyloproteomics employs two algorithms: a new parsing algorithm (UNIPAL) and a phylogenetic algorithm (MIX). By outgroup comparison, the parsing algorithm identifies novel or vanished MS peaks and peaks signifying up or down regulated proteins and scores them as derived or ancestral. The phylogenetic algorithm uses the latter scores to produce a biologically meaningful classification of the specimens.  相似文献   

2.
The wealth of interaction information provided in biomedical articles motivated the implementation of text mining approaches to automatically extract biomedical relations. This paper presents an unsupervised method based on pattern clustering and sentence parsing to deal with biomedical relation extraction. Pattern clustering algorithm is based on Polynomial Kernel method, which identifies interaction words from unlabeled data; these interaction words are then used in relation extraction between entity pairs. Dependency parsing and phrase structure parsing are combined for relation extraction. Based on the semi-supervised KNN algorithm, we extend the proposed unsupervised approach to a semi-supervised approach by combining pattern clustering, dependency parsing and phrase structure parsing rules. We evaluated the approaches on two different tasks: (1) Protein–protein interactions extraction, and (2) Gene–suicide association extraction. The evaluation of task (1) on the benchmark dataset (AImed corpus) showed that our proposed unsupervised approach outperformed three supervised methods. The three supervised methods are rule based, SVM based, and Kernel based separately. The proposed semi-supervised approach is superior to the existing semi-supervised methods. The evaluation on gene–suicide association extraction on a smaller dataset from Genetic Association Database and a larger dataset from publicly available PubMed showed that the proposed unsupervised and semi-supervised methods achieved much higher F-scores than co-occurrence based method.  相似文献   

3.
Face parsing is an important computer vision task that requires accurate pixel segmentation of facial parts (such as eyes, nose, mouth, etc.), providing a basis for further face analysis, modification, and other applications. Interlinked Convolutional Neural Networks (iCNN) was proved to be an effective two-stage model for face parsing. However, the original iCNN was trained separately in two stages, limiting its performance. To solve this problem, we introduce a simple, end-to-end face parsing framework: STN-aided iCNN(STN-iCNN), which extends the iCNN by adding a Spatial Transformer Network (STN) between the two isolated stages. The STN-iCNN uses the STN to provide a trainable connection to the original two-stage iCNN pipeline, making end-to-end joint training possible. Moreover, as a by-product, STN also provides more precise cropped parts than the original cropper. Due to these two advantages, our approach significantly improves the accuracy of the original model. Our model achieved competitive performance on the Helen Dataset, the standard face parsing dataset. It also achieved superior performance on CelebAMask-HQ dataset, proving its good generalization. Our code has been released at https://github.com/aod321/STN-iCNN.  相似文献   

4.
In this paper, we design a heuristic algorithm of computing a constrained multiple sequence alignment (CMSA for short) for guaranteeing that the generated alignment satisfies the user-specified constraints that some particular residues should be aligned together. If the number of residues needed to be aligned together is a constant alpha, then the time-complexity of our CMSA algorithm for aligning K sequences is O(alphaKn(4)), where n is the maximum of the lengths of sequences. In addition, we have built up such a CMSA software system and made several experiments on the RNase sequences, which mainly function in catalyzing the degradation of RNA molecules. The resulting alignments illustrate the practicability of our method.  相似文献   

5.
A natural language parser implemented entirely in simulated neurons is described. It produces a semantic representation based on frames. It parses solely using simulated fatiguing Leaky Integrate and Fire neurons, that are a relatively accurate biological model that is simulated efficiently. The model works on discrete cycles that simulate 10 ms of biological time, so the parser has a simple mapping to psychological parsing time. Comparisons to human parsing studies show that the parser closely approximates this data. The parser makes use of Cell Assemblies and the semantics of lexical items is represented by overlapping hierarchical Cell Assemblies so that semantically related items share neurons. This semantic encoding is used to resolve prepositional phrase attachment ambiguities encountered during parsing. Consequently, the parser provides a neurally-based cognitive model of parsing.  相似文献   

6.
MOTIVATION: While database activities in the biological area are increasing rapidly, rather little is done in the area of parsing them in a simple and object-oriented way. RESULTS: We present here an elegant, simple yet powerful way of parsing biological flat-file databases. We have taken EMBL, SWISSPROT and GENBANK as examples. EMBL and SWISS-PROT do not differ much in the format structure. GENBANK has a very different format structure than EMBL and SWISS-PROT. Extracting the desired fields in an entry (for example a sub-sequence with an associated feature) for later analysis is a constant need in the biological sequence-analysis community: this is illustrated with tools to make new splice-site databases. The interface to the parser is abstract in the sense that the access to all the databases is independent from their different formats, since parsing instructions are hidden.  相似文献   

7.
MOTIVATION: The field of 'DNA linguistics' has emerged from pioneering work in computational linguistics and molecular biology. Most formal grammars in this field are expressed using Definite Clause Grammars but these have computational limitations which must be overcome. The present study provides a new DNA parsing system, comprising a logic grammar formalism called Basic Gene Grammars and a bidirectional chart parser DNA-ChartParser. RESULTS: The use of Basic Gene Grammars is demonstrated in representing many formulations of the knowledge of Escherichia coli promoters, including knowledge acquired from human experts, consensus sequences, statistics (weight matrices), symbolic learning, and neural network learning. The DNA-ChartParser provides bidirectional parsing facilities for BGGs in handling overlapping categories, gap categories, approximate pattern matching, and constraints. Basic Gene Grammars and the DNA-ChartParser allowed different sources of knowledge for recognizing E.coli promoters to be combined to achieve better accuracy as assessed by parsing these DNA sequences in real-world data sets.  相似文献   

8.
Natural language processing is a fast and automatized process. A crucial part of this process is parsing, the online incremental construction of a syntactic structure. The aim of this study was to test whether a wh-filler extracted from an embedded clause is initially attached as the object of the matrix verb with subsequent reanalysis, and if so, whether the plausibility of such an attachment has an effect on reaction time. Finally, we wanted to examine whether subcategorization plays a role. We used a method called G-Maze to measure response time in a self-paced reading design. The experiments confirmed that there is early attachment of fillers to the matrix verb. When this attachment is implausible, the off-line acceptability of the whole sentence is significantly reduced. The on-line results showed that G-Maze was highly suited for this type of experiment. In accordance with our predictions, the results suggest that the parser ignores (or has no access to information about) implausibility and attaches fillers as soon as possible to the matrix verb. However, the results also show that the parser uses the subcategorization frame of the matrix verb. In short, the parser ignores semantic information and allows implausible attachments but adheres to information about which type of object a verb can take, ensuring that the parser does not make impossible attachments. We argue that the evidence supports a syntactic parser informed by syntactic cues, rather than one guided by semantic cues or one that is blind, or completely autonomous.  相似文献   

9.
The proton motive force (pmf) across the thylakoid membrane couples photosynthetic electron transport and ATP synthesis. In recent years, the electrochromic carotenoid and chlorophyll absorption band shift (ECS), peaking ∼515 nm, has become a widely used probe to measure pmf in leaves. However, the use of this technique to calculate the parsing of the pmf between the proton gradient (ΔpH) and electric potential (Δψ) components remains controversial. Interpretation of the ECS signal is complicated by overlapping absorption changes associated with violaxanthin de-epoxidation to zeaxanthin (ΔA505) and energy-dependent nonphotochemical quenching (qE; ΔA535). In this study, we used Arabidopsis (Arabidopsis thaliana) plants with altered xanthophyll cycle activity and photosystem II subunit S (PsbS) content to disentangle these overlapping contributions. In plants where overlap among ΔA505, ΔA535, and ECS is diminished, such as npq4 (lacking ΔA535) and npq1npq4 (also lacking ΔA505), the parsing method implies the Δψ contribution is virtually absent and pmf is solely composed of ΔpH. Conversely, in plants where ΔA535 and ECS overlap is enhanced, such as L17 (a PsbS overexpressor) and npq1 (where ΔA535 is blue-shifted to 525 nm) the parsing method implies a dominant contribution of Δψ to the total pmf. These results demonstrate the vast majority of the pmf attributed by the ECS parsing method to Δψ is caused by ΔA505 and ΔA535 overlap, confirming pmf is dominated by ΔpH following the first 60 s of continuous illumination under both low and high light conditions. Further implications of these findings for the regulation of photosynthesis are discussed.

Electrochromic shift absorption kinetics show the steady-state transthylakoid proton motive force in plants is dominated by the proton concentration gradient under both low and high light conditions.  相似文献   

10.
Noise is a major problem in analyzing tracking data of cargos moved by molecular motors. We use Bayesian statistics to incorporate what is known about the noise in parsing the trajectory of a cargo into a series of constant velocity segments. Tracks with just noise and no underlying motion are fit with constant velocity segments to produce a calibration curve of fit quality versus average segment duration. Fits to tracks of moving cargos are compared to the calibration curves with similar noise. The fit with the optimum number of constant velocity states has the least number of segments needed to match the fit quality of the calibration curve. We have tested this approach using tracks with known underlying motion generated by computer simulations and with a specially designed in vitro experiment. We present the results of using this parsing approach to analyze transport of lipid droplets in Drosophila embryos.  相似文献   

11.
12.
A simple, static contact mapping algorithm has been developed as a first step at identifying potential peptide biomimetics from protein interaction partner structure files. This rapid and simple mapping algorithm, “OpenContact” provides screened or parsed protein interaction files based on specified criteria for interatomic separation distances and interatomic potential interactions. The algorithm, which uses all‐atom Amber03 force field models, was blindly tested on several unrelated cases from the literature where potential peptide mimetics have been experimentally developed to varying degrees of success. In all cases, the screening algorithm efficiently predicted proposed or potential peptide biomimetics, or close variations thereof, and provided complete atom‐atom interaction data necessary for further detailed analysis and drug development. In addition, we used the static parsing/mapping method to develop a peptide mimetic to the cancer protein target, epidermal growth factor receptor. In this case, secondary, loop structure for the peptide was indicated from the intra‐protein mapping, and the peptide was subsequently synthesized and shown to exhibit successful binding to the target protein. The case studies, which all involved experimental peptide drug advancement, illustrate many of the challenges associated with the development of peptide biomimetics, in general. Proteins 2014; 82:2253–2262. © 2014 The Authors. Proteins: Structure, Function, and Bioinformatics Published by Wiley Periodicals, Inc.  相似文献   

13.
Decomposing a biological sequence into modular domains is a basic prerequisite to identify functional units in biological molecules. The commonly used segmentation procedures usually have two steps. First, collect and align a set of sequences that are homologous to the target sequence. Then, parse this multiple alignment into several blocks and identify the functionally important ones by using a semi-automatic method, which combines manual analysis and expert knowledge. In this paper, we present a novel exploratory approach to parsing and analyzing such kinds of multiple alignments. It is based on a type of analysis-of-variance (ANOVA) decomposition of the sequence information content. Unlike the traditional change-point method, this approach takes into account not only the composition biases but also the overdispersion effects among the blocks. The new approach is tested on the families of ribosomal proteins and has a promising performance. It is shown that the new approach provides a better way for judging some important residues in these proteins. This allows one to find some subsets of residues, which are critical to these proteins.  相似文献   

14.
15.
Accurate prediction of pseudoknotted nucleic acid secondary structure is an important computational challenge. Prediction algorithms based on dynamic programming aim to find a structure with minimum free energy according to some thermodynamic ("sum of loop energies") model that is implicit in the recurrences of the algorithm. However, a clear definition of what exactly are the loops in pseudoknotted structures, and their associated energies, has been lacking. In this work, we present a complete classification of loops in pseudoknotted nucleic secondary structures, and describe the Rivas and Eddy and other energy models as sum-of-loops energy models. We give a linear time algorithm for parsing a pseudoknotted secondary structure into its component loops. We give two applications of our parsing algorithm. The first is a linear time algorithm to calculate the free energy of a pseudoknotted secondary structure. This is useful for heuristic prediction algorithms, which are widely used since (pseudoknotted) RNA secondary structure prediction is NP-hard. The second application is a linear time algorithm to test the generality of the dynamic programming algorithm of Akutsu for secondary structure prediction.Together with previous work, we use this algorithm to compare the generality of state-of-the-art algorithms on real biological structures.  相似文献   

16.
A secondary structure has been predicted for the C termini of the fibrinogen β and γ chains from an aligned set of homologous protein sequences using a transparent method that extracts conformational information from patters of variation and conservation, parsing strings, and patterns of amphiphilicity. The structure is modeled to form two domains, the first having a core parallel sheet flanked on one side by at least two helices and on the other by an antiparallel amphiphilic sheet, with an additional helix connecting the two sheets. The second domain is built entirely from β strands. © 1997 Wiley-Liss, Inc.  相似文献   

17.
In this article, we present a de novo method for predicting protein domain boundaries, called OPUS-Dom. The core of the method is a novel coarse-grained folding method, VECFOLD, which constructs low-resolution structural models from a target sequence by folding a chain of vectors representing the predicted secondary-structure elements. OPUS-Dom generates a large ensemble of folded structure decoys by VECFOLD and labels the domain boundaries of each decoy by a domain parsing algorithm. Consensus domain boundaries are then derived from the statistical distribution of the putative boundaries and three empirical sequence-based domain profiles. OPUS-Dom generally outperformed several state-of-the-art domain prediction algorithms over various benchmark protein sets. Even though each VECFOLD-generated structure contains large errors, collectively these structures provide a more robust delineation of domain boundaries. The success of OPUS-Dom suggests that the arrangement of protein domains is more a consequence of limited coordination patterns per domain arising from tertiary packing of secondary-structure segments, rather than sequence-specific constraints.  相似文献   

18.
19.
This paper outlines a neurocognitive approach to human language, focusing on inflectional morphology and grammatical function in English. Taking as a starting point the selective deficits for regular inflectional morphology of a group of non-fluent patients with left hemisphere damage, we argue for a core decompositional network linking left inferior frontal cortex with superior and middle temporal cortex, connected via the arcuate fasciculus. This network handles the processing of regularly inflected words (such as joined or treats), which are argued not to be stored as whole forms and which require morpho-phonological parsing in order to segment complex forms into stems and inflectional affixes. This parsing process operates early and automatically upon all potential inflected forms and is triggered by their surface phonological properties. The predictions of this model were confirmed in a further neuroimaging study, using event-related functional magnetic resonance imaging (fMRI), on unimpaired young adults. The salience of grammatical morphemes for the language system is highlighted by new research showing that similarly early and blind segmentation also operates for derivationally complex forms (such as darkness or rider). These findings are interpreted as evidence for a hidden decompositional substrate to human language processing and related to a functional architecture derived from non-human primate models.  相似文献   

20.
Brain microinjection can aid elucidation of the molecular substrates of complex behaviors, such as motivation. For this purpose rodents can serve as appropriate models, partly because the response to behaviorally relevant stimuli and the circuitry parsing stimulus-action outcomes is astonishingly similar between humans and rodents. In studying molecular substrates of complex behaviors, the microinjection of reagents that modify, augment, or silence specific systems is an invaluable technique. However, it is crucial that the microinjection site is precisely targeted in order to aid interpretation of the results. We present a method for the manufacture of surgical implements and microinjection needles that enables accurate microinjection and unlimited customizability with minimal cost. Importantly, this technique can be successfully completed in awake rodents if conducted in conjunction with other JoVE articles that covered requisite surgical procedures. Additionally, there are many behavioral paradigms that are well suited for measuring motivation. The progressive ratio is a commonly used method that quantifies the efficacy of a reinforcer to maintain responding despite an (often exponentially) increasing work requirement. This assay is sensitive to reinforcer magnitude and pharmacological manipulations, which allows reinforcing efficacy and/ or motivation to be determined. We also present a straightforward approach to program operant software to accommodate a progressive ratio reinforcement schedule.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号