首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 0 毫秒
1.
Lauraceae and Fagaceae are two large woody plant families that are predominant in the low- and middle-altitude regions in Taiwan. The highly interspecific similarity between some species of the family brings limitations on the management and utilization. This work proposed an approach for identifying 15 Lauraceae species and 20 Fagaceae species using leaf images and convolutional neural networks (CNNs). Leaf specimens of 35 species were collected from the northern, central, and southern parts of Taiwan. Images of the leaves were acquired using flat-bed scanners. Three CNN architectures—DenseNet-121, MobileNet V2, and Xception—were trained. Xception achieved the highest mean test accuracy of 99.39%, and MobileNet V2 required the shortest mean test time of 17.1 ms per image using a GPU. The saliency maps revealed that the characteristics learned by models matched the leaf features used by botanists. A pruning algorithm, gate decorator, was applied to the trained models for reducing the number of parameters and number of floating-point operations of the MobileNet V2 by 55.4% and 69.1%, respectively, while the model accuracy was maintained at 92.03%. Thus, MobileNet V2 has the potential to be used for identifying the Lauraceae and Fagaceae species on mobile devices.  相似文献   

2.

Coral reef research and management efforts can be improved when supported by reef maps providing local-scale details across global extents. However, such maps are difficult to generate due to the broad geographic range of coral reefs, the complexities of relating satellite imagery to geomorphic or ecological realities, and other challenges. However, reef extent maps are one of the most commonly used and most valuable data products from the perspective of reef scientists and managers. Here, we used convolutional neural networks to generate a globally consistent coral reef probability map—a probabilistic estimate of the geospatial extent of reef ecosystems—to facilitate scientific, conservation, and management efforts. We combined a global mosaic of high spatial resolution Planet Dove satellite imagery with regional Millennium Coral Reef Mapping Project reef extents to build training, validation, and application datasets. These datasets trained our reef extent prediction model, a neural network with a dense-unet architecture followed by a random forest classifier, which was used to produce a global coral reef probability map. Based on this probability map, we generated a global coral reef extent map from a 60% threshold of reef probability (reef: probability ≥ 60%, non-reef: probability < 60%). Our findings provide a proof-of-concept method for global reef extent estimates using a consistent and readily updateable methodology that leverages modern deep learning approaches to support downstream users. These maps are openly-available through the Allen Coral Atlas.

  相似文献   

3.
The local environment and land usages have changed a lot during the past one hundred years. Historical documents and materials are crucial in understanding and following these changes. Historical documents are, therefore, an important piece in the understanding of the impact and consequences of land usage change. This, in turn, is important in the search of restoration projects that can be conducted to turn and reduce harmful and unsustainable effects originating from changes in the land-usage.This work extracts information on the historical location and geographical distribution of wetlands, from hand-drawn maps. This is achieved by using deep learning (DL), and more specifically a convolutional neural network (CNN). The CNN model is trained on a manually pre-labelled dataset on historical wetlands in the area of Jönköping county in Sweden. These are all extracted from the historical map called “Generalstabskartan”.The presented CNN performs well and achieves a F1-score of 0.886 when evaluated using a 10-fold cross validation over the data. The trained models are additionally used to generate a GIS layer of the presumable historical geographical distribution of wetlands for the area that is depicted in the southern collection in Generalstabskartan, which covers the southern half of Sweden. This GIS layer is released as an open resource and can be freely used.To summarise, the presented results show that CNNs can be a useful tool in the extraction and digitalisation of non-textual information in historical documents, such as historical maps. A modern GIS material that can be used to further understand the past land-usage change is produced within this research. Previously, no material of this detail and extent have been available, due to the large effort needed to manually create such. However, with the presented resource better quantifications and estimations of historical wetlands that have been lost can be made.  相似文献   

4.
We present a system for multi-class protein classification based on neural networks. The basic issue concerning the construction of neural network systems for protein classification is the sequence encoding scheme that must be used in order to feed the neural network. To deal with this problem we propose a method that maps a protein sequence into a numerical feature space using the matching scores of the sequence to groups of conserved patterns (called motifs) into protein families. We consider two alternative ways for identifying the motifs to be used for feature generation and provide a comparative evaluation of the two schemes. We also evaluate the impact of the incorporation of background features (2-grams) on the performance of the neural system. Experimental results on real datasets indicate that the proposed method is highly efficient and is superior to other well-known methods for protein classification.  相似文献   

5.
Min  Xu  Zeng  Wanwen  Chen  Shengquan  Chen  Ning  Chen  Ting  Jiang  Rui 《BMC bioinformatics》2017,18(13):478-46

Background

With the rapid development of deep sequencing techniques in the recent years, enhancers have been systematically identified in such projects as FANTOM and ENCODE, forming genome-wide landscapes in a series of human cell lines. Nevertheless, experimental approaches are still costly and time consuming for large scale identification of enhancers across a variety of tissues under different disease status, making computational identification of enhancers indispensable.

Results

To facilitate the identification of enhancers, we propose a computational framework, named DeepEnhancer, to distinguish enhancers from background genomic sequences. Our method purely relies on DNA sequences to predict enhancers in an end-to-end manner by using a deep convolutional neural network (CNN). We train our deep learning model on permissive enhancers and then adopt a transfer learning strategy to fine-tune the model on enhancers specific to a cell line. Results demonstrate the effectiveness and efficiency of our method in the classification of enhancers against random sequences, exhibiting advantages of deep learning over traditional sequence-based classifiers. We then construct a variety of neural networks with different architectures and show the usefulness of such techniques as max-pooling and batch normalization in our method. To gain the interpretability of our approach, we further visualize convolutional kernels as sequence logos and successfully identify similar motifs in the JASPAR database.

Conclusions

DeepEnhancer enables the identification of novel enhancers using only DNA sequences via a highly accurate deep learning model. The proposed computational framework can also be applied to similar problems, thereby prompting the use of machine learning methods in life sciences.
  相似文献   

6.
Deep learning is a powerful approach for distinguishing classes of images, and there is a growing interest in applying these methods to delimit species, particularly in the identification of mosquito vectors. Visual identification of mosquito species is the foundation of mosquito-borne disease surveillance and management, but can be hindered by cryptic morphological variation in mosquito vector species complexes such as the malaria-transmitting Anopheles gambiae complex. We sought to apply Convolutional Neural Networks (CNNs) to images of mosquitoes as a proof-of-concept to determine the feasibility of automatic classification of mosquito sex, genus, species, and strains using whole-body, 2D images of mosquitoes. We introduce a library of 1, 709 images of adult mosquitoes collected from 16 colonies of mosquito vector species and strains originating from five geographic regions, with 4 cryptic species not readily distinguishable morphologically even by trained medical entomologists. We present a methodology for image processing, data augmentation, and training and validation of a CNN. Our best CNN configuration achieved high prediction accuracies of 96.96% for species identification and 98.48% for sex. Our results demonstrate that CNNs can delimit species with cryptic morphological variation, 2 strains of a single species, and specimens from a single colony stored using two different methods. We present visualizations of the CNN feature space and predictions for interpretation of our results, and we further discuss applications of our findings for future applications in malaria mosquito surveillance.  相似文献   

7.
Face parsing is an important computer vision task that requires accurate pixel segmentation of facial parts (such as eyes, nose, mouth, etc.), providing a basis for further face analysis, modification, and other applications. Interlinked Convolutional Neural Networks (iCNN) was proved to be an effective two-stage model for face parsing. However, the original iCNN was trained separately in two stages, limiting its performance. To solve this problem, we introduce a simple, end-to-end face parsing framework: STN-aided iCNN(STN-iCNN), which extends the iCNN by adding a Spatial Transformer Network (STN) between the two isolated stages. The STN-iCNN uses the STN to provide a trainable connection to the original two-stage iCNN pipeline, making end-to-end joint training possible. Moreover, as a by-product, STN also provides more precise cropped parts than the original cropper. Due to these two advantages, our approach significantly improves the accuracy of the original model. Our model achieved competitive performance on the Helen Dataset, the standard face parsing dataset. It also achieved superior performance on CelebAMask-HQ dataset, proving its good generalization. Our code has been released at https://github.com/aod321/STN-iCNN.  相似文献   

8.
Even though individual-based models (IBMs) have become very popular in ecology during the last decade, there have been few attempts to implement behavioural aspects in IBMs. This is partly due to lack of appropriate techniques. Behavioural and life history aspects can be implemented in IBMs through adaptive models based on genetic algorithms and neural networks (individual-based-neural network-genetic algorithm, ING). To investigate the precision of the adaptation process, we present three cases where solutions can be found by optimisation. These cases include a state-dependent patch selection problem, a simple game between predators and prey, and a more complex vertical migration scenario for a planktivorous fish. In all cases, the optimal solution is calculated and compared with the solution achieved using ING. The results show that the ING method finds optimal or close to optimal solutions for the problems presented. In addition it has a wider range of potential application areas than conventional techniques in behavioural modelling. Especially the method is well suited for complex problems where other methods fail to provide answers. This revised version was published online in July 2006 with corrections to the Cover Date.  相似文献   

9.
有孔虫个体微小、数量众多、地理分布广、演化迅速, 是记录海洋沉积环境的重要载体, 在海相生物地层划分和对比中具有十分重要的作用。因有孔虫属种众多, 传统的属种鉴定需要经验丰富的专业人员进行人工鉴定且耗时较长, 此外人工鉴定古生物面临人才匮乏和工作量大等问题。卷积神经网络在计算机视觉领域的应用可较好的解决上述问题。利用古生物专家对中新世浮游有孔虫化石标注为指导, 根据有孔虫化石不同方向的视角分类, 结合卷积神经网络算法, 开发了有孔虫化石图像识别系统。研究发现, 通过有孔虫化石腹视、缘视和背视角度分类, 采取两级分段式鉴定算法对中新世浮游有孔虫属一级进行识别, 属一级鉴定准确率达到82%左右。  相似文献   

10.
The state of art in computer modelling of neural networks with associative memory is reviewed. The available experimental data are considered on learning and memory of small neural systems, on isolated synapses and on molecular level. Computer simulations demonstrate that realistic models of neural ensembles exhibit properties which can be interpreted as image recognition, categorization, learning, prototype forming, etc. A bilayer model of associative neural network is proposed. One layer corresponds to the short-term memory, the other one to the long-term memory. Patterns are stored in terms of the synaptic strength matrix. We have studied the relaxational dynamics of neurons firing and suppression within the short-term memory layer under the influence of the long-term memory layer. The interaction among the layers has found to create a number of novel stable states which are not the learning patterns. These synthetic patterns may consist of elements belonging to different non-intersecting learning patterns. Within the framework of a hypothesis of selective and definite coding of images in brain one can interpret the observed effect as the "idea? generating" process.  相似文献   

11.
Yue Cao  Yang Shen 《Proteins》2020,88(8):1091-1099
Structural information about protein-protein interactions, often missing at the interactome scale, is important for mechanistic understanding of cells and rational discovery of therapeutics. Protein docking provides a computational alternative for such information. However, ranking near-native docked models high among a large number of candidates, often known as the scoring problem, remains a critical challenge. Moreover, estimating model quality, also known as the quality assessment problem, is rarely addressed in protein docking. In this study, the two challenging problems in protein docking are regarded as relative and absolute scoring, respectively, and addressed in one physics-inspired deep learning framework. We represent protein and complex structures as intra- and inter-molecular residue contact graphs with atom-resolution node and edge features. And we propose a novel graph convolutional kernel that aggregates interacting nodes’ features through edges so that generalized interaction energies can be learned directly from 3D data. The resulting energy-based graph convolutional networks (EGCN) with multihead attention are trained to predict intra- and inter-molecular energies, binding affinities, and quality measures (interface RMSD) for encounter complexes. Compared to a state-of-the-art scoring function for model ranking, EGCN significantly improves ranking for a critical assessment of predicted interactions (CAPRI) test set involving homology docking; and is comparable or slightly better for Score_set, a CAPRI benchmark set generated by diverse community-wide docking protocols not known to training data. For Score_set quality assessment, EGCN shows about 27% improvement to our previous efforts. Directly learning from 3D structure data in graph representation, EGCN represents the first successful development of graph convolutional networks for protein docking.  相似文献   

12.
Ground cover and surface vegetation information are key inputs to wildfire propagation models and are important indicators of ecosystem health. Often these variables are approximated using visual estimation by trained professionals but the results are prone to bias and error. This study analyzed the viability of using nadir or downward photos from smartphones (iPhone 7) to provide quantitative ground cover and biomass loading estimates. Good correlations were found between field measured values and pixel counts from manually segmented photos delineating a pre-defined set of 10 discrete cover types. Although promising, segmenting photos manually was labor intensive and therefore costly. We explored the viability of using a trained deep convolutional neural network (DCNN) to perform image segmentation automatically. The DCNN was able to segment nadir images with 95% accuracy when compared with manually delineated photos. To validate the flexibility and robustness of the automated image segmentation algorithm, we applied it to an independent dataset of nadir photographs captured at a different study site with similar surface vegetation characteristics to the training site with promising results.  相似文献   

13.
In this paper we describe an improved neural network method to predict T-cell class I epitopes. A novel input representation has been developed consisting of a combination of sparse encoding, Blosum encoding, and input derived from hidden Markov models. We demonstrate that the combination of several neural networks derived using different sequence-encoding schemes has a performance superior to neural networks derived using a single sequence-encoding scheme. The new method is shown to have a performance that is substantially higher than that of other methods. By use of mutual information calculations we show that peptides that bind to the HLA A*0204 complex display signal of higher order sequence correlations. Neural networks are ideally suited to integrate such higher order correlations when predicting the binding affinity. It is this feature combined with the use of several neural networks derived from different and novel sequence-encoding schemes and the ability of the neural network to be trained on data consisting of continuous binding affinities that gives the new method an improved performance. The difference in predictive performance between the neural network methods and that of the matrix-driven methods is found to be most significant for peptides that bind strongly to the HLA molecule, confirming that the signal of higher order sequence correlation is most strongly present in high-binding peptides. Finally, we use the method to predict T-cell epitopes for the genome of hepatitis C virus and discuss possible applications of the prediction method to guide the process of rational vaccine design.  相似文献   

14.
Traditional regression analysis of body weight growth curvesencounters problems .when the data are extremely variable. Whiletransformations are often employed to meet the criteria of theanalysis, some transformations are inadequate for normalizingthe data. Regression analysis also requires presuppositionsregarding the model to be fit and the techniques to be usedin the analysis. An alternative approach using artificial neuralnetworks is presented which may be suitable for developing predictivemodels of growth. Neural networks are simulators of the processesthat occur in the biological brain during the learning process.They are trained on the data, developing the necessary algorithmswithin their internal architecture, and produce a predictivemodel based on the learned facts. A dataset of Sprague–Dawleyrat (Rattus norvegicus) weights is analyzed by both traditionalregression analysis and neural network training. Predictionsof body weight are made from both models. While both methodsproduce models that adequately predict the body weights, theneural network model is superior in that it combines accuracyand precision, being less influenced by longitudinal variabilityin the data. Thus, the neural network provides another toolfor researchers to analyze growth curve data.  相似文献   

15.

One fundamental problem of protein biochemistry is to predict protein structure from amino acid sequence. The inverse problem, predicting either entire sequences or individual mutations that are consistent with a given protein structure, has received much less attention even though it has important applications in both protein engineering and evolutionary biology. Here, we ask whether 3D convolutional neural networks (3D CNNs) can learn the local fitness landscape of protein structure to reliably predict either the wild-type amino acid or the consensus in a multiple sequence alignment from the local structural context surrounding site of interest. We find that the network can predict wild type with good accuracy, and that network confidence is a reliable measure of whether a given prediction is likely going to be correct or not. Predictions of consensus are less accurate and are primarily driven by whether or not the consensus matches the wild type. Our work suggests that high-confidence mis-predictions of the wild type may identify sites that are primed for mutation and likely targets for protein engineering.

  相似文献   

16.
Co-evolutionary models such as direct coupling analysis (DCA) in combination with machine learning (ML) techniques based on deep neural networks are able to predict accurate protein contact or distance maps. Such information can be used as constraints in structure prediction and massively increase prediction accuracy. Unfortunately, the same ML methods cannot readily be applied to RNA as they rely on large structural datasets only available for proteins. Here, we demonstrate how the available smaller data for RNA can be used to improve prediction of RNA contact maps. We introduce an algorithm called CoCoNet that is based on a combination of a Coevolutionary model and a shallow Convolutional Neural Network. Despite its simplicity and the small number of trained parameters, the method boosts the positive predictive value (PPV) of predicted contacts by about 70% with respect to DCA as tested by cross-validation of about eighty RNA structures. However, the direct inclusion of the CoCoNet contacts in 3D modeling tools does not result in a proportional increase of the 3D RNA structure prediction accuracy. Therefore, we suggest that the field develops, in addition to contact PPV, metrics which estimate the expected impact for 3D structure modeling tools better. CoCoNet is freely available and can be found at https://github.com/KIT-MBS/coconet.  相似文献   

17.
 Chains of coupled oscillators of simple “rotator” type have been used to model the central pattern generator (CPG) for locomotion in lamprey, among numerous applications in biology and elsewhere. In this paper, motivated by experiments on lamprey CPG with brainstem attached, we investigate a simple oscillator model with internal structure which captures both excitable and bursting dynamics. This model, and that for the coupling functions, is inspired by the Hodgkin–Huxley equations and two-variable simplifications thereof. We analyse pairs of coupled oscillators with both excitatory and inhibitory coupling. We also study traveling wave patterns arising from chains of oscillators, including simulations of “body shapes” generated by a double chain of oscillators providing input to a kinematic musculature model of lamprey.. Received: 25 November 1996 / Revised version: 9 December 1997  相似文献   

18.
Designing protein sequences that can fold into a given structure is a well‐known inverse protein‐folding problem. One important characteristic to attain for a protein design program is the ability to recover wild‐type sequences given their native backbone structures. The highest average sequence identity accuracy achieved by current protein‐design programs in this problem is around 30%, achieved by our previous system, SPIN. SPIN is a program that predicts sequences compatible with a provided structure using a neural network with fragment‐based local and energy‐based nonlocal profiles. Our new model, SPIN2, uses a deep neural network and additional structural features to improve on SPIN. SPIN2 achieves over 34% in sequence recovery in 10‐fold cross‐validation and independent tests, a 4% improvement over the previous version. The sequence profiles generated from SPIN2 are expected to be useful for improving existing fold recognition and protein design techniques. SPIN2 is available at http://sparks-lab.org .  相似文献   

19.
Deciphering metabolic networks.   总被引:14,自引:0,他引:14  
  相似文献   

20.
With the emergence of high throughput single cell techniques, the understanding of the molecular and cellular diversity of mammalian organs have rapidly increased. In order to understand the spatial organization of this diversity, single cell data is often integrated with spatial data to create probabilistic cell maps. However, targeted cell typing approaches relying on existing single cell data achieve incomplete and biased maps that could mask the true diversity present in a tissue slide. Here we applied a de novo technique to spatially resolve and characterize cellular diversity of in situ sequencing data during human heart development. We obtained and made accessible well defined spatial cell-type maps of fetal hearts from 4.5 to 9 post conception weeks, not biased by probabilistic cell typing approaches. With our analysis, we could characterize previously unreported molecular diversity within cardiomyocytes and epicardial cells and identified their characteristic expression signatures, comparing them with specific subpopulations found in single cell RNA sequencing datasets. We further characterized the differentiation trajectories of epicardial cells, identifying a clear spatial component on it. All in all, our study provides a novel technique for conducting de novo spatial-temporal analyses in developmental tissue samples and a useful resource for online exploration of cell-type differentiation during heart development at sub-cellular image resolution.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号