首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 0 毫秒
1.
The accurate identification of plant species is crucial for the conservation of biodiversity. However, traditional methods for identifying plant species are often complicated, time-consuming, and prone to errors. Therefore, it is essential to address these challenges and develop automated identification methods to enhance the efficiency and accuracy of plant species identification. In this study, a step-by-step method was utilized to identify and classify plant species. The dataset was first loaded, and then preprocessing was performed to remove noisy data. Following that, data augmentation was carried out to improve model accuracy. The deep convolutional neural network (CNN) and visual geometry group-16 (VGG-16) were then employed to extract only the relevant features, owing to their efficient learning capabilities. Feature-level fusion was accomplished by utilizing dimensionality reduction, and enhanced Spearman's principal component analysis (ESPCA) was employed to address the overfitting problem, eliminate redundant data, and reduce storage space and training time requirements. For classification, the hyperparameter-tuned batch-updated stochastic gradient descent (HP-BSGD) method was utilized. The Flavia and Swedish datasets were utilized in the experiments. The proposed hybrid classifier yielded excellent results due to its high convergence speed, good computational effectiveness, and high flexibility. To validate the experimental results, performance and comparative analyses were carried out using standard metrics. The analytical results demonstrated the superior efficiency and suitability of the proposed method in the classification of plant species over existing methods. The hybrid method achieved approximately 97% and 98.85% accuracy in the Flavia and Swedish datasets, respectively, when considering combined features. The performance of the proposed method was further enhanced by considering leaves at different stages, such as seedlings, tiny, mature, and dried leaves.  相似文献   

2.
Stano M  Klucar L 《Genomics》2011,98(5):376-380
phiGENOME is a web-based genome browser generating dynamic and interactive graphical representation of phage genomes stored in the phiSITE, database of gene regulation in bacteriophages. phiGENOME is an integral part of the phiSITE web portal (http://www.phisite.org/phigenome) and it was optimised for visualisation of phage genomes with the emphasis on the gene regulatory elements. phiGENOME consists of three components: (i) genome map viewer built using Adobe Flash technology, providing dynamic and interactive graphical display of phage genomes; (ii) sequence browser based on precisely formatted HTML tags, providing detailed exploration of genome features on the sequence level and (iii) regulation illustrator, based on Scalable Vector Graphics (SVG) and designed for graphical representation of gene regulations. Bringing 542 complete genome sequences accompanied with their rich annotations and references, makes phiGENOME a unique information resource in the field of phage genomics.  相似文献   

3.
Maturation of bacteriophage SPPI is imprecise. Although terminally redundant and circularly permuted molecules were always formed, individual molecules varied by more than 200 base pairs from each other.  相似文献   

4.
Abstract DNAs isolated from four strains of Brucella bacteriophages were studied by restriction endonuclease mapping and Southern blot analysis. In all strains the genome was composed of a 38 kb (25.1 × 106 dalton) double-stranded circular DNA. The physical map was the same for the four genomes and Southern blot hybridization of restriction endonuclease fragments with the Tbilissi strain DNA as a probe showed complete homology between the four DNAs. Thus, the four phage strains appear to be identical, the specific host range of each originating from minor changes in phage or Brucella receptors or both.  相似文献   

5.
The draft sequence of several complete protozoan genomes is now available and genome projects are ongoing for a number of other species. Different strategies are being implemented to identify and annotate protein coding and RNA genes in these genomes, as well as study their genomic architecture. Since the genomes vary greatly in size, GC-content, nucleotide composition, and degree of repetitiveness, genome structure is often a factor in choosing the methodology utilised for annotation. In addition, the approach taken is dictated, to a greater or lesser extent, by the particular reasons for carrying out genome-wide analyses and the level of funding available for projects. Nevertheless, these projects have provided a plethora of material that will aid in understanding the biology and evolution of these parasites, as well as identifying new targets that can be used to design urgently required drug treatments for the diseases they cause.  相似文献   

6.
We describe a PCR system that distinguishes the A, B and D genomes in wheat DNA extracts. PCRs were directed at the ‘non-transcribed spacer’ regions of the rDNA loci. The spacers within the D genome locus have a 71-bp insertion that is absent from the corresponding A and B loci PCR product sizes therefore enable D- and D+ genomes to be distinguished. The A and B genomes can be differentiated by PCR with an internal primer which does not anneal to A genome sequences. This work is relevant to the ancient ecology of wheat, as it is often difficult to determine ploidy level from morphological examination of archaeobotanical remains.  相似文献   

7.
8.
Eukaryotic protein secretion generally occurs via the classical secretory pathway that traverses the ER and Golgi apparatus. Secreted proteins usually contain a signal sequence with all the essential information required to target them for secretion. However, some proteins like fibroblast growth factors (FGF-1, FGF-2), interleukins (IL-1 alpha, IL-1 beta), galectins and thioredoxin are exported by an alternative pathway. This is known as leaderless or non-classical secretion and works without a signal sequence. Most computational methods for the identification of secretory proteins use the signal peptide as indicator and are therefore not able to identify substrates of non-classical secretion. In this work, we report a random forest method, SPRED, to identify secretory proteins from protein sequences irrespective of N-terminal signal peptides, thus allowing also correct classification of non-classical secretory proteins. Training was performed on a dataset containing 600 extracellular proteins and 600 cytoplasmic and/or nuclear proteins. The algorithm was tested on 180 extracellular proteins and 1380 cytoplasmic and/or nuclear proteins. We obtained 85.92% accuracy from training and 82.18% accuracy from testing. Since SPRED does not use N-terminal signals, it can detect non-classical secreted proteins by filtering those secreted proteins with an N-terminal signal by using SignalP. SPRED predicted 15 out of 19 experimentally verified non-classical secretory proteins. By scanning the entire human proteome we identified 566 protein sequences potentially undergoing non-classical secretion. The dataset and standalone version of the SPRED software is available at http://www.inb.uni-luebeck.de/tools-demos/spred/spred.  相似文献   

9.
Gene identification in novel eukaryotic genomes by self-training algorithm   总被引:8,自引:0,他引:8  
Finding new protein-coding genes is one of the most important goals of eukaryotic genome sequencing projects. However, genomic organization of novel eukaryotic genomes is diverse and ab initio gene finding tools tuned up for previously studied species are rarely suitable for efficacious gene hunting in DNA sequences of a new genome. Gene identification methods based on cDNA and expressed sequence tag (EST) mapping to genomic DNA or those using alignments to closely related genomes rely either on existence of abundant cDNA and EST data and/or availability on reference genomes. Conventional statistical ab initio methods require large training sets of validated genes for estimating gene model parameters. In practice, neither one of these types of data may be available in sufficient amount until rather late stages of the novel genome sequencing. Nevertheless, we have shown that gene finding in eukaryotic genomes could be carried out in parallel with statistical models estimation directly from yet anonymous genomic DNA. The suggested method of parallelization of gene prediction with the model parameters estimation follows the path of the iterative Viterbi training. Rounds of genomic sequence labeling into coding and non-coding regions are followed by the rounds of model parameters estimation. Several dynamically changing restrictions on the possible range of model parameters are added to filter out fluctuations in the initial steps of the algorithm that could redirect the iteration process away from the biologically relevant point in parameter space. Tests on well-studied eukaryotic genomes have shown that the new method performs comparably or better than conventional methods where the supervised model training precedes the gene prediction step. Several novel genomes have been analyzed and biologically interesting findings are discussed. Thus, a self-training algorithm that had been assumed feasible only for prokaryotic genomes has now been developed for ab initio eukaryotic gene identification.  相似文献   

10.
Cells interact mechanically with their surroundings by exerting and sensing forces. Traction force microscopy (TFM), purported to map cell-generated forces or stresses, represents an important tool that has powered the rapid advances in mechanobiology. However, to solve the ill-posed mathematical problem, conventional TFM involved compromises in accuracy and/or resolution. Here, we applied neural network-based deep learning as an alternative approach for TFM. We modified a neural network designed for image processing to predict the vector field of stress from displacements. Furthermore, we adapted a mathematical model for cell migration to generate large sets of simulated stresses and displacements for training and testing the neural network. We found that deep learning-based TFM yielded results that resemble those using conventional TFM but at a higher accuracy than several conventional implementations tested. In addition, a trained neural network is appliable to a wide range of conditions, including cell size, shape, substrate stiffness, and traction output. The performance of deep learning-based TFM makes it an appealing alternative to conventional methods for characterizing mechanical interactions between adherent cells and the environment.  相似文献   

11.
12.
SUMMARY: PRIMEX (PRImer Match EXtractor) can detect oligonucleotide sequences in whole genomes, allowing for mismatches. Using a word lookup table and server functionality, PRIMEX accepts queries from client software and returns matches rapidly. We find it faster and more sensitive than currently available tools. AVAILABILITY: Running applications and source code have been made available at http://bioinformatics.cribi.unipd.it/primex  相似文献   

13.
14.
Imaging sebaceous glands and evaluating morphometric parameters are important for diagnosis and treatment of serum problems. In this article, we investigate the feasibility of high-resolution optical coherence tomography (OCT) in combination with deep learning assisted automatic identification for these purposes. Specifically, with a spatial resolution of 2.3 μm × 6.2 μm (axial × lateral, in air), OCT is capable of clearly differentiating sebaceous gland from other skin structures and resolving the sebocyte layer. In order to achieve efficient and timely imaging analysis, a deep learning approach built upon ResNet18 is developed to automatically classify OCT images (with/without sebaceous gland), with a classification accuracy of 97.9%. Based on the result of automatic identification, we further demonstrate the possibility to measure gland size, sebocyte layer thickness and gland density.  相似文献   

15.
Gao Y  Luo L 《Gene》2012,492(1):309-314
Sequence alignment is not directly applicable to whole genome phylogeny since several events such as rearrangements make full length alignments impossible. Here, a novel alignment-free method derived from the standpoint of information theory is proposed and used to construct the whole-genome phylogeny for a population of viruses from 13 viral families comprising 218 dsDNA viruses. The method is based on information correlation (IC) and partial information correlation (PIC). We observe that (i) the IC-PIC tree segregates the population into clades, the membership of each is remarkably consistent with biologist's systematics only with little exceptions; (ii) the IC-PIC tree reveals potential evolutionary relationships among some viral families; and (iii) the IC-PIC tree predicts the taxonomic positions of certain “unclassified” viruses. Our approach provides a new way for recovering the phylogeny of viruses, and has practical applications in developing alignment-free methods for sequence classification.  相似文献   

16.
Kaleel  Manaz  Torrisi  Mirko  Mooney  Catherine  Pollastri  Gianluca 《Amino acids》2019,51(9):1289-1296

Predicting the three-dimensional structure of proteins is a long-standing challenge of computational biology, as the structure (or lack of a rigid structure) is well known to determine a protein’s function. Predicting relative solvent accessibility (RSA) of amino acids within a protein is a significant step towards resolving the protein structure prediction challenge especially in cases in which structural information about a protein is not available by homology transfer. Today, arguably the core of the most powerful prediction methods for predicting RSA and other structural features of proteins is some form of deep learning, and all the state-of-the-art protein structure prediction tools rely on some machine learning algorithm. In this article we present a deep neural network architecture composed of stacks of bidirectional recurrent neural networks and convolutional layers which is capable of mining information from long-range interactions within a protein sequence and apply it to the prediction of protein RSA using a novel encoding method that we shall call “clipped”. The final system we present, PaleAle 5.0, which is available as a public server, predicts RSA into two, three and four classes at an accuracy exceeding 80% in two classes, surpassing the performances of all the other predictors we have benchmarked.

  相似文献   

17.
18.
Synonymous codon usage patterns of bacteriophage and host genomes were compared. Two indexes, G + C base composition of a gene (fgc) and fraction of translationally optimal codons of the gene (fop), were used in the comparison. Synonymous codon usage data of all the coding sequences on a genome are represented as a cloud of points in the plane of fop vs. fgc. The Escherichia coli coding sequences appear to exhibit two phases, "rising" and "flat" phases. Genes that are essential for survival and are thought to be native are located in the flat phase, while foreign-type genes from prophages and transposons are found in the rising phase with a slope of nearly unity in the fgc vs. fop plot. Synonymous codon distribution patterns of genes from temperate phages P4, P2, N15 and lambda are similar to the pattern of E. coli rising phase genes. In contrast, genes from the virulent phage T7 or T4, for which a phage-encoded DNA polymerase is identified, fall in a linear curve with a slope of nearly zero in the fop vs. fgc plane. These results may suggest that the G + C contents for T7, T4 and E. coli flat phase genes are subject to the directional mutation pressure and are determined by the DNA polymerase used in the replication. There is significant variation in the fop values of the phage genes, suggesting an adjustment to gene expression level. Similar analyses of codon distribution patterns were carried out for Haemophilus influenzae, Bacillus subtilis, Mycobacterium tuberculosis and their phages with complete genomic sequences available.  相似文献   

19.
Driver fatigue is attracting more and more attention, as it is the main cause of traffic accidents, which bring great harm to society and families. This paper proposes to use deep convolutional neural networks, and deep residual learning, to predict the mental states of drivers from electroencephalography (EEG) signals. Accordingly we have developed two mental state classification models called EEG-Conv and EEG-Conv-R. Tested on intra- and inter-subject, our results show that both models outperform the traditional LSTM- and SVM-based classifiers. Our major findings include (1) Both EEG-Conv and EEG-Conv-R yield very good classification performance for mental state prediction; (2) EEG-Conv-R is more suitable for inter-subject mental state prediction; (3) EEG-Conv-R converges more quickly than EEG-Conv. In summary, our proposed classifiers have better predictive power and are promising for application in practical brain-computer interaction .  相似文献   

20.
We appreciate Gurgel-Goncalves et al. for offering their insights and opinions on our research on the automatic identification of vectors for Chagas disease. Our reply will address the key points raised and provide clarification on the following topics: (1) expert supervision, (2) labeling of the dataset, (3) recognition of the results from benchmark methods, (4) quality of training images, and (5) balancing of data.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号