首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 218 毫秒
1.
2.
Handwritten character recognition has continually been a fascinating field of study in pattern recognition due to its numerous real-life applications, such as the reading tools for blind people and the reading tools for handwritten bank cheques. Therefore, the proper and accurate conversion of handwriting into organized digital files that can be easily recognized and processed by computer algorithms is required for various applications and systems. This paper proposes an accurate and precise autonomous structure for handwriting recognition using a ShuffleNet convolutional neural network to produce a multi-class recognition for the offline handwritten characters and numbers. The developed system utilizes the transfer learning of the powerful ShuffleNet CNN to train, validate, recognize, and categorize the handwritten character/digit images dataset into 26 classes for the English characters and ten categories for the digit characters. The experimental outcomes exhibited that the proposed recognition system achieves extraordinary overall recognition accuracy peaking at 99.50% outperforming other contrasted character recognition systems reported in the state-of-art. Besides, a low computational cost has been observed for the proposed model recording an average of 2.7 (ms) for the single sample inferencing.  相似文献   

3.
We present an off-line cursive word recognition system based completely on neural networks: reading models and models of early visual processing. The first stage (normalization) preprocesses the input image in order to reduce letter position uncertainty; the second stage (feature extraction) is based on the feedforward model of orientation selectivity; the third stage (letter pre-recognition) is based on a convolutional neural network, and the last stage (word recognition) is based on the interactive activation model.  相似文献   

4.
Recent advances in training deep (multi-layer) architectures have inspired a renaissance in neural network use. For example, deep convolutional networks are becoming the default option for difficult tasks on large datasets, such as image and speech recognition. However, here we show that error rates below 1% on the MNIST handwritten digit benchmark can be replicated with shallow non-convolutional neural networks. This is achieved by training such networks using the ‘Extreme Learning Machine’ (ELM) approach, which also enables a very rapid training time (∼ 10 minutes). Adding distortions, as is common practise for MNIST, reduces error rates even further. Our methods are also shown to be capable of achieving less than 5.5% error rates on the NORB image database. To achieve these results, we introduce several enhancements to the standard ELM algorithm, which individually and in combination can significantly improve performance. The main innovation is to ensure each hidden-unit operates only on a randomly sized and positioned patch of each image. This form of random ‘receptive field’ sampling of the input ensures the input weight matrix is sparse, with about 90% of weights equal to zero. Furthermore, combining our methods with a small number of iterations of a single-batch backpropagation method can significantly reduce the number of hidden-units required to achieve a particular performance. Our close to state-of-the-art results for MNIST and NORB suggest that the ease of use and accuracy of the ELM algorithm for designing a single-hidden-layer neural network classifier should cause it to be given greater consideration either as a standalone method for simpler problems, or as the final classification stage in deep neural networks applied to more difficult problems.  相似文献   

5.
6.
A neural network is described which is intended to extract orientation features that should be used for recognition of hand drawn characters. The network partitions the input hand drawn characters into separate line segments (strokes) according to their orientations. The network consists of several neural layers; each layer serves for extracting strokes of a certain orientation. Every neural layer has one-to-one correspondence with an input screen. The network uses an iterative update procedure which includes interactions of neurons inside each layer through oriented excitatory connections and inhibitory interrelations between the corresponding neurons of different layers. Computer simulation of the network was performed. Experiments showed that the network efficiently classifies all pixels of any hand drawn characters according to the orientations of the strokes constituting these characters and performs, as a result of that, a reasonable segmentation of characters.  相似文献   

7.
This paper describes a method to combine near-infrared spectroscopy and a three layer back-propagation artificial neural network in order to identify official and unofficial rhubarbs. Thirty-three samples were taken as the training set, and 62 samples as the test set. The effects of input node number, learning rate and momentum on the final error and recognition accuracy for the training set, and on prediction accuracy for the test set were determined. A neural network with eight input nodes, a 0.5 learning rate, and a momentum of 0.3 can achieve a recognition accuracy of 100% for the training set and a prediction accuracy of 96.8% for the test set. The method described offers a quick and efficient means of identifying rhubarbs.  相似文献   

8.
IntroductionTo develop real-time image processing for image-guided radiotherapy, we evaluated several neural network models for use with different imaging modalities, including X-ray fluoroscopic image denoising.Methods & materialsSetup images of prostate cancer patients were acquired with two oblique X-ray fluoroscopic units. Two types of residual network were designed: a convolutional autoencoder (rCAE) and a convolutional neural network (rCNN). We changed the convolutional kernel size and number of convolutional layers for both networks, and the number of pooling and upsampling layers for rCAE. The ground-truth image was applied to the contrast-limited adaptive histogram equalization (CLAHE) method of image processing. Network models were trained to keep the quality of the output image close to that of the ground-truth image from the input image without image processing. For image denoising evaluation, noisy input images were used for the training.ResultsMore than 6 convolutional layers with convolutional kernels >5 × 5 improved image quality. However, this did not allow real-time imaging. After applying a pair of pooling and upsampling layers to both networks, rCAEs with >3 convolutions each and rCNNs with >12 convolutions with a pair of pooling and upsampling layers achieved real-time processing at 30 frames per second (fps) with acceptable image quality.ConclusionsUse of our suggested network achieved real-time image processing for contrast enhancement and image denoising by the use of a conventional modern personal computer.  相似文献   

9.
In this paper, we focus on animal object detection and species classification in camera-trap images collected in highly cluttered natural scenes. Using a deep neural network (DNN) model training for animal- background image classification, we analyze the input camera-trap images to generate a multi-level visual representation of the input image. We detect semantic regions of interest for animals from this representation using k-mean clustering and graph cut in the DNN feature domain. These animal regions are then classified into animal species using multi-class deep neural network model. According the experimental results, our method achieves 99.75% accuracy for classifying animals and background and 90.89% accuracy for classifying 26 animal species on the Snapshot Serengeti dataset, outperforming existing image classification methods.  相似文献   

10.
为了探索基于深度神经网络模型的牙形刺图像智能识别效果, 研究选取奥陶纪8种牙形刺作为研究对象, 通过体视显微镜采集牙形刺图像1188幅, 收集整理公开发表文献的牙形刺图像778幅, 将图像数据集划分为训练集和测试集。通过对训练集图像进行旋转、翻转、滤波增强处理, 解决了训练样本不足的问题。基于ResNet-18、ResNet-34、ResNet-50、ResNet-101、ResNet-152五种残差神经网络模型, 采用迁移学习方法, 对网络模型进行训练以获取模型参数, 五种模型测试Top-1准确率分别为85.37%、85.85%、83.90%、81.95%、80.00%, Top-2准确率分别为94.63%、94.63%、94.15%、93.17%、93.66%, 模型对牙形刺图像具有较好的识别效果。通过对比研究发现, ResNet-34识别准确率最高, 说明对于特征简单的牙形刺属种, 增加网络深度并不一定能提升准确率, 而确定合适深度的模型则不仅可以提高识别准确率, 还可以节约计算资源。通过ResNet-34模型的迁移学习训练和重新训练效果对比可以看出, 迁移学习不仅可以获得较高的准确率, 而且可以较快获取模型参数, 因而可作为小样本古生物化石图像识别的重要方法。研究还发现, 体视显微镜下牙形刺图像的识别准确率高于扫描电镜下图像识别准确率, 化石完整性和相似性、照相角度以及数据集的大小是影响图像识别准确率的主要原因。  相似文献   

11.
Implementing an accurate face recognition system requires images in different variations, and if our database is large, we suffer from problems such as storing cost and low speed in recognition algorithms. On the other hand, in some applications there is only one image available per person for training recognition model. In this article, we propose a neural network model inspired of bidirectional analysis and synthesis brain network which can learn nonlinear mapping between image space and components space. Using a deep neural network model, we have tried to separate pose components from person ones. After setting apart these components, we can use them to synthesis virtual images of test data in different pose and lighting conditions. These virtual images are used to train neural network classifier. The results showed that training neural classifier with virtual images gives better performance than training classifier with frontal view images.  相似文献   

12.
Deng Y  Guo R  Ding G  Peng D 《PloS one》2012,7(3):e33337
Both the ventral and dorsal visual streams in the human brain are known to be involved in reading. However, the interaction of these two pathways and their responses to different cognitive demands remains unclear. In this study, activation of neural pathways during Chinese character reading was acquired by using a functional magnetic resonance imaging (fMRI) technique. Visual-spatial analysis (mediated by the dorsal pathway) was disassociated from lexical recognition (mediated by the ventral pathway) via a spatial-based lexical decision task and effective connectivity analysis. Connectivity results revealed that, during spatial processing, the left superior parietal lobule (SPL) positively modulated the left fusiform gyrus (FG), while during lexical processing, the left SPL received positive modulatory input from the left inferior frontal gyrus (IFG) and sent negative modulatory output to the left FG. These findings suggest that the dorsal stream is highly involved in lexical recognition and acts as a top-down modulator for lexical processing.  相似文献   

13.
肖锦成  欧维新  符海月 《生态学报》2013,33(21):7496-7504
高效而精确的湿地遥感分类是大范围湿地资源动态监测与管理的必要保障。本研究使用ETM 遥感数据,借助Matlab神经网络工具箱,构建了基于BP神经网络的滨海湿地覆被分类模型,并将其应用于江苏盐城沿海湿地珍禽国家级自然保护区的核心区的自然湿地覆被分类研究中。本研究选择3、4、7、8波段作为输入层变量,单隐藏层设为10个节点,输出层变量对应待划分的8种覆被类型,构建三层式BP神经网络滨海湿地覆被分类模型。结果显示,BP分类总精度为85.91%,Kappa系数为0.8328,与最小距离法和极大似然法的分类总精度相比,分别提高了7.99%和6.08%,Kappa系数也相比提高。研究结果表明,BP神经网络分类法是一种较为有效的湿地遥感影像分类技术,能够提高分类精度。  相似文献   

14.
Artificial intelligence in pest insect monitoring   总被引:1,自引:0,他引:1  
Abstract Global problems of hunger and malnutrition induced us to introduce a new tool for semi‐automated pest insect identification and monitoring: an artificial neural network system. Multilayer perceptrons, an artificial intelligence method, seem to be efficient for this purpose. We evaluated 101 European economically important thrips (Thysanoptera) species: extrapolation of the verification test data indicated 95% reliability at least for some taxa analysed. Mainly quantitative morphometric characters, such as head, clavus, wing, ovipositor length and width, formed the input variable computation set in a Trajan neural network simulator. The technique may be combined with digital image analysis.  相似文献   

15.
目的:评估汉字字形刺激源在汉字认知fMRI研究中的有效性,并对参与汉字处理的脑皮层区域进行定位及初步的量化分析。方法:选择母语为汉语、经利手测试后为右利手且裸眼视力正常(大于等于1.0)的在校大学生10例(男6例,女4例)作为被试。试验任务采用组块设计,将汉字(非字、假字、真字)投射到屏幕上,受试者接受汉字字形图片的视觉刺激,按非字-假字-真字-非字-假字-真字顺序呈现,共6个block。数据处理及统计分析采用国际通用的AFNI软件。结果:左额叶上、中、下回(包括Broca's area)、左中央前回(BA6)、左顶上小叶及顶下小叶(包括缘上回及角回)及双侧枕叶、楔前叶显著激活;左颞叶梭状回(BA37)、右额下回及双侧颞中、上回及扣带回显著激活,左大脑半球的激活体积明显大于右侧大脑半球。结论:本研究设计的汉字字形刺激源结合功能磁共振成像技术可以对汉字处理的相关大脑皮层区域进行定位,为研究人脑加工处理汉字的神经机制提供了一种有效的无创性影像学方法,并应用fMRI技术进一步证实其优势半球为左半球,且需要多种脑区共同参与完成。本试验模式可望成为一种对语言障碍病人进行脑功能检查的有效手段,从而为指导临床治疗和评价预后提供更丰富的信息。  相似文献   

16.
Lu H  Jiang W  Ghiassi M  Lee S  Nitin M 《PloS one》2012,7(1):e29704
Leaf characters have been successfully utilized to classify Camellia (Theaceae) species; however, leaf characters combined with supervised pattern recognition techniques have not been previously explored. We present results of using leaf morphological and venation characters of 93 species from five sections of genus Camellia to assess the effectiveness of several supervised pattern recognition techniques for classifications and compare their accuracy. Clustering approach, Learning Vector Quantization neural network (LVQ-ANN), Dynamic Architecture for Artificial Neural Networks (DAN2), and C-support vector machines (SVM) are used to discriminate 93 species from five sections of genus Camellia (11 in sect. Furfuracea, 16 in sect. Paracamellia, 12 in sect. Tuberculata, 34 in sect. Camellia, and 20 in sect. Theopsis). DAN2 and SVM show excellent classification results for genus Camellia with DAN2's accuracy of 97.92% and 91.11% for training and testing data sets respectively. The RBF-SVM results of 97.92% and 97.78% for training and testing offer the best classification accuracy. A hierarchical dendrogram based on leaf architecture data has confirmed the morphological classification of the five sections as previously proposed. The overall results suggest that leaf architecture-based data analysis using supervised pattern recognition techniques, especially DAN2 and SVM discrimination methods, is excellent for identification of Camellia species.  相似文献   

17.
ABSTRACT

We report on our research efforts towards developing efficient equipment for the automatic recognition of insects using only the acoustic modality. Specifically, we deal with three groups of insects, namely the crickets, cicadas and katydids. Inspired by well-documented tactics of speech processing, the signal processing employed in the present work is elaborated further with respect to the sound production mechanisms of insects. In order to improve the practical efficacy of our equipment, we adopt a score-level fusion of classifiers with non-parametric (probabilistic neural network) and parametric (Gaussian mixture models) estimation of the probability density function. An efficient hierarchic classification scheme is introduced, where the identification of unlabelled input takes place at various levels of hierarchy, such as suborder, family, subfamily, genus and species. We evaluate the practical significance of our approach on a large and well-documented catalogue of recordings of crickets, cicadas and katydids. For the hierarchic classification scheme, we report identification accuracy that exceeds 99% at suborder and family levels. In the straight classification scheme, we report accuracy of 90% for 307 species.  相似文献   

18.
Several critical issues associated with the processing of olfactory stimuli in animals (but focusing on insects) are discussed with a view to designing a neural network which can process olfactory stimuli. This leads to the construction of a neural network that can learn and identify the quality (direction cosines) of an input vector or extract information from a sequence of correlated input vectors, where the latter corresponds to sampling a time varying olfactory stimulus (or other generically similar pattern recognition problems). The network is constructed around a discrete time content-addressable memory (CAM) module which basically satisfies the Hopfield equations with the addition of a unit time delay feedback. This modification improves the convergence properties of the network and is used to control a switch which activates the learning or template formation process when the input is “unknown”. The network dynamics are embedded within a sniff cycle which includes a larger time delay (i.e. an integert s <1) that is also used to control the template formation switch. In addition, this time delay is used to modify the input into the CAM module so that the more dominant of two mingling odors or an odor increasing against a background of odors is more readily identified. The performance of the network is evaluated using Monte Carlo simulations and numerical results are presented.  相似文献   

19.
项和雨  邹斌  唐亮  陈维国  饶凯锋  刘勇  马梅  杨艳 《生态学报》2021,41(17):6883-6892
浮游植物作为水生态系统中最重要的生物组成部分之一,对水环境敏感,在水环境监测中得到了广泛的关注。然而水生环境复杂多样,准确高效地识别浮游植物是监测工作中的一大挑战。当前浮游植物识别方法可分为经典形态学分类、分子标记和人工智能图像识别三类。前两种方法已被广泛采用,但费时费力,不利于监测机构的大规模应用和推广。同样,利用图像进行自动化分类难以在高准确率与高效率上达到平衡。深度学习技术的发展为此提供了新思路。本文提出一种新的深度卷积神经网络RAN-11。该网络以残差注意力网络Attention-56和Attention-92为基础,凭借通道对齐融合主干上的底层特征与顶层特征,通过调整注意力模块和残差快个数以精简结构,并引入了Leaky ReLU激活函数代替ReLU。以太湖11个优势属共计1036张图像为数据来源进行对比验证。除星杆藻外,RAN-11对单一优势属的的查准率都在90%以上,并且有5个优势属达到100%的查准率。RAN-11的识别准确率为95.67%,推理速率为41.5帧/s,不仅比Attention-92(95.19%的准确率,23.6帧/s)更准确,而且比Attention-56(94.71%的准确率,41.2帧/s)更快,真正兼顾了准确率与效率。研究结果表明:(1)RAN-11在查准率、准确率和推理速率上优于原始残差注意力网络,更优于以词包模型为代表的传统图像识别方法;(2)融合多尺度特征、精简网络结构和优化激活函数是提高卷积神经网络性能的有力手段。建立在经典分类基础之上,本文提出新的残差注意力网络来提升浮游植物鉴定技术,并构建出浮游植物自动化识别系统,识别准确率高、易于推广,对于实现水体中浮游植物的自动化监测具有重要意义。  相似文献   

20.
On average our eyes make 3–5 saccadic movements per second when we read, although their neural mechanism is still unclear. It is generally thought that saccades help redirect the retinal fovea to specific characters and words but that actual discrimination of information only occurs during periods of fixation. Indeed, it has been proposed that there is active and selective suppression of information processing during saccades to avoid experience of blurring due to the high-speed movement. Here, using a paradigm where a string of either lexical (Chinese) or non-lexical (alphabetic) characters are triggered by saccadic eye movements, we show that subjects can discriminate both while making saccadic eye movement. Moreover, discrimination accuracy is significantly better for characters scanned during the saccadic movement to a fixation point than those not scanned beyond it. Our results showed that character information can be processed during the saccade, therefore saccades during reading not only function to redirect the fovea to fixate the next character or word but allow pre-processing of information from the ones adjacent to the fixation locations to help target the next most salient one. In this way saccades can not only promote continuity in reading words but also actively facilitate reading comprehension.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号