首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 171 毫秒
1.
2.
We present an off-line cursive word recognition system based completely on neural networks: reading models and models of early visual processing. The first stage (normalization) preprocesses the input image in order to reduce letter position uncertainty; the second stage (feature extraction) is based on the feedforward model of orientation selectivity; the third stage (letter pre-recognition) is based on a convolutional neural network, and the last stage (word recognition) is based on the interactive activation model.  相似文献   

3.
4.
Text tokenization is a fundamental pre-processing step for almost all the information processing applications. This task is nontrivial for the scarce resourced languages such as Urdu, as there is inconsistent use of space between words. In this paper a morpheme matching based approach has been proposed for Urdu text tokenization, along with some other algorithms to solve the additional issues of boundary detection of compound words, affixation, reduplication, names and abbreviations. This study resulted into 97.28% precision, 93.71% recall, and 95.46% F1-measure; while tokenizing a corpus of 57000 words by using a morpheme list with 6400 entries.  相似文献   

5.
Drawing portraits upside down is a trick that allows novice artists to reproduce lower-level image features, e.g., contours, while reducing interference from higher-level face cognition. Limiting the available processing time to suffice for lower- but not higher-level operations is a more general way of reducing interference. We elucidate this interference in a novel visual-search task to find a target among distractors. The target had a unique lower-level orientation feature but was identical to distractors in its higher-level object shape. Through bottom-up processes, the unique feature attracted gaze to the target. Subsequently, recognizing the attended object as identically shaped as the distractors, viewpoint invariant object recognition interfered. Consequently, gaze often abandoned the target to search elsewhere. If the search stimulus was extinguished at time T after the gaze arrived at the target, reports of target location were more accurate for shorter (T<500 ms) presentations. This object-to-feature interference, though perhaps unexpected, could underlie common phenomena such as the visual-search asymmetry that finding a familiar letter N among its mirror images is more difficult than the converse. Our results should enable additional examination of known phenomena and interactions between different levels of visual processes.  相似文献   

6.
7.
McGinnis EM  Keil A 《PloS one》2011,6(2):e16824
Identifying targets in a stream of items at a given constant spatial location relies on selection of aspects such as color, shape, or texture. Such attended (target) features of a stimulus elicit a negative-going event-related brain potential (ERP), termed Selection Negativity (SN), which has been used as an index of selective feature processing. In two experiments, participants viewed a series of Gabor patches in which targets were defined as a specific combination of color, orientation, and shape. Distracters were composed of different combinations of color, orientation, and shape of the target stimulus. This design allows comparisons of items with and without specific target features. Consistent with previous ERP research, SN deflections extended between 160-300 ms. Data from the subsequent P3 component (300-450 ms post-stimulus) were also examined, and were regarded as an index of target processing. In Experiment A, predominant effects of target color on SN and P3 amplitudes were found, along with smaller ERP differences in response to variations of orientation and shape. Manipulating color to be less salient while enhancing the saliency of the orientation of the Gabor patch (Experiment B) led to delayed color selection and enhanced orientation selection. Topographical analyses suggested that the location of SN on the scalp reliably varies with the nature of the to-be-attended feature. No interference of non-target features on the SN was observed. These results suggest that target feature selection operates by means of electrocortical facilitation of feature-specific sensory processes, and that selective electrocortical facilitation is more effective when stimulus saliency is heightened.  相似文献   

8.
ABSTRACT

Chronotype questionnaires provide a simple and time-effective approach to assessing individual differences in circadian variations. Chronotype questionnaires traditionally focused on one dimension of chronotype, namely its orientation along a continuum of morningness and eveningness. The Caen Chronotype Questionnaire (CCQ) was developed to assess an additional dimension of chronotype that captures the extent to which individual functioning varies during the day (amplitude). The aim of this study was to provide a multilanguage validation of the CCQ in six world regions (Arabic, Dutch, German, Italian, Portuguese and Spanish). At Time 1, a total of 2788 participants agreed to take part in the study (Arabic, n = 731; Dutch, n = 538; German, n = 329; Italian, n = 473; Portuguese, n = 361; Spanish, n = 356). Participants completed an assessment of the CCQ together with the Morningness-Eveningness Questionnaire (MEQ; Horne & Ostberg 1976) as well as questions related to factors theoretically related to chronotype (age, shift work, physical activity, sleep parameters and coffee consumption). One month later, participants again completed the CCQ. Results showed that the two-factor structure (morningness-eveningness and amplitude) of the CCQ could be replicated in all six languages. However, measurement invariance could not be assumed regarding the factor loadings across languages, meaning that items loaded more on their factors in some translations than in others. Test–retest reliability of the CCQ ranged from unacceptable (German version) to excellent (Dutch, Portuguese). Convergent validity was established through small–medium effect size correlations between the morningness-eveningness dimension of the CCQ and the MEQ. Taken together, our findings generally support the use of the translated versions of the CCQ. Further validation work on the CCQ is required including convergent validation against physiological markers of sleep, health and well-being.  相似文献   

9.
The one-sample-per-person problem has become an active research topic for face recognition in recent years because of its challenges and significance for real-world applications. However, achieving relatively higher recognition accuracy is still a difficult problem due to, usually, too few training samples being available and variations of illumination and expression. To alleviate the negative effects caused by these unfavorable factors, in this paper we propose a more accurate spectral feature image-based 2DLDA (two-dimensional linear discriminant analysis) ensemble algorithm for face recognition, with one sample image per person. In our algorithm, multi-resolution spectral feature images are constructed to represent the face images; this can greatly enlarge the training set. The proposed method is inspired by our finding that, among these spectral feature images, features extracted from some orientations and scales using 2DLDA are not sensitive to variations of illumination and expression. In order to maintain the positive characteristics of these filters and to make correct category assignments, the strategy of classifier committee learning (CCL) is designed to combine the results obtained from different spectral feature images. Using the above strategies, the negative effects caused by those unfavorable factors can be alleviated efficiently in face recognition. Experimental results on the standard databases demonstrate the feasibility and efficiency of the proposed method.  相似文献   

10.
DNA binding regions I, II, and III at the origin of replication have different arrangements of A protein (T antigen) recognition pentanucleotides. The A protein also protects each region from DNase in distinctly different patterns. Footprint and fragment assays led to the following conclusions: (i) in some cases a single recognition pentanucleotide is sufficient to direct the binding and accurate alignment of A protein on DNA; (ii) the A protein binds within isolated region I or II in a sequential process leading to multiple overlapping areas of DNase protection within each region; and (iii) the 23-base pair span of recognition sequences in region II allows binding and protection of a longer length of DNA than the 23-base pair span in region I. We propose a model of protein binding that addresses the problem of variations in the arrangement of pentanucleotides in regions I and II and explains the observed DNase protection patterns. The central feature of the model requires each protomer of A protein to bind to a pentanucleotide in a unique direction. The resulting orientation of protein would protect more DNA at the 5' end of the 5'-GAGGC-3' recognition sequence than at the 3' end. The arrangement of multiple protomers at the origin of simian virus 40 replication is discussed.  相似文献   

11.
Computational modelling of visual attention   总被引:3,自引:0,他引:3  
Five important trends have emerged from recent work on computational models of focal visual attention that emphasize the bottom-up, image-based control of attentional deployment. First, the perceptual saliency of stimuli critically depends on the surrounding context. Second, a unique 'saliency map' that topographically encodes for stimulus conspicuity over the visual scene has proved to be an efficient and plausible bottom-up control strategy. Third, inhibition of return, the process by which the currently attended location is prevented from being attended again, is a crucial element of attentional deployment. Fourth, attention and eye movements tightly interplay, posing computational challenges with respect to the coordinate system used to control attention. And last, scene understanding and object recognition strongly constrain the selection of attended locations. Insights from these five key areas provide a framework for a computational and neurobiological understanding of visual attention.  相似文献   

12.
Motion recognition has received increasing attention in recent years owing to heightened demand for computer vision in many domains, including the surveillance system, multimodal human computer interface, and traffic control system. Most conventional approaches classify the motion recognition task into partial feature extraction and time-domain recognition subtasks. However, the information of motion resides in the space-time domain instead of the time domain or space domain independently, implying that fusing the feature extraction and classification in the space and time domains into a single framework is preferred. Based on this notion, this work presents a novel Space-Time Delay Neural Network (STDNN) capable of handling the space-time dynamic information for motion recognition. The STDNN is unified structure, in which the low-level spatiotemporal feature extraction and high-level space-time-domain recognition are fused. The proposed network possesses the spatiotemporal shift-invariant recognition ability that is inherited from the time delay neural network (TDNN) and space displacement neural network (SDNN), where TDNN and SDNN are good at temporal and spatial shift-invariant recognition, respectively. In contrast to multilayer perceptron (MLP), TDNN, and SDNN, STDNN is constructed by vector-type nodes and matrix-type links such that the spatiotemporal information can be accurately represented in a neural network. Also evaluated herein is the performance of the proposed STDNN via two experiments. The moving Arabic numerals (MAN) experiment simulates the object's free movement in the space-time domain on image sequences. According to these results, STDNN possesses a good generalization ability with respect to the spatiotemporal shift-invariant recognition. In the lipreading experiment, STDNN recognizes the lip motions based on the inputs of real image sequences. This observation confirms that STDNN yields a better performance than the existing TDNN-based system, particularly in terms of the generalization ability. In addition to the lipreading application, the STDNN can be applied to other problems since no domain-dependent knowledge is used in the experiment.  相似文献   

13.
Face recognition is challenging especially when the images from different persons are similar to each other due to variations in illumination, expression, and occlusion. If we have sufficient training images of each person which can span the facial variations of that person under testing conditions, sparse representation based classification (SRC) achieves very promising results. However, in many applications, face recognition often encounters the small sample size problem arising from the small number of available training images for each person. In this paper, we present a novel face recognition framework by utilizing low-rank and sparse error matrix decomposition, and sparse coding techniques (LRSE+SC). Firstly, the low-rank matrix recovery technique is applied to decompose the face images per class into a low-rank matrix and a sparse error matrix. The low-rank matrix of each individual is a class-specific dictionary and it captures the discriminative feature of this individual. The sparse error matrix represents the intra-class variations, such as illumination, expression changes. Secondly, we combine the low-rank part (representative basis) of each person into a supervised dictionary and integrate all the sparse error matrix of each individual into a within-individual variant dictionary which can be applied to represent the possible variations between the testing and training images. Then these two dictionaries are used to code the query image. The within-individual variant dictionary can be shared by all the subjects and only contribute to explain the lighting conditions, expressions, and occlusions of the query image rather than discrimination. At last, a reconstruction-based scheme is adopted for face recognition. Since the within-individual dictionary is introduced, LRSE+SC can handle the problem of the corrupted training data and the situation that not all subjects have enough samples for training. Experimental results show that our method achieves the state-of-the-art results on AR, FERET, FRGC and LFW databases.  相似文献   

14.
A scientific framework is described in which scientists are cast as problem-solvers, and problems as solved when data are mapped to models. This endeavor is limited by finite attentional capacity which keeps depth of understanding complementary to breadth of vision; and which distinguishes the process of science from its products, scientists from scholars. All four aspects of explanation described by Aristotle trigger, function, substrate, and model are required for comprehension. Various modeling languages are described, ranging from set theory to calculus of variations, along with exemplary applications in behavior analysis.  相似文献   

15.
文少卿  谢小冬  徐丹 《遗传》2013,35(6):761-770
东乡族是甘肃省特有的少数民族, 语言上隶属于阿尔泰语系蒙古语族, 其族源至今尚不明确。文章根据东乡人群和其他参考人群的Y 染色体单倍群数据所绘制的多维尺度分析图、树型聚类图、主成分分析图以及网络结构图分析结果显示, 东乡人在遗传结构上更靠近中亚族群, 而与蒙古人群距离甚远。通过计算汉藏人群、蒙古人群和中亚人群对东乡人群的遗传贡献率, 进一步证实了这种差距。据此, 本文认为:中国西北地区的东乡人群的父系遗传成分主要源于中亚地区操突厥语及波斯语的人群, 而非蒙古族。东乡族的这种父系遗传来源与其语言分类上的不匹配, 可以用精英主导模型来进行解释, 他们的祖先应该是被蒙古族在语言、文化上同化了的中亚人群。  相似文献   

16.
One of the major challenges that developing organs face is scaling, that is, the adjustment of physical proportions during the massive increase in size. Although organ scaling is fundamental for development and function, little is known about the mechanisms that regulate it. Bone superstructures are projections that typically serve for tendon and ligament insertion or articulation and, therefore, their position along the bone is crucial for musculoskeletal functionality. As bones are rigid structures that elongate only from their ends, it is unclear how superstructure positions are regulated during growth to end up in the right locations. Here, we document the process of longitudinal scaling in developing mouse long bones and uncover the mechanism that regulates it. To that end, we performed a computational analysis of hundreds of three-dimensional micro-CT images, using a newly developed method for recovering the morphogenetic sequence of developing bones. Strikingly, analysis revealed that the relative position of all superstructures along the bone is highly preserved during more than a 5-fold increase in length, indicating isometric scaling. It has been suggested that during development, bone superstructures are continuously reconstructed and relocated along the shaft, a process known as drift. Surprisingly, our results showed that most superstructures did not drift at all. Instead, we identified a novel mechanism for bone scaling, whereby each bone exhibits a specific and unique balance between proximal and distal growth rates, which accurately maintains the relative position of its superstructures. Moreover, we show mathematically that this mechanism minimizes the cumulative drift of all superstructures, thereby optimizing the scaling process. Our study reveals a general mechanism for the scaling of developing bones. More broadly, these findings suggest an evolutionary mechanism that facilitates variability in bone morphology by controlling the activity of individual epiphyseal plates.  相似文献   

17.
We report on evidence for selective long-distance interactions in Cyclopean binocular vision. When presented with a pair of Cyclopean test bars observers could discriminate trial-to-trial uncorrelated variations in the mean orientation, orientation difference, separation and mean location of the test bars while ignoring random variations in the orientation, width and location of a third bar placed between the two test bars. We propose that the human visual system contains Cyclopean long-distance comparators (i) that compare the outputs of two narrow receptive fields some distance apart while being insensitive to stimuli located between those receptive fields, and (ii) the outputs of which carry orthogonally labelled indicators of orientation difference, mean orientation, separation and mean location. In the evolutionary context, one role for the proposed mechanisms might be to encode information about the silhouettes of animals whose camouflage is broken by the binocular vision of predators.  相似文献   

18.

Background

Previous studies have claimed that a precise split at the vertical midline of each fovea causes all words to the left and right of fixation to project to the opposite, contralateral hemisphere, and this division in hemispheric processing has considerable consequences for foveal word recognition. However, research in this area is dominated by the use of stimuli from Latinate languages, which may induce specific effects on performance. Consequently, we report two experiments using stimuli from a fundamentally different, non-Latinate language (Arabic) that offers an alternative way of revealing effects of split-foveal processing, if they exist.

Methods and Findings

Words (and pseudowords) were presented to the left or right of fixation, either close to fixation and entirely within foveal vision, or further from fixation and entirely within extrafoveal vision. Fixation location and stimulus presentations were carefully controlled using an eye-tracker linked to a fixation-contingent display. To assess word recognition, Experiment 1 used the Reicher-Wheeler task and Experiment 2 used the lexical decision task.

Results

Performance in both experiments indicated a functional division in hemispheric processing for words in extrafoveal locations (in recognition accuracy in Experiment 1 and in reaction times and error rates in Experiment 2) but no such division for words in foveal locations.

Conclusions

These findings from a non-Latinate language provide new evidence that although a functional division in hemispheric processing exists for word recognition outside the fovea, this division does not extend up to the point of fixation. Some implications for word recognition and reading are discussed.  相似文献   

19.
ABSTRACT: BACKGROUND: Populations of the Arabian Peninsula have a complex genetic structure that reflects waves of migrations including the earliest human migrations from Africa and eastern Asia, migrations along ancient civilization trading routes and colonization history of recent centuries. 1 RESULTS: Here, we present a study of genome-wide admixture in this region, using 156 genotyped individuals from Qatar, a country located at the crossroads of these migration patterns. Since haplotypes of these individuals could have originated from many different populations across the world, we have developed a machine learning method "SupportMix" to infer loci-specific genomic ancestry when simultaneously analyzing many possible ancestral populations. Simulations show that SupportMix is not only more accurate than other popular admixture discovery tools but is the first admixture inference method that can efficiently scale for simultaneous analysis of 50-100 putative ancestral populations while being independent of prior demographic information. CONCLUSIONS: By simultaneously using the 55 world populations from the Human Genome Diversity Panel, SupportMix was able to extract the fine-scale ancestry of the Qatar population, providing many new observations concerning the ancestry of the region. For example, as well as recapitulating the three major sub-populations in Qatar, composed of mainly Arabic, Persian, and African ancestry, SupportMix additionally identifies the specific ancestry of the Persian group to populations sampled in Greater Persia rather than from China and the ancestry of the African group to sub-Saharan origin and not Southern African Bantu origin as previously thought.  相似文献   

20.
Birds are considered critical indicators of ecosystem condition. Automatic recording devices have emerged as a trending tool to assist field observations, contributing to biodiversity monitoring on large spatio-temporal scales. However, manually processing huge volumes of recordings is challenging. Consequently, there has been a growing interest in automatic bird vocalization recognition in recent years. Automatic bird vocalization recognition technology has advanced from classical pattern recognition to deep learning (DL), with significantly improved recognition performance. This paper reviews related works on DL-based automatic bird vocalization recognition technology in the last decade. In this review, we present the current state of research in the three key areas of pre-processing, feature extraction and recognition methods involved in automatic bird vocalization recognition. The related datasets, evaluation metrics and software are also summarized. Finally, existing challenges along with opportunities for future work are highlighted. We conclude that, while DL-based automatic bird vocalization recognition has made recent advances in specific species, more robust denoising approaches, larger public datasets, and stronger generalization capabilities of feature extraction and recognition are required to achieve reliable and general bird recognition in the wild. We expect that this review will serve as a firm foundation for new researchers working in the field of DL-based automatic bird vocalization recognition technologies, as well as become an insightful guide for computer science and ecology experts.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号