首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
In this paper, we focus on animal object detection and species classification in camera-trap images collected in highly cluttered natural scenes. Using a deep neural network (DNN) model training for animal- background image classification, we analyze the input camera-trap images to generate a multi-level visual representation of the input image. We detect semantic regions of interest for animals from this representation using k-mean clustering and graph cut in the DNN feature domain. These animal regions are then classified into animal species using multi-class deep neural network model. According the experimental results, our method achieves 99.75% accuracy for classifying animals and background and 90.89% accuracy for classifying 26 animal species on the Snapshot Serengeti dataset, outperforming existing image classification methods.  相似文献   

2.
《IRBM》2021,42(5):334-344
Active learning is an effective solution to interactively select a limited number of informative examples and use them to train a learning algorithm that can achieve its optimal performance for specific tasks. It is suitable for medical image applications in which unlabeled data are abundant but manual annotation could be very time-consuming and expensive. However, designing an effective active learning strategy for informative example selection is a challenging task, due to the intrinsic presence of noise in medical images, the large number of images, and the variety of imaging modalities. In this study, a novel low-rank modeling-based multi-label active learning (LRMMAL) method is developed to address these challenges and select informative examples for training a classifier to achieve the optimal performance. The proposed method independently quantifies image noise and integrates it with other measures to guide a pool-based sampling process to determine the most informative examples for training a classifier. In addition, an automatic adaptive cross entropy-based parameter determination scheme is proposed for further optimizing the example sampling strategy. Experimental results on varied medical image datasets and comparisons with other state-of-the-art multi-label active learning methods illustrate the superior performance of the proposed method.  相似文献   

3.
MOTIVATION: Protein fold recognition is an important approach to structure discovery without relying on sequence similarity. We study this approach with new multi-class classification methods and examined many issues important for a practical recognition system. RESULTS: Most current discriminative methods for protein fold prediction use the one-against-others method, which has the well-known 'False Positives' problem. We investigated two new methods: the unique one-against-others and the all-against-all methods. Both improve prediction accuracy by 14-110% on a dataset containing 27 SCOP folds. We used the Support Vector Machine (SVM) and the Neural Network (NN) learning methods as base classifiers. SVMs converges fast and leads to high accuracy. When scores of multiple parameter datasets are combined, majority voting reduces noise and increases recognition accuracy. We examined many issues involved with large number of classes, including dependencies of prediction accuracy on the number of folds and on the number of representatives in a fold. Overall, recognition systems achieve 56% fold prediction accuracy on a protein test dataset, where most of the proteins have below 25% sequence identity with the proteins used in training.  相似文献   

4.
JX Mi  JX Liu  J Wen 《PloS one》2012,7(8):e42461
Nearest subspace (NS) classification based on linear regression technique is a very straightforward and efficient method for face recognition. A recently developed NS method, namely the linear regression-based classification (LRC), uses downsampled face images as features to perform face recognition. The basic assumption behind this kind method is that samples from a certain class lie on their own class-specific subspace. Since there are only few training samples for each individual class, which will cause the small sample size (SSS) problem, this problem gives rise to misclassification of previous NS methods. In this paper, we propose two novel LRC methods using the idea that every class-specific subspace has its unique basis vectors. Thus, we consider that each class-specific subspace is spanned by two kinds of basis vectors which are the common basis vectors shared by many classes and the class-specific basis vectors owned by one class only. Based on this concept, two classification methods, namely robust LRC 1 and 2 (RLRC 1 and 2), are given to achieve more robust face recognition. Unlike some previous methods which need to extract class-specific basis vectors, the proposed methods are developed merely based on the existence of the class-specific basis vectors but without actually calculating them. Experiments on three well known face databases demonstrate very good performance of the new methods compared with other state-of-the-art methods.  相似文献   

5.
Brain-computer interaction (BCI) and physiological computing are terms that refer to using processed neural or physiological signals to influence human interaction with computers, environment, and each other. A major challenge in developing these systems arises from the large individual differences typically seen in the neural/physiological responses. As a result, many researchers use individually-trained recognition algorithms to process this data. In order to minimize time, cost, and barriers to use, there is a need to minimize the amount of individual training data required, or equivalently, to increase the recognition accuracy without increasing the number of user-specific training samples. One promising method for achieving this is collaborative filtering, which combines training data from the individual subject with additional training data from other, similar subjects. This paper describes a successful application of a collaborative filtering approach intended for a BCI system. This approach is based on transfer learning (TL), active class selection (ACS), and a mean squared difference user-similarity heuristic. The resulting BCI system uses neural and physiological signals for automatic task difficulty recognition. TL improves the learning performance by combining a small number of user-specific training samples with a large number of auxiliary training samples from other similar subjects. ACS optimally selects the classes to generate user-specific training samples. Experimental results on 18 subjects, using both nearest neighbors and support vector machine classifiers, demonstrate that the proposed approach can significantly reduce the number of user-specific training data samples. This collaborative filtering approach will also be generalizable to handling individual differences in many other applications that involve human neural or physiological data, such as affective computing.  相似文献   

6.
Non-intrusive monitoring of animals in the wild is possible using camera trapping networks. The cameras are triggered by sensors in order to disturb the animals as little as possible. This approach produces a high volume of data (in the order of thousands or millions of images) that demands laborious work to analyze both useless (incorrect detections, which are the most) and useful (images with presence of animals). In this work, we show that as soon as some obstacles are overcome, deep neural networks can cope with the problem of the automated species classification appropriately. As case of study, the most common 26 of 48 species from the Snapshot Serengeti (SSe) dataset were selected and the potential of the Very Deep Convolutional neural networks framework for the species identification task was analyzed. In the worst-case scenario (unbalanced training dataset containing empty images) the method reached 35.4% Top-1 and 60.4% Top-5 accuracy. For the best scenario (balanced dataset, images containing foreground animals only, and manually segmented) the accuracy reached a 88.9% Top-1 and 98.1% Top-5, respectively. To the best of our knowledge, this is the first published attempt on solving the automatic species recognition on the SSe dataset. In addition, a comparison with other approaches on a different dataset was carried out, showing that the architectures used in this work outperformed previous approaches. The limitations of the method, drawbacks, as well as new challenges in automatic camera-trap species classification are widely discussed.  相似文献   

7.
In conditions of tachistoscopic presentation of visual stimuli, healthy (male and female) right-handed subjects carried out a paired comparison of the stimuli presented unilaterally and in the center of the visual field. In case of recognition of images of words and objects, the number of correct answers and motor reaction time usually did not significantly differ at two interstimuli intervals (1 and 10 s). In comparing images of faces, there also were no differences by the number of reactions, and the reaction time was less at the intervals of 1 s. The left hemisphere dominated at the identification of words and female faces, the right one--at the recognition of male faces. When the right visual field was stimulated images of various classes were recognized more differentially than at the stimulation of the left visual field. The male subjects had more prominent interhemispheric differences than the females. The increase of the interstimuli interval from 1 to 10 s brought to a weakening of the functional interhemispheric asymmetry and decreasing of the differences between the male and female subjects. The obtained data show that in the processes connected with short-time memory, functional interhemispheric asymmetry is basically formed at the initial stages of the information processing.  相似文献   

8.
Many ecosystems, particularly wetlands, are significantly degraded or lost as a result of climate change and anthropogenic activities. Simultaneously, developments in machine learning, particularly deep learning methods, have greatly improved wetland mapping, which is a critical step in ecosystem monitoring. Yet, present deep and very deep models necessitate a greater number of training data, which are costly, logistically challenging, and time-consuming to acquire. Thus, we explore and address the potential and possible limitations caused by the availability of limited ground-truth data for large-scale wetland mapping. To overcome this persistent problem for remote sensing data classification using deep learning models, we propose 3D UNet Generative Adversarial Network Swin Transformer (3DUNetGSFormer) to adaptively synthesize wetland training data based on each class's data availability. Both real and synthesized training data are then imported to a novel deep learning architecture consisting of cutting-edge Convolutional Neural Networks and vision transformers for wetland mapping. Results demonstrated that the developed wetland classifier obtained a high level of kappa coefficient, average accuracy, and overall accuracy of 96.99%, 97.13%, and 97.39%, respectively, for the data in three pilot sites in and around Grand Falls-Windsor, Avalon, and Gros Morne National Park located in Canada. The results show that the proposed methodology opens a new window for future high-quality wetland data generation and classification. The developed codes are available at https://github.com/aj1365/3DUNetGSFormer.  相似文献   

9.
Recent advances in training deep (multi-layer) architectures have inspired a renaissance in neural network use. For example, deep convolutional networks are becoming the default option for difficult tasks on large datasets, such as image and speech recognition. However, here we show that error rates below 1% on the MNIST handwritten digit benchmark can be replicated with shallow non-convolutional neural networks. This is achieved by training such networks using the ‘Extreme Learning Machine’ (ELM) approach, which also enables a very rapid training time (∼ 10 minutes). Adding distortions, as is common practise for MNIST, reduces error rates even further. Our methods are also shown to be capable of achieving less than 5.5% error rates on the NORB image database. To achieve these results, we introduce several enhancements to the standard ELM algorithm, which individually and in combination can significantly improve performance. The main innovation is to ensure each hidden-unit operates only on a randomly sized and positioned patch of each image. This form of random ‘receptive field’ sampling of the input ensures the input weight matrix is sparse, with about 90% of weights equal to zero. Furthermore, combining our methods with a small number of iterations of a single-batch backpropagation method can significantly reduce the number of hidden-units required to achieve a particular performance. Our close to state-of-the-art results for MNIST and NORB suggest that the ease of use and accuracy of the ELM algorithm for designing a single-hidden-layer neural network classifier should cause it to be given greater consideration either as a standalone method for simpler problems, or as the final classification stage in deep neural networks applied to more difficult problems.  相似文献   

10.
中国土壤种子库研究进展与挑战   总被引:8,自引:2,他引:6  
查阅了维普《中文期刊数据库》(1989—2006年)和Web of Science(1985—2006年)中发表的土壤种子库研究文献,按《中国植被》划分的29个植被类型对土壤种子库密度、丰富度和研究方法等数据进行归类总结,共采集到14个植被类型的238个样地信息.结果显示:研究者使用的研究方法及获得的种子库密度和丰富度数据差异巨大.所有研究中采样时间以4和10月最多;样方面积介于78~10000 cm2之间,样方数量介于2~480之间;10 cm×10 cm 和 20 cm×20 cm是最常用的采样方式;总采样面积介于600~500000 cm2之间,并以1000~10000 cm2为最多.土壤种子库密度值变化范围在8 粒·m-2(沙漠)~ 65355粒·m-2(热带雨林的次生林)之间,物种数变化范围在1(温带草原次生光碱斑)~74(热带季雨林)之间,同一植被类型下的变异也相当大.热带的季雨林和雨林的密度值和物种数显著高于温带的针叶林;而人工林的密度和物种数量大于农地,农地又大于裸地.草原、荒漠和草甸的物种数量相对较少.未来的土壤种子库研究需要从广度和深度上进行扩展,并重点加强重要生态系统的土壤种子库长期定位研究以及种子库对策研究,特别是应将这些研究与植被群落的演替、更新和恢复有机地联系起来.同时应加强对不同植被类型的种子库采样、检测方法的探索性研究.  相似文献   

11.
Experimental evidence suggests a link between perception and the execution of actions . In particular, it has been proposed that motor programs might directly influence visual action perception . According to this hypothesis, the acquisition of novel motor behaviors should improve their visual recognition, even in the absence of visual learning. We tested this prediction by using a new experimental paradigm that dissociates visual and motor learning during the acquisition of novel motor patterns. The visual recognition of gait patterns from point-light stimuli was assessed before and after nonvisual motor training. During this training, subjects were blindfolded and learned a novel coordinated upper-body movement based only on verbal and haptic feedback. The learned movement matched one of the visual test patterns. Despite the absence of visual stimulation during training, we observed a selective improvement of the visual recognition performance for the learned movement. Furthermore, visual recognition performance after training correlated strongly with the accuracy of the execution of the learned motor pattern. These results prove, for the first time, that motor learning has a direct and highly selective influence on visual action recognition that is not mediated by visual learning.  相似文献   

12.
ABSTRACT Numerous techniques have been proposed to estimate carnivore abundance and density, but few have been validated against populations of known size. We used a density estimate established by intensive monitoring of a population of radiotagged leopards (Panthera pardus) with a detection probability of 1.0 to evaluate efficacy of track counts and camera-trap surveys as population estimators. We calculated densities from track counts using 2 methods and compared performance of 10 methods for calculating the effectively sampled area for camera-trapping data. Compared to our reference density (7.33 ± 0.44 leopards/100 km2), camera-trapping generally produced more accurate but less precise estimates than did track counts. The most accurate result (6.97 ± 1.88 leopards/100 km2) came from camera-trap data with a sampled area buffered by a boundary strip representing the mean maximum distance moved by leopards outside the survey area (MMDMOSA) established by telemetry. However, contrary to recent suggestions, the traditional method of using half the mean maximum distance moved from photographic recaptures did not result in gross overestimates of population density (6.56 ± 1.92 leopards/100 km2) but rather displayed the next best performance after MMDMOSA. The only track-count method comparable to reference density employed a capture-recapture framework applied to data when individuals were identified from their tracks (6.45 ± 1.43 leopards/100 km2) but the underlying assumptions of this technique limit more widespread application. Our results demonstrate that if applied correctly, camera-trap surveys represent the best balance of rigor and cost-effectiveness for estimating abundance and density of cryptic carnivore species that can be identified individually.  相似文献   

13.
The primate visual system achieves remarkable visual object recognition performance even in brief presentations, and under changes to object exemplar, geometric transformations, and background variation (a.k.a. core visual object recognition). This remarkable performance is mediated by the representation formed in inferior temporal (IT) cortex. In parallel, recent advances in machine learning have led to ever higher performing models of object recognition using artificial deep neural networks (DNNs). It remains unclear, however, whether the representational performance of DNNs rivals that of the brain. To accurately produce such a comparison, a major difficulty has been a unifying metric that accounts for experimental limitations, such as the amount of noise, the number of neural recording sites, and the number of trials, and computational limitations, such as the complexity of the decoding classifier and the number of classifier training examples. In this work, we perform a direct comparison that corrects for these experimental limitations and computational considerations. As part of our methodology, we propose an extension of “kernel analysis” that measures the generalization accuracy as a function of representational complexity. Our evaluations show that, unlike previous bio-inspired models, the latest DNNs rival the representational performance of IT cortex on this visual object recognition task. Furthermore, we show that models that perform well on measures of representational performance also perform well on measures of representational similarity to IT, and on measures of predicting individual IT multi-unit responses. Whether these DNNs rely on computational mechanisms similar to the primate visual system is yet to be determined, but, unlike all previous bio-inspired models, that possibility cannot be ruled out merely on representational performance grounds.  相似文献   

14.
MOTIVATION: Two important questions for the analysis of gene expression measurements from different sample classes are (1) how to classify samples and (2) how to identify meaningful gene signatures (ranked gene lists) exhibiting the differences between classes and sample subsets. Solutions to both questions have immediate biological and biomedical applications. To achieve optimal classification performance, a suitable combination of classifier and gene selection method needs to be specifically selected for a given dataset. The selected gene signatures can be unstable and the resulting classification accuracy unreliable, particularly when considering different subsets of samples. Both unstable gene signatures and overestimated classification accuracy can impair biological conclusions. METHODS: We address these two issues by repeatedly evaluating the classification performance of all models, i.e. pairwise combinations of various gene selection and classification methods, for random subsets of arrays (sampling). A model score is used to select the most appropriate model for the given dataset. Consensus gene signatures are constructed by extracting those genes frequently selected over many samplings. Sampling additionally permits measurement of the stability of the classification performance for each model, which serves as a measure of model reliability. RESULTS: We analyzed a large gene expression dataset with 78 measurements of four different cartilage sample classes. Classifiers trained on subsets of measurements frequently produce models with highly variable performance. Our approach provides reliable classification performance estimates via sampling. In addition to reliable classification performance, we determined stable consensus signatures (i.e. gene lists) for sample classes. Manual literature screening showed that these genes are highly relevant to our gene expression experiment with osteoarthritic cartilage. We compared our approach to others based on a publicly available dataset on breast cancer. AVAILABILITY: R package at http://www.bio.ifi.lmu.de/~davis/edaprakt  相似文献   

15.
Despite not knowing the exact age of individuals, humans can estimate their rough age using age-related physical features. Nonhuman primates show some age-related physical features; however, the cognitive traits underlying their recognition of age class have not been revealed. Here, we tested the ability of two species of Old World monkey, Japanese macaques (JM) and Campbell's monkeys (CM), to spontaneously discriminate age classes using visual paired comparison (VPC) tasks based on the two distinct categories of infant and adult images. First, VPCs were conducted in JM subjects using conspecific JM stimuli. When analyzing the side of the first look, JM subjects significantly looked more often at novel images. Based on analyses of total looking durations, JM subjects looked at a novel infant image longer than they looked at a familiar adult image, suggesting the ability to spontaneously discriminate between the two age classes and a preference for infant over adult images. Next, VPCs were tested in CM subjects using heterospecific JM stimuli. CM subjects showed no difference in the side of their first look, but looked at infant JM images longer than they looked at adult images; the fact that CMs were totally na?ve to JMs suggested that the attractiveness of infant images transcends species differences. This is the first report of visual age class recognition and a preference for infant over adult images in nonhuman primates. Our results suggest not only species-specific processing for age class recognition but also the evolutionary origins of the instinctive human perception of baby cuteness schema, proposed by the ethologist Konrad Lorenz.  相似文献   

16.
The subjects learned to recognize three figures presented in the left visual hemifield and three figures presented in the right visual hemifield. During presentation of a stimulus, the contralateral hemifield was overlapped by a mask. After the training, recognition of all six figures presented in the right and left visual hemifields, was compared. Each hemisphere recognizes figures which were learned in the corresponding visual hemifield, but the recognition of figures learned in the opposite visual hemifield was poor. Thus, the ability of the hemispheres to act separately in recognizing different sets of visual images, was established.  相似文献   

17.
Adding noise to a visual image makes object recognition more effortful and has a widespread effect on human electrophysiological responses. However, visual cortical processes directly involved in handling the stimulus noise have yet to be identified and dissociated from the modulation of the neural responses due to the deteriorated structural information and increased stimulus uncertainty in the case of noisy images. Here we show that the impairment of face gender categorization performance in the case of noisy images in amblyopic patients correlates with amblyopic deficits measured in the noise-induced modulation of the P1/P2 components of single-trial event-related potentials (ERP). On the other hand, the N170 ERP component is similarly affected by the presence of noise in the two eyes and its modulation does not predict the behavioral deficit. These results have revealed that the efficient processing of noisy images depends on the engagement of additional processing resources both at the early, feature-specific as well as later, object-level stages of visual cortical processing reflected in the P1 and P2 ERP components, respectively. Our findings also suggest that noise-induced modulation of the N170 component might reflect diminished face-selective neuronal responses to face images with deteriorated structural information.  相似文献   

18.
The study of nocturnal mammals relies on indirect evidence or invasive methods involving capture and tagging of individuals. Indirect methods are prone to error, while capture and tagging mammals have logistical and ethical considerations. Off-the-shelf camera traps are perceived as an accessible, non-intrusive method for direct data gathering, having many benefits but also potential biases. Here, using a 6-year camera-trap study of a Eurasian otter holt (den), we evaluate key parameters of study design. First, we analyse patterns of holt use in relation to researcher visits to maintain the camera traps. Then, using a dual camera-trap deployment, we compare the success of data capture from each camera-trap position in relation to the dual setup. Finally, we provide analyses to optimise minimum survey effort and camera-trap programming. Our findings indicate that otter presence and resting patterns were unaffected by the researcher visits. Results were significantly better using a close camera-trap emplacement than a distant. There was a higher frequency of otter activity at the holt during the natal and early rearing period which has implications for determining the minimum survey duration. Reducing video clip duration from 30 to 19 s would have included 95% of instances where sex could be identified, and saved 35–40% of memory storage. Peaks of otter activity were related to sunrise and sunset; exclusion of diurnal hours would have missed 11% of registrations. Camera-trap studies would benefit by adopting a similar framework of analyses in the preliminary stages or during a trial period to inform subsequent methodological refinements.  相似文献   

19.
This paper presents a computational model to address one prominent psychological behavior of human beings to recognize images. The basic pursuit of our method can be concluded as that differences among multiple images help visual recognition. Generally speaking, we propose a statistical framework to distinguish what kind of image features capture sufficient category information and what kind of image features are common ones shared in multiple classes. Mathematically, the whole formulation is subject to a generative probabilistic model. Meanwhile, a discriminative functionality is incorporated into the model to interpret the differences among all kinds of images. The whole Bayesian formulation is solved in an Expectation-Maximization paradigm. After finding those discriminative patterns among different images, we design an image categorization algorithm to interpret how these differences help visual recognition within the bag-of-feature framework. The proposed method is verified on a variety of image categorization tasks including outdoor scene images, indoor scene images as well as the airborne SAR images from different perspectives.  相似文献   

20.
Color-to-Grayscale: Does the Method Matter in Image Recognition?   总被引:2,自引:0,他引:2  
Kanan C  Cottrell GW 《PloS one》2012,7(1):e29740
  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号