首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
Current object detection algorithms suffer from low accuracy and poor robustness when used to detect marine benthos due to the complex environment and low light levels on the seabed. To solve these problems, the YOLOT (You Only Look Once with Transformer) algorithm, a quantitative detection algorithm based on the improved YOLOv4, is proposed for marine benthos in this paper. To improve the feature extraction capability of the neural network, the transformer mechanism is introduced in the backbone feature extraction network and feature fusion part of YOLOv4, which enhances the adaptability of the algorithm to targets in complex undersea environments. On the one hand, the self-attention unit is embedded into CSPDarknet-53, which improves the feature extraction capability of the network. On the other hand, it is transformer-based feature fusion rules that are introduced to enhance the extraction of contextual semantic information in the feature pyramid network. In addition, probabilistic anchor assignment based on Gaussian distribution is introduced to network training. The experimental validation shows that compared with the original YOLOv4, the YOLOT algorithm improves the recognition precision from 75.35% to 84.44% on the marine benthic dataset. The improvement reflects that YOLOT is suitable for the quantitative detection of marine benthos.  相似文献   

2.
Marine biological resources are abundant, and the reasonable development, research and protection of marine biological resources are of great significance to marine ecological health and economic development. At present, underwater object quantitative detection plays a very important role in marine biological science research, marine species richness survey, and rare species conservation. However, the problems of a large amount of noise in the underwater environment, small object scale, dense biological distribution, and occlusion all increase the detection difficulty. In this paper, a detection algorithm MAD-YOLO (Multiscale Feature Extraction and Attention Feature Fusion Reinforced YOLO for Marine Benthos Detection) is proposed, which is based on improved YOLOv5 is proposed to solve the above problems. To improve the adaptability of the network to the underwater environment, VOVDarkNet is designed as the feature extraction backbone. It uses the intermediate features with different receptive fields to reinforce the ability to extract feature. AFC-PAN is proposed as the feature fusion network so that the network can learn correct feature information and location information of objects at various scales, improving the network's ability to perceive small objects. Label assignment strategy SimOTA and decoupled head are introduced to help the model better handles occlusion and dense distribution problems. Experiments show the MAD-YOLO algorithm increases mAP0.5:0.95 on the URPC2020 dataset from 49.8% to 53.4% compared to the original YOLOv5. Moreover, the advantages of the model are visualized and analyzed by the method of controlling variables in the experimental part. The experiments show that MAD-YOLO is suitable for detecting blurred, dense, and small-scale objects. The model performs well in marine benthos detection tasks and can effectively promote marine life science research and marine engineering implementation. The source code is publicly available at https://github.com/JoeNan1/MAD-YOLO.  相似文献   

3.
Mastitis is one of the most common diseases in dairy cows and has a negative impact on their welfare and life, causing significant economic losses to the dairy industry. Many attempts have been made to develop a detection method for mastitis using thermal infrared thermography. However, the use of this detection technique to determine the health of the cow's udder is susceptible to external factors, resulting in inaccurate detection of dairy cow mastitis. Therefore, this study explored a new and comprehensive detection method of dairy cow mastitis based on infrared thermal images. This method combined the left and right udder skin surface temperature (USST) difference detection method with the ocular surface temperature and USST difference detection method with improvements. The effect of external factors on dairy cow USST was effectively reduced. In addition, after comparing different target localisation algorithms, this paper used the You Only Look Once v5 (YOLOv5) deep learning network model to obtain the temperature information of eyes and udders, and mastitis detection of dairy cows was performed. A total of 105 dairy cows passing through a passage were randomly selected from the thermal infrared video and detected by the new and comprehensive detection method, and the results of cow mastitis detection were compared with somatic cell count. The results showed that the accuracy, specificity, and sensitivity of mastitis detection were 87.62, 84.62, and 96.30%, respectively. Using the YOLOv5 deep learning network model to locate the key parts of the cow had a good effect, with an average accuracy of 96.1%, and an average frame rate of 116.3f/s. The detection accuracy of dairy cow mastitis by deep learning technology combined with the detection method in this paper reached 85.71%. The results showed that the new and comprehensive detection method based on infrared thermal images can be used for the detection of dairy cow mastitis with high detection accuracy. This method can reduce the influence of external factors and can be integrated into the automatic identification system of dairy mastitis based on YOLOv5 to realise on-site monitoring of dairy mastitis.  相似文献   

4.
The early symptom of lung tumor is always appeared as nodule on CT scans, among which 30% to 40% are malignant according to statistics studies. Therefore, early detection and classification of lung nodules are crucial to the treatment of lung cancer. With the increasing prevalence of lung cancer, large amount of CT images waiting for diagnosis are huge burdens to doctors who may missed or false detect abnormalities due to fatigue. Methods: In this study, we propose a novel lung nodule detection method based on YOLOv3 deep learning algorithm with only one preprocessing step is needed. In order to overcome the problem of less training data when starting a new study of Computer Aided Diagnosis (CAD), we firstly pick up a small number of diseased regions to simulate a limited datasets training procedure: 5 nodule patterns are selected and deformed into 110 nodules by random geometric transformation before fusing into 10 normal lung CT images using Poisson image editing. According to the experimental results, the Poisson fusion method achieves a detection rate of about 65.24% for testing 100 new images. Secondly, 419 slices from common database RIDER are used to train and test our YOLOv3 network. The time of lung nodule detection by YOLOv3 is shortened by 2–3 times compared with the mainstream algorithm, with the detection accuracy rate of 95.17%. Finally, the configuration of YOLOv3 is optimized by the learning data sets. The results show that YOLOv3 has the advantages of high speed and high accuracy in lung nodule detection, and it can access a large amount of CT image data within a short time to meet the huge demand of clinical practice. In addition, the use of Poisson image editing algorithms to generate data sets can reduce the need for raw training data and improve the training efficiency.  相似文献   

5.
Seagrasses provide a wide range of ecosystem services in coastal marine environments. Despite their ecological and economic importance, these species are declining because of human impact. This decline has driven the need for monitoring and mapping to estimate the overall health and dynamics of seagrasses in coastal environments, often based on underwater images. However, seagrass detection from underwater digital images is not a trivial task; it requires taxonomic expertise and is time-consuming and expensive. Recently automatic approaches based on deep learning have revolutionised object detection performance in many computer vision applications, and there has been interest in applying this to automated seagrass detection from imagery. Deep learning–based techniques reduce the need for hardcore feature extraction by domain experts which is required in machine learning-based techniques. This study presents a YOLOv5-based one-stage detector and an EfficientDetD7–based two-stage detector for detecting seagrass, in this case, Halophila ovalis, one of the most widely distributed seagrass species. The EfficientDet-D7–based seagrass detector achieves the highest mAP of 0.484 on the ECUHO-2 dataset and mAP of 0.354 on the ECUHO-1 dataset, which are about 7% and 5% better than the state-of-the-art Halophila ovalis detection performance on those datasets, respectively. The proposed YOLOv5-based detector achieves an average inference time of 0.077 s and 0.043 s respectively which are much lower than the state-of-the-art approach on the same datasets.  相似文献   

6.
在大熊猫(Ailuropoda melanoleuca)的迁地保护和种群饲养管理中,及时、快速地进行个体识别和行为监测,对其健康管理具有至关重要的作用。圈养大熊猫健康状况通常由专门的饲养人员肉眼观测,人力成本高、效率低并且缺乏时效性。基于图像的动物个体识别与行为分析技术效率高、时间成本低,已经成为新的监测发展趋势。已有研究提出,通过大熊猫面部图像的检测和分析,可实现个体识别和行为分类。但该方法依然存在检测精度不足导致识别准确率难以提升的问题。本文提出一种基于YOLOv3和Mask R-CNN的双模型融合方法,实现了大熊猫头部图像分割和精准检测。包含3个部分:YOLOv3完成头部检测,Mask R-CNN完成大熊猫轮廓分割,然后将两个模型的输出进行交并比融合。结果显示,头部检测准确率为82.6%,大熊猫轮廓分割准确率为95.2%,总体头部轮廓分割准确率为87.1%。该方法对大熊猫头部图像的识别率和分割准确率高,为大熊猫的个体识别、性别分类提供了帮助,为行为分析提供了技术参考。  相似文献   

7.
The accurate detection and classification of diseased pine trees with different levels of severity is important in terms of monitoring the growth of these trees and for preventing and controlling disease within pine forests. Our method combines a DDYOLOv5 with a ResNet50 network for detecting and classifying levels of pine tree disease from remote sensing UAV images. In this approach, images are preprocessed to increase the background diversity of the training samples, and efficient channel attention (ECA) and hybrid dilated convolution (HDC) modules are introduced to DDYOLOv5 to improve the detection accuracy. The ECA modules enable the network to focus on the characteristics of diseased pine trees, and solve the problem of low detection accuracy caused by the similarities in color and texture between diseased pine trees and the complex backgrounds. The HDC modules capture the contextual information of targets at different scales; they increase the receptive field to focus on targets of different sizes, and address the difficulty of detection caused by large variations in the shapes and sizes of diseased pine trees. In addition, a low confidence threshold is adopted to reduce missed detections and a ResNet50 classification network is applied to classify the detection results into different levels of severity, in order to reduce the number of false detections and improve the classification accuracy. Our experimental results show that the proposed method improves the precision by 13.55%, the recall by 5.06% and the F1-score by 9.71% on 8 test images compared with YOLOv5. Moreover, the detection and classification results from our approach show that it outperforms classical deep learning object detection methods such as Faster R-CNN and RetinaNet.  相似文献   

8.
The spectral fusion by Raman spectroscopy and Fourier infrared spectroscopy combined with pattern recognition algorithms is utilized to diagnose thyroid dysfunction serum, and finds the spectral segment with the highest sensitivity to further advance diagnosis speed. Compared with the single infrared spectroscopy or Raman spectroscopy, the proposal can improve the detection accuracy, and can obtain more spectral features, indicating greater differences between thyroid dysfunction and normal serum samples. For discriminating different samples, principal component analysis (PCA) was first used for feature extraction to reduce the dimension of high‐dimension spectral data and spectral fusion. Then, support vector machine (SVM), back propagation neural network, extreme learning machine and learning vector quantization algorithms were employed to establish the discriminant diagnostic models. The accuracy of spectral fusion of the best analytical model PCA‐SVM, single Raman spectral accuracy and single infrared spectral accuracy is 83.48%, 78.26% and 80%, respectively. The accuracy of spectral fusion is higher than the accuracy of single spectrum in five classifiers. And the diagnostic accuracy of spectral fusion in the range of 2000 to 2500 cm?1 is 81.74%, which greatly improves the sample measure speed and data analysis speed than analysis of full spectra. The results from our study demonstrate that the serum spectral fusion technique combined with multivariate statistical methods have great potential for the screening of thyroid dysfunction.  相似文献   

9.
Solving the problem of fish image classification is important to conserve fish diversity. This conundrum can be addressed by developing a new fish image classification method based on deep learning by training data with complex backgrounds. To this end, this paper proposes a fusion model, referred to as Tripmix-Net. The backbone network of the proposed model primarily consists of multiscale parallel and improved residual networks that are connected in an alternate manner, and network fusion is used to integrate the information that is extracted from shallow and deep layers. Experiments conducted on the 15-category WildFish fish image dataset verified the efficacy of the proposed Tripmix-Net for classifying same-genus fish images with complex backgrounds. The model achieved an accuracy of 95.31%, which is a significant improvement over traditional methods. The proposed approach serves as a new concept for the fine-grained image classification of fish against complex backgrounds.  相似文献   

10.
【目的】为减轻基层测报人员工作量,提高稻纵卷叶螟Cnaphalocrocis medinalis性诱测报的准确率和实时性,实现监测数据可追溯,建立了基于机器视觉的稻纵卷叶螟性诱智能监测系统。【方法】稻纵卷叶螟性诱智能监测系统包括基于机器视觉的智能性诱捕器、基于深度学习的稻纵卷叶螟检测模型、系统Web前端和服务器端。利用工业相机、光源和Android平板搭建了智能性诱捕器的机器视觉系统;建立了基于改进的YOLOv3和DBTNet-101双层网络的稻纵卷叶螟检测模型;利用HTML, CSS, JavaScript和Vue搭建系统Web前端展示稻纵卷叶螟检测与计数结果;使用Django框架搭建服务器端,对来自智能性诱捕器通过4G网络上传的图像进行接收与结果反馈;采用MySQL数据库保存图像和模型检测结果等信息。【结果】基于机器视觉的稻纵卷叶螟性诱智能监测系统利用智能性诱捕器自动定期上传稻纵卷叶螟图像至服务器,部署在服务器上的目标检测模型对稻纵卷叶螟成虫进行实时自动检测,精确率和召回率分别达97.6%和98.6%;用户可通过Web前端查看稻纵卷叶螟检测结果图。【结论】基于机器视觉的稻纵卷叶螟性...  相似文献   

11.
The reproductive performance of sows is an important indicator for evaluating the economic efficiency and production level of pigs. In this paper, we design and propose a lightweight sow oestrus detection method based on acoustic data and deep convolutional neural network (CNN) algorithms by collecting and analysing short-frequency and long-frequency sow oestrus sounds. We use visual log-mel spectrograms, which can reflect three-dimensional information, as inputs to the network model to improve the overall recognition accuracy. The improved lightweight MobileNetV3_esnet model is used to identify oestrus and nonoestrus sounds and is compared with existing algorithms. The model outperforms the other algorithms, with 97.12% precision, 97.34% recall, 97.59% F1-score, and 97.52% accuracy; the model size is 5.94 MB. Compared with traditional oestrus monitoring methods, the proposed method can more accurately boost the vocal characteristics exhibited by sows in latent oestrus, thus providing an efficient and accurate approach for use in practical applications of oestrus monitoring and early warning systems on pig farms.  相似文献   

12.
As a rapidly developing research direction in computer vision (CV), related algorithms such as image classification and object detection have achieved inevitable research progress. Improving the accuracy and efficiency of algorithms for fine-grained identification of plant diseases and birds in agriculture is essential to the dynamic monitoring of agricultural environments. In this study, based on the computer vision detection and classification algorithm, combined with the architecture and ideas of the CNN model, the mainstream Transformer model was optimized, and then the CA-Transformer (Transformer Combined with Channel Attention) model was proposed to improve the ability to identify and classify critical areas. The main work is as follows: (1) The C-Attention mechanism is proposed to strengthen the feature information extraction within the patch and the communication between feature information so that the entire network can be fully attentive while reducing the computational overhead; (2) The weight-sharing method is proposed to transfer parameters between different layers, improve the reusability of model data, and at the same time increase the knowledge distillation link to reduce problems such as excessive parameters and overfitting; (3) Token Labeling is proposed to generate score labels according to the position of each Token, and the total loss function of this study is proposed according to the CA-Transformer model structure. The performance of the CA-Transformer model proposed in this study is compared with the current mainstream models on datasets of different scales, and ablation experiments are performed. The results show that the accuracy and mIoU of the CA-Transformer proposed in this study reach 82.89% and 53.17MS, respectively, and have good transfer learning ability, indicating that the model has good performance in fine-grained visual categorization tasks and can be used in ecological information. In the context of more diverse ecological information, this study can provide reference and inspiration for the practical application of information.  相似文献   

13.
14.
Object categorization using single-trial electroencephalography (EEG) data measured while participants view images has been studied intensively. In previous studies, multiple event-related potential (ERP) components (e.g., P1, N1, P2, and P3) were used to improve the performance of object categorization of visual stimuli. In this study, we introduce a novel method that uses multiple-kernel support vector machine to fuse multiple ERP component features. We investigate whether fusing the potential complementary information of different ERP components (e.g., P1, N1, P2a, and P2b) can improve the performance of four-category visual object classification in single-trial EEGs. We also compare the classification accuracy of different ERP component fusion methods. Our experimental results indicate that the classification accuracy increases through multiple ERP fusion. Additional comparative analyses indicate that the multiple-kernel fusion method can achieve a mean classification accuracy higher than 72 %, which is substantially better than that achieved with any single ERP component feature (55.07 % for the best single ERP component, N1). We compare the classification results with those of other fusion methods and determine that the accuracy of the multiple-kernel fusion method is 5.47, 4.06, and 16.90 % higher than those of feature concatenation, feature extraction, and decision fusion, respectively. Our study shows that our multiple-kernel fusion method outperforms other fusion methods and thus provides a means to improve the classification performance of single-trial ERPs in brain–computer interface research.  相似文献   

15.
This paper proposes a fault diagnosis methodology for a gear pump based on the ensemble empirical mode decomposition (EEMD) method and the Bayesian network. Essentially, the presented scheme is a multi-source information fusion based methodology. Compared with the conventional fault diagnosis with only EEMD, the proposed method is able to take advantage of all useful information besides sensor signals. The presented diagnostic Bayesian network consists of a fault layer, a fault feature layer and a multi-source information layer. Vibration signals from sensor measurement are decomposed by the EEMD method and the energy of intrinsic mode functions (IMFs) are calculated as fault features. These features are added into the fault feature layer in the Bayesian network. The other sources of useful information are added to the information layer. The generalized three-layer Bayesian network can be developed by fully incorporating faults and fault symptoms as well as other useful information such as naked eye inspection and maintenance records. Therefore, diagnostic accuracy and capacity can be improved. The proposed methodology is applied to the fault diagnosis of a gear pump and the structure and parameters of the Bayesian network is established. Compared with artificial neural network and support vector machine classification algorithms, the proposed model has the best diagnostic performance when sensor data is used only. A case study has demonstrated that some information from human observation or system repair records is very helpful to the fault diagnosis. It is effective and efficient in diagnosing faults based on uncertain, incomplete information.  相似文献   

16.
Electroencephalography (EEG) signals collected from human brains have generally been used to diagnose diseases. Moreover, EEG signals can be used in several areas such as emotion recognition, driving fatigue detection. This work presents a new emotion recognition model by using EEG signals. The primary aim of this model is to present a highly accurate emotion recognition framework by using both a hand-crafted feature generation and a deep classifier. The presented framework uses a multilevel fused feature generation network. This network has three primary phases, which are tunable Q-factor wavelet transform (TQWT), statistical feature generation, and nonlinear textural feature generation phases. TQWT is applied to the EEG data for decomposing signals into different sub-bands and create a multilevel feature generation network. In the nonlinear feature generation, an S-box of the LED block cipher is utilized to create a pattern, which is named as Led-Pattern. Moreover, statistical feature extraction is processed using the widely used statistical moments. The proposed LED pattern and statistical feature extraction functions are applied to 18 TQWT sub-bands and an original EEG signal. Therefore, the proposed hand-crafted learning model is named LEDPatNet19. To select the most informative features, ReliefF and iterative Chi2 (RFIChi2) feature selector is deployed. The proposed model has been developed on the two EEG emotion datasets, which are GAMEEMO and DREAMER datasets. Our proposed hand-crafted learning network achieved 94.58%, 92.86%, and 94.44% classification accuracies for arousal, dominance, and valance cases of the DREAMER dataset. Furthermore, the best classification accuracy of the proposed model for the GAMEEMO dataset is equal to 99.29%. These results clearly illustrate the success of the proposed LEDPatNet19.  相似文献   

17.
漆愚  苏菡  侯蓉  刘鹏  陈鹏  臧航行  张志和 《兽类学报》2022,42(4):451-460
对圈养大熊猫 (Ailuropoda melanoleuca) 开展长期行为监测能及时了解其所处生理周期和健康状况,有助于繁殖饲养机构迅速采取相应繁育保护措施提高饲养管理水平,但目前无法对大熊猫进行24 h监控并及时地获得相应的行为信息。准确的动物姿态估计是动物行为研究的关键,也是诸多下游应用的基础。了解大熊猫的姿态可以促进大熊猫行为研究并提升保护管理水平。为了提高复杂环境下大熊猫姿态估计的准确率,本文以高分辨率网络 (High resolution net, HRNet) 为基础网络架构提出了一种大熊猫姿态估计方法:针对大熊猫不同部位尺度差异较大的问题,在HRNet-32中引入了空洞空间金字塔池化 (Atrous spatial pyramid pooling, ASPP) 模块,在提升特征感受野的同时捕获多尺度信息;同时对大熊猫身体关键点进行分组,引入基于部位的多分支结构来学习特定于每个部位组的表征。多次对比实验结果表明本文所用模型具有较高的检测精度:在PCK@0.05中所用模型精度达到了81.51%。本文提出的方法可为大熊猫的行为分析和健康评估提供技术支撑。  相似文献   

18.
This paper addresses the question of maximizing classifier accuracy for classifying task-related mental activity from Magnetoencelophalography (MEG) data. We propose the use of different sources of information and introduce an automatic channel selection procedure. To determine an informative set of channels, our approach combines a variety of machine learning algorithms: feature subset selection methods, classifiers based on regularized logistic regression, information fusion, and multiobjective optimization based on probabilistic modeling of the search space. The experimental results show that our proposal is able to improve classification accuracy compared to approaches whose classifiers use only one type of MEG information or for which the set of channels is fixed a priori.  相似文献   

19.
Recently, with most mobile phones coming with dual cameras, stereo image super-resolution is becoming increasingly popular in phones and other modern acquisition devices, leading stereo super-resolution images spread widely on the Internet. However, current image forensics methods are carried out in monocular images, and high false positive rate appears when detecting stereo super-resolution images by these methods. Therefore, it is important to develop stereo super-resolution image detection method. In this paper, a convolutional neural network with multi-scale feature extraction and hierarchical feature fusion is proposed to detect the stereo super-resolution images. Multi-atrous convolutions are employed to extract multi-scale features and be adapt for varying stereo super-resolution images, and hierarchical feature fusion further improve the performance and robustness of the model. Experimental results demonstrate that the proposed network can detect stereo super-resolution images effectively and achieve strong generalization and robustness. To the best of our knowledge, it is the first attempt to investigate the performance of current forensics methods when tested under stereo super-resolution images, and represent the first study of stereo super-resolution images detection. We believe that it can raise the awareness about the security of stereo super-resolution images.  相似文献   

20.
《IRBM》2020,41(1):31-38
In this paper, a brain-computer interface (BCI) system for character recognition is proposed based on the P300 signal. A P300 speller is used to spell the word or character without any muscle movement. P300 detection is the first step to detect the character from the electroencephalogram (EEG) signal. The character is recognized from the detected P300 signal. In this paper, sparse autoencoder (SAE) and stacked sparse autoencoder (SSAE) based feature extraction methods are proposed for P300 detection. This work also proposes a fusion of deep-features with the temporal features for P300 detection. A SSAE technique extracts high-level information about input data. The combination of SSAE features with the temporal features provides abstract and temporal information about the signal. An ensemble of weighted artificial neural network (EWANN) is proposed for P300 detection to minimize the variation among different classifiers. To provide more importance to the good classifier for final classification, a higher weightage is assigned to the better performing classifier. These weights are calculated from the cross-validation test. The model is tested on two different publicly available datasets, and the proposed method provides better or comparable character recognition performance than the state-of-the-art methods.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号