首页 | 本学科首页   官方微博 | 高级检索  
   检索      


Unsupervised classification to improve the quality of a bird song recording dataset
Institution:1. Department of Geography, University of Florida, Turlington Hall, 3141, 330 Newell Dr, Gainesville, FL 32611, United States of America;2. Harvard Forest, Harvard University, 324 North Main Street, Petersham, MA 01366-9504, United States of America;1. Laboratory of Ecology and Environmental Management, Science and Technology Advanced Institute, Van Lang University, Ho Chi Minh City, Viet Nam;2. Faculty of Applied Technology, School of Engineering and Technology, Van Lang University, Ho Chi Minh City, Viet Nam;3. Hanoi University of Natural Resources and Environment, Phu Dien, Bac Tu Liem, Ha Noi, Viet Nam;4. Hanoi National University of Education, 136 Xuan Thuy, Cau Giay, Hanoi, Viet Nam;5. Vietnam National University, Hanoi University of Science, 334 Nguyen Trai, Thanh Xuan, Hanoi, Viet Nam;1. Nature Coast Biological Station, Institute of Food and Agricultural Sciences, University of Florida, 552 1st St, Cedar Key, FL 32625, USA;2. Fisheries and Aquatic Sciences, Institute of Food and Agricultural Sciences, University of Florida, 136 Newins-Ziegler Hall, Gainesville, FL 32611-0410, USA;3. MERIDIAN, Halifax, Nova Scotia, Canada;4. Dalhousie University, 6299 South St, Halifax, NS B3H 4R2, Canada;5. Department of Biological Sciences, Simon Fraser University, 8888 University Dr W, Burnaby, BC V5A 1S6, Canada;6. Department of Biology, University of Victoria, 3800 Finnerty Road, Victoria, BC V8P 5C2, Canada;7. Instituto Oceanográfico, Universidade de São Paulo, Praça do Oceanográfico, 191 - CEP: 05508-120, Cidade Universitária, São Paulo (SP), Brazil;8. The Fish Listener, Waquoit, MA, USA;9. Soil, Water, and Ecosystem Sciences Department, Institute of Food and Agricultural Sciences, University of Florida, 1692 McCarty Dr, Gainesville, FL 32603, USA;10. Faculty of Computer Science, Dalhousie University, 6050 University Ave, Halifax, NS B3H 1W5, Canada;11. Institute of Computer Science, Polish Academy of Sciences, Jana Kazimierza 5, 01-248 Warszawa, Poland;1. ICEDS, Australian National University, Australia;2. James Cook University, Australia;1. School of Land Science and Technology, China University of Geosciences, Beijing 100083, China;2. Key Laboratory of Land Consolidation and Rehabilitation, Ministry of Land and Resources, Beijing 100035, China
Abstract:Open audio databases such as Xeno-Canto are widely used to build datasets to explore bird song repertoire or to train models for automatic bird sound classification by deep learning algorithms. However, such databases suffer from the fact that bird sounds are weakly labelled: a species name is attributed to each audio recording without timestamps that provide the temporal localization of the bird song of interest. Manual annotations can solve this issue, but they are time consuming, expert-dependent, and cannot run on large datasets. Another solution consists in using a labelling function that automatically segments audio recordings before assigning a label to each segmented audio sample. Although labelling functions were introduced to expedite strong label assignment, their classification performance remains mostly unknown. To address this issue and reduce label noise (wrong label assignment) in large bird song datasets, we introduce a data-centric novel labelling function composed of three successive steps: 1) time-frequency sound unit segmentation, 2) feature computation for each sound unit, and 3) classification of each sound unit as bird song or noise with either an unsupervised DBSCAN algorithm or the supervised BirdNET neural network. The labelling function was optimized, validated, and tested on the songs of 44 West-Palearctic common bird species. We first showed that the segmentation of bird songs alone aggregated from 10% to 83% of label noise depending on the species. We also demonstrated that our labelling function was able to significantly reduce the initial label noise present in the dataset by up to a factor of three. Finally, we discuss different opportunities to design suitable labelling functions to build high-quality animal vocalizations with minimum expert annotation effort.
Keywords:
本文献已被 ScienceDirect 等数据库收录!
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号