New article online: Improving biodiversity assessment via unsupervised separation of biological sounds from long-duration recordings

Our new article has been published on Scientific Reports! In this article, we introduce a novel machine learning tool, the periodicity coded nonnegative matrix factorization (PC-NMF). The PC-NMF can separate biological sounds from a noisy long-term spectrogram in an unsupervised approach, therefore, it is a great tool for evaluating the dynamics of soundscape and facilitating the soundscape-based biodiversity assessment.

You can download the MATLAB codes of PC-NMF and test data in the supplementary dataset of our article.

Improving biodiversity assessment via unsupervised separation of biological sounds from long-duration recordings

Scientific Reports 7, 4547 (2017) doi:10.1038/s41598-017-04790-7

Tzu-Hao Lin, Yu Tsao
Research Center for Information Technology Innovation, Academia Sinica, Taipei, Taiwan (R.O.C.)

Shih-Hua Fang
Department of Electrical Engineering, Yuan Ze University, Taoyuan, Taiwan (R.O.C.)

Investigating the dynamics of biodiversity via passive acoustic monitoring is a challenging task, owing to the difficulty of identifying different animal vocalizations. Several indices have been proposed to measure acoustic complexity and to predict biodiversity. Although these indices perform well under low-noise conditions, they may be biased when environmental and anthropogenic noises are involved. In this paper, we propose a periodicity coded non-negative matrix factorization (PC-NMF) for separating different sound sources from a spectrogram of long-term recordings. The PC-NMF first decomposes a spectrogram into two matrices: spectral basis matrix and encoding matrix. Next, on the basis of the periodicity of the encoding information, the spectral bases belonging to the same source are grouped together. Finally, distinct sources are reconstructed on the basis of the cluster of the basis matrix and the corresponding encoding information, and the noise components are then removed to facilitate more accurate monitoring of biological sounds. Our results show that the PC-NMF precisely enhances biological choruses, effectively suppressing environmental and anthropogenic noises in marine and terrestrial recordings without a need for training data. The results may improve behaviour assessment of calling animals and facilitate the investigation of the interactions between different sound sources within an ecosystem.


Ecoacoustics 2016

2016/6/5-8 @ University of Michigan

Investigation on the dynamics of soundscape by using unsupervised detection and classification algorithms

Tzu-Hao Lin, Lien-Siang Chou
Institute of Ecology and Evolutionary Biology, National Taiwan University, Repubic of China (Taiwan)

Yu-Huang Wang
Biodiversity Research Center, Academia Sinica, Repubic of China (Taiwan)

Soundscape has been proposed as a potential information source to study the variability of biodiversity. However, analysis of the soundscape is a challenging task when there is no sufficient database to recognize various sounds collected from long duration recordings. Previous researches have measured several acoustic diversity indexes to quantify the variation of biodiversity, but the acoustic diversity indexes are still difficult to interpret without any ground truth. In this study, we propose to analyze the composition of soundscape scenes and visualize the dynamics of soundscape by using unsupervised detection and classification algorithms. Different soundscape scenes were classified according to the tonal sounds, pulsed sounds, and acoustic features obtained from long-term spectrogram. By adjusting the variation explained through classification results, the number of soundscape scenes will be automatically determined. The unsupervised classifier has been employed to analyze the soundscape dynamics in several forests and shallow marine environments in Taiwan. Our results demonstrate that the seasonal and diurnal changing patterns of geophony, biophony, and anthrophony can be effectively investigated. Besides, the spatial change of soundscape can also be discriminated according to the composition of soundscape scenes. After the biophony scenes have been identified, we can apply the same classifier again to measure the complexity of biological sounds and examine the variability of biodiversity. The current approach provides researchers and managers a visualization platform to monitor the dynamics of soundscape and to study the interactions among acoustic environment, biodiversity, and human activities in the future.

New article online: Automatic classification of delphinids based on the representative frequencies of whistles

Our new article which introduce a new method of using representative frequency distribution to classify delphinid species has been published in the Journal of Acoustical Society of America. Please contact me if you are interested in the pdf copy or the algorithm.

Automatic classification of delphinids based on the representative frequencies of whistles

J. Acoust. Soc. Am. 138, 1003 (2015);

Tzu-Hao Lin, Lien-Siang Chou
Institute of Ecology and Evolutionary Biology, National Taiwan University, Number 1, Section 4, Roosevelt Road, Taipei 10617, Taiwan

Classification of odontocete species remains a challenging task for passive acoustic monitoring. Classifiers that have been developed use spectral features extracted from echolocation clicks and whistle contours. Most of these contour-based classifiers require complete contours to reduce measurement errors. Therefore, overlapping contours and partially detected contours in an automatic detection algorithm may increase the bias for contour-based classifiers. In this study, classification was conducted on each recording section without extracting individual contours. The local-max detector was used to extract representative frequencies of delphinid whistles and each section was divided into multiple non-overlapping fragments. Three acoustical parameters were measured from the distribution of representative frequencies in each fragment. By using the statistical features of the acoustical parameters and the percentage of overlapping whistles, correct classification rate of 70.3% was reached for the recordings of seven species (Tursiops truncatus, Delphinus delphis, Delphinus capensis, Peponocephala electra, Grampus griseus, Stenella longirostris longirostris, and Stenella attenuata) archived in In addition, correct classification rate was not dramatically reduced in various simulated noise conditions. This algorithm can be employed in acoustic observatories to classify different delphinid species and facilitate future studies on the community ecology of odontocetes.

Using local-max detector to detect tonal sounds

Lots of animals produce tonal calls. The acoustic characteristics of tonal sounds may help us to identify the species and behavior of calling animals. However, some animals like cetaceans, have a highly complex repertoire of tonal sounds. This elevates the difficulty of using automatic detection method in the passive acoustic monitoring.

In terms of this, I developed this program to help people use passive acoustic monitoring to study the animals’ tonal sounds. The purpose of this program is to decrease the labor work of detecting animals’ vocalizations by passive acoustic monitoring. The current program can detect multiple types of tonal sound without training or using sound template. The detection is based on the prominent of tonal appearance on the spectrogram.

The detection target of this program primary focus on “tonal sounds”, but it is also possible to detect “burst-pulses” and other “tonal noise” with strong tonal appearance. This program aims to work for everyone. It not only helps user to detect the occurrence of calls, but also provide information on their acoustic features so that user can use those information for further analysis.

The following figure is a demonstration of my algorithm.

whistle detector

The detection process include three main steps: 1. remove ambient noise, 2. extract tonal spectral peaks, and 3. noise filtering.

  1. Spectrograms of sound recordings are produced using fast Fourier transform (FFT) with the Hamming window. Ambient noise is removed by pre-whitening spectrograms. Spectrograms are further smoothed using a Gaussian kernel.
  2. Tonal sounds are detected by applying two thresholds (SNR and tonality). If the instantaneous frequency bandwidth of tonal sound is suitable, the peak frequency is extracted by finding the local maximum in the power spectrum.
  3. A noise filter is employed to exclude broadband noise and isolated narrowband noise. The tonal spectral peaks are claimed as adopted frequencies of animals’ tonal sound after noise filtering.

Detail of the local-max detector for tonal sound is available in the following publications:
Lin, Tzu-Hao, Chou, Lien-Siang, Akamatsu, Tomonari, Chan, Hsiang-Chih, Chen, Chi-Fang. (2013) An automatic detection algorithm for extracting the representative frequency of cetacean tonal sounds. Journal of the Acoustical Society America, 134: 2477-2485.

If you are interested, please feel free to contact me. This program is free of charge and we can cooperate together! Please also take a look at the operation manual of this program so that you can understand the system requirement and the possible application.

Operation manual of Local-max detector for tonal sounds


Tzu-Hao (Harry) Lin