New article online: Improving biodiversity assessment via unsupervised separation of biological sounds from long-duration recordings

Our new article has been published on Scientific Reports! In this article, we introduce a novel machine learning tool, the periodicity coded nonnegative matrix factorization (PC-NMF). The PC-NMF can separate biological sounds from a noisy long-term spectrogram in an unsupervised approach, therefore, it is a great tool for evaluating the dynamics of soundscape and facilitating the soundscape-based biodiversity assessment.

You can download the MATLAB codes of PC-NMF and test data in the supplementary dataset of our article.

Improving biodiversity assessment via unsupervised separation of biological sounds from long-duration recordings

Scientific Reports 7, 4547 (2017) doi:10.1038/s41598-017-04790-7

Tzu-Hao Lin, Yu Tsao
Research Center for Information Technology Innovation, Academia Sinica, Taipei, Taiwan (R.O.C.)

Shih-Hua Fang
Department of Electrical Engineering, Yuan Ze University, Taoyuan, Taiwan (R.O.C.)

Investigating the dynamics of biodiversity via passive acoustic monitoring is a challenging task, owing to the difficulty of identifying different animal vocalizations. Several indices have been proposed to measure acoustic complexity and to predict biodiversity. Although these indices perform well under low-noise conditions, they may be biased when environmental and anthropogenic noises are involved. In this paper, we propose a periodicity coded non-negative matrix factorization (PC-NMF) for separating different sound sources from a spectrogram of long-term recordings. The PC-NMF first decomposes a spectrogram into two matrices: spectral basis matrix and encoding matrix. Next, on the basis of the periodicity of the encoding information, the spectral bases belonging to the same source are grouped together. Finally, distinct sources are reconstructed on the basis of the cluster of the basis matrix and the corresponding encoding information, and the noise components are then removed to facilitate more accurate monitoring of biological sounds. Our results show that the PC-NMF precisely enhances biological choruses, effectively suppressing environmental and anthropogenic noises in marine and terrestrial recordings without a need for training data. The results may improve behaviour assessment of calling animals and facilitate the investigation of the interactions between different sound sources within an ecosystem.

 

Advertisements

2017年海洋科學年會

2017/5/4-5 @ 國立中山大學

聆聽海洋的訊息:應用深度學習分析海洋聲景之變動

林子皓、曹昱
中央研究院 資訊科技創新研究中心

被動式聲學監測已被廣泛應用在海洋環境與生態研究中,長期錄音中的各種環境音與動物音增加了我們對海洋生態環境的了解,許多研究也深入探討人為噪音對海洋生態的影響。然而,過去針對海洋聲景的分析大多著重噪音的時頻譜特性,並透過設定規則的偵測器尋找海洋動物的聲音。但海洋聲景受到地形、氣候、生物群聚與人為活動的高度影響,時頻譜分析可能無法有效描述同時出現的多種聲源,偵測器效能也隨著噪音變動而改變。為了有效分離海洋聲景中的各種聲源,本研究應用非負矩陣分解法 (non-negative matrix factorization) 及其變形方法分析長期時頻譜圖,將輸入資料拆解為特徵矩陣與編碼矩陣。雖然單層的非負矩陣分解法在多次疊代後,能夠在特徵矩陣與編碼矩陣約略學習到各種聲源的頻譜特徵與時域上的強度,但仍難以分離重疊的多種聲源。本研究將多層學習器分別預訓練後堆疊成深度學習架構,並在各層之間逐漸減少特徵矩陣之基底數量,藉由最末層回傳後之重建資料和輸入資料的誤差,在多次疊代中自行修正各層模型參數以達到最佳的聲源分離成果。本研究針對各地具有不同環境噪音特性的海洋聲景進行分析,結果顯示在不需要辨識樣本與資料標籤的情況下,深度學習能夠有效分離海洋中的各種主要聲源:魚群鳴唱、槍蝦脈衝聲、船隻噪音與環境音。學習到的特徵矩陣也能夠作為辨識樣本,透過半監督式學習分析大量的線上資料。透過深度學習分離聲源,未來將能夠更有效評估海洋聲景的複雜結構,並藉此探討海洋環境與生態的變動,以及人為開發的影響。

International Symposium on Grids & Clouds 2017

2017/3/5-10 @ Academia Sinica, Taipei, Taiwan

Listening to the ecosystem: the integration of machine learning and a long-term soundscape monitoring network

Tzu-Hao Lin, Yu Tsao
Research Center for Information Technology Innovation, Academia Sinica

Yu-Huang Wang
Taiwan Biodiversity Information Facility, Biodiversity Research Center, Academia Sinica

Han-Wei Yen
Academia Sinica Grid Computing

Information on the variability of environment and biodiversity is essential for conservation management. In recent years, soundscape monitoring has been proposed as a new approach to assess the dynamics of biodiversity. Soundscape is the collection of biological sound, environmental sound, and anthropogenic noise, which provide us the essential information regarding the nature environment, behavior of calling animals, and human activities. The recent developments of recording networks facilitate the field surveys in remote forests and deep marine environments. However, analysis of big acoustic data is still a challenging task due to the lack of sufficient database to recognize various animal vocalizations. Therefore, we have developed three tools for analyzing and visualizing soundscape data: (1) long-term spectrogram viewer, (2) biological chorus detector, (3) soundscape event classifier. The long-term spectrogram viewer helps users to visualize weeks or months of recordings and evaluate the dynamics of soundscape. The biological chorus detector can automatically recognize the biological chorus without any sound template. We can separate the biological chorus and non-biological noise from a long-term spectrogram and unsupervised identify various biological events by using the soundscape event classifier. We have applied these tools on terrestrial and marine recordings collected in Taiwan to investigate the variability of environment and biodiversity. In the future, we will integrate these tools with the Asian Soundscape monitoring network. Through the open data of soundscape, we hope to provide ecological researcher and citizens an interactive platform to study the dynamics of ecosystem and the interactions among acoustic environment, biodiversity, and human activities.

2017年動物行為生態研討會

2017/1/23-24 @ 高雄中山大學

應用機器學習探討海洋聲景變動與中華白海豚發聲活動之關聯

林子皓、曹昱
中央研究院資訊科技創新研究中心

方士豪
元智大學電機工程學系

鯨豚的發聲行為相當多變,不同族群可能會在各種環境音改變哨聲特徵﹐也會在遭遇人為噪音時改變聲音結構。海洋聲景是由環境音、動物音與人為噪音組成,具有高度變異的特性。儘管過去已有不少針對鯨豚發聲與單一音源的研究,但是對鯨豚如何在多變的海洋聲景且多重聲源相互重疊的狀況下改變行為仍不清楚。本研究透過水下錄音機,長期收錄2014年苗栗海域的海洋錄音。首先應用自動偵測器尋找中華白海豚水下聲音,再應用非負矩陣分解法學習海洋聲景中的主要聲源特徵。透過非監督式學習器,可以有效拆解長期時頻譜圖,視覺化呈現石首魚鳴唱、槍蝦聲音、環境與人為噪音等主要聲源的相對變化。利用廣義疊加模型分析聲景與白海豚聲音後,我們發現白海豚的聲音偵測率與複雜度和各種聲源皆有不同的相關性。此結果顯示應用機器學習分離聲景中的各種聲源之後,將能夠有效瞭解動物和各種聲源的交互作用。未來,聲景中的各種訊息也可以作為預測動物活動的生態遙測資料。

2016年臺灣地球科學聯合學術研討會

2016/5/20

近海與海岸環境 Land-Ocean Interactions in the Changing Coastal Zones of Taiwan:
Scientific Basis and Societal Engagements

應用非監督式分類方法調查海洋聲景的時空變化

林子皓
國立台灣大學生態學與演化生物學研究所

海洋聲景由環境音、動物聲音與人為噪音所組成,是由各種水下聲音所構築而成的音響環境。環境音可能來自於風浪、海流、地震等等自然事件,受到海床地形、水文的變化,聲音在各地傳播的路徑有所不同,進而塑造出獨特的音響環境。動物音主要來自於海洋動物發聲,也可能來自動物移動過程或水面活動伴隨發出的聲音。動物音具有高度複雜的變異性,以鯨豚和魚類為例,不同種的聲音特徵有所差別,但同種的聲音卻也有可能受到行為影響而有著不同的結構。海域的人為噪音則以船隻交通、海洋工程的噪音為主,依據接受強度的不同,噪音可能會使動物受到生理傷害、干擾行為、遮蔽溝通,長期暴露下也可能增加免疫壓力。因此,調查海洋聲景不只可以協助我們了解海洋環境特性、海洋動物的種類組成與活動特性,更可以了解人為噪音對海洋生態的影響。近年來隨著水下技術的發展,國際上開始廣泛應用自動錄音機收集長時間水下錄音調查海洋聲景的時空變化。然而目前仍缺乏完整資料庫辨認各種聲音,也難以利用人工分析巨量錄音,因此阻礙了海洋聲景生態學的發展。本研究運用資訊分析技術,解析海洋聲景的事件組成,以進一步了解海洋環境與生態的動態變化。在野外取回水下錄音之後,計算每五分鐘水下錄音的平均功率頻譜,以壓縮大量的錄音資料,並將一系列的平均功率頻譜組合成長期時頻譜圖做為視覺化分析海洋聲景的基礎資料。此外,將每五分鐘的平均功率頻譜作為分析參數,經過多變數分析方法減少特徵向量的維度之後,利用區分資料在多重維度空間內的分佈叢集,作為非監督式分類海洋聲景事件的分析架構。本研究應用自行開發的演算法分析苗栗中港溪口附近海域的水下錄音資料,結果顯示海洋聲景的事件組成在以泥沙底質為主的河口海域以及礁石為主的人工魚礁附近有明顯的結構性差異。海洋聲景在河口海域以較為安靜的環境音、以及夜晚出現的石首魚群鳴唱為主,但在人工魚礁附近則是以吵雜的槍蝦聲音、以及傍晚過後出現的低頻魚群鳴音為主要的事件。透過視覺化分析海洋聲景事件組成的時序變化,將可協助海洋生態研究人員進一步了解各地的海洋動物群聚組成與生態系統的動態變化,並提供海洋生態保育經營的重要基礎資料。

Conference presentation: 21st Biennial Conference on the Biology of Marine Mammals

13-18 Dec 2015

A noisy dinner? Passive acoustic monitoring on the predator-prey interactions between Indo-Pacific humpback dolphins and croakers

Tzu-Hao Lin, Wen-Ching Lien, Chih-Kai Yang, and Lien-Siang Chou
Institute of Ecology and Evolutionary Biology, National Taiwan University

Shane Guan
Office of Protected Resources, National Marine Fisheries Service, Silver Spring, MD, USA

The spatio-temporal dynamics of prey resources have been considered as important factors for shaping the distribution and behavior of odontocetes. Indo-Pacific humpback dolphin (Sousa chinensis) is a costal species, which primary feeds on benthic croakers. It has been hypothesized that the distribution pattern and periodic occurrence of humpback dolphins are results of their prey movement. However, the interactions between humpback dolphins and croakers remain unclear. During May 2013 and November 2014, underwater sound recordings were collected in western Taiwan waters. Croaker choruses and humpback dolphin echolocation clicks were automatically detected using custom developed algorithms. Both croaker choruses and dolphin clicks were frequently detected in shallow estuarine waters during spring and summer. In addition, shorter inter-click intervals were detected with higher frequencies in these areas, indicating more likely foraging behavior. Current results suggest that the core habitats of humpback dolphins show an agreement with the areas of prominent croaker chorus. Diurnal cycle analysis showed that croaker choruses were most evident after sunset to until approximately 4 A.M. In estuarine waters, humpback dolphin clicks were most frequently detected during the nighttime, with reduced detection rates after 8 A.M. This suggests that the diurnal behavior of humpback dolphins could be associated with the calling behavior of croakers. Although whether the position of a calling croaker could be passively localized by a dolphin remains unknown, our results indicate that the foraging probability of humpback dolphins may be elevated during the nighttime chorus of croakers. Information regarding the spatio-temporal dynamics of croaker chorus can be important for the conservation management of humpback dolphins. Further details on the predator-prey interactions between humpback dolphins and croakers can be investigated by using hydrophone arrays.

Poster (pdf)

New article online: Automatic classification of delphinids based on the representative frequencies of whistles

Our new article which introduce a new method of using representative frequency distribution to classify delphinid species has been published in the Journal of Acoustical Society of America. Please contact me if you are interested in the pdf copy or the algorithm.

Automatic classification of delphinids based on the representative frequencies of whistles

J. Acoust. Soc. Am. 138, 1003 (2015); http://dx.doi.org/10.1121/1.4927695

Tzu-Hao Lin, Lien-Siang Chou
Institute of Ecology and Evolutionary Biology, National Taiwan University, Number 1, Section 4, Roosevelt Road, Taipei 10617, Taiwan

Classification of odontocete species remains a challenging task for passive acoustic monitoring. Classifiers that have been developed use spectral features extracted from echolocation clicks and whistle contours. Most of these contour-based classifiers require complete contours to reduce measurement errors. Therefore, overlapping contours and partially detected contours in an automatic detection algorithm may increase the bias for contour-based classifiers. In this study, classification was conducted on each recording section without extracting individual contours. The local-max detector was used to extract representative frequencies of delphinid whistles and each section was divided into multiple non-overlapping fragments. Three acoustical parameters were measured from the distribution of representative frequencies in each fragment. By using the statistical features of the acoustical parameters and the percentage of overlapping whistles, correct classification rate of 70.3% was reached for the recordings of seven species (Tursiops truncatus, Delphinus delphis, Delphinus capensis, Peponocephala electra, Grampus griseus, Stenella longirostris longirostris, and Stenella attenuata) archived in MobySound.org. In addition, correct classification rate was not dramatically reduced in various simulated noise conditions. This algorithm can be employed in acoustic observatories to classify different delphinid species and facilitate future studies on the community ecology of odontocetes.