Towards End-to-End Acoustic Localization Using Deep Learning: From Audio Signals to Source Position Coordinates

機(jī)譯：使用深度學(xué)習(xí)實(shí)現(xiàn)端到端聲學(xué)定位：從音頻信號(hào)到源位置坐標(biāo)

代理獲取

本網(wǎng)站僅為用戶提供外文OA文獻(xiàn)查詢和代理獲取服務(wù)，本網(wǎng)站沒(méi)有原文。下單后我們將采用程序或人工為您竭誠(chéng)獲取高質(zhì)量的原文，但由于OA文獻(xiàn)來(lái)源多樣且變更頻繁，仍可能出現(xiàn)獲取不到、文獻(xiàn)不完整或與標(biāo)題不符等情況，如果獲取不到我們將提供退款服務(wù)。請(qǐng)知悉。

頁(yè)面導(dǎo)航

摘要
著錄項(xiàng)
引文網(wǎng)絡(luò)
相似文獻(xiàn)
相關(guān)主題

摘要

This paper presents a novel approach for indoor acoustic source localization using microphone arrays, based on a Convolutional Neural Network (CNN). In the proposed solution, the CNN is designed to directly estimate the three-dimensional position of a single acoustic source using the raw audio signal as the input information and avoiding the use of hand-crafted audio features. Given the limited amount of available localization data, we propose, in this paper, a training strategy based on two steps. We first train our network using semi-synthetic data generated from close talk speech recordings. We simulate the time delays and distortion suffered in the signal that propagate from the source to the array of microphones. We then fine tune this network using a small amount of real data. Our experimental results, evaluated on a publicly available dataset recorded in a real room, show that this approach is able to produce networks that significantly improve existing localization methods based on SRP-PHAT strategies and also those presented in very recent proposals based on Convolutional Recurrent Neural Networks (CRNN). In addition, our experiments show that the performance of our CNN method does not show a relevant dependency on the speaker’s gender, nor on the size of the signal window being used.

機(jī)譯：本文提出了一種基于卷積神經(jīng)網(wǎng)絡(luò)（CNN）的使用麥克風(fēng)陣列進(jìn)行室內(nèi)聲源定位的新方法。在提出的解決方案中，CNN旨在使用原始音頻信號(hào)作為輸入信息來(lái)直接估計(jì)單個(gè)聲源的三維位置，并避免使用手工制作的音頻功能。鑒于可用的本地化數(shù)據(jù)數(shù)量有限，我們?cè)诒疚闹刑岢隽艘环N基于兩個(gè)步驟的訓(xùn)練策略。我們首先使用從近距離語(yǔ)音記錄生成的半合成數(shù)據(jù)來(lái)訓(xùn)練我們的網(wǎng)絡(luò)。我們模擬了從源傳播到麥克風(fēng)陣列的信號(hào)中的時(shí)間延遲和失真。然后，我們使用少量實(shí)際數(shù)據(jù)微調(diào)此網(wǎng)絡(luò)。我們對(duì)在真實(shí)房間中記錄的公開(kāi)可用數(shù)據(jù)集進(jìn)行的實(shí)驗(yàn)結(jié)果表明，這種方法能夠產(chǎn)生可顯著改善基于SRP-PHAT策略以及基于卷積遞歸神經(jīng)網(wǎng)絡(luò)的最新提議中提出的定位方法的網(wǎng)絡(luò)。網(wǎng)絡(luò)（CRNN）。另外，我們的實(shí)驗(yàn)表明，我們的CNN方法的性能并未顯示出與說(shuō)話者性別的相關(guān)性，也沒(méi)有顯示出所使用的信號(hào)窗口的大小。

著錄項(xiàng)

期刊名稱 Sensors (Basel Switzerland)
作者
Juan Manuel Vera-Diaz; Daniel Pizarro; Javier Macias-Guarasa;
展開(kāi)▼
作者單位

展開(kāi)▼
年(卷),期 2018(18),10
年度 2018
頁(yè)碼 3418
總頁(yè)數(shù) 22
原文格式 PDF
正文語(yǔ)種
中圖分類
關(guān)鍵詞
acoustic source localization microphone arrays deep learning convolutional neural networks;

機(jī)譯：聲源定位;麥克風(fēng)陣列;深度學(xué)習(xí);卷積神經(jīng)網(wǎng)絡(luò);

相似文獻(xiàn)

外文文獻(xiàn)
中文文獻(xiàn)
專利

1. The Influence of the Coordinate Errors of Setting Sensors of a Piezoelectric Antenna on the Accuracy of Localizing Sources of Acoustic Emission Signals [J] . A. E. Kareev, L. N. Stepanova, E. S. Tenitilov Russian Journal of Nondestructive Testing . 2010,第11期

機(jī)譯：壓電天線設(shè)置傳感器的坐標(biāo)誤差對(duì)聲發(fā)射信號(hào)定位源精度的影響
2. A generalizable deep learning framework for localizing and characterizing acoustic emission sources in riveted metallic panels [J] . Ebrahimkhanlou Arvin, Dubuc Brennan, Salamone Salvatore Mechanical systems and signal processing . 2019,第Sepa1期

機(jī)譯：用于在鉚接金屬板上定位和表征聲發(fā)射源的通用化深度學(xué)習(xí)框架
3. A generalizable deep learning framework for localizing and characterizing acoustic emission sources in riveted metallic panels [J] . Ebrahimkhanlou Arvin, Dubuc Brennan, Salamone Salvatore Mechanical systems and signal processing . 2019,第SEPa1期

機(jī)譯：用于在鉚接金屬板上定位和表征聲發(fā)射源的通用化深度學(xué)習(xí)框架
4. Deep Learning for Audio Signal Source Positioning Using Microphone Array [C] . Resul Adanur, Y?ld?ray Ye?ilyurt, Cem ?i?man, International Conference on Digital Information Processing and Communications . 2019

機(jī)譯：使用麥克風(fēng)陣列進(jìn)行音頻信號(hào)源定位的深度學(xué)習(xí)
5. ON THE SPATIAL STRUCTURE OF THE ACOUSTIC SIGNAL FIELD NEAR THE DEEP OCEAN BOTTOM DUE TO A NEAR-SURFACE CW SOURCE. [D] . GRANT, DAVID EDWARD. 1986

機(jī)譯：由于近表面CW源，在深海底部附近的聲信號(hào)場(chǎng)的空間結(jié)構(gòu)。
6. End-to-End Deep Learning Fusion of Fingerprint and Electrocardiogram Signals for Presentation Attack Detection [O] . Rami M. Jomaa, Hassan Mathkour, Yakoub Bazi, 2020

機(jī)譯：指紋和心電圖信號(hào)的端到端深度學(xué)習(xí)融合用于演示攻擊檢測(cè)
7. Machine Learning and End-to-End Deep Learning for Monitoring Driver Distractions From Physiological and Visual Signals [O] . Martin Gjoreski, Matja Z Gams, Mitja Lustrek, 2020

機(jī)譯：機(jī)器學(xué)習(xí)和端到端深度學(xué)習(xí)，用于監(jiān)測(cè)生理和視覺(jué)信號(hào)的駕駛員分心

代理獲取

客服郵箱：kefu@zhangqiaokeyan.com

京公網(wǎng)安備：11010802029741號(hào) ICP備案號(hào)：京ICP備15016152號(hào)-6 六維聯(lián)合信息科技 (北京) 有限公司?版權(quán)所有

客服微信
服務(wù)號(hào)

国产bbaaaaa片,成年美女黄网站色视频免费,成年黄大片,а天堂中文最新一区二区三区,成人精品视频一区二区三区尤物

Towards End-to-End Acoustic Localization Using Deep Learning: From Audio Signals to Source Position Coordinates

摘要

著錄項(xiàng)

引文網(wǎng)絡(luò)

相似文獻(xiàn)

相關(guān)主題

期刊訂閱