国产bbaaaaa片,成年美女黄网站色视频免费,成年黄大片,а天堂中文最新一区二区三区,成人精品视频一区二区三区尤物

首頁(yè)> 美國(guó)衛(wèi)生研究院文獻(xiàn)>Sensors (Basel Switzerland) >Towards End-to-End Acoustic Localization Using Deep Learning: From Audio Signals to Source Position Coordinates
【2h】

Towards End-to-End Acoustic Localization Using Deep Learning: From Audio Signals to Source Position Coordinates

機(jī)譯:使用深度學(xué)習(xí)實(shí)現(xiàn)端到端聲學(xué)定位:從音頻信號(hào)到源位置坐標(biāo)

代理獲取
本網(wǎng)站僅為用戶提供外文OA文獻(xiàn)查詢和代理獲取服務(wù),本網(wǎng)站沒(méi)有原文。下單后我們將采用程序或人工為您竭誠(chéng)獲取高質(zhì)量的原文,但由于OA文獻(xiàn)來(lái)源多樣且變更頻繁,仍可能出現(xiàn)獲取不到、文獻(xiàn)不完整或與標(biāo)題不符等情況,如果獲取不到我們將提供退款服務(wù)。請(qǐng)知悉。

摘要

This paper presents a novel approach for indoor acoustic source localization using microphone arrays, based on a Convolutional Neural Network (CNN). In the proposed solution, the CNN is designed to directly estimate the three-dimensional position of a single acoustic source using the raw audio signal as the input information and avoiding the use of hand-crafted audio features. Given the limited amount of available localization data, we propose, in this paper, a training strategy based on two steps. We first train our network using semi-synthetic data generated from close talk speech recordings. We simulate the time delays and distortion suffered in the signal that propagate from the source to the array of microphones. We then fine tune this network using a small amount of real data. Our experimental results, evaluated on a publicly available dataset recorded in a real room, show that this approach is able to produce networks that significantly improve existing localization methods based on SRP-PHAT strategies and also those presented in very recent proposals based on Convolutional Recurrent Neural Networks (CRNN). In addition, our experiments show that the performance of our CNN method does not show a relevant dependency on the speaker’s gender, nor on the size of the signal window being used.
機(jī)譯:本文提出了一種基于卷積神經(jīng)網(wǎng)絡(luò)(CNN)的使用麥克風(fēng)陣列進(jìn)行室內(nèi)聲源定位的新方法。在提出的解決方案中,CNN旨在使用原始音頻信號(hào)作為輸入信息來(lái)直接估計(jì)單個(gè)聲源的三維位置,并避免使用手工制作的音頻功能。鑒于可用的本地化數(shù)據(jù)數(shù)量有限,我們?cè)诒疚闹刑岢隽艘环N基于兩個(gè)步驟的訓(xùn)練策略。我們首先使用從近距離語(yǔ)音記錄生成的半合成數(shù)據(jù)來(lái)訓(xùn)練我們的網(wǎng)絡(luò)。我們模擬了從源傳播到麥克風(fēng)陣列的信號(hào)中的時(shí)間延遲和失真。然后,我們使用少量實(shí)際數(shù)據(jù)微調(diào)此網(wǎng)絡(luò)。我們對(duì)在真實(shí)房間中記錄的公開(kāi)可用數(shù)據(jù)集進(jìn)行的實(shí)驗(yàn)結(jié)果表明,這種方法能夠產(chǎn)生可顯著改善基于SRP-PHAT策略以及基于卷積遞歸神經(jīng)網(wǎng)絡(luò)的最新提議中提出的定位方法的網(wǎng)絡(luò)。網(wǎng)絡(luò)(CRNN)。另外,我們的實(shí)驗(yàn)表明,我們的CNN方法的性能并未顯示出與說(shuō)話者性別的相關(guān)性,也沒(méi)有顯示出所使用的信號(hào)窗口的大小。

著錄項(xiàng)

相似文獻(xiàn)

  • 外文文獻(xiàn)
  • 中文文獻(xiàn)
  • 專利
代理獲取

客服郵箱:kefu@zhangqiaokeyan.com

京公網(wǎng)安備:11010802029741號(hào) ICP備案號(hào):京ICP備15016152號(hào)-6 六維聯(lián)合信息科技 (北京) 有限公司?版權(quán)所有
  • 客服微信

  • 服務(wù)號(hào)