Content deleted Content added
→State of the art: | Add: pages, date, title, chapter, s2cid, authors 1-3. Removed URL that duplicated identifier. | Use this tool. Report bugs. | #UCB_Gadget | Removed URL that duplicated identifier. | Use this tool. Report bugs. | #UCB_Gadget |
free |
||
Line 29:
=== State of the art ===
TDNN-based phoneme recognizers compared favourably in early comparisons with HMM-based phone models.<ref name="phoneme detection" /><ref name=":3" /> Modern deep TDNN architectures include many more hidden layers and sub-sample or pool connections over broader contexts at higher layers. They achieve up to 50% word error reduction over [[Mixture model|GMM]]-based acoustic models.<ref name=":4">{{cite book |doi=10.21437/Interspeech.2015-647 |doi-access=free |s2cid=8536162 |chapter=A time delay neural network architecture for efficient modeling of long temporal contexts |title=Interspeech 2015 |date=2015 |last1=Peddinti |first1=Vijayaditya |last2=Povey |first2=Daniel |last3=Khudanpur |first3=Sanjeev |pages=3214–3218 }}</ref><ref name=":5">David Snyder, Daniel Garcia-Romero, Daniel Povey, ''[http://danielpovey.com/files/2015_asru_tdnn_ubm.pdf A Time-Delay Deep Neural Network-Based Universal Background Models for Speaker Recognition]'', Proceedings of ASRU 2015.</ref> While the different layers of TDNNs are intended to learn features of increasing context width, they do model local contexts. When longer-distance relationships and pattern sequences have to be processed, learning states and state-sequences is important and TDNNs can be combined with other modelling techniques.<ref name=":6">{{Cite journal |last1=Haffner |first1=Patrick |last2=Waibel |first2=Alex |date=1991 |title=Multi-State Time Delay Networks for Continuous Speech Recognition |url=https://proceedings.neurips.cc/paper_files/paper/1991/hash/069d3bb002acd8d7dd095917f9efe4cb-Abstract.html |website=proceedings.neurips.cc |volume=4 |publisher=NIPS |pages=135–142}}</ref><ref name=":1" /><ref name=":2" />
== Applications ==
|