Content deleted Content added
m cap, punct |
m cap, punct |
||
Line 28:
=== State of the art ===
TDNN-based phoneme recognizers compared favourably in early comparisons with HMM-based phone models.<ref name="phoneme detection" /><ref name=":3" /> Modern deep TDNN architectures include many more hidden layers and sub-sample or pool connections over broader contexts at higher layers. They achieve up to 50% word error reduction over [[Mixture model|GMM]]
==Applications==
Line 50:
=== Lip-reading – audio-visual speech ===
TDNNs were also successfully used in early demonstrations of audio-visual speech, where the sounds of speech are complemented by visually reading lip movement.<ref name=":7" /> Here, TDNN
=== Handwriting recognition ===
Line 56:
TDNNs have been used effectively in compact and high-performance handwriting recognition systems. Shift-invariance was also adapted to spatial patterns (x/y-axes) in image offline handwriting recognition.<ref name=":2" />
=== Video
Video has a temporal dimension that makes a TDNN an ideal solution to analysing motion patterns. An example of this analysis is a combination of vehicle detection and recognizing pedestrians.<ref>Christian Woehler and Joachim K. Anlauf, Real-time object recognition on image sequences with the adaptable time delay neural network algorithm—applications for autonomous vehicles." Image and Vision Computing 19.9 (2001): 593-618.</ref> When examining videos, subsequent images are fed into the TDNN as input where each image is the next frame in the video. The strength of the TDNN comes from its ability to examine objects shifted in time forward and backward to define an object detectable as the time is altered. If an object can be recognized in this manner, an application can plan on that object to be found in the future and perform an optimal action.
Line 64:
Two-dimensional TDNNs were later applied to other image-recognition tasks under the name of “[[Convolutional neural network|Convolutional Neural Networks]]”, where shift-invariant training is applied to the x/y axes of an image.
=== Common
*TDNNs can be implemented in virtually all machine-learning frameworks using one-dimensional [[convolutional neural network]]s, due to the equivalence of the methods.
|