Windows Speech Recognition: Difference between revisions

Content deleted Content added
m Micorosoft->Microsoft - Fix a typo in one click
No edit summary
Tag: references removed
Line 12:
WSR is a locally processed speech recognition platform; it does not rely on [[cloud computing]] for accuracy, dictation, or recognition, but adapts based on contexts, grammars, speech samples, training sessions, and vocabularies. It provides a personal dictionary that allows users to include or exclude words or expressions from dictation and to optionally record pronunciations to increase recognition accuracy. With [[Windows Search]],<ref name="ThurrottAllchin">{{cite web |url=http://www.itprotoday.com/jim-allchin-talks-windows-vista |title=Jim Allchin Talks Windows Vista |last=Thurrott |first=Paul |authorlink=Paul Thurrott |date=October 6, 2010 |publisher=[[Penton (company)|Penton]] |work=[[Windows IT Pro]] |accessdate=March 29, 2018}}</ref> it can optionally analyze and collect text in documents, email, as well as [[handwritten]] [[tablet PC]] input to contextualize and disambiguate terms to further adapt the recognizer.<ref name="Privacy">{{cite web |url=http://download.microsoft.com/download/7/9/4/7945a146-fc32-48c2-8c14-83b1b36696e5/Windows%20Vista%20Privacy%20Statement.rtf |title=Windows Vista Privacy Statement |author=[[Microsoft]] |date=2006 |format=RTF |accessdate=July 1, 2015}}</ref> Custom language models that adapt the recognizer to the specific contexts, phonetics, and terminologies of users in particular occupational fields such as legal or medical are also supported.<ref name="CustomizedVocabularies">{{cite web |url=https://blogs.msdn.microsoft.com/robch/2005/09/20/customized-speech-vocabularies-in-windows-vista/ |title=Customized speech vocabularies in Windows Vista |last=Chambers |first=Rob |date=September 20, 2005 |publisher=[[Microsoft]] |work=[[Microsoft Developer Network|MSDN]] |accessdate=March 29, 2018}}</ref>
 
WSR was developed to be integrated into Windows Vista, as Windows previously only supported speech recognition features exclusive to applications such as [[Windows Media Player]]. [[Microsoft Office XP]] introduced speech recognition, but it was mainly limited to [[Internet Explorer]] and [[Microsoft Office|Office]]. With the release of Windows Vista, [[Microsoft Office 2007|Office 2007]] and later versions of Office rely on WSR, replacing the separate Office speech recognition.<ref name="Office2007SR">{{cite web |url=https://support.office.com/en-us/article/What-happened-to-speech-recognition-c6541b32-82df-4c18-bfa5-c411f45337d3 |title=What happened to speech recognition? |publisher=[[Microsoft]] |work=Office Support |accessdate=November 9, 2016}}</ref> The majority of integrated applications in Windows Vista can be controlled through speech.<ref name="Guide">{{cite web |url=https://msdn.microsoft.com/en-us/library/bb530325.aspx |title=Windows Vista Speech Recognition Step-by-Step Guide |last=Phillips |first=Todd |date=2007 |publisher=[[Microsoft]] |work=[[MSDN]] |accessdate=June 30, 2015}}</ref> WSR is present in [[Windows 7]],<ref name="SpeechRecognitionWindows7">{{cite web |url=http://windows.microsoft.com/en-us/windows/what-can-do-speech-recognition#1TC=windows-7 |title=What can I do with Speech Recognition? |author=[[Microsoft]] |work=Windows How-to |accessdate=June 26, 2015}}</ref> [[Windows 8]],<ref name="Windows8SR">{{cite web |url=http://windows.microsoft.com//en-US//windows-8//using-speech-recognition |title=How to use Speech Recognition |publisher=[[Microsoft]] |work=Support |archiveurl=https://web.archive.org/web/20121025193813/http://windows.microsoft.com//en-US//windows-8//using-speech-recognition |archivedate=October 25, 2012 |accessdate=December 24, 2018}}</ref> [[Windows 8.1]],<ref name="UpdatedGuidelines">{{cite web |url=https://support.microsoft.com/en-us/help/14213/windows-how-to-use-speech-recognition |title=How to use Speech Recognition in Windows |date=August 31, 2016 |publisher=[[Microsoft]] |work=Support |accessdate=December 24, 2018}}</ref> [[Windows RT]],<ref name="UpdatedGuidelines"/> and [[Windows 10]].<ref name="Windows10">{{cite web |url=http://windows.microsoft.com/en-us/windows-10/use-voice-recognition-in-windows-10 |title=Use Voice Recognition in Windows 10 |author=[[Microsoft]] |work=Support |accessdate=August 24, 2015}}</ref>
 
==History==
 
===Precursors===
Microsoft was involved in speech recognition and [[speech synthesis]] research for many years before WSR. In 1993, Microsoft hired [[Xuedong Huang]] from [[Carnegie Mellon University]] to lead its speech development efforts; the company's research led to the development of the [[Speech Application Programming Interface|Speech API]] introduced in 1994.<ref name="TalkingWindowsVista">{{cite web |url=http://msdn2.microsoft.com/en-us/magazine/cc163663.aspx |title=Exploring New Speech Recognition And Synthesis APIs In Windows Vista |last=Brown |first=Robert |publisher=[[Microsoft]] |work=MSDN Magazine |archiveurl=https://web.archive.org/web/20080307054756/http://msdn2.microsoft.com/en-us/magazine/cc163663.aspx |archivedate=March 7, 2008 |accessdate=June 26, 2015}}</ref> Speech recognition had also been used in previous Microsoft products. Office XP and [[Microsoft Office 2003|Office 2003]] provided speech recognition capabilities among Internet Explorer and Office applications;<ref name="SpeechXP">{{cite web |url=https://support.microsoft.com/en-us/kb/306901 |title=How To Use Speech Recognition in Windows XP |author=[[Microsoft]] |work=Support |accessdate=June 26, 2015}}</ref> it also enabled limited speech functionality in [[Windows 98]], [[Windows ME]], [[Windows NT 4.0]], and [[Windows 2000]].<ref name="Description">{{cite web |url=https://support.microsoft.com/en-us/kb/278927 |title=Description of the speech recognition and handwriting recognition methods in Word 2002 |author=[[Microsoft]] |work=Support |archiveurl=https://web.archive.org/web/20150703125056/https://support.microsoft.com/en-us/kb/278927 |archivedate=July 3, 2015 |accessdate=March 26, 2018}}</ref> [[Windows XP]] [[Windows XP editions#Tablet PC Edition|Tablet PC Edition]] 2002 included speech recognition capabilities with the Tablet PC Input Panel,<ref name="WindowsXPTabletPCEdition">{{cite web |url=http://winsupersite.com/article/windows-xp2/windows-xp-tablet-pc-edition-reviewed-127413 |title=Windows XP Tablet PC Edition Review |last=Thurrott |first=Paul |authorlink=Paul Thurrott |date=June 25, 2002 |publisher=[[Penton (company)|Penton]] |work=[[Windows IT Pro]] |accessdate=June 26, 2015}}</ref><ref name="Natural">{{cite web |url=http://download.microsoft.com/download/9/8/f/98f3fe47-dfc3-4e74-92a3-088782200fe7/TWDT05006_WinHEC05.ppt |title=Natural Input On Mobile PC Systems |last=Dresevic |first=Bodin |date=2005 |publisher=[[Microsoft]] |format=PPT |accessdate=June 26, 2015}}</ref> and the [[Microsoft Plus!#Microsoft Plus! for Windows XP|Microsoft Plus! for Windows XP]] expansion package enabled voice commands to be used in [[Windows Media Player]].<ref name="VoiceCommand">{{cite web |url=http://winsupersite.com/article/product-review/plus-for-windows-xp-review |title=Plus! for Windows XP Review |last=Thurrott |first=Paul |authorlink=Paul Thurrott |date=October 6, 2010 |publisher=[[Penton (company)|Penton]] |work=[[Windows IT Pro]] |accessdate=June 30, 2015}}</ref> However, this required installation of speech recognition as an additional component (with support primarily limited to individual applications); before Windows Vista, Windows did not include extensive or integrated speech recognition capabilities.<ref name="Natural"/>
 
===Development===
Line 23:
 
====Windows Vista====
At the 2002 [[Windows Hardware Engineering Conference]] (|WinHEC 2002)]] Microsoft announced that Windows Vista (then codenamed "Longhorn") would include advances in speech recognition and in features such as [[microphone array]] support;<ref name="WinHEC2002">{{cite web |url=https://www.pcmag.com/article2/0,2817,1183143,00.asp |title=WinHEC: The Pregame Show |last=Stam |first=Nick |date=April 16, 2002 |publisher=[[Ziff Davis Media]] |work=[[PC Magazine]] |accessdate=June 26, 2015}}</ref> these features wereas part of the company'san goaleffort to "provide a consistent quality audio infrastructure for natural (continuous) speech recognition and (discrete) command and control."<ref name="AudioConsiderations">{{cite web |url=http://download.microsoft.com/download/whistler/WHP/1.0/WXP/EN-US/WH02_AV01.exe |title=Audio Considerations for Voice-Enabled Applications |last=Flandern Van |first=Mike |date=2002 |publisher=[[Microsoft]] |work=[[Windows Hardware Engineering Conference]] |format=EXE |archiveurl=https://web.archive.org/web/20020506020208/http://download.microsoft.com/download/whistler/WHP/1.0/WXP/EN-US/WH02_AV01.exe |archivedate=May 6, 2002 |accessdate=March 30, 2018}}</ref> [[Bill Gates]] stated during the 2003 [[Professional Developers Conference]] (|PDC 2003)]] that Microsoft would "build speech capabilities into the system -- a big advance for that in 'Longhorn,' in both recognition and synthesis, real-time";<ref name="SpeechCapabilities">{{cite web |url=http://www.microsoft.com/billgates/speeches/2003/10-27PDC2003.asp |title=Bill Gates' Web Site - Speech Transcript, Microsoft Professional Developers Conference 2003 |author=[[Microsoft]] |date=October 27, 2003 |archiveurl=https://web.archive.org/web/20040203152133/http://www.microsoft.com/billgates/speeches/2003/10-27PDC2003.asp |archivedate=February 3, 2004 |accessdate=June 26, 2015}}</ref><ref name="SpeechPDC2003">{{cite web |url=http://windowsitpro.com/windows-server-2008/live-pdc-2003-day-1-monday |title=Live from PDC 2003: Day 1, Monday |last2=Furman |first2=Keith |last=Thurrott |first=Paul |date=October 26, 2003 |publisher=[[Penton (company)|Penton]] |work=[[Windows IT Pro]] |accessdate=June 26, 2015}}</ref> and pre-release builds throughout theduring [[development of Windows Vista]] included a speech engine with training features.<ref name="Windows2006">{{cite web |url=http://www.techhive.com/article/113631/article.html |title=Your Next OS: Windows 2006? |last=Spanbauer |first=Scott |date=December 4, 2003 |publisher=[[International Data Group|IDG]] |work=TechHive |accessdate=June 25, 2015}}</ref> A PDC 2003 developer presentation stated that Windows Vista would also include a user interface for microphone feedback and control, and user configuration and training features.<ref name="UserInputPDC2003">{{cite web |url=http://download.microsoft.com/download/6/6/9/669C56E3-12AF-48C5-AB2A-E7705F1BE37F/CLI351.ppt |title=Keyboard, Speech, and Pen Input in Your Controls |last2=Chambers |first2=Rob |last1=Gjerstad |first=Kevin |date=2003 |publisher=[[Microsoft]] |work=[[Professional Developers Conference]] |format=PPT |archiveurl=https://web.archive.org/web/20121219161523/http://download.microsoft.com/download/6/6/9/669C56E3-12AF-48C5-AB2A-E7705F1BE37F/CLI351.ppt |archivedate=December 19, 2012 |accessdate=March 30, 2018}}</ref> Microsoft later clarified the extent to which speech recognition would be integrated when it stated in a pre-release [[software development kit]] that "the common speech scenarios, like speech-enabling menus and buttons, will be enabled system-wide."<ref name="SpeechRecognitionLonghorn">{{cite web |url=http://longhorn.msdn.microsoft.com/lhsdk/speech/speechconcepts.aspx |title=Interacting with the Computer using Speech Input and Speech Output |author=[[Microsoft]] |date=2003 |work=[[MSDN]] |archiveurl=https://web.archive.org/web/20040104193115/http://longhorn.msdn.microsoft.com/lhsdk/speech/speechconcepts.aspx |archivedate=January 4, 2004 |accessdate=June 28, 2015}}</ref>
 
During WinHEC 2004, Microsoft listedincluded WSR as part of its "Longhorn" mobile PCa strategy to improve productivity on mobile PCs.<ref name="MobilePCs">{{cite web |url=http://download.microsoft.com/download/1/8/f/18f8cee2-0b64-41f2-893d-a6f2295b40c8/SW04023_WINHEC2004.ppt |title=Windows For Mobile PCs And Tablet PCs - CY05 And Beyond |last=Suokko |first=Matti |date=2004 |publisher=[[Microsoft]] |archiveurl=https://web.archive.org/web/20051214170817/http://download.microsoft.com/download/1/8/f/18f8cee2-0b64-41f2-893d-a6f2295b40c8/SW04023_WINHEC2004.ppt |archivedate=December 14, 2005 |format=PPT |accessdate=July 15, 2015}}</ref><ref name="MobilePCs04">{{cite web |url=http://download.microsoft.com/download/1/8/f/18f8cee2-0b64-41f2-893d-a6f2295b40c8/SW04022_WINHEC2004.ppt |title=Windows For Mobile PCs and Tablet PCs - CY04 |last=Fish |first=Darrin |date=2004 |publisher=[[Microsoft]] |archiveurl=https://web.archive.org/web/20051214170759/http://download.microsoft.com/download/1/8/f/18f8cee2-0b64-41f2-893d-a6f2295b40c8/SW04022_WINHEC2004.ppt |archivedate=December 14, 2005 |format=PPT |accessdate=July 15, 2015}}</ref> AtMicrosoft WinHEC 2005, Microsoftlater emphasized [[accessibility]], new mobility scenarios, support for additional languages, and improvements to the speech user experience at WinHEC 2005. Unlike the speech support included in Windows XP, which was integrated with the Tablet PC Input Panel and required switching between separate Commanding and Dictation modes, Windows Vista would introduce a dedicated interface for speech input on the desktop and would unify the separate speech modes;<ref name="NaturalInput">{{cite web |url=http://download.microsoft.com/download/9/8/f/98f3fe47-dfc3-4e74-92a3-088782200fe7/TWDT05006_WinHEC05.ppt |title=Natural Input on Mobile PC Systems |last=Dresevic |first=Bodin |date=2005 |publisher=[[Microsoft]] |format=PPT |archiveurl=https://web.archive.org/web/20051214132222/http://download.microsoft.com/download/9/8/f/98f3fe47-dfc3-4e74-92a3-088782200fe7/TWDT05006_WinHEC05.ppt |archivedate=December 14, 2005 |accessdate=March 29, 2018}}</ref> users previously could not speak a command after dictating or vice versa without first switching between these two modes.<ref name="CommandingandDictation">{{cite web |url=http://blogs.msdn.com/b/robch/archive/2005/08/01/446131.aspx |title=Commanding and Dictation - One mode or two in Windows Vista? |last=Chambers |first=Rob |date=August 1, 2005 |publisher=[[Microsoft]] |work=[[Microsoft Developer Network|MSDN]] |accessdate=June 30, 2015}}</ref> Microsoft also stated that Windows Vista would improve dictation accuracy and support additional language;<ref name="NaturalInput"/> a demonstration emphasized email dictation,<ref name="NaturalInput"/> and a presentation about microphone arrays was also shown.<ref name="MicrophoneArray">{{cite web |url=http://download.microsoft.com/download/9/8/f/98f3fe47-dfc3-4e74-92a3-088782200fe7/TWEN05009_WinHEC05.ppt |title=Microphone Array Support in Windows Longhorn |last2=Strande |first2=Hakon |last1=Tashev |first1=Ivan |publisher=[[Microsoft]] |format=PPT |archiveurl=https://web.archive.org/web/20051221102019/http://download.microsoft.com/download/9/8/f/98f3fe47-dfc3-4e74-92a3-088782200fe7/TWEN05009_WinHEC05.ppt |archivedate=December 21, 2005 |accessdate=March 29, 2018}}</ref> Windows Vista Beta 1 included an integrated speech recognition application.<ref name="WindowsVistaBeta1">{{cite web |url=http://winsupersite.com/product-review/windows-vista-beta-1-review-part-3 |title=Windows Vista Beta 1 Review (Part 3) |last=Thurrott |first=Paul |authorlink=Paul Thurrott |date=October 6, 2010 |publisher=[[Penton (company)|Penton]] |work=[[Windows IT Pro]] |accessdate=June 26, 2015}}</ref> To incentivize company employees to analyze WSR for software [[software bug|glitch]]es and to provide feedback during its development, Microsoft offered an opportunity for its testers to win a Premium model of the [[Xbox 360]].<ref name="MicrosoftWSRPoster">{{cite web |url=http://www.brian.levy3.net/proj_msft_poster1.html |title=Microsoft Speech Recognition poster |last=Levy |first=Brian |date=2006 |archiveurl=https://web.archive.org/web/20061011080004/http://brian.levy3.net/proj_msft_poster1.html |archivedate=October 11, 2006 |accessdate=March 17, 2016}}</ref>
 
During a demonstration by Microsoft on July 27, 2006, before2006—before Windows Vista's [[release to manufacturing]] (RTM), a—a notable incident involving WSR occurred that resulted in an unintended output of "Dear aunt, let's set so double the killer delete select all" when several attempts to dictate led to consecutive output errors;<ref name="GoodDemos">{{cite web |url=http://blogs.reuters.com/blog/archives/1991 |title=UPDATED-When good demos go (very, very) bad |last=Auchard |first=Eric |date=July 28, 2006 |publisher=[[Thomson Reuters]] |archiveurl=https://web.archive.org/web/20110521230956/http://blogs.reuters.com/blog/archives/1991 |archivedate=May 21, 2011 |accessdate=March 29, 2018}}</ref><ref name="MSNBC">{{cite web|url=http://www.nbcnews.com/id/14158843 |title=Software glitch foils Microsoft demo |author=[[NBC News]] |date=August 2, 2006 |publisher=[[Associated Press]] |accessdate=June 30, 2015 }}</ref> the incident was a subject of significant derision among analysts and journalists in the audience.<ref name="NeedsWork">{{cite web |url=http://www.infoworld.com/article/06/07/31/HNvoicevista_1.html |title=Vista voice-recognition feature needs work |last=Montalbano |first=Elizabeth |date=July 31, 2006 |publisher=[[International Data Group|IDG]] |work=[[InfoWorld]] |archiveurl=https://web.archive.org/web/20060805091528/http://www.infoworld.com/article/06/07/31/HNvoicevista_1.html |archivedate=August 5, 2006 |accessdate=June 26, 2015}}</ref><ref name="Stammers">{{cite web |url=http://www.techhive.com/article/126613/article.html |title=Vista's Voice Recognition Stammers |last=Montalbano |first=Elizabeth |date=July 31, 2006 |publisher=[[International Data Group|IDG]] |work=TechHive |accessdate=July 1, 2015}}</ref> Microsoft later revealed that these issues were due to an audio [[Gain (electronics)|gain]] glitch that caused the speech recognizer to distort the dictated words;<ref name="FAM">{{cite web |url=http://blogs.msdn.com/b/robch/archive/2006/07/29/682479.aspx |title=FAM: Vista SR Demo failure -- And now you know the rest of the story ... |last=Chambers |first=Rob |date=July 29, 2006 |publisher=[[Microsoft]] |work=[[Microsoft Developer Network|MSDN]] |accessdate=June 26, 2015}}</ref> the glitch was fixed before Windows Vista's release.<ref name="FAM"/>
 
=====Security report=====
InReports earlysurfaced 2007,in reportsearly surfaced2007 that WSR might be vulnerable to an attack that could allow attackers to play audio through a computer's speakers, thereby using speech recognition to perform undesired user operations on a target computer;<ref name="SpeechRecognitionHole">{{cite web |url=http://news.bbc.co.uk/2/hi/technology/6320865.stm |title=Vista has speech recognition hole |date=February 1, 2007 |publisher=[[British Broadcasting Corporation|BBC]] |work=[[BBC News]] |accessdate=March 29, 2018}}</ref><ref name="RemoteExploit">{{cite web |url=https://www.engadget.com/2007/02/01/remote-exploit-of-vista-speech-reveals-fatal-flaw/ |title=Remote 'exploit' of Vista Speech reveals fatal flaw |last=Miller |first=Paul |date=February 1, 2007 |publisher=[[AOL]] |work=[[Engadget]] |accessdate=June 28, 2015}}</ref> it was the first vulnerability discovered after Windows Vista's [[Software release life cycle#General availability|general availability]].<ref name="PCWorld">{{cite web |url=http://www.pcworld.com/article/id,128737-c,vistalonghorn/article.html |title=Honeymoon's Over: First Windows Vista Flaw |last=Roberts |first=Paul |date=February 1, 2007 |publisher=[[International Data Group|IDG]] |work=[[PCWorld]] |archiveurl=https://web.archive.org/web/20070204030144/http://www.pcworld.com/article/id,128737-c,vistalonghorn/article.html |archivedate=February 4, 2007 |accessdate=June 28, 2015}}</ref> While Microsoft stated that such an attack is theoretically possible, it would have to meet a number of prerequisites to be successful: the target system would have to have the speech recognition feature properly configured and activated; speakers and microphone(s) connected to the targeted system would need to be turned on; and the exploit would require the software to interpret commands without a user noticing—an unlikely scenario as the affected system would perform visible interface operations and produce audible feedback. Mitigating factors include dictation clarity and microphone feedback and placement. Because of [[User Account Control]], an exploit of this nature also would not be able to perform privileged operations for users or protected administrators without explicit consent.<ref name="SpeechIssue">{{cite web |url=https://blogs.technet.microsoft.com/msrc/2007/01/31/issue-regarding-windows-vista-speech-recognition/ |title=Issue regarding Windows Vista Speech Recognition |date=January 31, 2007 |publisher=[[Microsoft]] |work=[[Microsoft TechNet|TechNet]] |archive-url=https://web.archive.org/web/20160520045703/https://blogs.technet.microsoft.com/msrc/2007/01/31/issue-regarding-windows-vista-speech-recognition/ |url-status=dead |archivedate=May 20, 2016 |accessdate=March 31, 2018}}</ref>
 
====Windows 7====
Line 84:
 
====Speech dictionary====
WSR includes aA personal dictionary that allows users to include or exclude certain words or expressions from dictation is available.<ref name="CustomizedVocabularies"/> When a user adds a word beginning with a capital letter to the dictionary, a user can specify whether it should always be capitalized or if capitalization depends on the context in which the word is spoken. Users can also record pronunciations for words added to the dictionary to increase recognition accuracy; words written via a [[stylus]] on a [[tablet PC]] for the Windows [[handwriting recognition]] feature are also stored. Most of the informationInformation stored within a dictionary is included as part of a user's speech profile.<ref name="Privacy"/>
 
===Macros===