Content deleted Content added
Bunnyhop11 (talk | contribs) m simplify link |
Removed a paragraph from the lead Tags: Mobile edit Mobile web edit Advanced mobile edit |
||
(16 intermediate revisions by 16 users not shown) | |||
Line 2:
{{good article}}
{{Infobox software
| logo
| logo size = 64px
| name = Windows Speech Recognition▼
|
| screenshot_size
| caption
| developer
| released
| operating system = [[Windows Vista]] and later
| genre
}}
'''Windows Speech Recognition''' ('''WSR''') is [[speech recognition]] developed by [[Microsoft]] for [[Windows Vista]] that enables [[hands-free computing|voice commands]] to control the [[desktop metaphor|desktop]] [[user interface]]
WSR is a locally processed speech recognition platform; it does not rely on [[cloud computing]] for accuracy, dictation, or recognition, but adapts based on contexts, grammars, speech samples, training sessions, and vocabularies. It provides a personal dictionary that allows users to include or exclude words or expressions from dictation and to record pronunciations to increase recognition accuracy. Custom language models are also supported.
Line 19 ⟶ 20:
==History==
Microsoft was involved in speech recognition and [[speech synthesis]] research for many years before WSR. In 1993, Microsoft hired [[Xuedong Huang]] from [[Carnegie Mellon University]] to lead its speech development efforts; the company's research led to the development of the [[Speech Application Programming Interface|Speech API]] (SAPI) introduced in 1994.<ref name="TalkingWindowsVista">{{cite web |url=http://msdn2.microsoft.com/en-us/magazine/cc163663.aspx |title=Exploring New Speech Recognition And Synthesis APIs In Windows Vista |last=Brown |first=Robert |publisher=[[Microsoft]] |work=MSDN Magazine |archive-url=https://web.archive.org/web/20080307054756/http://msdn2.microsoft.com/en-us/magazine/cc163663.aspx |archive-date=March 7, 2008 |access-date=June 26, 2015}}</ref> Speech recognition had also been used in previous Microsoft products. [[Office XP]] and [[Microsoft Office 2003|Office 2003]] provided speech recognition capabilities among [[Internet Explorer]] and [[Microsoft Office]] applications;<ref name="SpeechXP">{{cite web |url=https://support.microsoft.com/en-us/kb/306901 |title=How To Use Speech Recognition in Windows XP |publisher=[[Microsoft]] |work=Windows Support |archive-url=https://web.archive.org/web/20150314222444/https://support.microsoft.com/en-us/kb/306901 |archive-date=March 14, 2015 |access-date=May 15, 2020}}</ref> it also enabled limited speech functionality in [[Windows 98]], [[Windows
===Windows Vista===
[[File:WindowsVistaPreliminaryWSR.PNG|thumb|right|A prototype speech recognition [[Windows Aero#Aero Wizards|Aero Wizard]] in [[Windows Vista]] (then known as "Longhorn") [[Development of Windows Vista#Milestone 7|build 4093]].]]
At [[Windows Hardware Engineering Conference|WinHEC 2002]] Microsoft announced that Windows Vista (codenamed "Longhorn") would include advances in speech recognition and in features such as [[microphone array]] support<ref name="WinHEC2002">{{cite web |url=https://www.pcmag.com/article2/0,2817,1183143,00.asp |title=WinHEC: The Pregame Show |last=Stam |first=Nick |date=April 16, 2002 |publisher=[[Ziff Davis Media]] |work=[[PC Magazine]] |archive-url=https://web.archive.org/web/20150703193044/https://www.pcmag.com/article2/0,2817,1183143,00.asp |archive-date=July 3, 2015 |access-date=May 15, 2020}}</ref> as part of an effort to "provide a consistent quality audio infrastructure for natural (continuous) speech recognition and (discrete) command and control."<ref name="AudioConsiderations">{{cite web |url=http://download.microsoft.com/download/whistler/WHP/1.0/WXP/EN-US/WH02_AV01.exe |title=Audio Considerations for Voice-Enabled Applications |last=Flandern Van |first=Mike |date=2002 |publisher=[[Microsoft]] |work=[[Windows Hardware Engineering Conference]] |format=EXE |archive-url=https://web.archive.org/web/20020506020208/http://download.microsoft.com/download/whistler/WHP/1.0/WXP/EN-US/WH02_AV01.exe |archive-date=May 6, 2002 |access-date=March 30, 2018}}</ref> [[Bill Gates]] stated during [[Professional Developers Conference|PDC 2003]] that Microsoft would "build speech capabilities into the system — a big advance for that in 'Longhorn,' in both recognition and synthesis, real-time";<ref name="SpeechCapabilities">{{cite web |url=http://www.microsoft.com/billgates/speeches/2003/10-27PDC2003.asp |title=Bill Gates' Web Site — Speech Transcript, Microsoft Professional Developers Conference 2003 |publisher=[[Microsoft]] |date=October 27, 2003 |archive-url=https://web.archive.org/web/20040203152133/http://www.microsoft.com/billgates/speeches/2003/10-27PDC2003.asp |archive-date=February 3, 2004 |access-date=May 15, 2020}}</ref><ref name="SpeechPDC2003">{{cite web |url=http://windowsitpro.com/windows-server-2008/live-pdc-2003-day-1-monday |title=Live from PDC 2003: Day 1, Monday |last2=Furman |first2=Keith |
During WinHEC 2004 Microsoft included WSR as part of a strategy to improve productivity on mobile PCs.<ref name="MobilePCs">{{cite web |url=http://download.microsoft.com/download/1/8/f/18f8cee2-0b64-41f2-893d-a6f2295b40c8/SW04023_WINHEC2004.ppt |title=Windows For Mobile PCs And Tablet PCs — CY05 And Beyond |last=Suokko |first=Matti |date=2004 |publisher=[[Microsoft]] |format=PPT |archive-url=https://web.archive.org/web/20051214170817/http://download.microsoft.com/download/1/8/f/18f8cee2-0b64-41f2-893d-a6f2295b40c8/SW04023_WINHEC2004.ppt |archive-date=December 14, 2005 |access-date=May 15, 2020}}</ref><ref name="MobilePCs04">{{cite web |url=http://download.microsoft.com/download/1/8/f/18f8cee2-0b64-41f2-893d-a6f2295b40c8/SW04022_WINHEC2004.ppt |title=Windows For Mobile PCs and Tablet PCs — CY04 |last=Fish |first=Darrin |date=2004 |publisher=[[Microsoft]] |format=PPT |archive-url=https://web.archive.org/web/20051214170759/http://download.microsoft.com/download/1/8/f/18f8cee2-0b64-41f2-893d-a6f2295b40c8/SW04022_WINHEC2004.ppt |archive-date=December 14, 2005 |access-date=May 15, 2020}}</ref> Microsoft later emphasized [[accessibility]], new mobility scenarios, support for additional languages, and improvements to the speech user experience at WinHEC 2005. Unlike the speech support included in Windows XP, which was integrated with the Tablet PC Input Panel and required switching between separate Commanding and Dictation modes, Windows Vista would introduce a dedicated interface for speech input on the desktop and would unify the separate speech modes;<ref name="NaturalInput">{{cite web |url=http://download.microsoft.com/download/9/8/f/98f3fe47-dfc3-4e74-92a3-088782200fe7/TWDT05006_WinHEC05.ppt |title=Natural Input on Mobile PC Systems |last=Dresevic |first=Bodin |date=2005 |publisher=[[Microsoft]] |format=PPT |archive-url=https://web.archive.org/web/20051214132222/http://download.microsoft.com/download/9/8/f/98f3fe47-dfc3-4e74-92a3-088782200fe7/TWDT05006_WinHEC05.ppt |archive-date=December 14, 2005 |access-date=May 15, 2020}}</ref> users previously could not speak a command after dictating or vice versa without first switching between these two modes.<ref name="CommandingandDictation">{{cite web |url=http://blogs.msdn.com/b/robch/archive/2005/08/01/446131.aspx |title=Commanding and Dictation — One mode or two in Windows Vista? |last=Chambers |first=Rob |date=August 1, 2005 |publisher=[[Microsoft]] |work=[[Microsoft Developer Network|MSDN]] |access-date=June 30, 2015}}</ref> Windows Vista Beta 1 included integrated speech recognition.<ref name="WindowsVistaBeta1">{{cite web |url=http://winsupersite.com/product-review/windows-vista-beta-1-review-part-3 |title=Windows Vista Beta 1 Review (Part 3) |last=Thurrott |first=Paul
During a demonstration by Microsoft on July 27, 2006—before Windows Vista's [[release to manufacturing]] (RTM)—a notable incident involving WSR occurred that resulted in an unintended output of "Dear aunt, let's set so double the killer delete select all" when several attempts to dictate led to consecutive output errors;<ref name="GoodDemos">{{cite web |url=http://blogs.reuters.com/blog/archives/1991 |title=Updated – When good demos go (very, very) bad |last=Auchard |first=Eric |date=July 28, 2006 |publisher=[[Thomson Reuters]] |archive-url=https://web.archive.org/web/20110521230956/http://blogs.reuters.com/blog/archives/1991 |archive-date=May 21, 2011 |url-status=dead |access-date=March 29, 2018}}</ref><ref name="MSNBC">{{cite web|url=
Reports from early 2007 indicated that WSR is vulnerable to attackers using speech recognition for malicious operations by playing certain audio commands through a target's speakers;<ref name="SpeechRecognitionHole">{{cite web |url=http://news.bbc.co.uk/2/hi/technology/6320865.stm |title=Vista has speech recognition hole |date=February 1, 2007 |publisher=[[British Broadcasting Corporation|BBC]] |work=[[BBC News]] |archive-url=https://web.archive.org/web/20070203051551/http://news.bbc.co.uk/2/hi/technology/6320865.stm |archive-date=February 3, 2007 |access-date=May 15, 2020}}</ref><ref name="RemoteExploit">{{cite web |url=https://www.engadget.com/2007/02/01/remote-exploit-of-vista-speech-reveals-fatal-flaw/ |title=Remote 'exploit' of Vista Speech reveals fatal flaw |last=Miller |first=Paul |date=February 1, 2007 |publisher=[[AOL]] |work=[[Engadget]] |access-date=June 28, 2015}}</ref> it was the first vulnerability discovered after Windows Vista's [[Software release life cycle#General availability|general availability]].<ref name="PCWorld">{{cite web |url=http://www.pcworld.com/article/id,128737-c,vistalonghorn/article.html |title=Honeymoon's Over: First Windows Vista Flaw |last=Roberts |first=Paul |date=February 1, 2007 |publisher=[[International Data Group|IDG]] |work=[[PCWorld]] |archive-url=https://web.archive.org/web/20070204030144/http://www.pcworld.com/article/id,128737-c,vistalonghorn/article.html |archive-date=February 4, 2007 |access-date=June 28, 2015}}</ref> Microsoft stated that although such an attack is theoretically possible, a number of mitigating factors and prerequisites would limit its effectiveness or prevent it altogether: a target would need the recognizer to be active and configured to properly interpret such commands; microphones and speakers would both need to be enabled and at sufficient volume levels; and an attack would require the computer to perform visible operations and produce audible feedback without users noticing. [[User Account Control]] would also prohibit the occurrence of privileged operations.<ref name="SpeechIssue">{{cite web |url=https://blogs.technet.microsoft.com/msrc/2007/01/31/issue-regarding-windows-vista-speech-recognition/ |title=Issue regarding Windows Vista Speech Recognition |date=January 31, 2007 |publisher=[[Microsoft]] |work=[[Microsoft TechNet|TechNet]] |archive-url=https://web.archive.org/web/20160520045703/https://blogs.technet.microsoft.com/msrc/2007/01/31/issue-regarding-windows-vista-speech-recognition/ |url-status=dead |archive-date=May 20, 2016 |access-date=March 31, 2018}}</ref>
Line 42 ⟶ 43:
===Windows 10===
WSR is featured in the [[Settings (Windows)|Settings]] application starting with the Windows 10 April 2018 Update ([[Windows 10 version history|Version 1803]]); the change first appeared in [[Windows Insider|Insider]] Preview Build 17083.<ref name="WSRInsider">{{cite web |url=https://blogs.windows.com/windowsexperience/2018/01/24/announcing-windows-10-insider-preview-build-17083-for-pc/ |title=Announcing Windows 10 Insider Preview Build 17083 for PC |last=Sarkar |first=Dona |date=January 24, 2018 |publisher=[[Microsoft]] |work=Windows Blogs |archive-url=https://web.archive.org/web/20180124224723/https://blogs.windows.com/windowsexperience/2018/01/24/announcing-windows-10-insider-preview-build-17083-for-pc/ |archive-date=January 24, 2018 |access-date=May 15, 2020}}</ref> The April 2018 Update also introduces a new {{keypress|Win}}+{{keypress|Ctrl}}+{{keypress|S}} keyboard shortcut to activate WSR.<ref name="KeyboardShortcutsAccessibility">{{cite web |url=https://support.microsoft.com/en-us/help/13810/windows-keyboard-shortcuts-accessibility |title=Windows keyboard shortcuts for accessibility |publisher=[[Microsoft]] |work=Windows Support |archive-url=https://web.archive.org/web/20181012161947/https://support.microsoft.com/en-us/help/13810/windows-keyboard-shortcuts-accessibility |archive-date=October 12, 2018 |access-date=January 8, 2019}}</ref>
===Windows 11===
In Windows 11 version 22H2, a second Microsoft app, Voice Access, was added in addition to WSR.<ref>{{Cite web |title=Set up voice access - Microsoft Support |url=https://support.microsoft.com/en-us/topic/set-up-voice-access-9fc44e29-12bf-4d86-bc4e-e9bb69df9a0e |access-date=2022-12-10 |website=support.microsoft.com}}</ref><ref>{{Cite web |last=Hachman |first=Mark |title=New Windows 11 build tests Voice Access, Spotlight backgrounds |url=https://www.pcworld.com/article/558293/new-windows-11-build-tests-voice-access-spotlight-backgrounds-feature.html |access-date=2022-12-10 |website=PCWorld |language=en}}</ref> In December 2023 Microsoft announced that WSR is deprecated in favor of Voice Access and may be removed in a future build or release of Windows.<ref name="DeprecatedFeatures">{{cite web |url=https://learn.microsoft.com/en-us/windows/whats-new/deprecated-features |title=Deprecated features in the Windows client - What's new in Windows |author=[[Microsoft]] |access-date=December 7, 2023}}</ref>
==Overview and features==
WSR allows a user to control applications and the Windows [[desktop metaphor|desktop]] [[user interface]] through voice commands.<ref name="Guide"/> Users can dictate text within documents, email, and forms; control the operating system user interface; perform [[keyboard shortcut]]s; and move the [[cursor (computing)|mouse cursor]].<ref name="CommonCommands">{{cite web |url=http://windows.microsoft.com/en-us/windows/common-speech-recognition-commands#1TC=windows-vista |title=Windows Speech Recognition commands |publisher=[[Microsoft]] |work=Windows Support |access-date=May 15, 2020}}</ref> The majority of integrated applications in Windows Vista can be controlled;<ref name="Guide">{{cite web |url=https://msdn.microsoft.com/en-us/library/bb530325.aspx |title=Windows Vista Speech Recognition Step-by-Step Guide |last=Phillips |first=Todd |date=2007 |publisher=[[Microsoft]] |work=[[MSDN]] |access-date=June 30, 2015}}</ref> third-party applications must support the Text Services Framework for dictation.<ref name="TalkingWindowsVista"/> [[American English|English (U.S.)]], [[British English|English (U.K.)]], [[French language|French]], [[German language|German]], [[Japanese language|Japanese]], [[Mandarin Chinese]], and [[Spanish language|Spanish]] are supported languages.<ref name="SpeechRecognition">{{cite web |url=https://www.microsoft.com/enable/products/windowsvista/speech.aspx |title=Windows Speech Recognition |publisher=[[Microsoft]] |work=Microsoft Accessibility |archive-url=https://web.archive.org/web/20070204044614/https://www.microsoft.com/enable/products/windowsvista/speech.aspx |archive-date=February 4, 2007 |access-date=May 15, 2020}}</ref>
When started for the first time, WSR presents a microphone setup wizard and an optional interactive step-by-step tutorial that users can commence to learn basic commands while adapting the recognizer to their specific voice characteristics;<ref name="Guide"/> the tutorial is estimated to require approximately 10 minutes to complete.<ref name="MSR8">{{cite web |url=http://www.pcworld.com/article/3124761/windows/the-windows-weakness-no-one-mentions-speech-recognition.html |title=The Windows weakness no one mentions: Speech recognition |last=Hachman |first=Mark |date=May 10, 2017 |publisher=[[International Data Group|IDG]] |work=[[PC World]] |access-date=March 28, 2018}}</ref> The accuracy of the recognizer increases through regular use, which adapts it to contexts, grammars, patterns, and vocabularies.<ref name="SpeechRecognition"/><ref name="Privacy"/> Custom language models for the specific contexts, phonetics, and terminologies of users in particular occupational fields such as legal or medical are also supported.<ref name="CustomizedVocabularies">{{cite web |url=https://blogs.msdn.microsoft.com/robch/2005/09/20/customized-speech-vocabularies-in-windows-vista/ |title=Customized speech vocabularies in Windows Vista |last=Chambers |first=Rob |date=September 20, 2005 |publisher=[[Microsoft]] |work=[[Microsoft Developer Network|MSDN]] |access-date=March 29, 2018}}</ref> With [[Windows Search]],<ref name="ThurrottAllchin">{{cite web |url=http://www.itprotoday.com/jim-allchin-talks-windows-vista |title=Jim Allchin Talks Windows Vista |last=Thurrott |first=Paul
WSR is a locally processed speech recognition platform; it does not rely on cloud computing for accuracy, dictation, or recognition.<ref name="MicrosoftPrivacyStatement">{{cite web |url=https://privacy.microsoft.com/en-us/privacystatement |title=Microsoft Privacy Statement |publisher=[[Microsoft]] |access-date=May 12, 2020}}</ref> Speech profiles that store information about users are retained locally.<ref name="Privacy"/> Backups and transfers of profiles can be performed via [[Windows Easy Transfer]].<ref name="Transfer">{{cite web |url=http://blogs.msdn.com/b/robch/archive/2007/02/15/transferring-windows-speech-recognition-profiles-from-one-machine-to-another.aspx |title=Transferring Windows Speech Recognition profiles from one machine to another |last=Chambers |first=Rob |date=February 15, 2007 |publisher=[[Microsoft]] |work=[[Microsoft Developer Network|MSDN]] |access-date=June 28, 2015}}</ref>
Line 77 ⟶ 81:
====''Show Numbers''====
Applications and interface elements that do not present identifiable commands can still be controlled by asking the system to overlay numbers on top of them through a ''Show Numbers'' command. Once active, speaking the overlaid number selects that item so a user can open it or perform other operations.<ref name="CommonCommands"/> ''Show Numbers'' was designed so that users could interact with items that are not readily identifiable.<ref
[[File:Show numbers.png|thumb|160px|left|The Show Numbers command overlaying numbers in the [[Games for Windows#Games Explorer|Games Explorer]].]]
===Dictation===
Line 158 ⟶ 162:
==Performance==
{{As of|2017}} WSR uses Microsoft Speech Recognizer 8.0, the version introduced in Windows Vista. For dictation it was found to be 93.6% accurate without training by Mark Hachman, a Senior Editor of ''[[PC World]]''—a rate that is not as accurate as competing software. According to Microsoft, the rate of accuracy when trained is 99%. Hachman opined that Microsoft does not publicly discuss the feature because of the 2006 incident during the development of Windows Vista, with the result being that few users knew that documents could be dictated within Windows before the introduction of [[Cortana (virtual assistant)|Cortana]].<ref name="MSR8"/>
==See also==
* [[Braina]]
* [[List of speech recognition software]]
* [[Microsoft Cordless Phone System]]
|