Revision as of 16:08, 3 May 2020 edit 66.116.49.78 (talk) No edit summary Tag: references removed ← Previous edit		Revision as of 00:36, 6 May 2020 edit undo 66.116.49.78 (talk) No edit summary Next edit →
Line 8: \| genre = [[Speech recognition]] }} '''Windows Speech Recognition''' ('''WSR''') is a [[speech recognition]] ~~component~~ developed by [[Microsoft]] for [[Windows Vista]] that enables [[hands-free computing\|voice commands]] to control the [[desktop metaphor\|desktop]] [[user interface]]; [[transcription (linguistics)\|dictate]] text in [[electronic document]]s and [[email]]; navigate [[website]]s; perform [[keyboard shortcut]]s; and to operate the [[cursor (computing)\|mouse cursor]]. It ~~also~~ supports ~~the creation of~~ custom [[macro (computer science)\|macro]]s to perform additional or supplementary tasks. WSR is a locally processed speech recognition platform; it does not rely on [[cloud computing]] for accuracy, dictation, or recognition, but adapts based on contexts, grammars, speech samples, training sessions, and vocabularies. It provides a personal dictionary that allows users to include or exclude words or expressions from dictation and to ~~optionally~~ record pronunciations to increase recognition accuracy. With [[Windows Search]],<ref name="ThurrottAllchin">{{cite web \|url=http://www.itprotoday.com/jim-allchin-talks-windows-vista \|title=Jim Allchin Talks Windows Vista \|last=Thurrott \|first=Paul \|authorlink=Paul Thurrott \|date=October 6, 2010 \|publisher=[[Penton (company)\|Penton]] \|work=[[Windows IT Pro]] \|accessdate=March 29, 2018}}</ref> it can ~~optionally~~ analyze and collect text in documents, email, as well as [[handwritten]] [[tablet PC]] input to contextualize and disambiguate terms ~~to further adapt the recognizer~~.<ref name="Privacy">{{cite web \|url=http://download.microsoft.com/download/7/9/4/7945a146-fc32-48c2-8c14-83b1b36696e5/Windows%20Vista%20Privacy%20Statement.rtf \|title=Windows Vista Privacy Statement \|author=[[Microsoft]] \|date=2006 \|format=RTF \|accessdate=July 1, 2015}}</ref> Custom language models that adapt the recognizer to the specific contexts, phonetics, and terminologies of users in particular occupational fields such as legal or medical are also supported.<ref name="CustomizedVocabularies">{{cite web \|url=https://blogs.msdn.microsoft.com/robch/2005/09/20/customized-speech-vocabularies-in-windows-vista/ \|title=Customized speech vocabularies in Windows Vista \|last=Chambers \|first=Rob \|date=September 20, 2005 \|publisher=[[Microsoft]] \|work=[[Microsoft Developer Network\|MSDN]] \|accessdate=March 29, 2018}}</ref> WSR was developed to be integrated into Windows Vista, as Windows previously only supported speech recognition exclusive to applications such as [[Windows Media Player]]. [[Microsoft Office XP]] introduced speech recognition ~~limited to~~for [[Internet Explorer]] and [[Microsoft Office\|Office]]. With the release of Windows Vista, [[Microsoft Office 2007\|Office 2007]] and later versions of Office rely on WSR, replacing the separate Office speech recognition.<ref name="Office2007SR">{{cite web \|url=https://support.office.com/en-us/article/What-happened-to-speech-recognition-c6541b32-82df-4c18-bfa5-c411f45337d3 \|title=What happened to speech recognition? \|publisher=[[Microsoft]] \|work=Office Support \|accessdate=November 9, 2016}}</ref> The majority of integrated applications in Windows Vista can be controlled through speech.<ref name="Guide">{{cite web \|url=https://msdn.microsoft.com/en-us/library/bb530325.aspx \|title=Windows Vista Speech Recognition Step-by-Step Guide \|last=Phillips \|first=Todd \|date=2007 \|publisher=[[Microsoft]] \|work=[[MSDN]] \|accessdate=June 30, 2015}}</ref> WSR is present in [[Windows 7]],<ref name="SpeechRecognitionWindows7">{{cite web \|url=http://windows.microsoft.com/en-us/windows/what-can-do-speech-recognition#1TC=windows-7 \|title=What can I do with Speech Recognition? \|author=[[Microsoft]] \|work=Windows How-to \|accessdate=June 26, 2015}}</ref> [[Windows 8]],<ref name="Windows8SR">{{cite web \|url=http://windows.microsoft.com//en-US//windows-8//using-speech-recognition \|title=How to use Speech Recognition \|publisher=[[Microsoft]] \|work=Support \|archiveurl=https://web.archive.org/web/20121025193813/http://windows.microsoft.com//en-US//windows-8//using-speech-recognition \|archivedate=October 25, 2012 \|accessdate=December 24, 2018}}</ref> [[Windows 8.1]],<ref name="UpdatedGuidelines">{{cite web \|url=https://support.microsoft.com/en-us/help/14213/windows-how-to-use-speech-recognition \|title=How to use Speech Recognition in Windows \|date=August 31, 2016 \|publisher=[[Microsoft]] \|work=Support \|accessdate=December 24, 2018}}</ref> [[Windows RT]],<ref name="UpdatedGuidelines"/> and [[Windows 10]].<ref name="Windows10">{{cite web \|url=http://windows.microsoft.com/en-us/windows-10/use-voice-recognition-in-windows-10 \|title=Use Voice Recognition in Windows 10 \|author=[[Microsoft]] \|work=Support \|accessdate=August 24, 2015}}</ref> ==History== Line 34: ====Windows 7==== [[File:DictationScratchpad.png\|thumb\|200px\|The dictation scratchpad in Windows 7 replaces the "enable dictation everywhere" option of Windows Vista.]] With Windows 7 ~~Microsoft introduced several changes to improve~~, the ~~user experience. The~~speech recognizer was updated to use [[Microsoft UI Automation]]—substantially enhancing its performance—and the recognition engine now uses the [[Technical features new to Windows Vista#Audio stack architecture\|WASAPI]] audio stack, which enables support for [[echo suppression and cancellation\|echo cancellation]]. The document harvester, which ~~optionally~~can ~~analyzes~~analyze and ~~collects~~collect text in email and documents to contextualize ~~and disambiguate~~ user terms has improved performance, and ~~has been updated to~~now ~~run~~runs periodically in the background instead of only after recognizer startup. Sleep mode has also seen performance improvements and, to address security issues, ~~Windows~~the 7recognizer ~~introduces~~is aturned ~~new "voice activation" option—enabled~~off by ~~default—that turns the recognizer off~~default after users speak "stop listening" instead of ~~putting the recognizer to~~being ~~sleep~~suspended. Windows 7 also introduces an option to submit speech training data to Microsoft to improve future recognizer versions.<ref name="SRWindows7">{{cite web \|url=http://blogs.msdn.com/b/tsfaware/archive/2009/01/29/what-s-new-in-windows-speech-recognition.aspx \|title=What's new in Windows Speech Recognition? \|last=Brown \|first=Eric \|date=January 29, 2009 \|publisher=[[Microsoft]] \|work=[[Microsoft Developer Network\|MSDN]] \|accessdate=March 28, 2018}}</ref> Windows 7 introduced an optional dictation scratchpad interface that functions as a temporary document into which users can dictate or type text for insertion into applications that are not compatible with the Text Services Framework.<ref name="SRWindows7"/> WSR previously provided an "enable dictation everywhere option" in Windows Vista.<ref name="DictationWSR">{{cite web \|url=https://blogs.msdn.microsoft.com/speech/2007/10/24/where-does-dictation-work-in-windows-speech-recognition/ \|title=Where does dictation work in Windows Speech Recognition? \|last=Brown \|first=Eric \|date=October 24, 2007 \|publisher=[[Microsoft]] \|work=[[Microsoft Developer Network\|MSDN]] \|accessdate=March 28, 2018}}</ref> Line 72: : '''Speech recognition commands:''' "Start listening"; "Stop listening"; "Show speech options"; "Open speech dictionary"; "Move speech recognition"; "Minimize speech recognition."<ref name="CommonCommands"/> In the English language, applicable commands can be shown by speaking "What can I say?"<ref name="SpeechRecognition"/> Users can also query the recognizer about tasks in Windows by speaking "How can I ''task name''," which opens related help documentation.<ref name="General Commands">{{cite web \|url=https://blogs.msdn.microsoft.com/robch/2007/03/12/windows-speech-recognition-general-commands/ \|title=Windows Speech Recognition: General commands \|last=Chambers \|first=Rob \|date=March 12, 2007 \|publisher=[[Microsoft]] \|work=[[Microsoft Developer Network\|MSDN]] \|accessdate=May 1, 2017}}</ref> ====''~~Mousegrid~~MouseGrid''==== [[File:Mousegrid.png\|thumb\|160px\|right\|The mousegrid on the Windows Vista desktop.]] A ''~~mousegrid~~MouseGrid'' ~~command~~ enables users to control the mouse cursor by overlaying numbers across nine regions on the screen; these regions gradually narrow as a user speaks the number(s) of the region on which to focus until the desired interface element is reached. ~~The regions with which a user~~Users can ~~interact~~then ~~are based on~~issue commands including "Click ''number of region''," which moves the mouse cursor to the desired region and then clicks it; and "Mark ''number of region''", which allows an item (such as a [[icon (computing)\|computer icon]]) in a region to be selected, which can then be clicked with the previous ''click'' command. AUsers ~~user~~also can ~~also simultaneously~~ interact with multiple regions of the mousegrid at once.<ref name="CommonCommands"/> ====''Show ~~numbers~~Numbers''==== [[File:Show numbers.png\|thumb\|160px\|left\|The show numbers command overlaying numbers in the [[Games for Windows#Games Explorer\|Games Explorer]].]] Applications and interface elements that do not present identifiable commands can still be controlled by asking the system to overlay numbers on top of them through a ''show numbers'' command. Once active, speaking the overlaid number selects that item so a user can open it or perform other operations.<ref name="CommonCommands"/> ''Show numbers'' was designed so that users could interact with items that are not readily identifiable.<ref name="US7742923">{{Cite patent\|US\|7742923\| title=Graphic user interface schemes for supporting speech recognition input systems \|status=patent \|assign1=Microsoft Corporation \|invent5=Scholz, Oliver \|invent4=Chambers, Robert \|invent3=Mowatt, David \|invent2=Murillo, Oscar \|invent1=Bickel, Ryan}}</ref> Line 84: ====Speech dictionary==== A personal dictionary ~~that~~ allows users to include or exclude certain words or expressions from dictation ~~is available~~.<ref name="CustomizedVocabularies"/> When a user adds a word beginning with a capital letter to the dictionary, a user can specify whether it should always be capitalized or if capitalization depends on the context in which the word is spoken. Users can also record pronunciations for words added to the dictionary to increase recognition accuracy; words written via a [[stylus]] on a [[tablet PC]] for the Windows [[handwriting recognition]] feature are also stored. Information stored within a dictionary is included as part of a user's speech profile.<ref name="Privacy"/> ===Macros=== [[File:WSRMacroOptions.png\|thumb\|160px\|left\|An Aero Wizard interface displaying options to create speech recognition macros.]] WSR supports custom macros through a supplementary application by Microsoft that enables additional [[natural language processing\|natural language]] commands.<ref name="WSRM">{{cite web \|url=http://www.microsoft.com/en-us/download/details.aspx?id=13045 \|title=Windows Speech Recognition Macros \|author=[[Microsoft]] \|work=Download Center \|accessdate=June 29, 2015}}</ref><ref name="Ars">{{cite web \|url=https://arstechnica.com/information-technology/2008/04/wsr-macros-extend-windows-vistas-speech-recognition-feature/ \|title=WSR Macros extend Windows Vista's speech recognition feature \|last=Protalinski \|first=Emil \|date=April 30, 2008 \|publisher=[[Condé Nast]] \|work=[[ArsTechnica]] \|accessdate=June 29, 2015}}</ref> As an example of this functionality, an email macro released by Microsoft enables a natural language command where a user can ~~state~~speak "send email to ''contact'' about ''subject''," which opens [[Microsoft Outlook]] to compose a new message with the designated contact and subject automatically inserted.<ref name="MicrosoftOutlook">{{cite web \|url=http://blogs.msdn.com/b/robch/archive/2008/06/09/macro-of-the-day-send-email-to-outlookcontact.aspx \|title=Macro of the Day: Send Email to [OutlookContact] \|last=Chambers \|first=Rob \|date=June 9, 2008 \|publisher=[[Microsoft]] \|work=[[Microsoft Developer Network\|MSDN]] \|accessdate=June 26, 2015}}</ref> Microsoft has also released sample macros for the speech dictionary,<ref name="SpeechDictionaryMacro">{{cite web \|url=http://blogs.msdn.com/b/robch/archive/2008/08/02/speech-macro-of-the-day-speech-dictionary.aspx \|title=Speech Macro of the Day: Speech Dictionary \|last=Chambers \|first=Rob \|date=August 2, 2008 \|publisher=[[Microsoft]] \|work=[[Microsoft Developer Network\|MSDN]] \|accessdate=September 3, 2015}}</ref> for Windows Media Player,<ref name="MediaPlayer">{{cite web \|url=http://blogs.msdn.com/b/robch/archive/2008/07/01/macro-of-the-day-windows-media-player.aspx \|title=Macro of the Day: Windows Media Player \|last=Chambers \|first=Rob \|date=July 1, 2008 \|publisher=[[Microsoft]] \|work=[[Microsoft Developer Network\|MSDN]] \|accessdate=June 26, 2015}}</ref> for [[Microsoft PowerPoint]],<ref name="NextSlide">{{cite web \|url=http://blogs.msdn.com/b/robch/archive/2008/06/03/macro-of-the-day-next-slide.aspx \|title=Macro of the day: Next Slide \|last=Chambers \|first=Rob \|date=June 3, 2008 \|publisher=[[Microsoft]] \|work=[[Microsoft Developer Network\|MSDN]] \|accessdate=September 3, 2015}}</ref> for [[speech synthesis]],<ref name="ReadThat">{{cite web \|url=http://blogs.msdn.com/b/robch/archive/2008/05/28/macro-of-the-day-read-that.aspx \|title=Macro of the Day: Read that \|last=Chambers \|first=Rob \|date=May 28, 2008 \|publisher=[[Microsoft]] \|work=[[Microsoft Developer Network\|MSDN]] \|accessdate=June 26, 2015}}</ref> to switch between multiple microphones,<ref name="Microphone">{{cite web \|url=http://blogs.msdn.com/b/robch/archive/2008/11/07/macro-of-the-day-microphone-control.aspx \|title=Macro of the Day: Microphone Control \|last=Chambers \|first=Rob \|date=November 7, 2008 \|publisher=[[Microsoft]] \|work=[[Microsoft Developer Network\|MSDN]] \|accessdate=June 30, 2015}}</ref> to customize various aspects of audio device configuration such as volume levels,<ref name="SpeakersMacro">{{cite web \|url=http://blogs.msdn.com/b/robch/archive/2008/08/18/macro-of-the-day-mute-the-speakers.aspx \|title=Macro of the Day: Mute the speakers! \|last=Chambers \|first=Rob \|date=August 18, 2008 \|publisher=[[Microsoft]] \|work=[[Microsoft Developer Network\|MSDN]] \|accessdate=September 3, 2015}}</ref> and for general natural language queries such as "What is the weather forecast?"<ref name="WeatherForecast">{{cite web \|url=http://blogs.msdn.com/b/robch/archive/2008/06/02/macro-of-the-day-tell-me-the-weather-forecast-for-redmond.aspx \|title=Macro of the Day: Tell me the weather forecast for Redmond \|last=Chambers \|first=Rob \|date=June 2, 2008 \|publisher=[[Microsoft]] \|work=[[Microsoft Developer Network\|MSDN]] \|accessdate=June 26, 2015}}</ref> "What time is it?"<ref name="ReadThat"/> and "What's the date?"<ref name="ReadThat"/> Answers to these queries are spoken via a [[Microsoft text-to-speech voices\|speech synthesizer]]. Users and developers can create their own macros ~~that can be~~ based on text transcription and substitution; application execution (with support for [[command-line interface#arguments\|command-line arguments]]); keyboard shortcuts; emulation of existing voice commands; or a combination of these items. [[extensible markup language\|XML]], [[JScript]] and [[VBScript]] are supported.<ref name="Modes"/> Macros can be limited to individual applications if desired<ref name="Application">{{cite web \|url=http://blogs.msdn.com/b/robch/archive/2008/06/30/making-a-speech-macro-application-specific.aspx \|title=Making a Speech macro Application Specific \|last=Chambers \|first=Rob \|date=June 30, 2008 \|publisher=[[Microsoft]] \|work=[[Microsoft Developer Network\|MSDN]] \|accessdate=September 3, 2015}}</ref> and rules for macros can be defined programmatically.<ref name="MicrosoftOutlook"/> For a macro to load, it must be stored in a ''Speech Macros'' folder within the ~~current~~active user's ''[[My Documents\|Documents]]'' directory. All macros are [[digital signature\|digitally signed]] by default if a [[public key certificate\|user certificate]] is available, to ensure that commands are not ~~corrupted~~altered or loaded by third-parties; if ~~one~~a certificate is not available, an administrator can create ~~a certificate for use~~one.<ref name="WSRMacros">{{cite web \|url=http://download.microsoft.com/download/F/6/B/F6B71555-D73F-4273-9217-7D872D59BE31/Windows%20Speech%20Recognition%20Macros%20Release%20Notes.docx \|title=Windows Speech Recognition Macros Release Notes \|author=[[Microsoft]] \|date=2009 \|format=DOCX \|accessdate=June 28, 2015}}</ref> ~~The macros utility also includes~~Configurable security levels tocan prohibit unsigned macros from being loaded; to prompt users to sign macros; and to load unsigned macros.<ref name="Application"/> ==Performance== {{As of\|2017}} WSR uses Microsoft Speech Recognizer 8.0, which has not been changed since Windows Vista. For dictation it was found to be 93.6% accurate without training by Mark Hachman, a Senior Editor of ''[[PC World]]''—a rate that is not as accurate as competing software. According to Microsoft, the rate of accuracy when trained is 99%. Hachman ~~commented~~opined that Microsoft does not publicly discuss ~~WSR,~~the ~~attributing~~feature ~~this~~because toof the 2006 incident during development of Windows Vista, with the result being that few users ~~knowing~~knew that documents could be dictated within Windows before the introduction of [[Cortana]].<ref name="MSR8">{{cite web \|url=http://www.pcworld.com/article/3124761/windows/the-windows-weakness-no-one-mentions-speech-recognition.html \|title=The Windows weakness no one mentions: Speech recognition \|last=Hachman \|first=Mark \|date=May 10, 2017 \|publisher=[[International Data Group\|IDG]] \|work=[[PC World]] \|accessdate=March 28, 2018}}</ref> ==See also== * [[List of speech recognition software]] * [[Microsoft Cordless Phone System]] * [[Microsoft Narrator]] * [[Microsoft Voice Command]]

Windows Speech Recognition: Difference between revisions