Modality (human–computer interaction): Difference between revisions

Browse history interactively

← Previous edit

Content deleted Content added

VisualWikitext

Revision as of 09:54, 26 December 2019 edit Neils51 (talk \| contribs) Extended confirmed users 119,322 edits m →Using multiple modalities: spelling - complimentary->complementary ← Previous edit		Latest revision as of 20:16, 29 March 2025 edit undo Alenoach (talk \| contribs) Extended confirmed users 5,805 edits m ce Tag: Visual edit
(16 intermediate revisions by 11 users not shown)
Line 1: {{Short description\|Type of data}} {{distinguish\|Mode (user interface)}} In the context of [[human–computer interaction]], a '''modality''' is the classification of a single independent channel of ~~sensory~~ [[input/output]] between a computer and a human. Such channels may differ based on sensory nature (e.g., visual vs. auditory),<ref name="HCI Overview2">{{cite journal\|last1 = Karray\|first1 = Fakhreddine\|last2 = Alemzadeh\|first2 = Milad\|last3 = Saleh\|first3 = Jamil Abou\|last4 = Arab\|first4 = Mo Nours\|title = Human-Computer Interaction: Overview on State of the Art\|journal = International Journal on Smart Sensing and Intelligent Systems\|date = March 2008\|volume = 1\|issue = 1\| pages=137–159 \| doi=10.21307/ijssis-2017-283 \|url = http://www.s2is.org/issues/v1/n1/papers/paper9.pdf\|accessdate = April 21, 2015\|archive-url = https://web.archive.org/web/20150430205510/http://s2is.org/Issues/v1/n1/papers/paper9.pdf\|archive-date = April 30, 2015\|url-status = dead}}</ref> or other significant differences in processing (e.g., text vs. image).<ref>{{cite arXiv \| eprint=2301.13823 \| author1=Jing Yu Koh \| last2=Salakhutdinov \| first2=Ruslan \| last3=Fried \| first3=Daniel \| title=Grounding Language Models to Images for Multimodal Inputs and Outputs \| date=2023 \| class=cs.CL }}</ref> A system is designated unimodal if it has only one modality implemented, and [[multimodal interaction\|multimodal]] if it has more than one.<ref name="HCI Overview2" /> When multiple modalities are available for some tasks or aspects of a task, the system is said to have overlapping modalities. If multiple modalities are available for a task, the system is said to have redundant modalities. Multiple modalities can be used in combination to provide complementary methods that may be redundant but convey information more effectively.<ref>{{Cite book\|title = Interactive Systems. Design, Specification, and Verification\|~~last~~last1 = Palanque\|~~first~~first1 = Philippe\|publisher = Springer Science & Business Media\|year = 2001\|isbn = 9783540416630~~\|___location =~~ \|pages = [https://archive.org/details/springer_10.1007-3-540-44675-3/page/n50 43]\|last2 = Paterno\|first2 = Fabio\|url = https://~~books~~archive.~~google.com~~org/~~books?id=RddIwyhAvDAC&dq=~~details/springer_10.1007-3-540-44675-3}}</ref> Modalities can be generally defined in two forms: ~~human-~~computer-human and ~~computer-~~human-computer modalities. ==~~Computer–Human~~Computer–human modalities== Computers utilize a wide range of technologies to communicate and send information to humans: Line 17 ⟶ 18: ** [[Equilibrioception]] (balance) Any human sense can be used as a computer to human modality. However, the modalities of [[visual perception\|seeing]] and [[hearing (sense)\|hearing]] are the most commonly employed since they are capable of transmitting information at a higher speed than other modalities, 250 to 300<ref name=Ziefle98>{{cite journal\|last1=Ziefle\|first1=M\|title=Effects of display resolution on visual performance.\|journal=Human ~~factors~~Factors\|date=December 1998\|volume=40\|issue=4\|pages=554–68\|pmid=9974229\|doi=10.1518/001872098779649355}}</ref> and 150 to 160<ref>Williams, J. R. (1998). Guidelines for the use of multimedia in instruction, Proceedings of the Human Factors and Ergonomics Society 42nd Annual Meeting, 1447–1451</ref> [[words per minute]], respectively. Though not commonly implemented as computer-human modality, tactition can achieve an average of 125 wpm<ref>{{cite web\|title=Braille\|url=http://www.acb.org/node/67\|website=ACB\|publisher=American Council of the Blind\|accessdate=21 April 2015}}</ref> through the use of a [[refreshable Braille display]]. Other more common forms of tactition are smartphone and game controller vibrations. ==Human–computer modalities== Computers can be equipped with various types of [[input devices]] and sensors to allow them to receive information from humans. Common input devices are often interchangeable if they have a standardized method of communication with the computer and [[Affordance\|afford]] practical adjustments to the user. Certain modalities can provide a richer interaction depending on the context, and having options for implementation allows for more robust systems.<ref>{{Cite book\|title = Berkshire Encyclopedia of Human-computer Interaction\|last = Bainbridge\|first = William\|publisher = Berkshire Publishing Group LLC\|year = 2004\|isbn = 9780974309125~~\|___location =~~ \|pages = 483\|url = https://books.google.com/books?id=568u_k1R4lUC~~&dq~~}}</ref> * Simple modalities Line 31 ⟶ 32: [[Accelerometer\|Motion]] [[Orientation (geometry)\|Orientation]] With the increasing popularity of [[smartphones]], the general public are becoming more comfortable with the more complex modalities. Motion and orientation are commonly used in smartphone mapping applications. Speech recognition is widely used with Virtual Assistant applications. Computer Vision is now common in camera applications that are used to scan documents and QR codes. With the increasing popularity of [[smartphones]], the general public are becoming more comfortable with the more complex modalities. Speech recognition was a major selling point of the [[iPhone 4S]] and following [[Apple Inc.\|Apple]] products, with the introduction of [[Siri]].<ref>{{Cite news\|url = http://bgr.com/2011/11/02/siri-said-to-be-driving-force-behind-huge-iphone-4s-sales/\|title = Siri said to be driving force behind huge iPhone 4S sales\|last = Epstein\|first = Zach\|date = Nov 2, 2011\|work = \|access-date = April 21, 2015\|via = }}</ref> This technology gives users an alternative way to communicate with computers when typing is less desirable. However, in a loud environment, the audition modality is not quite effective. This exemplifies how certain modalities have varying strengths depending on the situation.<ref>{{Cite book\|title = Multimodality in Mobile Computing and Mobile Devices: Methods for Adaptable Usability\|last = Kurkovsky\|first = Stan\|publisher = IGI Global \|year = 2009\|isbn = 9781605669793\|___location = \|pages = 210–211\|url = https://books.google.com/books?id=kqxpqs32muQC&dq}}</ref> Other complex modalities such as computer vision in the form of [[Microsoft]]'s [[Kinect]] or other similar technologies can make sophisticated tasks easier to communicate to a computer especially in the form of three dimensional movement.<ref>{{Cite book\|title = Human-Computer Interaction: Interaction Modalities and Techniques\|last = Kurosu\|first = Masaaki\|publisher = Springer\|year = 2013\|isbn = 9783642393303\|___location = \|pages = 366\|url = https://books.google.com/books?id=p5W6BQAAQBAJ&dq}}</ref> ==Using multiple modalities== {{main\|Multimodal interaction}} Having multiple modalities in a system gives more [[affordance]] to users and can contribute to a more robust system. Having more also allows for greater [[accessibility]] for users who work more effectively with certain modalities. Multiple modalities can be used as backup when certain forms of communication are not possible. This is especially true in the case of redundant modalities in which two or more modalities are used to communicate the same information. Certain combinations of modalities can add to the expression of a computer-human or human-computer interaction because the modalities each may be more effective at expressing one form or aspect of information than others. There are six types of cooperation between modalities, and they help define how a combination or fusion of modalities work together to convey information more effectively.<ref name=":0">{{Cite book\|title = Multimodal Human Computer Interaction and Pervasive Services\|last = Grifoni\|first = Patrizia\|publisher = IGI Global\|year = 2009\|isbn = 9781605663876~~\|___location =~~ \|pages = 37\|url = https://books.google.com/books?id=O8CqMtIKSWwC~~&source=gbs_navlinks_s~~}}</ref> * '''Equivalence:''' information is presented in multiple ways and can be interpreted as the same information Line 48 ⟶ 50: ==See also== * [[{{Annotated link\|Multimodal ~~interaction]]~~learning}} * [[{{Annotated link\|Multisensory integration]]}} * [[{{Annotated link\|User interface]]}} ==References==