Multimodal sentiment analysis: Difference between revisions

Content deleted Content added
Bender the Bot (talk | contribs)
m Audio features: HTTP → HTTPS for IEEE Xplore, replaced: http://ieeexplore.ieee.org/document/ → https://ieeexplore.ieee.org/document/
Citation bot (talk | contribs)
m Add: url. Removed URL that duplicated unique identifier. Removed parameters. | You can use this bot yourself. Report bugs here.| Activated by User:Nemo bis | via #UCB_webform
Line 1:
'''Multimodal sentiment analysis''' is a new dimension{{peacock term|date=June 2018}} of the traditional text-based [[sentiment analysis]], which goes beyond the analysis of texts, and includes other [[Modality (human–computer interaction)|modalities]] such as audio and visual data.<ref>{{cite journal |last1=Soleymani |first1=Mohammad |last2=Garcia |first2=David |last3=Jou |first3=Brendan |last4=Schuller |first4=Björn |last5=Chang |first5=Shih-Fu |last6=Pantic |first6=Maja |title=A survey of multimodal sentiment analysis |journal=Image and Vision Computing |date=September 2017 |volume=65 |pages=3–14 |doi=10.1016/j.imavis.2017.08.003|url=https://zenodo.org/record/3449163 }}</ref> It can be bimodal, which includes different combinations of two modalities, or trimodal, which incorporates three modalities.<ref>{{cite journal |last1=Karray |first1=Fakhreddine |last2=Milad |first2=Alemzadeh |last3=Saleh |first3=Jamil Abou |last4=Mo Nours |first4=Arab |title=Human-Computer Interaction: Overview on State of the Art |journal=International Journal on Smart Sensing and Intelligent Systems |date=2008 |url=http://s2is.org/Issues/v1/n1/papers/paper9.pdf}}</ref> With the extensive amount of [[social media]] data available online in different forms such as videos and images, the conventional text-based [[sentiment analysis]] has evolved into more complex models of multimodal sentiment analysis,<ref name="s1">{{cite journal |last1=Poria |first1=Soujanya |last2=Cambria |first2=Erik |last3=Bajpai |first3=Rajiv |last4=Hussain |first4=Amir |title=A review of affective computing: From unimodal analysis to multimodal fusion |journal=Information Fusion |date=September 2017 |volume=37 |pages=98–125 |doi=10.1016/j.inffus.2017.02.003|hdl=1893/25490 }}</ref> which can be applied in the development of [[virtual assistant]]s,<ref name ="s5">{{cite web |title=Google AI to make phone calls for you |url=https://www.bbc.com/news/technology-44045424 |website=BBC News |accessdate=12 June 2018 |date=8 May 2018}}</ref> [[Social media analytics|analysis]] of YouTube movie reviews,<ref name="s4">{{cite journal |last1=Wollmer |first1=Martin |last2=Weninger |first2=Felix |last3=Knaup |first3=Tobias |last4=Schuller |first4=Bjorn |last5=Sun |first5=Congkai |last6=Sagae |first6=Kenji |last7=Morency |first7=Louis-Philippe |title=YouTube Movie Reviews: Sentiment Analysis in an Audio-Visual Context |journal=IEEE Intelligent Systems |date=May 2013 |volume=28 |issue=3 |pages=46–53 |doi=10.1109/MIS.2013.34}}</ref> [[Social media analytics|analysis]] of news videos,<ref>{{cite arxiv|last1=Pereira |first1=Moisés H. R. |last2=Pádua |first2=Flávio L. C. |last3=Pereira |first3=Adriano C. M. |last4=Benevenuto |first4=Fabrício |last5=Dalip |first5=Daniel H. |title=Fusing Audio, Textual and Visual Features for Sentiment Analysis of News Videos|date=9 April 2016 |eprint=1604.02612|class=cs.CL }}</ref> and [[emotion recognition]] (sometimes known as [[emotion]] detection) such as [[depression (mood)|depression]] monitoring,<ref name = "s6">{{cite book |last1=Zucco |first1=Chiara |last2=Calabrese |first2=Barbara |last3=Cannataro |first3=Mario |title=Sentiment analysis and affective computing for depression monitoring |journal=2017 IEEE International Conference on Bioinformatics and Biomedicine (BIBM) |date=November 2017 |pages=1988–1995 |doi=10.1109/bibm.2017.8217966 |publisher=IEEE |language=English|isbn=978-1-5090-3050-7 }}</ref> among others.
 
Similar to the traditional [[sentiment analysis]], one of the most basic task in multimodal sentiment analysis is [[Feeling|sentiment]] classification, which classifies different sentiments into categories such as positive, negative, or neutral.<ref>{{cite book |last1=Pang |first1=Bo |last2=Lee |first2=Lillian |title=Opinion mining and sentiment analysis |date=2008 |publisher=Now Publishers |___location=Hanover, MA |isbn=978-1601981509}}</ref> The complexity of [[Social media analytics|analyzing]] text, audio, and visual features to perform such a task requires the application of different fusion techniques, such as feature-level, decision-level, and hybrid fusion.<ref name="s1" /> The performance of these fusion techniques and the [[classification]] [[algorithm]]s applied, are influenced by the type of textual, audio, and visual features employed in the analysis.<ref name = "s7" />
Line 10:
 
=== Audio features ===
[[Feeling|Sentiment]] and [[emotion]] characteristics are prominent in different [[phonetic]] and [[prosodic]] properties contained in audio features.<ref>{{cite journal |last1=Chung-Hsien Wu |last2=Wei-Bin Liang |title=Emotion Recognition of Affective Speech Based on Multiple Classifiers Using Acoustic-Prosodic Information and Semantic Labels |journal=IEEE Transactions on Affective Computing |date=January 2011 |volume=2 |issue=1 |pages=10–21 |doi=10.1109/T-AFFC.2010.16}}</ref> Some of the most important audio features employed in multimodal sentiment analysis are [[mel-frequency cepstrum| mel-frequency cepstrum (MFCC)]], [[spectral centroid]], [[spectral flux]], beat histogram, beat sum, strongest beat, pause duration, and [[pitch accent|pitch]].<ref name="s1" /> [[OpenSMILE]]<ref>{{cite book |last1=Eyben |first1=Florian |last2=Wöllmer |first2=Martin |last3=Schuller |first3=Björn |title=OpenEAR — Introducing the munich open-source emotion and affect recognition toolkit - IEEE Conference Publication |journal= |pages=1 |date=2009 |doi=10.1109/ACII.2009.5349350 |chapter-url=https://ieeexplore.ieee.org/document/5349350|isbn=978-1-4244-4800-5 |chapter=OpenEAR — Introducing the munich open-source emotion and affect recognition toolkit }}</ref> and [[Praat]] are popular open-source toolkits for extracting such audio features.<ref>{{cite book|last1=Morency |first1=Louis-Philippe |last2=Mihalcea |first2=Rada |last3=Doshi |first3=Payal |title=Towards multimodal sentiment analysis: harvesting opinions from the web |date=14 November 2011 |pages=169–176 |doi=10.1145/2070481.2070509 |chapter-url=https://dl.acm.org/citation.cfm?id=2070509 |publisher=ACM|chapter=Towards multimodal sentiment analysis |isbn=9781450306416 }}</ref>
 
=== Visual features ===
Line 28:
 
== Applications ==
Similar to text-based sentiment analysis, multimodal sentiment analysis can be applied in the development of different forms of [[recommender system]]s such as in the analysis of user-generated videos of movie reviews<ref name="s4" /> and general product reviews,<ref>{{cite journal |last1=Pérez-Rosas |first1=Verónica |last2=Mihalcea |first2=Rada |last3=Morency |first3=Louis Philippe |title=Utterance-level multimodal sentiment analysis |journal=Long Papers |date=1 January 2013 |url=https://experts.umich.edu/en/publications/utterance-level-multimodal-sentiment-analysis |publisher=Association for Computational Linguistics (ACL)}}</ref> to predict the sentiments of customers, and subsequently create product or service recommendations.<ref>{{cite web |last1=Chui |first1=Michael |last2=Manyika |first2=James |last3=Miremadi |first3=Mehdi |last4=Henke |first4=Nicolaus |last5=Chung |first5=Rita |last6=Nel |first6=Pieter |last7=Malhotra |first7=Sankalp |title=Notes from the AI frontier. Insights from hundreds of use cases |url=https://www.mckinsey.com/mgi/ |website=McKinsey & Company |publisher=McKinsey & Company |accessdate=13 June 2018 |language=en}}</ref> Multimodal sentiment analysis also plays an important role in the advancement of [[virtual assistant]]s through the application of [[natural language processing]] (NLP) and [[machine learning]] techniques.<ref name ="s5" /> In the healthcare ___domain, multimodal sentiment analysis]] can be utilized to detect certain medical conditions such as [[Psychological stress|stress]], [[anxiety]], or [[Depression (mood)|depression]].<ref name = "s6" /> Multimodal sentiment analysis can also be applied in understanding the sentiments contained in video news programs, which is considered as a complicated and challenging ___domain, as sentiments expressed by reporters tend to be less obvious or neutral.<ref>{{cite book|last1=Ellis |first1=Joseph G. |last2=Jou |first2=Brendan |last3=Chang |first3=Shih-Fu |title=Why We Watch the News: A Dataset for Exploring Sentiment in Broadcast Video News |date=12 November 2014 |pages=104–111 |doi=10.1145/2663204.2663237 |chapter-url=https://dl.acm.org/citation.cfm?doid=2663204.2663237 |publisher=ACM|chapter=Why We Watch the News |isbn=9781450328852 }}</ref>
 
==References==