1、博士学位請求論文 An Analytical Approach for Affect Sensing from Text 感情解析的 By Mostafa Al Masum Shaikh (48-57405) A THESIS SUBMITTED TO THE GRADUATE SCHOOL OF INFORMATION SCIENCE AND TECHNOLOGY IN PARTIAL FULFILMENT OF THE REQUIREMENTS FOR THE DEGREE OF Doctor of Philosophy At the University of Tokyo Thesis
2、Supervisor Professor Mitsuru Ishizuka (石塚 満 教授) i Abstract Studying the relationship between natural language and affective information as well as assessing the underpinned affective meaning of natural language are becoming crucial for improving human computer interaction. The area of such interacti
3、ve applications is numerous and varied, ranging from categorizing newsgroup flame and augmenting search engine responses to analysis of public opinion trends towards a particular fact or entity and customer feedback. Text is not only an important medium to describe facts and events, but also to effe
4、ctively communicate information about the writers positive or negative sentiment underlying an opinion, or to express an affective or emotional state, such as happy, fearful, surprised, and so on. We consider sentiment assessment and emotion sensing from text as two different problems. Classifying t
5、he tone of the communication as generally positive or negative is considered as the task of sentiment assessment and recognition of particular emotion(s) being expressed is the task of emotion sensing. Therefore, the thesis first presents an analytical approach to sentiment assessment, i.e., the rec
6、ognition of negative or positive valence of a sentence and then explains how a well- founded emotion model has been implemented for recognition of emotions. For the purpose of sentiment assessment from text, we perform semantic dependency analysis on the semantic verb frame(s) of each sentence, and
7、then apply a set of rules to each dependency relation to calculate the contextual valence of the words used in the sentence. By employing a domain-independent, rule-based approach our system is able to automatically identify sentence-level sentiment. A linguistic tool called SenseNet has been develo
8、ped to recognize sentiments in text, and to visualize the detected sentiments. We conducted several experiments with a variety of datasets containing data from different domains. The obtained results indicate significant performance gains over existing state-of-the-art approaches. Emotions expressed
9、 in natural language are very often expressed in subtle and complex ways, presenting challenges which may not be easily addressed by simple text categorization approaches such as n-gram or keyword identification ii approaches. Numerous approaches have already been employed to “sense” affective infor
10、mation from text; but none of those ever employed the OCC emotion model an influential theory of the cognitive and appraisal structure of emotion. The OCC model derives twenty-two emotion types and two cognitive states as consequences of several cognitive variables. This thesis therefore describes h
11、ow to relate cognitive variables of the emotion model to linguistic components in text, in order to achieve emotion recognition for a much larger set of emotions than handled in comparable approaches. In particular, we provide tailored rules for textural emotion recognition, which are inspired by th
12、e rules of the OCC emotion model. Hereby, we clarify how text components can be mapped to specific values of the cognitive variables of the emotion model. The resulting linguistics-based rule set for the OCC emotion types and cognitive states allow us to determine a broad class of emotions conveyed
13、by text. iii Acknowledgements I would like to express my sincere gratitude to Professor Mitsuru Ishizuka and Dr. Helmut Prendinger who were the persons that helped me the most by their guidance and prudent advice while completing this thesis. Prof. Ishizuka as my supervisor took care of me and guide
14、d me during all the years of my doctoral studies. Dr. Prendinger provided me with necessary information and advice contributing to my success. The laboratory of Prof. Ishizuka is a very intellectually stimulating environment with many people coming from different countries and cultures, having diffe
15、rent experiences and opinions. During my stay in the laboratory, I had occasions for collaboration on the scientific and cultural level. I wish to thank Hiroshi Dohi, Meiko Fujita, Adam Jatowt, Naoaki Okazaki, Arturo Nakasone, Jie Yang, Eiko Kin, Tadanobu Furukawa, Danushka Bollegala, Werner Breitfu
16、ss, Tomofumi Takayama, Yulan Yan, Alena Neviarouskaya, Manuel M. Martinez, A. S. M. Mahbub Morshed, Francisco Tacoa, Bahlul Haider, Fahim Ferdous Khan, Fahim Kawser, Ashequl Qadir, Md. Tawhidul Islam, Russell Rehman, Shahidul Islam, Zeenatul Bashar, Sarkar Abul Bashar, Nayeem Mahmud, Shakil Ahmed an
17、d many others for their help and assistance during my research. Finally I would like to express my thanks to Japanese Ministry of Education, Culture, Sports, Science and Technology for enabling me to come and study in Japan by providing financial assistance in terms of “Monbukagakusho Scholarship”.
18、iv To the persons whom I love most my parents and my wife v Table of Contents Abstract i Acknowledgements iii List of Tables vii List of Figures and Illustrations . viii CHAPTER ONE: INTRODUCTION1 1.1 Structure of Thesis.2 1.2 Sentiment and Emotion in Text .3 1.3 Research Theme.4 1.4 Domain Knowledg
19、e .5 1.5 Core Feature of this Research7 CHAPTER TWO: SENTIMENT ANALYSIS OF TEXT 10 2.1 Reviews of Existing Approaches.10 2.2 Topics Ignored and Our Focus.16 2.3 Summary of Our Approach18 CHAPTER THREE: LINGUISTIC RESOURCES AND SENSENET 20 3.1 SenseNet Architecture .20 3.2 Semantic Parser22 3.3 Devel
20、oping Affective Lexica.23 3.3.1 The Knowledgebase: .25 3.3.2 Scoring a list of Verbs, Adjectives and Adverbs.26 3.3.3 Scoring of Nouns.28 3.3.4 Scored-list of Named Entity 30 3.4 Contextual Valence Assessment32 3.5 Sentiment Assessment .37 3.6 SenseNet GUI 40 CHAPTER FOUR: SENSENET EVALUATION.44 4.1
21、 The Datasets.44 4.2 Sentence Level Comparisons.47 4.3 Paragraph Level Comparisons.49 4.4 Evaluating Individual Components of SenseNet .53 4.5 Comparison to a State-of-the System 57 4.6 Discussion58 4.7 Conclusion .63 CHAPTER FIVE: EMOTION ANALYSIS OF TEXT.64 5.1 Necessity of a New Approach .64 5.2
22、The OCC Emotion Model65 5.2.1 Why OCC Model?.66 5.2.2 Characterization of the OCC Emotions.68 5.3 Implementation of the OCC Model in Linguistic Realm 71 vi 5.3.1 Linguistic Resources .71 5.3.2 Assigning Values to the Variables 73 5.3.3 The Rules of the OCC Emotion Types81 5.4 Walk-Through Examples f
23、or Emotion Recognition84 5.5 Evaluation and Discussion.88 5.6 Summary90 CHAPTER SIX: DEVELOPED APPLICATIONS.92 6.1 SenseNet 92 6.2 ASNA: Affect Sensitive News Agent95 6.3 ESNA: Emotion Sensitive News Agent.97 6.4 Online System for Textual Affect Sensing105 CHAPTER SEVEN: SUMMARY AND CONCLUSION 108 R
24、EFERENCES 112 APPENDIX A: PSEUDO CODE 119 APPENDIX B: EXPERIMENTAL RESULT FOR DATASET A .121 PUBLICATION LIST OF THE AUTHOR.123 vii List of Tables Table 3.1 Triplet output of Semantic Parsing for the sentence given above. . 23 Table 3.2 Sample list of verbs with associated Prior Valence 28 Table 3.3
25、 Sample list of scored named entities. 31 Table 3.4 Symbols used in SenseNet browser 43 Table 4.1 Input datasets 46 Table 4.2 Accuracy results obtained for Dataset B using different approaches. 48 Table 4.3 Summary of several systems that experimented with Dataset C 50 Table 4.4 Experimental result
26、using the Dataset C. 52 Table 4.5 Experimenting with different models of the system using all the datasets. 55 Table 4.6 Accuracy Comparison Metrics between EmpathyBuddy and SenseNet . 58 Table 5.1 the variables (i.e., cognitive variables) of the OCC Emotion Model 68 Table 5.2 the definitions of the
27、 rules for the OCC emotion types 69 Table 5.3 Semantic verb-frames output by the semantic parser for “My mother presented me a nice wrist watch on my birthday and made delicious pancakes.”. 72 Table 5.4 Recognizing the OCC emotions from the sentence “I didnt see John for the last few hours; I though
28、t he might miss the flight but I suddenly found him on the plane.”. 85 Table 5.5 Preliminary experimental result of the two systems. 90 Table 6.1 Excerpts from database of ranked words 93 Table 7.1 the summary of experimental results using different datasets 108 viii List of Figures and Illustration
29、s Figure 3.1 Architecture of SenseNet. 21 Figure 3.2 ConceptNet output for the concept rocket 25 Figure 3.3 Interface and Sample Output of SenseNet Browser 42 Figure 4.1 Relationship between the Neutral Range of the system to signal neutrality of a sentence and other system performance measures name
30、ly, Accuracy, Average Precision, Recall, and F-Score for three classes. 48 Figure 5.1 The OCC Emotion Model . 67 Figure 6.1 SenseNet GUI 94 Figure 6.2 Architecture of ASNA. 96 Figure 6.3 ASNA News Browser . 96 Figure 6.4 Architecture of Emotion Sensitive News Agent (ESNA) . 98 Figure 6.5 User Interf
31、ace to Customize Sentiment towards Named-Entities . 100 Figure 6.6 the summary of user study. 104 Figure 6.7 user-interface of the online system 105 Figure 6.8 the output interface of the online system. 106 Figure 6.9 Survey form. 107 1 Chapter One: Introduction There is now plenty of evidence in ne
32、uroscience and psychology about the importance of emotional intelligence for the overall human performance in tasks such as rational decision- making, communicating, negotiating, and adapting to unpredictable environments. As a result, people can no longer be modeled as pure goal-driven, task-solvin
33、g agents: they also have emotive reasons for their choices and behavior which (more often than not) drive rational decision-making Mandler, 1975. In holistic view the research is aimed at giving computer programs skills of emotional intelligence, including the ability to recognize, model, and unders
34、tand human emotion, to appropriately communicate emotion, and to respond to it effectively. The new discipline coined as “Affective Computing” Picard 1997investigates the basics of human emotion and emphasizes both the physiological and cognitive aspects of emotion. The Affective Computing community
35、 developed several mechanisms for emotion sensing, including the processing of various physiological signals obtained from wearable sensors. Early work in Affective Computing emphasized the physiological and behavioral aspects of emotion, for instance, by analyzing biometric sensor data, prosody, po
36、sture, and so on. More recently, the sensing of emotion from text gained increased popularity, since textual information provides a rich source of the expression of human affective state. The words we use reflect who we are and hence the word choice of ones writing serves as a key to ones personalit
37、y, social situation and affective or attitudinal information conveyed through texts. Furthermore, people most naturally interact with their computers in a social and affectively meaningful way, just like with other people Reeves and Nass, 1998. These observations have created an expectation that the
38、 future human computer interaction (HCI) is in themes such as emotions, entertainment, attention, motivation, e-learning etc. So studying the relationship between natural language and affective information as well as assessing the underpinned affective qualities of natural language are becoming cruc
39、ial for improving interaction with users. Specifically, this research is devoted to exploring different 2 techniques to recognize positive and negative opinion, or favourable and unfavourable sentiments towards specific subjects occurring in natural language texts. 1.1 Structure of Thesis This thesi
40、s is composed of seven chapters and two appendices, which provide background to this research, describe the core methodologies, demonstrate results of this work, describe the developed applications, and enlist pseudo codes of the approach discussed. The contents of each chapter are outlined below. C
41、hapter one: This part is a general introduction to the topic. Since the research topic is multi- disciplinary, first the contribution and background knowledge obtained from different knowledge domains are discussed. Then the core features of this research are pointed out. Chapter two: In this chapte
42、r, the current state of the art approaches for sentiment analysis from texts have been discussed by pointing to the limitations of those. Finally, our approach is explained from the viewpoint of considering the previously ignored topics for the task of sentiment analysis of text. Chapter three: This
43、 chapter explains the core approach of this research. How different lexical resources have been developed and then employing several rules how an input text can be considered as an analytical model have been explained with examples. Our developed application, SenseNet, assesses an input text numeric
44、ally in order to know whether the input text carries a negative or positive sense. The implementation detail of SenseNet is discussed in this chapter. Chapter four: This chapter contains experimental results for different standard datasets for the task of sentiment analysis. Different types of syste
45、m evaluation are done and the chapter concluded with a discussion on obtained results and failure analysis. 3 Chapter five: Though all emotions can be seen as positive or negative, this chapter extends the idea of recognizing more fine-grained named emotions (e.g., happy, sad, anger etc.). Towards t
46、his point how a well-founded emotion model (i.e., OCC emotion model taken from Cognitive Psychology) can be implemented in linguistic realm has been discussed. This is completely a new contribution that came out of this research. Chapter six: Grounding the developed theories and methodologies severa
47、l applications are developed. In this chapter the developed applications are discussed in terms of their architectures, functional steps and graphical user interfaces. Chapter seven: This chapter contains summary and conclusions of the studies in sentiment and affect sensing from text. Appendix A: I
48、t contains the pseudo code of the algorithm for sentiment sensing from text. Appendix B: It contains the detail experimental result of one of the datasets. 1.2 Sentiment and Emotion in Text We consider sentiment assessment and affect sensing from text as two different problems. We refer the phrases
49、like sentiment analysis, opinion mining, mood analysis, trend analysis, to the problem of sentiment assessment aiming to determine whether a positive, negative or neutral attitude of a speaker or a writer has been communicated with respect to some topic. But, by the phrases like emotion recognition, affect sensing, emotion analysis, we refer to the problem of affect sensing aiming to detect the underlying emotions being communicated grounding on the theory of emotions from cognitive psychology. Hence, the research o