Wu's Notepad: 6月 2013

2013年6月20日星期四

Aspect Extraction

Mukherjee and Liu (2012). Aspect Extraction through Semi-Supervised Modeling. ACL.

A key task of the framework is to extract aspects of entities that have been commented in opinion documents.
Two main types:

The first type only extracts aspect terms without grouping them;

The second type uses statistical topic models to extract aspects and group them.

This paper that given some seeds in the user interested categories.
The models are related to the DFLDA model in (Andrzejewski et al., 2009), while DF-LDA is only for topics/aspects.
There are many existing works on aspect extraction

to find frequent noun terms and possibly with the help of dependency relations

to use supervised sequence labeling

Aspect and sentiment extraction using topic modeling come in two flavors:

discovering aspect words sentiment wise (放在一起表示)

separately discovering both aspects and sentiments (used Maximum-Entropy, Mei
et al., 2007; Zhao et al., 2010)

思考上述兩種方法的優缺點，改進的空間

One problem with these existing models is that many discovered aspects are not understandable / meaningful to users.
Standard LDA and existing aspect and sentiment models based on document level, so many “non-specific” terms being pulled and clustered
Aspect terms tend to be nouns or noun phrases and sentiment terms tend to be adjectives, adverbs

Zhao et al., (2010). jointly modeling aspects and opinions with a mazEnt-LDA Hybrid. EMNLP.

Separateing aspects and opinion words can be very useful.

can be used to construct a domain-dependent sentiment lexicon and applied to tasks such as sentiment classification.

Global topic models may not be suitable for detecing rateable aspects.

Bagheri et al., (2013). An Unsupervised Aspect Detection Model for Sentiment Analysis of Reviews. NLDB.

Aspects are important because without knowing them, the opinions expressed in a sentence or a review are of limited use.

2013年6月19日星期三

Many sentences in real reviews often involve two or more aspects.
The first sentence contains three single-aspect segments: an environment-segment (环境不错/ the environment is nice), a food-segment (菜品一般/ the quality of food is so so), and a charge-segment (很贵/ the food is very expensive)

2013年6月17日星期一

Terminology

topic: a multinomial distribution over words that represents a coherent concept in text.
aspect: a multinomial distribution over words that represents a more speci c topic in reviews, for example,"lens" in camera reviews.
senti-aspect: a multinomial distribution over words that represents a pair of aspect and sentiment, for example, "screen, positive" in a laptop review.
affective word: a word that expresses a feeling, for example "satisfied", "disappointed".
evaluative word: a word that expresses sentiment by evaluating an aspect, for example, "excellent", "nice".
general evaluative word: an evaluative word that expresses a consistent sentiment every time it is used, for example, "good", "bad".
aspect-specific evaluative word: an evaluative word that may express di erent sentiments depending on the aspect, for example, a "small" font size on a monitor that is hard to read vs. a "small" vacuum that is portable.
sentiment word: a word that conveys sentiment. It is either an a ective word, general evaluative word, or aspect-speci c evaluative word.

source: Jo and Oh, WSDM'11.

Gold-standard lexicon

The gold-standard lexicon mentioned in the former case is obtained through one
of the following ways:
a) by manually tagging words from a domain corpus;
b) by one or more domain experts choosing aspects and keywords without the use of a
corpus; or
c) using review sets that have already been annotated with aspects and
keywords by the original reviewers

Wu's Notepad

2013年6月20日星期四

Aspect Extraction

2013年6月19日星期三

multi-aspect sentence

2013年6月17日星期一

Terminology

Gold-standard lexicon

Types of Bots: An Overview

2013年6月20日 星期四

Aspect Extraction

2013年6月19日 星期三

multi-aspect sentence

2013年6月17日 星期一

Terminology

Gold-standard lexicon

Types of Bots: An Overview

2013年6月20日星期四

2013年6月19日星期三

2013年6月17日星期一