Aspect Extraction Performance With Common Pattern of Dependency Relation in Multi Aspect Sentiment Analysis

Aspect Extraction Performance With POS Tag Pattern of
Dependency Relation in Aspect-based Sentiment Analysis
CAMP’18: 26 - 28 March 2018
Ana Salwa Shafie, Nurfadhlina Mohd Sharef, Azreen Azman,Masrah
Azrifah Azmi Murad
Department of Computer Science, Faculty of Computer Science and Information
Technology, Universiti Putra Malaysia, 43400 UPM Serdang, Selangor, Malaysia

INTRODUCTION
Different Level
of Sentiment
Analysis
Document Level
Sentence Level
Aspect Level (ABSA)
Sentiment analysis (SA) is the study of analyzing people’s opinions,
sentiments, appraisals, attitudes, and emotions toward entities such as
products, services, individuals and their aspects expressed in textual
reviews.
• The most important task in ABSA is aspect and sentiment
word extraction.
• This task aims to efficiently identify and extract aspects
and sentiment word regarding that aspect from reviews.

INTRODUCTION
Issues in product review:
(1) single aspect and single sentiment,
(2) single aspect and multiple sentiments,
(3) multiple aspects and single sentiment,
(4) multiple aspects and multiple sentiments
Multiple sentiments
Opposing polarity
Different aspect
Challenges
The display on this computer is the best I've seen in a very long
time, the battery life is very long and very convenient.

INTRODUCTION
Required a lot of effort and various type dependency patterns to
develop the extraction rule that suit with the domain.
• Previous research has shown that unsupervised methods based on
dependency relations are promising for aspect extraction.
• In dependency rule-based approach, the consideration of word to be a
candidate aspect or sentiment word are based on the type dependency
relation, the part-of-speech (POS) tag of the word in that relation, and rule
of extraction.
Challenges
large numbers of aspects are not
extracted by the rules
some of the extracted words are not
the aspects.
difficulty to develop a generalized
dependency-based rule extraction

INTRODUCTION
Contributions:
• The identification of the most potential type dependency relation
with it POS tag pattern in extracting more correct aspects.
• The combination of these dependency relations can solve the
single aspect single sentiment and multi aspect multi sentiment
cases.
• It also will assist in developing the generalized dependency-based
rule extraction.
Main objective:
To perform a preliminary study in order to measure the extraction
performance of different type of dependency relation in product
review.

METHODOLOGY
Pre-
processing
POS tagging
Dependency
Parsing
Dependency
relation
analysis

PRE-PROCESSING
• The noise element consist of useless characters and symbols have been
removed from the review. E.g: --, *, =, /, [, :), :D (, ), :-),!!!, “, +, etc.
• It will help to reduce the complexity of dependency relation of a review
sentence.
• Certain symbols or punctuations will be remained to preserve the
authenticity dependency grammar between words.
Review After symbols removal
BEST BUY - 5 STARS + + + (sales, service, respect
for old men who aren't familiar with the
technology) DELL COMPUTERS - 3 stars DELL
SUPPORT - owes a me a couple
BEST BUY - 5 STARS (sales, service, respect for old
men who aren't familiar with the technology)
DELL COMPUTERS - 3 stars DELL SUPPORT - owes
a me a couple
Since I keyboard over 100 wpm, I look for a unit
that has a comfortble keyboard (no keys sticking
or lagging, strange configuration of "extra key",
etc.
Since I keyboard over 100 wpm, I look for a unit
that has a comfortble keyboard no keys sticking
or lagging, strange configuration of extra key, etc.
I bought a protector for my key pad and it works
great :)
I bought a protector for my key pad and it works
great
:-)If you buy this - don't go into it expecting 7 hrs
of battery life, and you'll be perfectly satisfied.
If you buy this - don't go into it expecting 7 hrs of
battery life, and you'll be perfectly satisfied.

POS Tagging
• Part-of-speech (POS) tagging is performed for each review sentence using
Stanford CoreNLP.
• The POS tag is used to identify the word in the review sentence that is
nouns (NN), adjective (JJ), verb (VB) and adverb (RB).
POS Tag Description Indication
NN/NNS/NNP/NNPS Nouns Aspect
JJ/JJR/JJS Adjectives Sentiment
VB/VBD/VBG/VBN/VBP/VBZ Verbs Sentiment
RB/RBR/RBS Adverb Sentiment
• The list of POS tag that have been used in determining the POS tag pattern
of dependency relation shows as below.

DEPENDENCY PARSING
root ( ROOT-0 , long-23 )
det ( display-2 , The-1 )
nsubj ( best-8 , display-2 )
case ( computer-5 , on-3 )
det ( computer-5 , this-4 )
nmod ( display-2 , computer-5 )
cop ( best-8 , is-6 )
det ( best-8 , the-7 )
ccomp ( long-23 , best-8 )
nsubj ( seen-11 , I-9 )
aux ( seen-11 , 've-10 )
acl:relcl ( best-8 , seen-11 )
case ( time-16 , in-12 )
det ( time-16 , a-13 )
advmod ( long-15 , very-14 )
amod ( time-16 , long-15 )
nmod ( seen-11 , time-16 )
det ( life-20 , the-18 )
compound ( life-20 , battery-19 )
nsubj ( long-23 , life-20 )
cop ( long-23 , is-21 )
advmod ( long-23 , very-22 )
cc ( long-23 , and-24 )
advmod ( convenient-26 , very-25 )
conj ( long-23 , convenient-26 )
• The dependency parsing is applied to get the syntactic grammatical
dependency relation between words in the review sentence using Stanford
Parser (http://nlp.stanford.edu).
• From the dependency parsing, the type dependency relations (TDR)
between governor and dependent can be identified in order to extract the
most relevant aspect and sentiment word.
Type
dependency
relation (TDR) governor
dependent

DEPENDENCY RELATION ANALYSIS
• The dependency relation analysis is performed to identify relevant TDR and measure
the performance of each TDR in pre-extracting aspect and sentiment word.
• This task is performed in three steps: (1) select relevant TDR, (2) determine POS tag
pattern, and (3) extract product aspect.
(1) Select relevant TDR
This work only focuses on seven TDR specifically ‘nsubj’, ‘dobj’, ‘amod’, ‘nmod’, ‘acl’, ‘conj’
and ‘compound’ due to their capability to directly extract the aspect and sentiment word,
and able to tackle the multi aspects and multi sentiments issue.
(2) Determine POS tag pattern for governor and dependent of each selected TDR.
• The POS tag pattern is design based on the POS tag of governor and dependent of the
relation that represent aspect and sentiment word.
• Example: ‘nsubj’ relation consists of two types of pattern.
nsubj(JJ, NN) --> sentiment word-aspect
nsubj(VB, NN) ) --> sentiment word-aspect

DEPENDENCY RELATION ANALYSIS
(3) Extract product aspect using extraction rule
The extraction rule is derived based on the type dependency relation (TDR) and POS tag
pattern of that TDR.
TDR ID POS Tag Pattern Extraction Rule
nsubj1 nsubj (JJ/JJR/JJS, NN/NNS/NNP)
If the relation is nsubj and match the pattern, therefore the
governor is opinion and the dependent is aspect.nsubj2 nsubj (VB/VBD/VBG/VBN/VBP/VBZ, NN/NNS/NNP)
amod1 amod (NN/NNS/NNP, JJ/JJR/JJS)
If the relation is amod and match the pattern, therefore the
governor is aspect and the dependent is opinion.amod2 amod (NN/NNS/NNP, VB/VBD/VBG/VBN/VBP/VBZ)
dobj dobj (VB/VBD/VBG/VBN/VBP/VBZ, NN/NNS/NNP)
If the relation is dobj and match the pattern, therefore the
governor is opinion and the dependent is aspect.
nmod1 nmod (NN, NN/NNS)
If the relation is nmod and match the pattern, therefore both
words are aspects.
nmod2 nmod (JJ, NN)
If the relation is nmod and match the pattern, therefore the
governor is opinion and the dependent is aspect.
acl1 acl (NN, JJ) If the relation is acl and match the pattern, therefore the
governor is aspect and the dependent is opinion.acl2 acl (NNS, VBP)
conjA1 conjA (NN, NN/NNS/NNP)
If the relation is conj and match the pattern, therefore both
words are aspects.
conjA2 conjA (NN/NNS/NNP, JJ) If the relation is conj and match the pattern, therefore the
governor is aspect and the dependent is opinion.conjA3 conjA (NN/NNS/NNP, VBZ)
compound compound (NN, NN)
If the relation is compound and match the pattern, therefore
both words are aspects.

PRELIMINARY RESULT
• The experiment and evaluation are performed on training data of SemEval 2014
dataset.
Information of SemEval 2014 Dataset
Number of Review
Domain Training Testing Total
Laptop 3045 800 3845
Restaurant 3041 800 3841
• The performance is measured using evaluation metrics precision (P), recall (R) and F1-
score (F1) that is calculated using true positive (TP), false positive (FP) and false
negative (FN).
• TP is the number of word extracted that is correct aspect.
• FP is the number of word extracted that is incorrect aspect .
• FN is the number of word that is aspect, but not extracted.

PRELIMINARY RESULT
Aspect information of the SemEval 2014 dataset
• The experimental result is used to measure the performance of each TDR with POS
tag pattern in extracting correct aspect.
• The correct aspect is calculated based on the number of correct word extracted
compared to the number of word in actual aspect.
Aspect Information Laptop Restaurant
Total number of aspect 2358 3693
Total number of aspect word 3492 5120

PRELIMINARY RESULTS
Extraction Performance
TDR ID TP FP FN P R F1
compound 1037 1489 2455 41.05 29.70 34.46
amod1 570 1359 2922 29.55 16.32 21.03
dobj 487 1458 3005 25.04 13.95 17.91
nsubj2 307 600 3185 33.85 8.79 13.96
conjA1 273 315 3219 46.43 7.82 13.38
nmod1 244 712 3248 25.52 6.99 10.97
nsubj1 179 129 3313 58.12 5.13 9.42
nmod2 69 178 3423 27.94 1.98 3.69
amod2 31 44 3461 41.33 0.89 1.74
conjA2 7 11 3485 38.71 0.34 0.68
acl2 5 31 2087 13.89 0.24 0.47
conjA3 7 8 3485 46.67 0.20 0.40
acl1 12 19 3480 38.89 0.20 0.40
TDR ID TP FP FN P R F1
compound 1486 1322 3634 52.92 29.02 37.49
amod1 1147 1105 3973 50.93 22.40 31.12
nmod2 102 175 5018 36.82 1.99 3.78
nsubj1 663 108 4457 85.99 12.95 22.51
conjA1 555 193 4565 74.20 10.84 18.92
dobj 596 698 4524 46.06 11.64 18.58
nmod1 544 846 4576 39.14 10.63 16.71
nsubj2 327 376 4793 46.51 6.39 11.23
acl1 36 25 5084 59.02 0.70 1.39
amod2 24 9 5096 72.73 0.47 0.93
conjA2 12 20 5108 37.50 0.23 0.47
acl2 7 17 5113 29.17 0.14 0.27
conjA3 1 9 5119 10.00 0.02 0.04
Restaurant domainLaptop domain
• In both cases recall was lower than precision and consequently causes lower value to F1-
score. The reason is because not all words in a multi-word aspect (aspect phrase) are
extracted as well as has led to an increased number of not extracted aspects.

PRELIMINARY RESULTS
• Based on this result, the generation of more comprehensive and
generalized dependency-based rules extraction would be much
easier and more reliable.
• The combination with other dependencies might also contribute to
the finding of others potential aspect.
• The combination of these dependency relations can solve the
single aspect single sentiment and multi aspect multi sentiment
cases.
• More detail extraction rules is essential to be considered to achieve
high performance and accuracy.

CONCLUSION
• From the evaluation that has been carried out, the specific type
dependency relation with it POS tag pattern that could give
highest extraction performance has been identified.
• The results presented are based on the investigation of the
performance of the POS tag patterns on multi aspect multi
sentiment issues.
• Hence it would be the basis for the generation of dependency-
based extraction rule with the appropriate selection and
combination of the identified TDR POS tag pattern.
• By means of appropriate TDR combination, the single aspect
single sentiment and multi aspect multi sentiment cases can be
solved. More accurate aspects extracted would be expected.

FUTURE WORK
• This work can be further applied to extract and evaluate the
sentiment words that associated with each extracted aspect.
• An appropriate pruning method will be applied to reduce the
false aspects thus increase the recall.
• This work also will be implemented and evaluated using
testing data and another domain.

Aspect Extraction Performance With Common Pattern of Dependency Relation in Multi Aspect Sentiment Analysis

Aspect Extraction Performance With Common Pattern of Dependency Relation in Multi Aspect Sentiment Analysis

Recommended

Recommended

More Related Content

What's hot

What's hot (20)

Similar to Aspect Extraction Performance With Common Pattern of Dependency Relation in Multi Aspect Sentiment Analysis

Similar to Aspect Extraction Performance With Common Pattern of Dependency Relation in Multi Aspect Sentiment Analysis (20)

More from Nurfadhlina Mohd Sharef

More from Nurfadhlina Mohd Sharef (20)

Recently uploaded

Recently uploaded (20)

Aspect Extraction Performance With Common Pattern of Dependency Relation in Multi Aspect Sentiment Analysis

Editor's Notes