Home Polarity Analysis of Customer Reviews Based on Part-of-Speech Subcategory
Article Open Access

Polarity Analysis of Customer Reviews Based on Part-of-Speech Subcategory

  • Ayman S. Ghabayen and Basem H. Ahmed EMAIL logo
Published/Copyright: August 15, 2019
Become an author with De Gruyter Brill

Abstract

Nowadays, sentiment analysis is a method used to analyze the sentiment of the feedback given by a user in an online document, such as a blog, comment, and review, and classifies it as negative, positive, or neutral. The classification process relies upon the analysis of the polarity features of the natural language text given by users. Polarity analysis has been an important subtask in sentiment analysis; however, detecting correct polarity has been a major issue. Different researchers have utilized different polarity features, such as standard part-of-speech (POS) tags such as adjectives, adverbs, verbs, and nouns. However, there seems to be a lack of research focusing on the subcategories of these tags. The aim of this research was to propose a method that better recognizes the polarity of natural language text by utilizing different polarity features using the standard POS category and the subcategory combinations in order to explore the specific polarity of text. Several experiments were conducted to examine and compare the efficacies of the proposed method in terms of F-measure, recall, and precision using an Amazon dataset. The results showed that JJ + NN + VB + RB + VBP + RP, which is a POS subcategory combination, obtained better accuracy compared to the baseline approaches by 4.4% in terms of F-measure.

1 Introduction

The recent years have brought significant growth in social media websites across the Internet. Alongside a huge amount of user-generated data, users provide their data not only through discussion and personal notes but also, on a mass scale, by sharing what they think and feel about products, services, issues, events, and policies in E-commerce websites. This is called user opinion or review, which aims to determine the mood of the writer or the attitude of the speaker. Therefore, user-generated data have become a very important source for business intelligence decision-making processes by helping organizations improve their products. In addition, it could also be helpful for consumers to read the reviews of other consumers to help them make up their mind about purchasing particular products before deciding to buy those products. The positive or negative feeling expressed by people is known as sentiment.

Sentiment analysis (SA), also known as opinion mining, is considered a natural language approach that analyzes people’s attitudes about a specific product or topic [15]. SA is the purpose of automatic analysis of an online document, such as a blog, comment, review, and other new items like a comprehensive sentiment, and summarizes it as positive, negative, or neutral [15]. SA can be used in different areas, such as predicting election results [25], providing companies and organizations with information about their products [8], automating product summary or outlines in reviews, or even predicting the stock market in online purchasing sites [8]. SA can be viewed as a classification process, as demonstrated in Figure 1.

Figure 1: Sentiment Analysis Process [13].
Figure 1:

Sentiment Analysis Process [13].

According to Refs. [10], [13], there are three levels on which SA can be conducted:

  1. Document level: This approach considers the entire document (e.g. a comment or review) as a basic information unit, and then classifies it as positive, negative, or neutral. However, in some cases, the results given by this approach are incompatible; for example, a document that has positively recognized a particular item does not indicate that the author seems to have only positive opinions about all features of that item. Similarly, a document that has negatively recognized an item does not indicate that the author is completely negative about all features of that item. Typically, authors convey both positive and negative sentiments about a particular item and its features.

  2. Sentence level: This approach attempts to establish the opinion expressed in each sentence by breaking the entire document into sentences, with each sentence handled as a separate information unit. It is first recognized whether a sentence is subjective or objective, and then it is decided whether the sentence conveys a positive or a negative opinion.

  3. Aspect level: This approach performs a fine-grained analysis to identify relevant aspects and entities of a particular item, and the sentiment/polarity is expressed toward each aspect. Here is an aspect that refers to a feature of a particular item, e.g. the battery life of a mobile phone.

The fundamental problem of SA is sentiment polarity categorization [10], [11], [13], [17]. Generally, there are many pieces of research in SA conducted for classifying a given textual content regarding the opinions expressed in it, by utilizing different polarity features, such as part-of-speech (POS) tags [4], [6], [18], [21], [24]. The aim of this study was to present lexical-based methods for sentiment polarity categorization based on polarity features, such as adjectives, adverbs, verbs, and nouns, with combinations among these features and all their subfeatures. The rest of this paper is structured as follows. In Section 2, the related work and the existing methods or techniques used in SA are presented. Next, the proposed methods and details about the different steps that are performed, the proposed framework, dataset, and evaluation matrices used are explained in Section 3. Section 4 presents and discusses the results of the proposed methods. Lastly, Section 5 gives the conclusion and our recommendations for future work that can be performed in this field.

2 Related Works

Generally, one of the fundamental problems in recent SA research focuses on the categorization of sentiments. This process relies on assigning the opinion polarity of words and phrases that express sentiments in order to decide the subjectivity/objectivity orientation of a document, like positive or negative (or neutral) [4]. Sentiment classification techniques described in the literature can be roughly classified into two approaches: supervised and unsupervised SA [16]. This classification relates to the methodology whereby the application is designed to make the sentiment classification. Supervised learning applies machine learning methods, such as support vector machine, maximum entropy, k-nearest neighbor, naïve Bayes, decision tree, and artificial neural network [2], [5], [18], [16]. The supervised learning approach uses labeled training data that are manually annotated and testing data in resolving the SA. Besides, it uses a set of linguistic and/or syntactic features vector extracted directly from the original feedback sentences in order to make a classification decision [5], [11].

In contrast, the unsupervised approach does not need to have labeled training data. To illustrate, the methodology in this research follows the unsupervised approach. The main approaches in this methodology are linguistics based and lexicon based [7]. The lexicon-based approach involves statistical calculation of sentiments from the semantic orientation of words or phrases that occur in the text [22]. The basic assumption underlying lexicon-based approaches is that the most essential indications of sentiment in natural language text are words that express sentiments, which are also called opinion words. Based on this approach, a pre-compiled dictionary of positive and negative terms is required. In addition, SentiWordNet is considered a famous lexical resource that was explicitly developed for supporting sentiment classification and analysis applications [24]. The linguistics-based SA divides natural language text not simply into particular constituent words and sentences. It also identifies their syntactic constructions in order to locate a syntactic POS category, such as an adjective or verb phrase, that is the most likely to be expressed in an opinion. The sentiment classification of a textual content considers the polarity score of each word in the text. For instance, if a word is matched with a positive sentiment score in the lexicon, then the positive polarity score of the content is increased. Subsequently, if the positive polarity score is bigger than the negative one, the content is considered positive, or else the content is considered negative.

There is a large volume of research that focused on finding commonly used terms that express the sentiment in an online review using the learning lexicon-based approach and natural language processing in order to find common terms that express opinion mining [1], [4], [21]. Thet et al. [23] suggested a linguistic approach for SA of discussion posts on conversation forums, where they work in a clause-level opinion analysis. They employed SentiWordNet to obtain the preceding word sentiment scores, with a specific domain lexicon (movie review domain) built on purpose. Then, they identified the sentiment score for each clause by inspecting the natural language syntactic dependencies of words, reliance on syntactic dependency trees, and considering pattern rules. Sarkar et al. [18] proposed an SA approach deploying linguistic features, such as adverb-adjective-noun-verb on the document level. However, a set of well-identified axioms was employed to compute the function value of SA. Bethard et al. [1] extracted the opinion at the sentence level by utilizing the combination of adverbs and adjectives. Chesley et al. [4] utilized linguistic features, such as adjectives and verbs, in order to classify blog sentiment. The Wikipedia dictionary is utilized for determining the polarity scores of the blog content.

Although there are plenty of works that have been done for SA covering, there is use of natural language, POS, such as adjectives, adverbs, verbs, and nouns. However, mostly, no research has been done until now that considers all subcategories of POS tags. For example, adjective is syntactically categorized into comparative and superlative. Table 1 lists a summary of related methods.

Table 1:

Summary of the Reported Methods.

Method Description Used by
Linguistics based This method identifies the syntactic construction of a natural language text in order to locate the syntactic POS phrase that is most likely to express an opinion Sarkar et al. [21], Bethard et al. [1], Chesley et al. [4], Thet et al. [23]
Lexicon based This method involves statistical calculation of sentiments from the semantic orientation of words or phrases that occur in a text Tomar and Sharma [17], Taboada et al. [22]

This study focuses on the feature evaluation of the standard POS – adjectives, adverbs, verbs, and nouns – and all its subcategories that indicate the property and informativeness of a word in a natural language text. Then, a comprehensive study of combinations between these subfeatures is presented to show the strength of such features in sentiment polarity.

3 Materials and Methods

This section comprises two main subsections. The first subsection presents the proposed approach to better recognize the polarity of natural language text and details of the processing that comprises seven steps. The second subsection presents the dataset and evaluation matrices used to evaluate the proposed approach.

3.1 Proposed Approaches

The aim of this research is to propose methods to improve the classification efficiency of identifying the sentiment expressed in a text by using the SentiWordNet lexicon and natural language POS. Figure 2 presents the proposed framework for sentiment classification, which consists of several steps, such as data pre-processing, sentiment tokenization, POS tagging, sentiment negation, term score, feature selection, sentiment classification, accuracy metric, and result comparison.

Figure 2: Sentiment Classification Framework.
Figure 2:

Sentiment Classification Framework.

3.1.1 Data Pre-processing

The first step is collecting the reviews from the Amazon dataset. As known, before classifying a text, it is important to process it. The second step is mainly focused on pre-processing and cleaning redundant data in the dataset. First, punctuation standardization is performed to ensure that writing rules can be respected. Thereafter, all non-alphabetical characters, like numbers and emotion letters including smileys, punctuation periods, hyphens, and apostrophes, are removed from each review. Then, all words in the review are converted into lowercase letters. Consequently, each split word and phrase are used for further processing.

3.1.2 Sentiment Tokenization

In this step, all the review text is tokenized, by breaking it up into sentences (based on the use of periods) and each sentence into words. After the text is separated in tokens, the next step is frequently designed to conduct a morph syntactic analysis to recognize characteristics, for example its grammatical or lexical category. This analysis is well known as POS tagging.

The steps of the sentiment classification framework are presented in further detail in the following subsections.

3.1.3 POS Tagger

A POS tagger is a linguistic approach in natural language processing that reads a textual content in a given language as input and provides POS tags to every single word in a sentence, such as a noun, verb, adjective, etc. Generally, there are eight POS in the English language: adjective, noun, verb, pronoun, adverb, preposition, conjunction, and interjection. The POS tagger employed in this research is Stanford POS [12]. The tagger is capable of providing 36 different tags that can recognize more detailed syntactic parts, more than just eight. Table 2 presents a list of POS tags that have been included in the POS tagger. Actually, POS tags indicate the informativeness of a word; thus, it can be used to calculate the term scores in the classification process.

Table 2:

POS Tags and Abbreviations.

POS tags Definition SentiWordNet abbr.
NN Noun, singular or mass N
NNP Proper noun, singular N
NNPS Proper noun, plural N
NNS Noun, plural N
VB Verb, base form V
VBD Verb, past tense V
VBG Verb, gerund/present participle V
VBN Verb, past participle V
VBP Verb, present tense V
VBZ Verb, third person V
RB Adverb R
RBR Adverb, comparative R
RBS Adverb, superlative R
JJ Adjective A
JJR Adjective, comparative A
JJS Adjective, superlative A

3.1.4 Negation

Negation negates the current polarity of paired words in a sentence. Negation words, such as “not,” “can’t,” and “don’t,” act as a polarity inverter of the word that is paired with it. For example, whenever a negation word is paired with a positive word, it will be turned into negative. Alternatively, when a negation word is paired with a negative word, it turns into positive. Accordingly, the negation words must be appropriately treated in a sentiment classification. This process adapted the work that was performed by Pang et al. [16]. For a negation handling process, a list of negation words is presented and each sentence in the review is checked based on the occurrence of a negation word. If a negation word is found, a negation process is applied; it is considered as a set of negation words defined as follows:

(1) NegW={set of negation words}.

If a negation word is found in the sentence, the positive score and the negative score of opinion terms in a sentence are swapped. For each term in the sentence, we apply the following equation:

(2) Score(termx)={Pos.Score(termx)=Neg.Score(termx)Neg.Score(termx)=Pos.Score(termx) termyNegW,

where termx is any term in the tokenized sentence.

3.1.5 Term Score

This process takes the review corresponding feature vector that represents the POS, then represents a set of sentiment scores. Each sentiment-carrying word within the input review is assigned a term score from the sentiment lexicon, which is the most essential resource for most SA methods. In this research, SentiWordNet was developed to set the synsets as the main target part of WordNet, in accordance with the “positive,” “negative,” and “neutral” notation of each sunset. Each set of terms sharing the same meaning in synsets is linked with three numerical scores, Pos.Score(termx), Neg.Score(termx), and obj.Score(termx), for positive, negative, and neutral, respectively. The score can be 0 or 1, which indicates the negative and positive bias of the term according to the following formula:

(3) Pos.Score(termx)+Neg.Score(termx)+obj.Score(termx)=1.

If a word is not found in the SentiWordNet lexicon, the term score is calculated by using the WordNet lexicon, by collecting the corresponding synonym set (synset) of the target word based on its POS tags. For example, if the target word is an adjective, all the synonym sets that are tagged with adjective POS in WordNet is collected. We believe that this procedure might improve accuracy, as it could possibly overcome the ambiguity and the variety of the vocabulary. In this case, the term score determined the based maximum value of the absolute value of the subtraction between its positive and negative score. The term score for such word is calculated as follows: let Sx be the set of terms synset (synonyms), then

(4) Score(termx)=MaxsynsetsSx|Pos.Score(synset)Neg.Score(synset)|,

where Pos.Score and Neg.Score correspond to the collected synonyms for terms.

3.1.6 Feature Selection

This study focuses on feature extraction retracted to different types of POS (adjectives, adverbs, verbs, and nouns). The feature selection process receives a tokenized tag for each feature with a term score corresponding to each POS tag shown in Table 2. Then, to compare the effectiveness of each POS tag on the sentiment polarity, each feature of tags is selected separately. A combination of sets of the best POS tags is generated as a new feature set.

3.1.7 Sentiment Classification

Sentiment classification relies on the selected feature set that shows a user’s opinion. Each review comprises a variety of opinion words of variable sequence. The selected feature set consists of different grammatical words, as mentioned in Table 2. In this study, the sentiment classification approach is the unsupervised lexicon-based approach. The sentiment classification of review R is calculated by the difference between the summations of its positive and negative term scores, as follows:

(5) SentiScore(R)=pos=1pScore(termpos)neg=1nScore(termneg)n+p,

where p denotes the total number of positive terms and n denotes the total number of negative terms. As review R is longer, it may contain more terms that are regarded as positive or negative. Therefore, in order to compare the sentiment polarity of different review lengths, the SentiScore is normalized by dividing it by the number of sentiment terms in R, with the intention to dampen the impact of the review size on its score. Then, the normalized SentiScore values are within the interval [−1,1]. Thus, the review is classified as a positive review if its normalized SentiScore is positive. Alternatively, the review is classified as a negative review if its normalized SentiScore is negative.

3.2 Dataset and Evaluation Matrices

The proposed methods were examined on the Amazon reviews dataset released by the Association for Computational Linguistics [3]. Each review consists of the user comments (opinion text), a reviewer name and location, a product name, a review title and date, and a numerical rating scale from 1 to 5 stars. The low rating scale (1 star) indicates an extremely negative opinion and a very high rating (5 stars) reflects an extremely positive opinion on the product. To prepare the dataset, all reviews with user rating >3 were labeled as positive and those with rating <3 were labeled as a negative review. Furthermore, reviews with a rating of 3, which is considered neutral, are ignored because they lie near the boundary of a binary classifier, under the assumption that there is less to learn from neutral texts compared to the ones with a clear positive or negative sentiment [20].

The dataset was pre-processed and cleaned from its raw form. For instance, HTML tags were eliminated using the HTML parser. Additionally, all punctuation signifiers and numbers were eliminated. Experiments were conducted on a dataset of 21,972 positives plus 16,576 negatives with a total of 38,548 reviews. The experimental dataset is a benchmark dataset that consists of 26 different product types from different domains. Table 3 presents a summary of the dataset used.

Table 3:

Amazon Product Dataset Summary.

Domain Positive Negative
Apparel 1000 1000
Automotive 584 152
Baby 1000 900
Beauty 1000 493
Books 1000 1000
Camera & photo 1000 999
Cell phones & service 639 384
Computer & video games 1000 458
DVD 1000 1000
Electronics 1000 1000
Gourmet food 1000 208
Grocery 1000 352
Health & personal care 1000 1000
Jewelry & watches 1000 292
Kitchen & housewares 1000 1000
Magazines 1000 970
Music 1000 1000
Musical instruments 284 48
Office products 367 64
Outdoor living 1000 327
Software 1000 915
Sports & outdoors 1000 1000
Tools & hardware 98 14
Toys & games 1000 1000
Video 1000 1000
Total no. 21,972 16,576

As known, most SA algorithms categorize data into positive, neutral, and negative. Hence, the rule of thumb is to measure performance by examining if the system categorized the data in accordance with the intuition of the use. In order to examine the effectiveness of the proposed methodology, well-known evaluation metrics, precision-recall and F-measure, are used. These metrics compare the correct classification that considers the true positive and true negative considering both positive and negative reviews, respectively.

4 Results and Discussion

In order to compare the performance of our proposed methods, we compared our approach with the lexicon-based classifier. The first baseline (baseline1) is based on the idea that the polarity of a text can be given by the sum of the individual polarity values of each word or phrase presented in the text, where each term is associated with numerical scores indicating positive and negative sentiment information. The total polarity of the text is considered positive if the sum of the positive word polarity is greater than the sum of the negative word polarity. Otherwise, the total polarity of the text is considered negative [9], [19]. The second baseline (baseline2) involves counting the number of positive and negative word scores to determine the sentiment polarity of a text. In this approach, the total polarity of the text is considered positive if the count of positive word polarity is greater than the count of negative word polarity. Otherwise, the total polarity of the text is considered negative [9], [14].

Table 4 reports the results of each individual method. The performance was evaluated by the recall, precision, and F-measure metrics per method, where PR is the positive recall, PP is the positive precision, PF is the positive F-measure, NR is the negative recall, NP is the negative precision, and NF is the negative F-measure. Hence, WF is the weighted F-measure. WF is the weighted average of the PF and NF scores.

Table 4:

Experimental Results for POS Subcategories.

Exp. PR PP PF NR NP NF WF
Baseline1 72.6 71.8 72.2 53.3 52.8 53.1 64
Baseline2 76.9 72.1 74.5 35.5 32.6 34 57.1
Adjective 73.7 69.9 71.8 57 53 55 64.6
JJ 70.4 66.5 68.4 57.7 53.5 55.6 62.9
JJS 97 12.6 22.4 38.8 3.4 6.3 15.5
JJR 0 0 0 0 0 0 0
Adverb 60.1 53.2 56.5 60.3 55.7 58 57.2
RB 58.3 50.8 54.3 62 56.4 59.1 56.4
RP 68 18.5 29.1 39.3 13.1 19.7 25.1
RBR 100 5.3 10.1 0.2 0.1 0.2 5.9
RBS 100 0.7 1.4 0 0 0 0.8
Noun 63.2 62 62.6 41.9 41.2 41.6 53.6
NN 66.6 63.1 64.9 38.8 36.9 37.9 53.3
NNS 49.5 9.2 15.6 46.3 8.9 15 15.4
NNP 31.4 2.7 5 65 5.4 10 7.2
NNPS 12.5 0.1 0.2 60 0.1 0.2 0.2
Verb 68.2 62.3 65.2 37.1 34.7 35.9 52.7
VB 65.9 53.2 58.9 33.3 29.2 31.2 47
VBP 60.3 44.6 51.3 48.7 35.7 41.2 47
VBD 71.5 9.4 16.7 33.9 5.2 9.1 13.5
VBN 73.2 6.7 12.3 28.1 2.6 4.8 9.1
VBZ 39.9 0.4 0.8 52.4 0.6 1.2 1
VBG 38.5 0.2 0.4 55 0.3 0.6 0.5

The result illustrates the accuracy of different POS tag methods. Firstly, we observed the classification accuracy of using the standard POS tags, which uses adjectives, adverbs, verbs, and nouns only as their scoring source. The classification method using adjectives achieved the highest accuracy of 64.6%. Secondly, we compared the results of the subcategories. The results showed that JJ, RB, VBP, and NN achieved the best result compared to the other POS subcategories in the standard POS tags adjectives, adverbs, verbs, and nouns, respectively.

Table 5 presents POS subcategory combinations, where the subcategories that obtain the highest accuracy in the standard POS tags are selected in the combinations. For example, in the adjective category, JJ obtains the highest accuracy compared to other adjective categories, similarly NN in noun, VB in verb, and RB in adverb. Moreover, the subcategories that obtain the next highest accuracy in the standard POS tags are added in the combinations in order to explore their effect on classification accuracy. For example, the verb VBP subcategory obtains 47% in terms of WF score; therefore, it is elected in the subcategory combinations. The same procedure is followed for RP, JJS, and VBD in the combination process.

Table 5:

Subcategory Combinations.

Set no. Feature set combinations
1. JJ, NN, VB, RB
2. JJ, NN, VB, RB, VBP
3. JJ, NN, VB, RB, VBP, RP
4. JJ, NN, VB, RB, VBP, RP, JJS
5. JJ, NN, VB, RB, VBP, RP, JJS, VBD
6. JJ, NN, VB, RB, VBP, RP, JJS, VBD, NNS
7. JJ, NN, VB, RB, VBP, RP, JJS, VBD, NNS, VBN

Table 6 depicts the WF measure for POS subcategory combinations. The results show that the JJ + NN + VB + RB combination obtained a better accuracy of 67.5% compared to a baseline1 approach, which achieved 64.1%. Next, we compared the result of other POS combinations in order to examine the effectiveness of each POS category combination. In case of adding the VBP feature to the previous combination, the accuracy obtains an enhancement of 1%. However, other POS feature combinations do not show a significant improvement in the classification accuracy; for example, the RP tag does not have any accuracy improvement. Meanwhile, the JJS tags show a small improvement of 0.1%, which cannot be considered a significant improvement. Finally, the results illustrate that adding the VBD, NNS, and VBN POS features to the combination produces a negative effect on classification accuracy.

Table 6:

Experimental Results for POS Subcategory Combinations.

Exp./Metrics PR PP PF NR NP NF WF
Baseline1 72.6 71.9 72.3 53.3 52.8 53.1 64.1
Baseline2 76.9 72.1 74.5 35.5 32.6 34 57.1
1 75.5 74.7 75.1 57.6 56.9 57.3 67.5
2 75.5 75.3 75.4 59.4 59.2 59.3 68.5
3 77.6 76.9 77.3 57.1 56.5 56.8 68.5
4 77.6 76.9 77.3 57.2 56.6 56.9 68.6
5 77.6 76.9 77.3 56.7 56.1 56.4 68.4
6 77.6 76.9 77.3 56.7 56.1 56.4 68.4
7 78 77.3 77.7 56.1 55.5 55.8 68.3

5 Conclusion

In this paper, we presented lexical-based methods for SA that better recognize the polarity of natural language text by utilizing different polarity features with the standard POS tags, such as adjectives, adverbs, verbs, and nouns, and examined the combination of subcategories. An experimental study was conducted on the Amazon dataset to explore the specific polarity of text. We examined different polarity features using the standard POS tags and the combination of the subcategories. The experimental results indicated that the JJ + NN + VB + RB + VBP + RP combination achieved 4.4% enhancement compared with baseline1. This result is very promising in performing their tasks compared to the other feature combinations and the baseline approaches. In the future, we are planning to improve the presented work by considering the semantic definition of each word. We could also employ different general inquirer dictionaries for further categorization.

Bibliography

[1] S. Bethard, H. Yu, A. Thornton, V. Hatzivassiloglou and D. Jurafsky, Automatic extraction of opinion propositions and their holders, in: 2004 AAAI Spring Symposium on Exploring Attitude and Affect in Text, vol. 2224, 2004.Search in Google Scholar

[2] R. Bhargava, S. Arora and Y. Sharma, Neural network-based architecture for sentiment analysis in Indian languages, J. Intell. Syst. 28 (2018), 361–375.10.1515/jisys-2017-0398Search in Google Scholar

[3] J. Blitzer, M. Dredze and F. Pereira, Biographies, Bollywood, boom-boxes and blenders: domain adaptation for sentiment classification, ACL 7 (2007), 440–447.Search in Google Scholar

[4] P. Chesley, B. Vincent, L. Xu and R. K. Srihari, Using verbs and adjectives to automatically classify blog sentiment, in: Computational Approaches to Analyzing Weblogs: Papers from the 2006 Spring Symposium, N. Nicolov, F. Salvetti, M. Liberman, and J. H. Martin, eds., AAAI Press, Menlo Park, CA, 27–29, Technical Report SS-06-03, vol. 580, no. 263, p. 233, 2006.Search in Google Scholar

[5] M. D. Devika, C. Sunitha and A. Ganesh, Sentiment analysis: a comparative study on different approaches, Proc. Comput. Sci. 87 (2016), 44–49.10.1016/j.procs.2016.05.124Search in Google Scholar

[6] X. Fang and J. Zhan, Sentiment analysis using product review data, J. Big Data 2 (2015), 5.10.1186/s40537-015-0015-2Search in Google Scholar

[7] R. Feldman, Techniques and applications for sentiment analysis, Commun. ACM 56 (2013), 82–89.10.1145/2436256.2436274Search in Google Scholar

[8] M. Ghiassi, J. Skinner and D. Zimbra, Twitter brand sentiment analysis: a hybrid system using n-gram analysis and dynamic artificial neural network, Expert Syst. Appl. 40 (2013), 6266–6282.10.1016/j.eswa.2013.05.057Search in Google Scholar

[9] C. S. Khoo and S. B. Johnkhan, Lexicon-based sentiment analysis: comparative evaluation of six sentiment lexicons, J. Inform. Sci. 44 (2017), 491–511.10.1177/0165551517703514Search in Google Scholar

[10] B. Liu, Sentiment analysis and opinion mining, Synthesis Lect. Hum. Lang. Technol. 5 (2012), 1–167.10.1007/978-3-642-19460-3_11Search in Google Scholar

[11] Y. Liu, J.-W. Bi and Z.-P. Fan, Multi-class sentiment classification: the experimental comparisons of feature selection and machine learning algorithms, Expert Syst. Appl. 80 (2017), 323–339.10.1016/j.eswa.2017.03.042Search in Google Scholar

[12] C. D. Manning, M. Surdeanu, J. Bauer, J. R. Finkel, S. Bethard and D. McClosky, The Stanford corenlp natural language processing toolkit, in: ACL (System Demonstrations), Stanford, pp. 55–60, 2014.10.3115/v1/P14-5010Search in Google Scholar

[13] W. Medhat, A. Hassan and H. Korashy, Sentiment analysis algorithms and applications: a survey, Ain Shams Eng. J. 5 (2014), 1093–1113.10.1016/j.asej.2014.04.011Search in Google Scholar

[14] B. Ohana and B. Tierney, Sentiment classification of reviews using SentiWordNet, in: IT&T Conference, Dublin Institute of Technology, Dublin, Ireland, 22nd–23rd October, 2009.Search in Google Scholar

[15] B. Pang and L. Lee, Opinion mining and sentiment analysis, Found. Trends Inform. Retriev. 2 (2008), 1–135.10.1561/9781601981516Search in Google Scholar

[16] B. Pang, L. Lee and S. Vaithyanathan, Thumbs up?: sentiment classification using machine learning techniques, in: Proceedings of the ACL-02 Conference on Empirical Methods in Natural Language Processing, vol. 10, 2002.10.3115/1118693.1118704Search in Google Scholar

[17] K. Ravi and V. Ravi, A survey on opinion mining and sentiment analysis: tasks, approaches and applications, Knowl.-Based Syst. 89 (2015), 14–46.10.1016/j.knosys.2015.06.015Search in Google Scholar

[18] S. Sarkar, P. Mallick and T. K. Mitra, A novel machine learning approach for sentiment analysis based on Adverb-Adjective-Noun-Verb (AANV) combinations, Int. J. Recent Trends Eng. Technol. 7 (2012).Search in Google Scholar

[19] A. Sharma, A. Sharma, R. K. Singh and M. D. Upadhayay, Hybrid classifier for sentiment analysis using effective pipelining, Int. Res. J. Eng. Technol. (IRJET) 4 (2017), 2276–2281.Search in Google Scholar

[20] F. Smarandache, M. Teodorescu and D. Gîfu, Neutrosophy, a sentiment analysis model, in: The 3 rd Workshop on Social Media and the Web of Linked Data, Toronto, Ontario, Canada, ACM/IEEE-CS Joint Conference on Digital Libraries (JCDL), pp. 38–41, 2017.Search in Google Scholar

[21] V. S. Subrahmanian and D. Reforgiato, AVA: adjective-verb-adverb combinations for sentiment analysis, IEEE Intell. Syst. 23 (2008), 43–50.10.1109/MIS.2008.57Search in Google Scholar

[22] M. Taboada, J. Brooke, M. Tofiloski, K. Voll and M. Stede, Lexicon-based methods for sentiment analysis, Comput. Linguist. 37 (2011), 267–307.10.1162/COLI_a_00049Search in Google Scholar

[23] T. T. Thet, J.-C. Na, C. S. Khoo and S. Shakthikumar, Sentiment analysis of movie reviews on discussion boards using a linguistic approach, in: Proceedings of the 1st International CIKM Workshop on Topic-Sentiment Analysis for Mass Opinion, pp. 81–84, ACM, 2009.10.1145/1651461.1651476Search in Google Scholar

[24] D. S. Tomar and P. Sharma, A text polarity analysis using SentiWordNet based an algorithm, Int. J. Comput. Sci. Inform. Technol. 7 (2016), 190–193.Search in Google Scholar

[25] A. Tumasjan, T. O. Sprenger, P. G. Sandner and I. M. Welpe, Predicting elections with twitter: what 140 characters reveal about political sentiment, in: Fourth International AAAI Conference on Weblogs and Social Media, vol. 10, no. 1, pp. 178–185, 2010.10.1609/icwsm.v4i1.14009Search in Google Scholar

Received: 2018-08-31
Published Online: 2019-08-15

©2020 Walter de Gruyter GmbH, Berlin/Boston

This work is licensed under the Creative Commons Attribution 4.0 Public License.

Articles in the same Issue

  1. An Optimized K-Harmonic Means Algorithm Combined with Modified Particle Swarm Optimization and Cuckoo Search Algorithm
  2. Texture Feature Extraction Using Intuitionistic Fuzzy Local Binary Pattern
  3. Leaf Disease Segmentation From Agricultural Images via Hybridization of Active Contour Model and OFA
  4. Deadline Constrained Task Scheduling Method Using a Combination of Center-Based Genetic Algorithm and Group Search Optimization
  5. Efficient Classification of DDoS Attacks Using an Ensemble Feature Selection Algorithm
  6. Distributed Multi-agent Bidding-Based Approach for the Collaborative Mapping of Unknown Indoor Environments by a Homogeneous Mobile Robot Team
  7. An Efficient Technique for Three-Dimensional Image Visualization Through Two-Dimensional Images for Medical Data
  8. Combined Multi-Agent Method to Control Inter-Department Common Events Collision for University Courses Timetabling
  9. An Improved Particle Swarm Optimization Algorithm for Global Multidimensional Optimization
  10. A Kernel Probabilistic Model for Semi-supervised Co-clustering Ensemble
  11. Pythagorean Hesitant Fuzzy Information Aggregation and Their Application to Multi-Attribute Group Decision-Making Problems
  12. Using an Efficient Optimal Classifier for Soil Classification in Spatial Data Mining Over Big Data
  13. A Bayesian Multiresolution Approach for Noise Removal in Medical Magnetic Resonance Images
  14. Gbest-Guided Artificial Bee Colony Optimization Algorithm-Based Optimal Incorporation of Shunt Capacitors in Distribution Networks under Load Growth
  15. Graded Soft Expert Set as a Generalization of Hesitant Fuzzy Set
  16. Universal Liver Extraction Algorithm: An Improved Chan–Vese Model
  17. Software Effort Estimation Using Modified Fuzzy C Means Clustering and Hybrid ABC-MCS Optimization in Neural Network
  18. Handwritten Indic Script Recognition Based on the Dempster–Shafer Theory of Evidence
  19. An Integrated Intuitionistic Fuzzy AHP and TOPSIS Approach to Evaluation of Outsource Manufacturers
  20. Automatically Assess Day Similarity Using Visual Lifelogs
  21. A Novel Bio-Inspired Algorithm Based on Social Spiders for Improving Performance and Efficiency of Data Clustering
  22. Discriminative Training Using Noise Robust Integrated Features and Refined HMM Modeling
  23. Self-Adaptive Mussels Wandering Optimization Algorithm with Application for Artificial Neural Network Training
  24. A Framework for Image Alignment of TerraSAR-X Images Using Fractional Derivatives and View Synthesis Approach
  25. Intelligent Systems for Structural Damage Assessment
  26. Some Interval-Valued Pythagorean Fuzzy Einstein Weighted Averaging Aggregation Operators and Their Application to Group Decision Making
  27. Fuzzy Adaptive Genetic Algorithm for Improving the Solution of Industrial Optimization Problems
  28. Approach to Multiple Attribute Group Decision Making Based on Hesitant Fuzzy Linguistic Aggregation Operators
  29. Cubic Ordered Weighted Distance Operator and Application in Group Decision-Making
  30. Fault Signal Recognition in Power Distribution System using Deep Belief Network
  31. Selector: PSO as Model Selector for Dual-Stage Diabetes Network
  32. Oppositional Gravitational Search Algorithm and Artificial Neural Network-based Classification of Kidney Images
  33. Improving Image Search through MKFCM Clustering Strategy-Based Re-ranking Measure
  34. Sparse Decomposition Technique for Segmentation and Compression of Compound Images
  35. Automatic Genetic Fuzzy c-Means
  36. Harmony Search Algorithm for Patient Admission Scheduling Problem
  37. Speech Signal Compression Algorithm Based on the JPEG Technique
  38. i-Vector-Based Speaker Verification on Limited Data Using Fusion Techniques
  39. Prediction of User Future Request Utilizing the Combination of Both ANN and FCM in Web Page Recommendation
  40. Presentation of ACT/R-RBF Hybrid Architecture to Develop Decision Making in Continuous and Non-continuous Data
  41. An Overview of Segmentation Algorithms for the Analysis of Anomalies on Medical Images
  42. Blind Restoration Algorithm Using Residual Measures for Motion-Blurred Noisy Images
  43. Extreme Learning Machine for Credit Risk Analysis
  44. A Genetic Algorithm Approach for Group Recommender System Based on Partial Rankings
  45. Improvements in Spoken Query System to Access the Agricultural Commodity Prices and Weather Information in Kannada Language/Dialects
  46. A One-Pass Approach for Slope and Slant Estimation of Tri-Script Handwritten Words
  47. Secure Communication through MultiAgent System-Based Diabetes Diagnosing and Classification
  48. Development of a Two-Stage Segmentation-Based Word Searching Method for Handwritten Document Images
  49. Pythagorean Fuzzy Einstein Hybrid Averaging Aggregation Operator and its Application to Multiple-Attribute Group Decision Making
  50. Ensembles of Text and Time-Series Models for Automatic Generation of Financial Trading Signals from Social Media Content
  51. A Flame Detection Method Based on Novel Gradient Features
  52. Modeling and Optimization of a Liquid Flow Process using an Artificial Neural Network-Based Flower Pollination Algorithm
  53. Spectral Graph-based Features for Recognition of Handwritten Characters: A Case Study on Handwritten Devanagari Numerals
  54. A Grey Wolf Optimizer for Text Document Clustering
  55. Classification of Masses in Digital Mammograms Using the Genetic Ensemble Method
  56. A Hybrid Grey Wolf Optimiser Algorithm for Solving Time Series Classification Problems
  57. Gray Method for Multiple Attribute Decision Making with Incomplete Weight Information under the Pythagorean Fuzzy Setting
  58. Multi-Agent System Based on the Extreme Learning Machine and Fuzzy Control for Intelligent Energy Management in Microgrid
  59. Deep CNN Combined With Relevance Feedback for Trademark Image Retrieval
  60. Cognitively Motivated Query Abstraction Model Based on Associative Root-Pattern Networks
  61. Improved Adaptive Neuro-Fuzzy Inference System Using Gray Wolf Optimization: A Case Study in Predicting Biochar Yield
  62. Predict Forex Trend via Convolutional Neural Networks
  63. Optimizing Integrated Features for Hindi Automatic Speech Recognition System
  64. A Novel Weakest t-norm based Fuzzy Fault Tree Analysis Through Qualitative Data Processing and Its Application in System Reliability Evaluation
  65. FCNB: Fuzzy Correlative Naive Bayes Classifier with MapReduce Framework for Big Data Classification
  66. A Modified Jaya Algorithm for Mixed-Variable Optimization Problems
  67. An Improved Robust Fuzzy Algorithm for Unsupervised Learning
  68. Hybridizing the Cuckoo Search Algorithm with Different Mutation Operators for Numerical Optimization Problems
  69. An Efficient Lossless ROI Image Compression Using Wavelet-Based Modified Region Growing Algorithm
  70. Predicting Automatic Trigger Speed for Vehicle-Activated Signs
  71. Group Recommender Systems – An Evolutionary Approach Based on Multi-expert System for Consensus
  72. Enriching Documents by Linking Salient Entities and Lexical-Semantic Expansion
  73. A New Feature Selection Method for Sentiment Analysis in Short Text
  74. Optimizing Software Modularity with Minimum Possible Variations
  75. Optimizing the Self-Organizing Team Size Using a Genetic Algorithm in Agile Practices
  76. Aspect-Oriented Sentiment Analysis: A Topic Modeling-Powered Approach
  77. Feature Pair Index Graph for Clustering
  78. Tangramob: An Agent-Based Simulation Framework for Validating Urban Smart Mobility Solutions
  79. A New Algorithm Based on Magic Square and a Novel Chaotic System for Image Encryption
  80. Video Steganography Using Knight Tour Algorithm and LSB Method for Encrypted Data
  81. Clay-Based Brick Porosity Estimation Using Image Processing Techniques
  82. AGCS Technique to Improve the Performance of Neural Networks
  83. A Color Image Encryption Technique Based on Bit-Level Permutation and Alternate Logistic Maps
  84. A Hybrid of Deep CNN and Bidirectional LSTM for Automatic Speech Recognition
  85. Database Creation and Dialect-Wise Comparative Analysis of Prosodic Features for Punjabi Language
  86. Trapezoidal Linguistic Cubic Fuzzy TOPSIS Method and Application in a Group Decision Making Program
  87. Histopathological Image Segmentation Using Modified Kernel-Based Fuzzy C-Means and Edge Bridge and Fill Technique
  88. Proximal Support Vector Machine-Based Hybrid Approach for Edge Detection in Noisy Images
  89. Early Detection of Parkinson’s Disease by Using SPECT Imaging and Biomarkers
  90. Image Compression Based on Block SVD Power Method
  91. Noise Reduction Using Modified Wiener Filter in Digital Hearing Aid for Speech Signal Enhancement
  92. Secure Fingerprint Authentication Using Deep Learning and Minutiae Verification
  93. The Use of Natural Language Processing Approach for Converting Pseudo Code to C# Code
  94. Non-word Attributes’ Efficiency in Text Mining Authorship Prediction
  95. Design and Evaluation of Outlier Detection Based on Semantic Condensed Nearest Neighbor
  96. An Efficient Quality Inspection of Food Products Using Neural Network Classification
  97. Opposition Intensity-Based Cuckoo Search Algorithm for Data Privacy Preservation
  98. M-HMOGA: A New Multi-Objective Feature Selection Algorithm for Handwritten Numeral Classification
  99. Analogy-Based Approaches to Improve Software Project Effort Estimation Accuracy
  100. Linear Regression Supporting Vector Machine and Hybrid LOG Filter-Based Image Restoration
  101. Fractional Fuzzy Clustering and Particle Whale Optimization-Based MapReduce Framework for Big Data Clustering
  102. Implementation of Improved Ship-Iceberg Classifier Using Deep Learning
  103. Hybrid Approach for Face Recognition from a Single Sample per Person by Combining VLC and GOM
  104. Polarity Analysis of Customer Reviews Based on Part-of-Speech Subcategory
  105. A 4D Trajectory Prediction Model Based on the BP Neural Network
  106. A Blind Medical Image Watermarking for Secure E-Healthcare Application Using Crypto-Watermarking System
  107. Discriminating Healthy Wheat Grains from Grains Infected with Fusarium graminearum Using Texture Characteristics of Image-Processing Technique, Discriminant Analysis, and Support Vector Machine Methods
  108. License Plate Recognition in Urban Road Based on Vehicle Tracking and Result Integration
  109. Binary Genetic Swarm Optimization: A Combination of GA and PSO for Feature Selection
  110. Enhanced Twitter Sentiment Analysis Using Hybrid Approach and by Accounting Local Contextual Semantic
  111. Cloud Security: LKM and Optimal Fuzzy System for Intrusion Detection in Cloud Environment
  112. Power Average Operators of Trapezoidal Cubic Fuzzy Numbers and Application to Multi-attribute Group Decision Making
Downloaded on 9.9.2025 from https://www.degruyterbrill.com/document/doi/10.1515/jisys-2018-0356/html
Scroll to top button