An effective system for Sentiment Analysis and classification of Twitter Data based on Artificial Intelligence (AI) Techniques

Authors

  • Ankit Bansal Data scientist, 200 braid street, Stamford, CT, 06901, USA Author

Keywords:

Sentiment analysis, Machine Learning, Twitter Dataset, PCA, Noise Removal, Stochastic Gradient Descent

Abstract

The microblogging service Twitter has quickly become one of the most widely used platforms for online discussion and opinion sharing. Tweets, when aggregated, might reveal how the public feels about certain occurrences. The goal of this research is to improve the accuracy and efficacy of sentiment categorisation by creating a state-of-the-art sentiment analysis system that can process data from Twitter. Given the vast volume of unstructured data generated on social media, particularly through Twitter’s microblogging platform, this research aims to accurately identify and categorise sentiments expressed in tweets, ranging from negative to positive. Utilising a dataset of 1,600,000 tweets labelled by sentiment, the study employs a robust methodology involving data collection, preprocessing, class balancing, and the implementation of diverse classification models, including RNN, MLP, NB, SVM, and SGD. Incorporating preprocessing techniques like stemming, tokenisation, noise removal, and PCA helps reduce dimensionality and enhance data quality. The RNN model achieved an outstanding 93% accuracy, making it the top-performing model among those that were tested. Showcasing its capacity to effectively categorise feelings while resolving class imbalance, the findings indicate the usefulness of the proposed method. Insights into public opinion patterns improved decision-making for organisations and enterprises as a result of this study, which contributed to the expanding area of sentiment analysis. Ultimately, the findings highlight the potential of AI techniques in understanding consumer opinions and trends within the dynamic landscape of social media.

References

B. Gupta, M. Negi, K. Vishwakarma, G. Rawat, and P. Badhani, “Study of Twitter Sentiment Analysis using Machine Learning Algorithms on Python,” Int. J. Comput. Appl., 2017, doi: 10.5120/ijca2017914022.

S. E. Saad and J. Yang, “Twitter Sentiment Analysis Based on Ordinal Regression,” IEEE Access, 2019, doi: 10.1109/ACCESS.2019.2952127.

L. B. Shyamasundar and P. J. Rani, “Twitter sentiment analysis with different feature extractors and dimensionality reduction using supervised learning algorithms,” 2016 IEEE Annu. India Conf. INDICON 2016, 2017, doi: 10.1109/INDICON.2016.7839075.

A. Giachanou and F. Crestani, “Like it or not: A survey of Twitter sentiment analysis methods,” ACM Computing Surveys. 2016. doi: 10.1145/2938640.

V. V. Kumar, S. R. Yadav, F. W. Liou, and S. N. Balakrishnan, “A digital interface for the part designers and the fixture designers for a reconfigurable assembly system,” Math. Probl. Eng., 2013, doi: 10.1155/2013/943702.

S. Dixit, P. Pathak, and S. Gupta, “A novel approch for gray hole and black hole detection and prevention,” in 2016 Symposium on Colossal Data Analysis and Networking, CDAN 2016, 2016. doi: 10.1109/CDAN.2016.7570861.

R. Vinayakumar, P. Poornachandran, and K. P. Soman, “Scalable Framework for Cyber Threat Situational Awareness Based on Domain Name Systems Data Analysis,” in Studies in Big Data, 2018. doi: 10.1007/978-981-10-8476-8_6.

E. Kontopoulos, C. Berberidis, T. Dergiades, and N. Bassiliades, “Ontology-based sentiment analysis of twitter posts,” Expert Syst. Appl., 2013, doi: 10.1016/j.eswa.2013.01.001.

R. Vinayakumar, K. P. Soman, and P. Poornachandran, “Evaluating deep learning approaches to characterize and classify malicious URL’s,” in Journal of Intelligent and Fuzzy Systems, 2018. doi: 10.3233/JIFS-169429.

V. V Kumar, M. Tripathi, M. K. Pandey, and M. K. Tiwari, “Physical programming and conjoint analysis-based redundancy allocation in multistate systems: A Taguchi embedded algorithm selection and control (TAS&C) approach,” Proc. Inst. Mech. Eng. Part O J. Risk Reliab., vol. 223, no. 3, pp. 215–232, Sep. 2009, doi: 10.1243/1748006XJRR210.

M. S. Neethu and R. Rajasree, “Sentiment analysis in twitter using machine learning techniques,” in 2013 4th International Conference on Computing, Communications and Networking Technologies, ICCCNT 2013, 2013. doi: 10.1109/ICCCNT.2013.6726818.

M. Bouazizi and T. Ohtsuki, “A Pattern-Based Approach for Multi-Class Sentiment Analysis in Twitter,” IEEE Access, 2017, doi: 10.1109/ACCESS.2017.2740982.

K. Dey, R. Shrivastava, and S. Kaushik, “Twitter Stance Detection - A Subjectivity and Sentiment Polarity Inspired Two-Phase Approach,” in IEEE International Conference on Data Mining Workshops, ICDMW, 2017. doi: 10.1109/ICDMW.2017.53.

M. H. Abd El-Jawad, R. Hodhod, and Y. M. K. Omar, “Sentiment analysis of social media networks using machine learning,” in ICENCO 2018 - 14th International Computer Engineering Conference: Secure Smart Societies, 2018. doi: 10.1109/ICENCO.2018.8636124.

A. M. Ramadhani and H. S. Goo, “Twitter sentiment analysis using deep learning methods,” in Proceedings - 2017 7th International Annual Engineering Seminar, InAES 2017, 2017. doi: 10.1109/INAES.2017.8068556.

K. S. Naveenkumar, R. Vinayakumar, and K. P. Soman, “Amrita-CEN-SentiDB: Twitter dataset for sentimental analysis and application of classical machine learning and deep learning,” in 2019 International Conference on Intelligent Computing and Control Systems, ICCS 2019, 2019. doi: 10.1109/ICCS45141.2019.9065337.

N. Agarwal, S. Gupta, and S. Gupta, “A comparative study on discrete wavelet transform with different methods,” in 2016 Symposium on Colossal Data Analysis and Networking (CDAN), Mar. 2016, pp. 1–6. doi: 10.1109/CDAN.2016.7570878.

I. T. Jollife and J. Cadima, “Principal component analysis: A review and recent developments,” Philosophical Transactions of the Royal Society A: Mathematical, Physical and Engineering Sciences. 2016. doi: 10.1098/rsta.2015.0202.

V. V. Kumar, F. T. S. Chan, N. Mishra, and V. Kumar, “Environmental integrated closed loop logistics model: An artificial bee colony approach,” in SCMIS 2010 - Proceedings of 2010 8th International Conference on Supply Chain Management and Information Systems: Logistics Systems and Engineering, 2010.

M. R. Kishore Mullangi, Vamsi Krishna Yarlagadda, Niravkumar Dhameliya, “Integrating AI and Reciprocal Symmetry in Financial Management: A Pathway to Enhanced Decision-Making,” Int. J. Reciprocal Symmetry Theor. Phys., vol. 5, no. 1, pp. 42–52, 2018.

S. Gupta and A. Mathur, “Modified spray and wait routing in under water acostic communication for sensor network,” in 2015 IEEE International Conference on Computational Intelligence and Computing Research, ICCIC 2015, 2016. doi: 10.1109/ICCIC.2015.7435763.

V. K. Y. Nicholas Richardson, Rajani Pydipalli, Sai Sirisha Maddula, Sunil Kumar Reddy Anumandla, “Role-Based Access Control in SAS Programming: Enhancing Security and Authorization,” Int. J. Reciprocal Symmetry Theor. Phys., vol. 6, no. 1, pp. 31–42, 2019.

H. Zhang and J. Su, “Naive Bayes for optimal ranking,” J. Exp. Theor. Artif. Intell., 2008, doi: 10.1080/09528130701476391.

Z. Muda, W. Yassin, M. N. Sulaiman, and N. I. Udzir, “Intrusion detection based on K-Means clustering and Naïve Bayes classification,” in 2011 7th International Conference on Information Technology in Asia: Emerging Convergences and Singularity of Forms - Proceedings of CITA’11, 2011. doi: 10.1109/CITA.2011.5999520.

J. Alzubi, A. Nayyar, and A. Kumar, “Machine Learning from Theory to Algorithms: An Overview,” in Journal of Physics: Conference Series, 2018. doi: 10.1088/1742-6596/1142/1/012012.

K. Dixit, P. Pathak, and S. Gupta, “A new technique for trust computation and routing in VANET,” in 2016 Symposium on Colossal Data Analysis and Networking, CDAN 2016, 2016. doi: 10.1109/CDAN.2016.7570944.

K. S. Naveenkumar, R. Vinayakumar, and K. P. Soman, “Amrita-CEN-SentiDB 1: Improved Twitter Dataset for Sentimental Analysis and Application of Deep learning,” 2019 10th Int. Conf. Comput. Commun. Netw. Technol. ICCCNT 2019, pp. 1–5, 2019, doi: 10.1109/ICCCNT45670.2019.8944758.

N. Jamal, C. Xianqiao, and H. Aldabbas, “Deep learning-based sentimental analysis for large-scale imbalanced twitter data,” Futur. Internet, 2019, doi: 10.3390/fi11090190.

V. A. and S. S. Sonawane, “Sentiment Analysis of Twitter Data: A Survey of Techniques,” Int. J. Comput. Appl., 2016, doi: 10.5120/ijca2016908625.

G. Gautam and D. Yadav, “Sentiment analysis of twitter data using machine learning approaches and semantic analysis,” 2014 7th Int. Conf. Contemp. Comput. IC3 2014, pp. 437–442, 2014, doi: 10.1109/IC3.2014.6897213.

Downloads

Published

18-03-2020

How to Cite

Ankit Bansal. (2020). An effective system for Sentiment Analysis and classification of Twitter Data based on Artificial Intelligence (AI) Techniques. International Journal of Computer Science and Information Technology Research , 1(1), 32-47. https://ijcsitr.com/index.php/home/article/view/IJCSITR_2020_01_01_003