TY - GEN
T1 - Campus Sentiment Analysis with GAN-based Data Augmentation
AU - Shang, Yu
AU - Su, Xiaohui
AU - Xiao, Zhifeng
AU - Chen, Zidong
N1 - Publisher Copyright:
© 2021 IEEE.
PY - 2021
Y1 - 2021
N2 - Recent advances have seen the rapid development of online media, which offers various communication channels for people to express opinions. The fast-growing social media platforms have generated tremendous data that can be transformed into business and social value through modern machine learning algorithms. One of the crucial learning tasks is sentiment analysis, which refers to identifying the tendency of subjective information in an expression. Prior efforts in sentiment classification have explored a broad spectrum of methods that have achieved impressive performance gains from the predictive modeling perspective. However, the potential of data augmentation has not been sufficiently explored in this task, and we aim to fill this gap. This study proposes a novel sentiment analysis framework powered by two performance boosters, including a data augmentation method based on a transformer-based generative adversarial network (GAN) and RoBERTa, a robustly optimized Bidirectional Encoder Representations from Transformers (BERT) pretraining model, to improve the prediction accuracy from both the data and the model side. We conduct extensive experiments on a campus sentiment classification dataset. We show that the GAN-based data augmentation method can generate high-quality synthetic samples to increase the size and diversity of the training set. Compared with other model options and augmentation methods, the RoBERTa model enhanced by transformer-based GAN (TransGAN) presents a superior performance in prediction accuracy, validating the efficacy of the proposed framework.
AB - Recent advances have seen the rapid development of online media, which offers various communication channels for people to express opinions. The fast-growing social media platforms have generated tremendous data that can be transformed into business and social value through modern machine learning algorithms. One of the crucial learning tasks is sentiment analysis, which refers to identifying the tendency of subjective information in an expression. Prior efforts in sentiment classification have explored a broad spectrum of methods that have achieved impressive performance gains from the predictive modeling perspective. However, the potential of data augmentation has not been sufficiently explored in this task, and we aim to fill this gap. This study proposes a novel sentiment analysis framework powered by two performance boosters, including a data augmentation method based on a transformer-based generative adversarial network (GAN) and RoBERTa, a robustly optimized Bidirectional Encoder Representations from Transformers (BERT) pretraining model, to improve the prediction accuracy from both the data and the model side. We conduct extensive experiments on a campus sentiment classification dataset. We show that the GAN-based data augmentation method can generate high-quality synthetic samples to increase the size and diversity of the training set. Compared with other model options and augmentation methods, the RoBERTa model enhanced by transformer-based GAN (TransGAN) presents a superior performance in prediction accuracy, validating the efficacy of the proposed framework.
UR - http://www.scopus.com/inward/record.url?scp=85127111761&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85127111761&partnerID=8YFLogxK
U2 - 10.1109/ICAIT52638.2021.9702068
DO - 10.1109/ICAIT52638.2021.9702068
M3 - Conference contribution
AN - SCOPUS:85127111761
T3 - 2021 13th International Conference on Advanced Infocomm Technology, ICAIT 2021
SP - 209
EP - 214
BT - 2021 13th International Conference on Advanced Infocomm Technology, ICAIT 2021
PB - Institute of Electrical and Electronics Engineers Inc.
T2 - 13th International Conference on Advanced Infocomm Technology, ICAIT 2021
Y2 - 15 October 2021 through 18 October 2021
ER -