Opportunities and Challenges of Deep Learning Models for Short Text Data Analysis

Byungsoon Kim

Mini Review

Opportunities and Challenges of Deep Learning Models for Short Text Data Analysis

Ashis Kumar Chanda*

Adjunct faculty of Rowan University, Glassboro, New jersey, USA

Corresponding Author

Received Date: November 08, 2021; Published Date: November 19, 2021

Abstract

The rapid growth of the internet usage on mobile devices has increased the opportunities to communicate with other people and publicly share news and opinions. As a result, a huge amount of data is generated every day, and it becomes a great resource for investigating different research problems such as sentiment analysis, entity finding, and prediction problems. Recent advanced machine learning methods use deep neural network models to solve the problems that require huge datasets to train and converge the models. However, people mostly use short text data for such online communication purposes. Analyzing the sentiment of the short text is challenging because of its natural characters, such as sparseness, immediacy, and misspelling. Therefore, this study focuses on the opportunities and challenges of deep learning models for short text data analysis. The paper discussed research works that applied deep learning (deep neural network) models to analyze short text data and explained unsolved problems in this area.

Keywords:Short text data; natural language processing; NLP; deep neural network models; social media data; Twitter data

Introduction

People tend to share their experiences, emotions, and news with other people using different online social media platforms (i.e., Twitter, Facebook), where they mostly use short texts. The short text has also been used in many other fields, such as mobile short messages, news titles, and blog comments. Although there is no fixed length limit to define a text as short, usually, the text length remains smaller than 200 characters. For example, a tweet can contain up to 280 characters on Twitter.

Since the online social media data is publicly available, researchers became interested in working on the data to solve several research problems such as knowledge extraction [1], text generation [2], and natural disaster prediction [3]. However, it is challenging to analyze short text data because of their shortness, sparsity (i.e., diverse word content) [4], velocity (rapid growth of short text like SMS and tweet), and misspelling [5]. For example, a short text containing only a few words does not provide enough shared context for a good similarity measure. Moreover, the short texts are sent im mediately in real-time, and the description is concise. People often use several abbreviations, local or non-standard terms, and noise or symbols in their real-time short text communication. Therefore, analyzing short text data has become a daunting research topic in natural language processing (NLP) fields.

Traditional methods used rule-based models and learning- based models to analyze short text for different purposes but faced difficulty because of the characteristics of short text. For example, the authors of [6] discussed the difficulty of traditional methods on short text classification tasks. Another research work [2] focused on the automatic generation of short-text conversation using a retrieval-based automatic response model. Existing research focused on linking short text data such as tweets to news [7], where they used a graph-based latent variable model to find the inter short text correlations. They also showed that tweet-specific features (hashtag) and news specific features (named entities), and temporal constraints could extract text-to-text correlations, thus completing the semantic picture of a short text. Moreover, a recent survey paper [8] investigates how traditional methods work on topic modeling problems when applied to short textual social data. These methods are latent semantic analysis, latent Dirichlet allocation, non-negative matrix factorization, random projection, and principal component analysis.

Recent advances in machine learning models show that deep learning models can solve several complex problems for large datasets. However, it is also interesting to discover how the deep learning models work for short-length text data. For example, Jiaming et al. [9] applied Convolutional neural networks (CNN) to cluster short text on social media data, and they found that the deep learning method overcame traditional methods. Another research work performed on hate speech classification [10] also used deep learning methods on short text data. However, deep learning models require huge labeled data to train models. Thus, labeling short text data is another potential challenge. For this reason, Linmei et al. [11] proposed a semi-supervised model (Heterogeneous Graph ATtention networks (HGAT)), leveraging the full advantage of a few labeled data and large unlabeled data through information propagation along with the graph. The authors showed that the attention mechanism could learn the importance of different neighboring nodes (words) and the importance of different node (information) types to a current node.

A new language representation model, BERT (Bidirectional Encoder Representations from Transformers), is [12] is proposed recently that constructs different vectors for the same word in different contexts. In the past, a single low-dimensional vector representation or embedding was used to present a word from a given document. BERT embeddings have been successfully used in several natural language processing, including short text analysis tasks. The research work of [3] showed that deep learning models (i.e., BiLSTM) with the BERT embedding have the best accuracy in predicting disaster-type tweets from Twitter data where the mean length of the tweet is 12.5 words. Another work [13] showed that an advanced deep neural network (i.e., BiLSTM-CNN with attention) could improve disaster-type tweet prediction tasks.

Although deep learning methods have been successfully used for short text analysis tasks, there are still many open challenges. For example, previous research works removed symbols or emojis from a short text at the time of data processing steps. However, it is essential to explore how the symbols or emojis have an effect on the sentiment analysis task. Marco, et al. [14] provided an extensive research study to explain the data pre-processing steps, which plays a vital role in short text data analysis. Moreover, qualitative study is also needed to analyze the performance of the deep learning models on short text data to observe when and how the models work successfully [15].

Conclusion

The number of works on deep learning for short text data analysis has grown explosively in recent years. Such works have achieved accuracy comparable to that of traditional feature-based approaches. Specifically, we found that deep learning models with contextual embeddings (BERT) yield the best results. However, some new challenges and problems related to interpretability, scalability, and efficiency must be addressed. Furthermore, it is also worth investigating new applications from the perspectives of datasets and methods.

Acknowledgement

None

Conflict of Interest

None.

References

Article Details

Citation

Ashis Kumar Chanda. Opportunities and Challenges of Deep Learning Models for Short Text Data Analysis. Glob J Eng Sci. 8(5): 2021. GJES.MS.ID.000698. DOI: 10.33552/GJES.2021.08.000698.

Keywords