Natural Language Processing and Deep Learning Approaches for Multiclass Document Classifier

Shruti A. Gadewar; Prof. P. H. Pawar

doi:10.32628/IJSRSET2411143

Authors

Shruti A. Gadewar ME Aspirant, Department of Computer Science and Engineering, Babasaheb Naik College of Engineering, Pusad, Maharashtra, India
Prof. P. H. Pawar Associate Professor, Department of Computer Science and Engineering, Babasaheb Naik College of Engineering, Pusad, Maharashtra, India

DOI:

https://doi.org/10.32628/IJSRSET2411143

Keywords:

Classification, Natural Language Processing, Deep Learning

Abstract

With the recent growth of the internet, the volume of data has also increased. A large section of the internet is full of documents, which may contain data, big data, formatted and unformatted data, structured data, and unstructured data. The increase in the amount of this unstructured data results in making it difficult to manage data. As it is difficult to classify the increasing volume of data for various purposes manually, automated classification is required. This paper overviews different approaches to Natural Language Processing and Deep Learning for content-based classification.

References

Ilkay Yelmen, Ali Gunes, and Metin Zontul on “Multi-Class Document Classification Using Lexical Ontology-Based Deep Learning” Appl. Sci. 2023, 13(10), 6139; https://doi.org/10.3390/app13106139
Kadhim, A.I. Survey on supervised machine learning techniques for automatic text classification. Artif. Intell. Rev. 2019, 52, 273–292. [Google Scholar] [CrossRef]
Kumbhar, P.; Mali, M.A. Survey on Feature Selection Techniques and Classification Algorithms for Efficient Text Classification. Int. J. Sci. Res. 2016, 5, 1267–1275. [Google Scholar]
Mwadulo, M.W. A Review on Feature Selection Methods for Classification Tasks. Int. J. Comput. Appl. Technol. Res. 2016, 5, 395–402. [Google Scholar]
Zhang, T.; Yang, B. Big data dimension reduction using PCA. In Proceedings of the 2016 IEEE International Conference on Smart Cloud (SmartCloud), New York, NY, USA, 18–20 November 2016; pp. 152–157. [Google Scholar] [CrossRef]
Lu, Z.; Du, P.; Nie, J.Y. VGCN-BERT: Augmenting BERT with graph embedding for text classification. In Advances in Information Retrieval, Proceedings of the 42nd European Conference on IR Research, ECIR 2020, Lisbon, Portugal, 14–17 April 2020; Springer: Berlin/Heidelberg, Germany, 2020; pp. 369–382. [Google Scholar] [CrossRef]
Barbouch, M.; Verberne, S.; Verhoef, T. WN-BERT: Integrating WordNet and BERT for Lexical Semantics in Natural Language Understanding. Comput. Linguist. Neth. J. 2021, 11, 105–124. [Google Scholar]
Koushiki Sarkar and Ritwika Law on “A Novel Approach to Document Classification using WordNet” arXiv:1510.02755 [cs.IR] or arXiv:1510.02755v2 [cs.IR] for this version) https://doi.org/10.48550/arXiv.1510.02755
Kunze Wang, Soyeon Caren Han, Josiah Poon on “InducT-GCN: Inductive Graph Convolutional Networks for Text Classification” arXiv:2206.00265 [cs.CL] (or arXiv:2206.00265v1 [cs.CL] for this version) https://doi.org/10.48550/arXiv.2206.00265
Ren, Y.; Wang, R.; Ji, D. A topic-enhanced word embedding for twitter sentiment classification. Inf. Sci. 2016, 369, 188–198. [Google Scholar] [CrossRef]
Le, Q.; Mikolov, T. Distributed representations of sentences and documents. In Proceedings of the International Conference on Machine Learning, Beijing, China, 21–26 June 2014; pp. 1188–1196. [Google Scholar]
Nozza, D.; Bianchi, F.; Hovy, D. What the [mask]? making sense of language-specific BERT models. arXiv 2020, arXiv:2003.02912. [Google Scholar]

Natural Language Processing and Deep Learning Approaches for Multiclass Document Classifier

Authors

DOI:

Keywords:

Abstract

References

Downloads

Published

Issue

Section

License

How to Cite