Natural Language Processing and Deep Learning Approaches for Multiclass Document Classifier

Authors

  • Shruti A. Gadewar  ME Aspirant, Department of Computer Science and Engineering, Babasaheb Naik College of Engineering, Pusad, Maharashtra, India
  • Prof. P. H. Pawar  Associate Professor, Department of Computer Science and Engineering, Babasaheb Naik College of Engineering, Pusad, Maharashtra, India

DOI:

https://doi.org/10.32628/IJSRSET2411143

Keywords:

Classification, Natural Language Processing, Deep Learning

Abstract

With the recent growth of the internet, the volume of data has also increased. A large section of the internet is full of documents, which may contain data, big data, formatted and unformatted data, structured data, and unstructured data. The increase in the amount of this unstructured data results in making it difficult to manage data. As it is difficult to classify the increasing volume of data for various purposes manually, automated classification is required. This paper overviews different approaches to Natural Language Processing and Deep Learning for content-based classification.

References

  1. Ilkay Yelmen, Ali Gunes, and Metin Zontul on “Multi-Class Document Classification Using Lexical Ontology-Based Deep Learning” Appl. Sci. 2023, 13(10), 6139; https://doi.org/10.3390/app13106139
  2. Kadhim, A.I. Survey on supervised machine learning techniques for automatic text classification. Artif. Intell. Rev. 2019, 52, 273–292. [Google Scholar] [CrossRef]
  3. Kumbhar, P.; Mali, M.A. Survey on Feature Selection Techniques and Classification Algorithms for Efficient Text Classification. Int. J. Sci. Res. 2016, 5, 1267–1275. [Google Scholar]
  4. Mwadulo, M.W. A Review on Feature Selection Methods for Classification Tasks. Int. J. Comput. Appl. Technol. Res. 2016, 5, 395–402. [Google Scholar]
  5. Zhang, T.; Yang, B. Big data dimension reduction using PCA. In Proceedings of the 2016 IEEE International Conference on Smart Cloud (SmartCloud), New York, NY, USA, 18–20 November 2016; pp. 152–157. [Google Scholar] [CrossRef]
  6. Lu, Z.; Du, P.; Nie, J.Y. VGCN-BERT: Augmenting BERT with graph embedding for text classification. In Advances in Information Retrieval, Proceedings of the 42nd European Conference on IR Research, ECIR 2020, Lisbon, Portugal, 14–17 April 2020; Springer: Berlin/Heidelberg, Germany, 2020; pp. 369–382. [Google Scholar] [CrossRef]
  7. Barbouch, M.; Verberne, S.; Verhoef, T. WN-BERT: Integrating WordNet and BERT for Lexical Semantics in Natural Language Understanding. Comput. Linguist. Neth. J. 2021, 11, 105–124. [Google Scholar]
  8. Koushiki Sarkar and Ritwika Law on “A Novel Approach to Document Classification using WordNet” arXiv:1510.02755 [cs.IR] or arXiv:1510.02755v2 [cs.IR] for this version) https://doi.org/10.48550/arXiv.1510.02755
  9. Kunze Wang, Soyeon Caren Han, Josiah Poon on “InducT-GCN: Inductive Graph Convolutional Networks for Text Classification” arXiv:2206.00265 [cs.CL] (or arXiv:2206.00265v1 [cs.CL] for this version) https://doi.org/10.48550/arXiv.2206.00265
  10. Ren, Y.; Wang, R.; Ji, D. A topic-enhanced word embedding for twitter sentiment classification. Inf. Sci. 2016, 369, 188–198. [Google Scholar] [CrossRef]
  11. Le, Q.; Mikolov, T. Distributed representations of sentences and documents. In Proceedings of the International Conference on Machine Learning, Beijing, China, 21–26 June 2014; pp. 1188–1196. [Google Scholar]
  12. Nozza, D.; Bianchi, F.; Hovy, D. What the [mask]? making sense of language-specific BERT models. arXiv 2020, arXiv:2003.02912. [Google Scholar]

Downloads

Published

2024-02-29

Issue

Section

Research Articles

How to Cite

[1]
Shruti A. Gadewar, Prof. P. H. Pawar "Natural Language Processing and Deep Learning Approaches for Multiclass Document Classifier" International Journal of Scientific Research in Science, Engineering and Technology (IJSRSET), Print ISSN : 2395-1990, Online ISSN : 2394-4099, Volume 11, Issue 1, pp.278-283, January-February-2024. Available at doi : https://doi.org/10.32628/IJSRSET2411143