Inproceedings,

Hierarchical Multi Label Classification of News Articles Using RNN, CNN and HAN

, , , and .
ICT with Intelligent Applications: Proceedings of ICTIS 2021, volume 248 of Smart Innovation, Systems and Technologies book series (SIST), page 499--506. Singapore, Springer Singapore, Springer, (2022)
DOI: https://doi.org/10.1007/978-981-16-4177-0_50

Abstract

Labeling or tagging of any online news article is very important nowadays for quick information retrieval. In real world, news may belong to multiple categories. Moreover, those categories can be organized hierarchically. Such news classification is termed as hierarchical multi label classification. Hierarchical multi label classification in case of news domain has been very rarely explored. The base article for this paper created a pre-defined taxonomy of Bio-NLP labels, and articles were then categorized into multi labels based on pre-defined hierarchy. In this work, similar approach is used. The labels are placed in a pre-defined hierarchy, and then articles are classified in multi label manner with the pre-defined hierarchy in place. HAN (Hierarchical Attention Network) has rarely been used for hierarchical multi label classification. In this work, combination of CNN and GRU is implemented at first, and then HAN alone is implemented for the task. When CNN (Convolutional Neural Network) alone is used for classification as in the base paper, the average F1-score for different sets of experiments results in approximately 0.86. When CNN is combined with GRU (Gated Recurrent Unit), the average F1-score improvement is 1%.When HAN (Hierarchical Attention Network) is used then average F1-score improves by 3.35%. The models were trained and tested using news articles of the guardian newspaper published in 2014.

Tags

Users

  • @amanshakya

Comments and Reviews