Development of hierarchical attention network based architecture for cloze-style question answering

Fahad Alsahli*, Andri Mirzal

*Corresponding author for this work

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

1 Scopus citations

Abstract

Recently, researchers have been addressing Question Answering (QA) by utilizing deep learning architectures. Architectures include Recurrent Neural Networks (RNNs), Convolutional Neural Networks (CNNs), and attention mechanism. QA has several variants, for example, document-based QA and cloze-style QA. In general, QA tasks could be addressed via similar approaches. This is due to the nature of QA which needs a context and a question to be analyzed so that an answer is retrieved. We are tackling cloze-style QA. In such task, a context and a query are given. Query is a sentence that is missing a piece of information (e.g., a word). The missing information should be inferred based on the given context. We develop a Hierarchical Attention Network (HAN) model to tackle cloze-style QA. Because HAN models employ hierarchical attention, HAN models are suitable for this task. We utilize two publicly available cloze-style data. The datasets are two instances of Children’s Book Test (CBT), namely, Named Entity (CBT-NE) and Common Nouns (CBT-CN). CBT-NE data includes 108,719 training, 2,000 validation, and 2,500 test samples. CBT-CN data includes 120,769 training, 2,000 validation, and 2,500 test samples. We conduct experiments to compare our model against a baseline model which is HAN pointer sum attention. Comparison is based on inference time (i.e., time needed to process a single sample) and accuracy score. Results show that our model outperforms baseline model in both criteria. Our model achieves an average inference time of 0.0476 s and an average accuracy score of 70.47% in CBT-NE test data, and achieves an average inference time of 0.049 s and an average accuracy score of 67.5% in CBT-CN test data. On the other hand, the baseline model achieves an average inference time of 0.115 s and an average accuracy score of 68.99% in CBT-NE test data, and an average inference time of 0.105 s and an average accuracy score of 67.12% in CBT-CN test data.

Original languageEnglish
Title of host publicationEmerging Technologies in Computing - 3rd EAI International Conference, iCETiC 2020, Proceedings
EditorsMahdi H. Miraz, Peter S. Excell, Andrew Ware, Safeeullah Soomro, Maaruf Ali
PublisherSpringer Science and Business Media Deutschland GmbH
Pages196-213
Number of pages18
ISBN (Print)9783030600358
DOIs
StatePublished - 2020

Publication series

NameLecture Notes of the Institute for Computer Sciences, Social-Informatics and Telecommunications Engineering, LNICST
Volume332 LNICST
ISSN (Print)1867-8211
ISSN (Electronic)1867-822X

Bibliographical note

Publisher Copyright:
© ICST Institute for Computer Sciences, Social Informatics and Telecommunications Engineering 2020.

ASJC Scopus subject areas

  • Computer Networks and Communications

Fingerprint

Dive into the research topics of 'Development of hierarchical attention network based architecture for cloze-style question answering'. Together they form a unique fingerprint.

Cite this