Pashto poetry generation: deep learning with pre-trained transformers for low-resource languages

  • Imran Ullah
  • , Khalil Ullah
  • , Hamad Khan
  • , Khursheed Aurangzeb
  • , Muhammad Shahid Anwar*
  • , Ikram Syed*
  • *Corresponding author for this work

Research output: Contribution to journalArticlepeer-review

Abstract

Generating poetry using machine and deep learning techniques has been a challenging and exciting topic of research in recent years. It has significance in natural language processing and computational linguistics. This study introduces an innovative approach to generate high-quality Pashto poetry by leveraging two pretrained transformer models, LaMini-Cerebras-590M and bloomz-560m. The models were trained on an extensive new and quality Pashto poetry dataset to learn the underlying complex patterns and structures. The trained models are then used to generate new Pashto poetry by providing them with a seed text or prompt. To evaluate the quality of the generated poetry, we conducted both subjective and objective evaluations, including human evaluation. The experimental results demonstrate that the proposed approach can generate Pashto poetry that is comparable in quality to human-generated poetry. The study provides a valuable contribution to the field of Pashto language and poetry generation and has potential applications in natural language processing and computational linguistics.

Original languageEnglish
Pages (from-to)1-23
Number of pages23
JournalPeerJ Computer Science
Volume10
DOIs
StatePublished - 2024
Externally publishedYes

Bibliographical note

Publisher Copyright:
© 2024 Ullah et al.

Keywords

  • Bloomz-560m
  • Deep learning
  • LaMini-Cerebras-590M
  • Machine learning
  • Natural language processing
  • Poetry generation

ASJC Scopus subject areas

  • General Computer Science

Fingerprint

Dive into the research topics of 'Pashto poetry generation: deep learning with pre-trained transformers for low-resource languages'. Together they form a unique fingerprint.

Cite this