Proposal of a Haiku Evaluation Method Using Large Language Model and Prompt Engineering
DOI:
https://doi.org/10.52731/lbds.v005.346Keywords:
haiku evaliation, human evaluation, Large Language Model, prompt engineeringAbstract
In this paper we describe the development of a haiku evaluation system using Large Language Model (LLM). We propose several prompting methods for haiku evaluation and selection, and verify the performance of the proposed methods using an automatically evaluable haiku dataset. We also performed haiku evaluation and selection on a large haiku database containing over 100 million verses using the proposed methods and validated their effectiveness through a questionnaire survey of haiku poets. The main contributions of this paper are as follows. First, we investigated the effectiveness of the procedures for demonstrating the validity of a number of haiku rating systems, including the creation of rating datasets and the results of subjective ratings through questionnaires. Second, we investigated methods for conducting haiku evaluation using LLM and prompt engineering.
References
Chen Jiyang et al. A Hybrid Parallel Computing Architecture Based on CNN and Transformer for Music Genre Classification. Electronics 2024, 2024.
Panos Achlioptas et al. ArtEmis: Affective Language for Visual Art. CVPR2021, 2021.
Carlos Hernandez-Olivan and Jose R. Beltran. Music Composition with Deep Learning: A Review. arXiv, 2021.
Naoko Tosa, Hideto Obara, and Michihiko Minoh. Hitch Haiku: An Interactive Supporting System for Composing Haiku Poem. ICEC2008, 2008.
Alec Radford, Jeffrey Wu, Rewon Child, David Luan, Dario Amodei, and Ilya Sutskever. Language Models are Unsupervised Multitask Learners. Technical report, 2019.
Brendan Bena and Jugal Kalita. Introducing Aspects of Creativity in Automatic Poetry Generation. ICON-2019, 2019.
Mika Hamalainen, Khalid Alnajjar, and Thierry Poibeau. Modern french poetry generation with roberta and gpt-2. 2022.
Xingxing Zhang and Mirella Lapata. Chinese poetry generation with recurrent neural networks. In Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP), pages 670–680, Doha, Qatar, oct 2014. Association for Computational Linguistics.
Andrea Zugarini, Stefano Melacci, and Marco Maggini. Neural Poetry: Learning to Generate Poems using Syllables. ICANN2019, 2019.
Jonas Belouadi and Steffen Eger. ByGPT5: End-to-End Style-conditioned Poetry Generation with Token-free Language Models. ACL2023, 2023.
Shinji Kikuchi, Keizo Kato, Junya Saito, Seiji Okura, Kentaro Murase, Takaya Yamamoto, and Akira Nakagawa. Quality Estimation for Japanese Haiku Poems Using Neural Network. IEEE2016, 2016.
Jimpei Hitsuwari, Yoshiyuki Ueda, Woojin Yun, and Michio Nomura. Does human–AI collaboration lead to more creative art? Aesthetic evaluation of human-made and AI-generated haiku poetry. Elsevier Ltd., 2023.
Tom B. Brown et al. Language Models are Few-Shot Learners. OpenAI, 2020.
Colin Raffel et al. Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer. Journal of machine learning research, 2019.
Introducing ChatGPT. OpenAI, 2022.
GPT-4 Technical Report. OpenAI, 2023.
Jason Wei, Xuezhi Wang, Dale Schuurmans, Maarten Bosma, Brian Ichter, Fei Xia, Ed H. Chi, Quoc V. Le, and Denny Zhou. Chain-of-Thought Prompting Elicits Reasoning in Large Language Models. Advances in Neural Information Processing Systems, 2022.
Mirac Suzgun, Nathan Scales, Nathanael Scharli, Sebastian Gehrmann, Yi Tay, Hyung Won Chung, Aakanksha Chowdhery, Quoc V. Le, Ed H. Chi, Denny Zhou, and Jason Wei. Challenging BIG-Bench Tasks and Whether Chain-of-Thought Can Solve Them. ACL2023, 2023.
Hugo Touvron et al. Llama 2: Open Foundation and Fine-Tuned Chat Models.
Sondos Mahmoud Bsharat, Aidar Myrzakhan, and Zhiqiang Shen. Principled Instructions Are All You Need for Questioning LLaMA-1/2, GPT-3.5/4. arXiv, 2023.
Swaroop Mishra, Daniel Khashabi, Chitta Baral, Yejin Choi, and Hannaneh Hajishirzi. Reframing Instructional Prompts to GPTk ’s Language. ACL2022, 2022.
Xuezhi Wang, Jason Wei, Quoc Le Dale Schuurmans, Sharan Narang Ed Chi, Aakanksha Chowdhery, and Denny Zhou. Self-Consistency Improves Chain of Thought Reasoning in Language Models. ICLR2023, 2023.
Kodai Hirata, Soichiro Yokoyama, Tomohisa Yamashita, and Hidenori Kawamura. Implementation of Autoregressive Language Models for Generation of Seasonal Fixed-form Haiku in Japanese. KICSS2022, 2022.
Shunki Tomizawa, Soichiro Yokoyama, Tomohisa Yamashita, and Hidenori Kawamura. Proposal for a haiku evaluation mechanism using CoT Prompting with a largescale language model. JSAI2024, 2024.