Qualitative Evaluations of Ideas created by Generative AI

Authors

  • Hiroaki FURUKAWA The University of Kitakyushu

DOI:

https://doi.org/10.52731/liir.v004.168

Keywords:

Creativity, Generative AI, GPT-3, Idea evaluation, Qualitative evaluation

Abstract

This paper describes that the difference between GPT-3 and human in the qualitative evaluation of ideas. GPT-3 is expected to be used as ‘Artificial General Intelligence’ that requires no pre-training for a specific applications or purposes different from the conventional language model. This study aims to validate the usefulness of the ideas created by GPT-3. GPT-3 was validated using three pre-training approach; zero-shot, one-shot, and few-shot. The qualitative evaluation of ideas was conducted on three items; fluency, feasibility and originality. The comparative experiments were conducted on the evaluation results of GPT-3-created ideas and human-created ideas, as well as on the results of the in-context learning settings for tasks in GPT-3. The results suggested that human-created ideas were superior to GPT-3-created ideas in originality; moreover, few-shot is the highest of the approaches in originality.

References

T. Brown, B. Mann, N. Ryder, M. Subbiah, J.D. Kaplan, P. Dhariwal, et al., ”Language models are few-shot learners”, Advances in neural information processing systems, Vol.33, 2020, pp.1877--1901.

R. Dale, “GPT-3: What’s it good for?”, Natural Language Engineering, Vol.27, No.1, Cam-bridge University Press, 2021, pp.113--118.

L. Floridi and M. Chiriatti, “GPT-3: Its nature, scope, limits, and consequences”, Minds and Machines, Vol.30, No.4, Springer, 2020, pp.681--694.

OpenAI, OpenAI, https://openai.com/ [accessed 2023-10-20].

C. Stevenson, I. Smal, M. Baas R. Grasman and H. van der Maas, “Putting GPT-3’s Crea-tivity to the (Alternative Uses) Test”, arXiv preprint arXiv:2206.08932, 2022.

DeepL, DeepL Translate: The world’s most accurate translator, https://www.deepl.com/transl ator [accessed 2023-10-20].

H. Furukawa, T. Yuizono, and S. Kunifuji, “Idea Planter: A Backchannel Function for Fos-tering Ideas in a Distributed Brainstorming Support System”, Proceedings of KICSS’2013, 2013, pp.92--103.

Y. Nagai, T. Taura, and F. Mukai, “Concept blending and dissimilarity: factors for creative concept generation process”, Design studies, Vol.30, No.6, Elsevier, 2009, pp.648--675.

User Local,Inc., User Local AI Text mining Tool, https://textmining.userlocal.jp/ [accessed 2023-10-20].

Y. Hassan-Montero and V. Herrero-Solana, “Improving tag-clouds as visual information retrieval interfaces”, International conference on multidisciplinary information sciences and technologies, Vol.12, 2006.

M.J. Halvey and M.T. Keane, “An assessment of tag presentation techniques”, Proceedings of the 16th international conference on World Wide Web, 2007, pp.1313--1314.

S. Lohmann, J. Ziegler and L. Tetzlaff, “Comparison of tag cloud layouts: Task-related performance and visual exploration”, Human-Computer Interaction--INTERACT 2009: 12th IFIP TC 13 International Conference, Uppsala, Sweden, August 24-28, 2009, Proceedings, Part I 12, Springer, 2009, pp.392--404.

R. Vuillemot, T. Clement, C. Plaisant and A. Kumar, “What’s being said near “Martha”?Exploring name entities in literary text collections”, 2009 IEEE Symposium on Visual Ana-lytics Science and Technology, IEEE, 2009, pp.107--114.

F. Heimerl, S. Lohmann, S. Lange and T. Ertl, “Word cloud explorer: Text analytics based on word clouds”, 2014 47th Hawaii international conference on system sciences, IEEE, 2014, pp.1833--1842.

A. Rajaraman and J.D. Ullman, “Mining of massive datasets”, Cambridge University Press, 2011.

J. Diedrich, M. Benedek, E. Jauk, and C.A. Neubauer, “Are creative ideas novel and use-ful?”, Psychology of aesthetics, creativity, and the arts, Vol.9, No.1, Educational Publishing Foundation, 2015, pp.35--40.

DAIR.AI, Prompt Engineering Guide, https://www.promptingguide.ai/ [accessed 2023-10-20].

J. Wei, M. Bosma, V.Y. Zhao, K. Guu, A.W. Yu, B. Lester, et al., “Finetuned language models are zero-shot learners”, arXiv preprint arXiv:2109.01652, 2021.

T. Kojima, S.S. Gu, M.Reid, Y. Matsuo, and Y. Iwasawa, “Large language models are ze-ro-shot reasoners”, Advances in neural information processing systems, Vol.35, 2022, pp.22199--22213.

M. Takahashi, “Research on Brainstorming (1) : Effectiveness of “Rules of Idea gene-ra-tion””, Journal of Japan Creativity Society, No.2, 1998, pp.94--122 (in Japanese).

Downloads

Published

2023-12-20