Evaluating and Enhancing RAG Systems through Test and Source Analysis

Zelan Shi; Oranus Kotsuwan; Kazunori Matsumoto

doi:10.52731/liir.v006.454

Authors

Zelan Shi Kanagawa Institute of Technology
Oranus Kotsuwan Kanagawa Institute of Technology
Kazunori Matsumoto Kanagawa Institute of Technology

DOI:

https://doi.org/10.52731/liir.v006.454

Keywords:

Generative AI, RAG, white box test, black box test

Abstract

This paper presents a prototype Retrieval-Augmented Generation (RAG) system developed for university curriculum guides and evaluates its performance through experiments. RAG, which combines large language models (LLMs) with independent information sources, is emerging as a solution to address generative AI challenges such as hallucinations and the lack of domain-specific knowledge. By prioritizing information from dedicated databases, RAG can enhance factual accuracy and reduce hallucinations. Through experimental trials, the system demonstrated reliable performance in some cases, although issues related to the quality of information sources and data extraction were identified. These findings underscore the importance of robust testing and systematic revisions of information sources. This paper reports on an outline of the system implementation, the guides for improvement, and the experimental results. We find that an iterative improvement process is crucial for enhancing the overall quality of RAG. This process involves not only optimizing retrieval and generation mechanisms but also continuously reviewing and refining the information sources themselves, the system can systematically adapt to ensure sustained relevance and improved response accuracy over time.

References

S. Bengesi, H. El-Sayed, MD K. Sarker, Y. Houkpati, J. Irungu, T. Oladunni, Advancements in Generative AI: A Comprehensive Review of GANs, GPT, Autoencoders, Diffusion Model, and Transformer IEEE Access, vol. 12, pp. 69812-69837, 2024.

L.Manduchi, K.Pandey, C.Meister, et al., On the Challenges and Opportunities in Generative AI, arXiv:2403.00025, 2024.

S.S. Sengar, A.B.Hasan, S.Kumar, et al., Generative Artificial Intelligence: A Systematic Review and Applications, Multimedia Tools and Applications, Springer, https://doi.org/10.1007/s11042-024-20016-1, 2024.

K.B. Ooi,, G.W. H.Tan, et al., The Potential of Generative Artificial Intelligence Across Disciplines: Perspectives and Future Directions, Journal of Computer Information Systems, Vol. 65, No.1, 2023.

V. B. Parthasarathy, A. Zafar, A. Khan, A. Shahid, The Ultimate Guide to Fine-Tuning LLMs from Basics to Breakthroughs: An Exhaustive Review of Technologies, Research, Best Practices, Applied Research Challenges and Opportunities, arXiv:2408.13296, 2024.

R.Patil,and V. Gudivada, A Review of Current Trends, Techniques, and Challenges in Large Language Models (LLMs), Applied Sciences, Vol.14, No.5, 2024.

W. Fan, Y. Ding, L. Ning, S. Wang, H. Li, D. Yin, T. Chua, Q. Li, A Survey on RAG Meeting LLMs: Towards Retrieval-Augmented Large Language Models, Proceedings of the 30th ACM SIGKDD Conference on Knowledge Discovery and Data Mining (KDD'24), 2024.

M. Arslan, H. Ghanem, S. Munawar, C. Cruz, A Survey on RAG with LLMs, Procedia Computer Science, Vol. 246, pp. 3781-3790, 2024.

S. Gupta, R. Ranjan, S. N. Singh, A Comprehensive Survey of Retrieval-Augmented Generation : Evolution, Current Landscape and Future Directions, arXiv:2410.12837, 2024.

X. Huang, W.Ruan, W.Huang, et al. A survey of safety and trustworthiness of large language models through the lens of verification and validation, Artificial Intelligence Review, Vol. 57, No.175, 2024.

S. Filice, G. Horowitz, C. David, Generating Diverse Q&A Benchmarks for RAG Evaluation with DataMorgana, arXiv:2501.12789, 2025.

S. Gupta, C. Berrospi, L. Mishra, M. Dolfi, P. Staar, P. Vagenas, Know Your RAG: Dataset Taxonomy and Generation Strategies for Evaluating RAG Systems, arXiv:2411.19710, 2024.

V. Jeronymo, L. Bonifacio, H. Abonizio, M. Fadaee, R. Lotufo, J. Zavrel, R. Nogueira,Inpars-v2: Large Language Models as Efficient Dataset Generators for Information Retrieval, arXiv:2301.01820, 2023.

Kanagawa Institute of Technology, https://portal.kait.jp/aaa_web/KAIT_WEB/004_curriculum/curriculum.html

https://ai.meta.com/tools/faiss/

S.Kirchem, and J.Waack,Explore the Right Personas for Successful Marketing, Sales, and Service, In: S. Kirchem, M. Stadelmann, M.Pufahl, D. Laux, (eds) CRM Goes Digital. Management for Professionals.

R.Patton, Software Testing, 2nd edition, Sams Publishing, 2005.

Evaluating and Enhancing RAG Systems through Test and Source Analysis

Authors

DOI:

Keywords:

Abstract

References

Downloads

Published

Issue

Section