Journal
Communication & Cognition 2025, Vol. 58, issue 1-2
ISSN
0378-0880
e-ISSN
2953-1454
Title
IMPLEMENTING AND ASSESSING RETRIEVAL AUGMENTED GENERATION (RAG) FOR LLM-BASED DOCUMENTS QUERIES
Author
Peter Kaczmarski and Fernand Vandamme 
Pages
pp. 3 - 22
Keywords
AI, Artificial Intelligence, chunking, Chroma, document management, embedding, image analysis, James Joyce, Large Language Model, LLM, OpenAI, Python, q&a, RAG, Retrieval Augmented Generation, RAG enhancements, RAPTOR, scenario analysis, speech to text, Theory of Mind (ToM), Ulysses, vector database. 4
Abstract
In recent years, AI-related technology referred to as RAG (Retrieval Augmented Generation) (Lewis, 2020) gained a lot of attention. In the RAG-approach, custom sources of information are used to seed the knowledge obtained from a LLM (Large Language Model), thus forming an approach which solves the issue of adapting the LLM to cope with custom external information. Using the RAGscenario, various information processing use cases can be implemented, such as AI-based document management, AI-enhanced web search, AI-based online service support, etc. This paper outlines the main components of the RAG-workflow such as chunking and embedding the input documents, as well as similarity search, and LLM -based user query processing. The RAG approach is illustrated via Python implementation which is used to validate the procedure via a simple example of processing a multi-topic document. Experimental results are discussed showing the feasibility of this approach, as well as illustrating the need for further research and enhancements, for example by the use of the RAPTOR concept (Sarthi, 2024).
Access