2.1. First study: Structured vs. unstructured data

The results of the first study illustrate the importance of data structuring for the training of a retrieval augmented generation (RAG) system. The superiority of structured data over unstructured formats, such as a PDF, shows that the quality and structure of the data have a significant impact on the performance of such systems. It is particularly worth emphasising that a specially adapted Markdown format, which offers optimal structuring, delivered the best results. This underlines the need to prepare data specifically for RAG systems in order to maximise their effectiveness. The study thus provides a clear basis for further research and emphasises how crucial data preparation is for the success of such systems.