Member-only story

Enhanced AI Rag System with PHP

Stefan Pöltl
8 min readJan 27, 2025

--

Imagine chatting with any guide to find a solution to a problem you are currently facing. In this post, I'll show you how easy it is to parse a bunch of PDF files, index them, and use AI to chat with the extracted know-how. We will use technologies that run locally, so you don't have to expose your data to a cloud platform. In addition, I explain some steps to improve the results as much as possible.

Repository: https://github.com/stefpe/rag_system

Architecture

Software Architecture RAG System

Setup

Ollama

To be able to chat with our data we need a large language model. Let’s use Meta’s open-source model called Llama. There is a handy tool called Ollama to run it locally. Install Ollama by running this command in your terminal or check(https://ollama.com/download):

curl -fsSL https://ollama.com/install.sh | sh

There is also a Docker version available, but due to performance reasons on my MacBook, the local installation works much better.

We are going to use a model that works pretty well to generate embeddings. You can download it here or install via Terminal:

ollama pull bge-m3

BGE-M3 is a high quality embedding model that delivers really good results in finding the most…

--

--

Stefan Pöltl
Stefan Pöltl

No responses yet