Member-only story

Enhanced AI Rag System with PHP

Stefan Pöltl
8 min readJan 27, 2025

Imagine chatting with any guide to find a solution to a problem you are currently facing. In this post, I'll show you how easy it is to parse a bunch of PDF files, index them, and use AI to chat with the extracted know-how. We will use technologies that run locally, so you don't have to expose your data to a cloud platform. In addition, I explain some steps to improve the results as much as possible.

Repository: https://github.com/stefpe/rag_system

Architecture

Software Architecture RAG System

Setup

Ollama

To be able to chat with our data we need a large language model. Let’s use Meta’s open-source model called Llama. There is a handy tool called Ollama to run it locally. Install Ollama by running this command in your terminal or check(https://ollama.com/download):

curl -fsSL https://ollama.com/install.sh | sh

There is also a Docker version available, but due to performance reasons on my MacBook, the local installation works much better.

We are going to use a model that works pretty well to generate embeddings. You can download it here or install via Terminal:

ollama pull bge-m3

BGE-M3 is a high quality embedding model that delivers really good results in finding the most semantically similar documents(multilingual).

To answer questions we can use any LLM e.g. the small Llama 3.2 model:

ollama pull llama3.2

The new Deepseek R1 model works pretty good too(Reasoning inside!):

ollama pull deepseek-r1:8b

Something you need to consider is that the reasoning of the Model takes some extra time. Depending on the model quality, the reasoning might mix up the context with it’s own thinking and destroys the final result.

PHP Setup

For local PHP development there is no alternative for me. I always use DDEV: https://ddev.com/

DDEV is simple to install, easy to configure and includes an amazing add-on system.

Once it’s installed you can run:

ddev config

Stefan Pöltl
Stefan Pöltl

No responses yet

Write a response