Chat Privately with Your Documents: Local RAG with Flowise

In our last blog, How to Build Your Own Local Low-Code Chatbot Using Ollama and Flowise, we explored the basics of installing Flowise and Ollama, and walked through creating a simple chatbot using local resources. If you haven’t checked that out yet, it’s a great starting point!

This time, we’re diving deeper into a practical application: building a privacy-focused chatbot for interacting with your sensitive documents like PDFs. By leveraging Retrieval-Augmented Generation (RAG), we’ll ensure your data stays securely within your infrastructure while delivering accurate, context-aware responses.

Why Focus on Privacy with Local RAG?

When working with sensitive information—contracts, medical records, or proprietary documents—sending data to external servers isn’t an option. That’s where Local RAG shines. It combines the strengths of Retrieval-Augmented Generation with local processing to ensure:

Complete Data Security: All processing happens within your infrastructure.
Custom Control: You own the workflow, from document parsing to response generation.
Enhanced Accuracy: Tailor the chatbot to handle specific documents and contexts.

By the end of this guide, you’ll have a working chatbot that privately and securely interacts with your documents while leveraging the powerful tools in Flowise and Ollama.

Step-by-Step: Creating a Privacy-Focused Chatbot

Here’s how we can achieve it.

1. Setting Up Flowise

If you followed our previous blog, you already know how to install and start Flowise AI. But as a quick refresher:

Install Node.js (Download Node.js)
Install Flowise:

npm install -g flowise
npx flowise start

Access Flowise at http://localhost:3000.

Flowise gives you a visual, low-code interface to orchestrate powerful workflows for LLMs.

For more options see here.

2. Creating a New Project in Flowise

After launching Flowise:

Click Add New to create a project.
You’ll see a blank canvas—this is where we’ll build the chatbot workflow.

3. Loading Your Document

We’ll start by loading the document you want the chatbot to interact with.

Drag and drop a PDF Loader onto the canvas.
Select your document (e.g., a report, resume, or legal contract).

This step lets the chatbot “read” the document for later processing.

4. Splitting the Document into Chunks

To process the document efficiently, we need to break it into smaller, manageable pieces.

Use a Recursive Character Text Splitter for this.
Set (Defualt):
- Chunk Size: 1000 characters.
- Overlap: 200 characters (to maintain context across chunks).

Connect the splitter to the PDF Loader.

5. Creating Embeddings with Ollama

Embeddings are like “map coordinates” for text, enabling the chatbot to understand semantic relationships between chunks.

Drag an Ollama Embedding node onto the canvas.
Configure it to use your locally hosted Ollama model (e.g., http://host.docker.internal:11434). - Windows docker, if your not using docker use Ollama local address (http://localhost:11434)

This ensures your embeddings are created locally, safeguarding your privacy.

i will be using the nomic embeddings (nomic-embed-text) to download them type the below in your command prompt:

ollama pull nomic-embed-text

Now enter nomic-embed-text in the model name field

6. Storing Embeddings in a Vector Database

To enable quick retrieval of relevant document chunks, we’ll store the embeddings in a Vector Store.

Use an In-Memory Vector Store for simplicity.
Connect the Vector Store to the Embedding node.

7. Retrieving Relevant Data

When a user submits a query, the chatbot will search the Vector Store for relevant chunks.

Drag a Conversational Retrieval Chain node onto the canvas.
Connect it to the Vector Store.

This chain ensures the chatbot retrieves contextually relevant data and maintains conversational history for follow-ups.

This is different from the Conversational Chain we used last time as we need to retrieve information.

8. Generating Responses with ChatLlama

Finally, let’s generate responses based on the retrieved document chunks.

Drag a ChatLlama node onto the canvas.
Configure:
- Model Name: llama3.23b.
- Temperature: 0.3 (for accurate, focused responses).

Connect the chain to ChatLlama, completing the workflow.

We have changed the model from the last blog, this model will handle RAG better and run on most machines. We are now using llama3.2 3b to get this model in your command prompt type:

ollama run llama3.2

Putting it all together

Link up all the nodes to the corresponding points and save the file.

Upload the PDF document

Click on the Upload File button and select your PDF file you would like to chat with.

Once you have done this click on the upsert icon and then upsert the information.

Example Use Case

Imagine you’ve uploaded a financial report PDF.

User Question: “What were the main highlights from Q2?”
Processing:
- The chatbot splits the document into chunks, creates embeddings, and stores them in the Vector Store.
- It retrieves the most relevant sections for the query.
- ChatLlama uses these sections to generate a context-aware response.
Output: A concise summary of Q2 highlights is returned—all processed securely on your machine.

In my example I will use the new OWASP LLM Top 10 2025 PDF and ask the question "LLM01:2025 Prompt Injection" to get more information about prompt injection.

As you can see an answer is given from the provided PDF we upoaded.

Why This Approach Stands Out

Flowise for Simplified Orchestration

Flowise’s low-code interface makes it easy to design workflows without deep programming expertise.

Local Hosting with Ollama and ChatLlama

By hosting models like Ollama locally, you maintain full control over your data—essential for compliance in industries like healthcare, finance, or legal.

Seamless Integration

Components like text splitters, vector stores, and retrieval chains work together effortlessly, delivering fast and accurate results.

Ready to Build Your Own Private RAG Chatbot?

With this guide, you’re now equipped to create a chatbot that securely interacts with your sensitive documents. Whether it’s a legal contract, a client report, or a personal project, you can confidently keep your data private while harnessing the power of modern AI.

For those looking to take the next step, stay tuned! In our upcoming blog, we’ll show you how to deploy this chatbot on your website for seamless user interaction.

Let us know what you think—and if you try building your own, we’d love to hear about your experience!

If you would like more help with Flowise I have created a GPT called Flowise Ally, Chat with it here.

if you would like to import the JSON file I have created please visit my GitHub.