Noise background for pixel effect
May 10, 2024
HowTo

Create RAG ChatBot in minutes

In this example we're going to create RAG Chat Bot from source website, using flow simple and edge

Before we delve into our discussion, let's first explore RAG. (NOTE: this example belongs to Nouro Flow Alpha so in future version here would link to new example with better API/features)

What is RAG?

RAG stands for Retrieval Augmented Generation, a technique used to enhance the knowledge of Language Models (LLMs) by incorporating additional data.

While LLMs possess the ability to reason across diverse subjects, their knowledge is confined to public data available up to a specific training point in time. If one aims to develop AI applications capable of analyzing private or post-cutoff date data, augmenting the model's knowledge with specific information becomes necessary. The process of integrating relevant information into the model prompt is known as Retrieval Augmented Generation (RAG).

RAG Architecture

A typical RAG application comprises two primary components:

  1. Indexing: This involves a pipeline for ingesting data from a source and indexing it, typically conducted offline.

  2. Retrieval and Generation: The core RAG chain operates by receiving user queries at runtime, retrieving relevant data from the index, and then passing it to the model.

The common sequence from raw data to answer usually involves: rag_indexing-8160f90a90a33253d0154659cf7d453f.png

Let's Implement RAG on Nouro Flow

Nouro Flow offers built-in support for Langchain components, facilitating our implementation.

  1. Loading and Splitting Documents from Source: Screenshot from 2024-05-10 21-14-39.png

  2. Creating a Vector Store: This enables the storage of documents for easy retrieval: Screenshot from 2024-05-10 21-16-05.png Note: The Vector Store should return a retriever, which can be defined in node options (simply click on the node, and node options will appear on the right side).

  3. Creating a Runnable Sequence: Compose a sequence involving llm-chat, prompt, and the vector store as depicted below: Screenshot from 2024-05-10 21-17-16.png For ChatOpenAI, set credentials and model to gpt-3.5-turbo, and for LangHub, use rlm/rag-prompt.

  4. Executing the Runnable Sequence: Execute the runnable sequence using the Executor node with the stream function. Before outputting the stream to an answer, it's essential to parse the runnable sequence stream to a string stream using the runnable sequence stream formatter node: Screenshot from 2024-05-10 21-28-33.png

In Conclusion

In this example, we effortlessly created an RAG Chatbot within Nouro Flow. I recommend utilizing the 'Edge' over the 'Simple' type, as it offers significantly faster completion times, approximately twice as fast. For a demonstration, refer to the following videos: