Nouro blog - Create RAG ChatBot in minutes

Before we delve into our discussion, let's first explore RAG. (NOTE: this example belongs to Nouro Flow Alpha so in future version here would link to new example with better API/features)

What is RAG?

RAG stands for Retrieval Augmented Generation, a technique used to enhance the knowledge of Language Models (LLMs) by incorporating additional data.

While LLMs possess the ability to reason across diverse subjects, their knowledge is confined to public data available up to a specific training point in time. If one aims to develop AI applications capable of analyzing private or post-cutoff date data, augmenting the model's knowledge with specific information becomes necessary. The process of integrating relevant information into the model prompt is known as Retrieval Augmented Generation (RAG).

RAG Architecture

A typical RAG application comprises two primary components:

Indexing: This involves a pipeline for ingesting data from a source and indexing it, typically conducted offline.
Retrieval and Generation: The core RAG chain operates by receiving user queries at runtime, retrieving relevant data from the index, and then passing it to the model.

The common sequence from raw data to answer usually involves:

Let's Implement RAG on Nouro Flow

Nouro Flow offers built-in support for Langchain components, facilitating our implementation.

Loading and Splitting Documents from Source:
Creating a Vector Store: This enables the storage of documents for easy retrieval: Note: The Vector Store should return a retriever, which can be defined in node options (simply click on the node, and node options will appear on the right side).
Creating a Runnable Sequence: Compose a sequence involving llm-chat, prompt, and the vector store as depicted below: For ChatOpenAI, set credentials and model to gpt-3.5-turbo, and for LangHub, use rlm/rag-prompt.
Executing the Runnable Sequence: Execute the runnable sequence using the Executor node with the stream function. Before outputting the stream to an answer, it's essential to parse the runnable sequence stream to a string stream using the runnable sequence stream formatter node:

In Conclusion

In this example, we effortlessly created an RAG Chatbot within Nouro Flow. I recommend utilizing the 'Edge' over the 'Simple' type, as it offers significantly faster completion times, approximately twice as fast. For a demonstration, refer to the following videos:

Edge: video
Simple: video