Summary
This post will cover the usage of guardrails in the context of an RAG application using Redis Stack as the vector store.
- Nvidia's guardrail package is used for the railed implementation.
- Langchain LCEL is used for the non-railed implementation.
- Content from the online Redis vector search documentation is used for the RAG content
- GUI is implemented with Chainlit
Application Architecture
This bot is operating within a Chainlit app. It has two modes of operation:
- 'chain' - no guardrails
- 'rails' - NeMo guardrails in place for both user inputs and LLM outputs
Screenshots
Bot without rails
This first screenshot shows the bot operating with no guardrails. It does just fine until an off-topic question is posed - then it cheerfully deviates from its purpose.
Bot with rails
Same series of questions here with guardrails enabled. Note that it keeps the user on topic now.
Code Snippets
Non-railed chain (LCEL)
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
retriever: BaseRetriever = redis.Redis.from_existing_index( | |
OpenAIEmbeddings(model='text-embedding-3-small', dimensions=512), | |
index_name=os.getenv('INDEX_NAME'), | |
redis_url=os.getenv('REDIS_URL'), | |
schema=os.getenv('SCHEMA') | |
).as_retriever(search_type='similarity_score_threshold', search_kwargs={'score_threshold':0.5}) | |
chain: Runnable = ( | |
{ 'chat_history': RunnablePassthrough(), 'input': RunnablePassthrough() } | |
| hub.pull("joeywhelan/rephrase") | |
| ChatOpenAI(model_name='gpt-3.5-turbo', temperature=0) | |
| StrOutputParser() | |
| RunnableParallel({ 'question': RunnablePassthrough() }) | |
| { 'context': itemgetter('question') | retriever, 'question': itemgetter('question') } | |
| hub.pull('rlm/rag-prompt') | |
| ChatOpenAI(model_name='gpt-3.5-turbo', temperature=0) | |
| StrOutputParser() | |
) |
Railed with NeMO Guardrails
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
config: RailsConfig = RailsConfig.from_path('./guardrails') | |
rails: LLMRails = LLMRails(config, verbose=False) |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
define user asks question specifically about redis vector search | |
"how does redis index vectors?" | |
"what is the purpose of the EF_RUNTIME parameter?" | |
"what redis data structures can be used for vector search?" | |
"what vector distance metrics does redis support?" | |
define flow vector_question | |
user asks question specifically about redis vector search | |
# Given the following conversation and a follow up question, rephrase the follow up question to be a standalone question. | |
$rephrased = ... | |
$answer = execute rag(question=$rephrased) | |
bot $answer |
Source
Copyright ©2024 Joey E Whelan, All rights reserved.