Thursday, February 22, 2024

Redis RAG with Nvidia NeMoGuardrails

Summary


This post will cover the usage of guardrails in the context of an RAG application using Redis Stack as the vector store.  
  • Nvidia's guardrail package is used for the railed implementation.
  • Langchain LCEL is used for the non-railed implementation.
  • Content from the online Redis vector search documentation is used for the RAG content
  • GUI is implemented with Chainlit


Application Architecture


This bot is operating within a Chainlit app.  It has two modes of operation:  
  • 'chain' - no guardrails
  • 'rails' - NeMo guardrails in place for both user inputs and LLM outputs


Screenshots


Bot without rails

This first screenshot shows the bot operating with no guardrails.  It does just fine until an off-topic question is posed - then it cheerfully deviates from its purpose.



Bot with rails

Same series of questions here with guardrails enabled.  Note that it keeps the user on topic now.



Code Snippets

Non-railed chain (LCEL)



Railed with NeMO Guardrails



Source


Copyright  ©2024 Joey E Whelan, All rights reserved.

Monday, January 15, 2024

Change Data Capture w/Redis Enterprise

Summary

Redis Enterprise has the capability for continuous data integration with 3rd party data sources.  This capability is enabled via the Redis Data Integration (RDI) product.  With RDI, change data capture (CDC) can be achieved with all the major SQL databases for ingress.  Similarly, in the other direction, updates to Redis can be continuously written to 3rd party targets via the write-behind functionality of RDI.  

This post covers a demo-grade environment of Redis Enterprise + RDI with ingress and write-behind integrations with the following SQL databases:  Oracle, MS SQL, Postgres, and MySQL.  All components are containerized and run from a Docker environment.

Architecture


Ingress



Write-behind



Code Snippets

Docker Compose - Redis Enterprise Node



Docker Compose - Oracle Enterprise



Docker Compose - Debezium



RDI Ingress w/Prometheus Integration


Source


Copyright ©1993-2024 Joey E Whelan, All rights reserved.

Monday, January 8, 2024

Document AI with Apache Airflow

Summary

In this post, I cover an approach to a document AI problem using a task flow implemented in Apache Airflow.  The particular problem is around the de-duplication of invoices.  This comes up in payment provider space.  I use Azure AI Document Intelligence for OCR, Azure OpenAI for vector embeddings, and Redis Enterprise for vector search.

Architecture



Code Snippets


File Sensor DAG


OCR DAG


OCR Client (Azure AI Doc Intelligence)


Embedding DAG


Embedding Client (Azure OpenAI)


Vector Search DAG


Vector Search Client (Redis Enterprise)


Source


Copyright ©1993-2024 Joey E Whelan, All rights reserved.

Sunday, December 31, 2023

HAProxy with Redis Enterprise

Summary

This is Part 2 of a two-part series on the implementation of a contact center ACD using Redis data structures.  This part is focused on the network configuration.  In particular, I explain the configuration of HAProxy load balancing with VRRP redundancy in a Redis Enterprise environment.  To boot, I explain some of the complexities of doing this inside a Docker container environment.

Network Architecture


Load Balancing Configuration


HAProxy w/Keepalived

Docker Container

Dockerfile and associated Docker compose script below for two instances of HAProxy w/keepalived.  Note the default start-up for the HAProxy container is overridden with a CMD to start keepalived and haproxy.

Keepalived Config

VRRP redundancy of the two HAProxy instances is implemented with keepalived.  Below is the config for the Master instance.  The Backup instance is identical except for the priority.

Web Servers

I'll start with the simplest load-balancing scenario - web farm.


Docker Container

Below is the Dockerfile and associated Docker compose scripting for a 2-server deployment of Python FastAPI.  Note that no IP addresses are assigned and multiple instances are deployed via Docker compose 'replicas'.


HAProxy Config

Below are the front and backend configurations.  Note the use of Docker's DNS server to enable dynamic mapping of the web servers via a HAProxy server template.

Redis Enterprise Components

Redis Enterprise can provide its own load balancing via internal DNS servers.  For those that do not want to use DNS, external load balancing is also supported.  Official Redis documentation on the general configuration of external load balancing is here.  I'm going to go into detail on the specifics of setting this up with the HAProxy load balancer in a Docker environment.

Docker Containers

A three-node cluster is provisioned below.  Note the ports that are opened:
  • 8443 - Redis Enterprise Admin Console
  • 9443 - Redis Enterprise REST API
  • 12000 - The client port configured for the database.

RE Database Configuration

Below is a JSON config that can be used via the RE REST API to create a Redis database.  Note the proxy policy.  "all-nodes" enables a database client connection point on all the Redis nodes.

RE Cluster Configuration

In the start.sh script, this command below is added to configure redirects in the Cluster (per the Redis documentation).

HAProxy Config - RE Admin Console

Redis Enterprise has a web interface for configuration and monitoring (TLS, port 8443).  I configure back-to-back TLS sessions below with a local SSL cert for the front end.  Additionally, I configure 'sticky' sessions via cookies.

HAProxy Config - RE REST API

Redis Enterprise provides a REST API for programmatic configuration and provisioning (TLS, port 9443).  For this scenario, I simply pass the TLS sessions through HAProxy via TCP.

HAProxy Config - RE Database

A Redis Enterprise database can have a configurable client connection port.  In this case, I've configured it to 12000 (TCP).  Note in the backend configuration I've set up a Layer 7 health check that will attempt to create an authenticated Redis client connection, send a Redis PING, and then close that connection.

Source


https://github.com/Redislabs-Solution-Architects/basic-acd

Copyright ©1993-2024 Joey E Whelan, All rights reserved.

Basic ACD with Redis Enterprise

Summary

This post covers a contact ACD implementation I've done utilizing Redis data structures.  The applications are written in Python.  The client interface is implemented as REST API via FastAPI.  An internal Python app (Dispatcher) is used to monitor and administer the ACD data structures in Redis.  Docker containers are used for architectural components.


Application Architecture



Data Structures


Contact, Queue


Contacts are implemented as Redis JSON objects.  Each contact has an associated array of skills necessary to service that contact.  Example:  English language proficiency.

A single queue for all contacts is implemented as a Redis Sorted Set.  The members of the set are the Redis key names of the contacts.  The associated scores are millisecond timestamps of the time the contact entered the queue.  This allows for FIFO queue management  


Agent


Agents are implemented as Redis JSON objects.  Agent meta-data is stored as simple properties.  Agent skills are maintained as arrays.  The redis-py implementation of Redlock is used to ensure mutual exclusion to agent objects.


Agent Availability


Redis Sorted Sets are also used to track Agent availability.  A sorted set is created per skill.  The members of that set are the Redis keys for the agents that are available with the associated skill.  The associated scores are millisecond timestamps of the time the agent became available.  This use of sorted sets allows for multi-skill routing to the longest available agent (LAA).


Operations


Agent Targeting 


Routing of contacts to agents is performed by multiple Dispatcher processes.  Each Dispatcher is running an infinite loop that does the following:
  • Pop the oldest contact from the queue
  • Perform an intersection of the availability sets for the skills necessary for that contact
  • If there are agent(s) available, assign that agent to this contact and set the agent to unavailable.
  • If there are no agents available with the necessary skills, put the contact back in the queue

Source


Sunday, November 19, 2023

DICOM Image Caching with Redis

 Summary

This post covers a demonstration of the usage of Redis for caching DICOM imagery.  I use a Jupyter Notebook to step through loading and searching DICOM images in a Redis Enterprise environment.

Architecture




Redis Enterprise Environment

Screen-shot below of the resulting environment in Docker.



Sample DICOM Image

I use a portion of sample images included with the Pydicom lib.  Below is an example:


Code Snippets

Data Load

The code below loops through the Pydicom-included DICOM files.  Those that contain the meta-data that is going to be subsequently used for some search scenarios are broken up into 5 KB chunks and stored as Redis Strings.  Those chunks and the meta-data are then saved to a Redis JSON object.  The chunks' Redis key names are stored as an array in that JSON object.

Search Scenario 1

This code retrieves all the byte chunks for a DICOM image where the Redis key is known.  Strictly, speaking this isn't a 'search'.  I'm simply performing a JSON GET for a key name.

Search Scenario 2

The code below demonstrates how to put together a Redis Search on the image meta-data.  In this case, we're looking for a DICOM image with a protocolName of 194 and studyDate in 2019.

Source


Copyright ©1993-2024 Joey E Whelan, All rights reserved.

Sunday, November 5, 2023

Redis Vector Database Sizing Tool

Summary

In this post, I cover a utility I wrote for observing Redis vector data and index sizes with varying data types and index parameters.  The tool creates a single-node, single-shard Redis Enterprise database with the Search and JSON modules enabled.

Code Snippets

Constants and Enums



Redis Index Build and Data Load


Sample Results


Source


Copyright ©1993-2024 Joey E Whelan, All rights reserved.