Monday, January 15, 2024

Change Data Capture w/Redis Enterprise

Summary

Redis Enterprise has the capability for continuous data integration with 3rd party data sources.  This capability is enabled via the Redis Data Integration (RDI) product.  With RDI, change data capture (CDC) can be achieved with all the major SQL databases for ingress.  Similarly, in the other direction, updates to Redis can be continuously written to 3rd party targets via the write-behind functionality of RDI.  

This post covers a demo-grade environment of Redis Enterprise + RDI with ingress and write-behind integrations with the following SQL databases:  Oracle, MS SQL, Postgres, and MySQL.  All components are containerized and run from a Docker environment.

Architecture


Ingress



Write-behind



Code Snippets

Docker Compose - Redis Enterprise Node



Docker Compose - Oracle Enterprise



Docker Compose - Debezium



RDI Ingress w/Prometheus Integration


Source


Copyright ©1993-2024 Joey E Whelan, All rights reserved.

Monday, January 8, 2024

Document AI with Apache Airflow

Summary

In this post, I cover an approach to a document AI problem using a task flow implemented in Apache Airflow.  The particular problem is around the de-duplication of invoices.  This comes up in payment provider space.  I use Azure AI Document Intelligence for OCR, Azure OpenAI for vector embeddings, and Redis Enterprise for vector search.

Architecture



Code Snippets


File Sensor DAG


OCR DAG


OCR Client (Azure AI Doc Intelligence)


Embedding DAG


Embedding Client (Azure OpenAI)


Vector Search DAG


Vector Search Client (Redis Enterprise)


Source


Copyright ©1993-2024 Joey E Whelan, All rights reserved.

Monday, January 1, 2024

Redis Search and SQL Command Comparison

Summary

This post covers comparisons of various data search scenarios of equivalent SQL and Redis Search commands.  The Chinook data set is deployed in Oracle Enterprise.  The Oracle Enterprise data is then continuously populated into Redis JSON objects using Redis Data Integration (RDI).


Architecture

This entire architecture is deployed in Docker containers.


Sample Scenarios


Scenario  - Which countries have the most Invoices?

SQL


Redis Search


Scenario  - Which artists have written the most Rock music?

This particular query touches 4 different tables in relational db tables.  There are 2 approaches to this in Oracle:
  • Ad hoc query that implements the necessary joins
  • Materialized View that implements the same join query but is updated continuously.

SQL - Query


SQL - Materialized View

Below is the equivalent Materialized View.  The ALTER commands set this view up to be tracked by Debezium and ultimately replicated in Redis Enterprise.

Redis Search - Materialized View

The Oracle Materialized View is treated the same as any other table by Debezium and is thus populated into Redis via RDI.

Redis Search - Triggers & functions

An alternate approach to the materialized view is to perform the necessary joins in a Redis Function.  This a Gear 2.0 feature where a function can be written in Javascript and executed server-side in Redis.  Snippet below of that function.


Source


Copyright ©1993-2024 Joey E Whelan, All rights reserved.