Monday, January 15, 2024

Change Data Capture w/Redis Enterprise

Summary

Redis Enterprise has the capability for continuous data integration with 3rd party data sources.  This capability is enabled via the Redis Data Integration (RDI) product.  With RDI, change data capture (CDC) can be achieved with all the major SQL databases for ingress.  Similarly, in the other direction, updates to Redis can be continuously written to 3rd party targets via the write-behind functionality of RDI.  

This post covers a demo-grade environment of Redis Enterprise + RDI with ingress and write-behind integrations with the following SQL databases:  Oracle, MS SQL, Postgres, and MySQL.  All components are containerized and run from a Docker environment.

Architecture


Ingress



Write-behind



Code Snippets

Docker Compose - Redis Enterprise Node



Docker Compose - Oracle Enterprise



Docker Compose - Debezium



RDI Ingress w/Prometheus Integration


Source


Copyright ©1993-2024 Joey E Whelan, All rights reserved.

Monday, January 8, 2024

Document AI with Apache Airflow

Summary

In this post, I cover an approach to a document AI problem using a task flow implemented in Apache Airflow.  The particular problem is around the de-duplication of invoices.  This comes up in payment provider space.  I use Azure AI Document Intelligence for OCR, Azure OpenAI for vector embeddings, and Redis Enterprise for vector search.

Architecture



Code Snippets


File Sensor DAG


OCR DAG


OCR Client (Azure AI Doc Intelligence)


Embedding DAG


Embedding Client (Azure OpenAI)


Vector Search DAG


Vector Search Client (Redis Enterprise)


Source


Copyright ©1993-2024 Joey E Whelan, All rights reserved.