Tech Tips: January 2024

Monday, January 15, 2024

Change Data Capture w/Redis Enterprise

Summary

Redis Enterprise has the capability for continuous data integration with 3rd party data sources. This capability is enabled via the Redis Data Integration (RDI) product. With RDI, change data capture (CDC) can be achieved with all the major SQL databases for ingress. Similarly, in the other direction, updates to Redis can be continuously written to 3rd party targets via the write-behind functionality of RDI.

This post covers a demo-grade environment of Redis Enterprise + RDI with ingress and write-behind integrations with the following SQL databases: Oracle, MS SQL, Postgres, and MySQL. All components are containerized and run from a Docker environment.

Architecture

Ingress

Write-behind

Code Snippets

Docker Compose - Redis Enterprise Node

Docker Compose - Oracle Enterprise

Docker Compose - Debezium

RDI Ingress w/Prometheus Integration

Source

https://github.com/Redislabs-Solution-Architects/redisdi-docker

Monday, January 8, 2024

Document AI with Apache Airflow

Summary

In this post, I cover an approach to a document AI problem using a task flow implemented in Apache Airflow. The particular problem is around the de-duplication of invoices. This comes up in payment provider space. I use Azure AI Document Intelligence for OCR, Azure OpenAI for vector embeddings, and Redis Enterprise for vector search.

Tech Tips

Monday, January 15, 2024

Change Data Capture w/Redis Enterprise

Summary

Architecture

Ingress

Write-behind

Code Snippets

Docker Compose - Redis Enterprise Node

Docker Compose - Oracle Enterprise

Docker Compose - Debezium

RDI Ingress w/Prometheus Integration

Source

Monday, January 8, 2024

Document AI with Apache Airflow

Summary

Architecture

Code Snippets

File Sensor DAG

OCR DAG

OCR Client (Azure AI Doc Intelligence)

Embedding DAG

Embedding Client (Azure OpenAI)

Vector Search DAG

Vector Search Client (Redis Enterprise)

Source