Sunday, December 31, 2023

HAProxy with Redis Enterprise

Summary

This is Part 2 of a two-part series on the implementation of a contact center ACD using Redis data structures.  This part is focused on the network configuration.  In particular, I explain the configuration of HAProxy load balancing with VRRP redundancy in a Redis Enterprise environment.  To boot, I explain some of the complexities of doing this inside a Docker container environment.

Network Architecture


Load Balancing Configuration


HAProxy w/Keepalived

Docker Container

Dockerfile and associated Docker compose script below for two instances of HAProxy w/keepalived.  Note the default start-up for the HAProxy container is overridden with a CMD to start keepalived and haproxy.

FROM haproxytech/haproxy-ubuntu:latest
USER root
RUN apt-get update
RUN apt-get install keepalived -y
RUN apt-get install psmisc -y
CMD service keepalived start; haproxy -f /usr/local/etc/haproxy/haproxy.cfg
lb1:
build:
context: .
dockerfile: $PWD/haproxy/Dockerfile
container_name: lb1
cap_add:
- NET_ADMIN
ports:
- 8000
- 8443
- 9443
- 12000
profiles: ["loadbalancer"]
volumes:
- $PWD/haproxy/haproxy.cfg:/usr/local/etc/haproxy/haproxy.cfg:ro
- $PWD/haproxy/server.pem:/usr/local/etc/haproxy/server.pem
- $PWD/haproxy/keepalived1.conf:/etc/keepalived/keepalived.conf
networks:
- re_cluster
lb2:
build:
context: .
dockerfile: $PWD/haproxy/Dockerfile
container_name: lb2
cap_add:
- NET_ADMIN
ports:
- 8000
- 8443
- 9443
- 12000
profiles: ["loadbalancer"]
volumes:
- $PWD/haproxy/haproxy.cfg:/usr/local/etc/haproxy/haproxy.cfg:ro
- $PWD/haproxy/server.pem:/usr/local/etc/haproxy/server.pem
- $PWD/haproxy/keepalived2.conf:/etc/keepalived/keepalived.conf
networks:
- re_cluster

Keepalived Config

VRRP redundancy of the two HAProxy instances is implemented with keepalived.  Below is the config for the Master instance.  The Backup instance is identical except for the priority.

global_defs {
script_user nobody
enable_script_security
}
vrrp_script chk_haproxy {
script "/usr/bin/killall -0 haproxy"
interval 2
weight 2
}
vrrp_instance VI_1 {
state MASTER
interface eth0
virtual_router_id 51
priority 101
advert_int 1
authentication {
auth_type PASS
auth_pass passwd
}
virtual_ipaddress {
192.168.20.100
}
track_script {
chk_haproxy
}
}

Web Servers

I'll start with the simplest load-balancing scenario - web farm.


Docker Container

Below is the Dockerfile and associated Docker compose scripting for a 2-server deployment of Python FastAPI.  Note that no IP addresses are assigned and multiple instances are deployed via Docker compose 'replicas'.

FROM python:3.10-slim
WORKDIR /app
COPY ./requirements.txt ./
RUN pip install --no-cache-dir --upgrade -r ./requirements.txt
COPY ./restapi/log_conf.yaml ./src/main.py ./src/operations.py ./src/response.py ./src/states.py ./
COPY ./.env ./
EXPOSE 8000
CMD ["uvicorn", "main:app", "--host", "0.0.0.0", "--port", "8000", "--log-config=log_conf.yaml"]
rest:
build:
context: .
dockerfile: $PWD/restapi/Dockerfile
deploy:
replicas: 2
ports:
- 8000
profiles: ["rest"]
networks:
- re_cluster


HAProxy Config

Below are the front and backend configurations.  Note the use of Docker's DNS server to enable dynamic mapping of the web servers via a HAProxy server template.

resolvers docker
nameserver dns1: 127.0.0.11:53
frontend rest_fe
mode http
bind :8000
default_backend rest_be
backend rest_be
mode http
balance roundrobin
server-template restapi- 2 rest:8000 check resolvers docker init-addr none

Redis Enterprise Components

Redis Enterprise can provide its own load balancing via internal DNS servers.  For those that do not want to use DNS, external load balancing is also supported.  Official Redis documentation on the general configuration of external load balancing is here.  I'm going to go into detail on the specifics of setting this up with the HAProxy load balancer in a Docker environment.

Docker Containers

A three-node cluster is provisioned below.  Note the ports that are opened:
  • 8443 - Redis Enterprise Admin Console
  • 9443 - Redis Enterprise REST API
  • 12000 - The client port configured for the database.

re1:
image: redislabs/redis:latest
container_name: re1
restart: unless-stopped
tty: true
cap_add:
- sys_resource
ports:
- 8443
- 9443
- 12000
profiles: ["redis"]
networks:
re_cluster:
ipv4_address: 192.168.20.2
re2:
image: redislabs/redis:latest
container_name: re2
restart: unless-stopped
tty: true
cap_add:
- sys_resource
ports:
- 8443
- 9443
- 12000
profiles: ["redis"]
networks:
re_cluster:
ipv4_address: 192.168.20.3
re3:
image: redislabs/redis:latest
container_name: re3
restart: unless-stopped
tty: true
cap_add:
- sys_resource
ports:
- 8443
- 9443
- 12000
profiles: ["redis"]
networks:
re_cluster:
ipv4_address: 192.168.20.4

RE Database Configuration

Below is a JSON config that can be used via the RE REST API to create a Redis database.  Note the proxy policy.  "all-nodes" enables a database client connection point on all the Redis nodes.

{
"name": "redb",
"type": "redis",
"memory_size": 10000000,
"port": 12000,
"authentication_redis_pass": "redis",
"proxy_policy": "all-nodes",
"sharding": true,
"shards_count": 2,
"shards_placement": "sparse",
"shard_key_regex": [{"regex": ".*\\{(?<tag>.*)\\}.*"}, {"regex": "(?<tag>.*)"}],
"replication": false,
"module_list": [{
"module_name":"ReJSON",
"module_args": ""
}]
}
view raw lb-redb.json hosted with ❤ by GitHub

RE Cluster Configuration

In the start.sh script, this command below is added to configure redirects in the Cluster (per the Redis documentation).

docker exec -it re1 /opt/redislabs/bin/rladmin cluster config handle_redirects enabled
view raw lb-recluster.sh hosted with ❤ by GitHub

HAProxy Config - RE Admin Console

Redis Enterprise has a web interface for configuration and monitoring (TLS, port 8443).  I configure back-to-back TLS sessions below with a local SSL cert for the front end.  Additionally, I configure 'sticky' sessions via cookies.

frontend redisadmin_fe
mode http
bind :8443 ssl crt /usr/local/etc/haproxy/server.pem
default_backend redisadmin_be
backend redisadmin_be
mode http
balance leastconn
cookie SERVER_USED insert indirect nocache
server re1 re1:8443 check cookie re1 ssl verify none
server re2 re2:8443 check cookie re2 ssl verify none
server re3 re3:8443 check cookie re3 ssl verify none

HAProxy Config - RE REST API

Redis Enterprise provides a REST API for programmatic configuration and provisioning (TLS, port 9443).  For this scenario, I simply pass the TLS sessions through HAProxy via TCP.

frontend redisrest_fe
mode tcp
bind :9443
default_backend redisrest_be
backend redisrest_be
mode tcp
balance roundrobin
server re1 re1:9443 check
server re2 re2:9443 check
server re3 re3:9443 check

HAProxy Config - RE Database

A Redis Enterprise database can have a configurable client connection port.  In this case, I've configured it to 12000 (TCP).  Note in the backend configuration I've set up a Layer 7 health check that will attempt to create an authenticated Redis client connection, send a Redis PING, and then close that connection.

frontend redb_fe
mode tcp
bind :12000
default_backend redb_be
backend redb_be
mode tcp
balance roundrobin
option tcp-check
tcp-check send AUTH\ redis\r\n
tcp-check expect string +OK
tcp-check send PING\r\n
tcp-check expect string +PONG
tcp-check send QUIT\r\n
tcp-check expect string +OK
server re1 re1:12000 check
server re2 re2:12000 check
server re3 re3:12000 check
view raw lb-redb.cfg hosted with ❤ by GitHub

Source


https://github.com/redis-developer/basic-acd

Copyright ©1993-2024 Joey E Whelan, All rights reserved.

Basic ACD with Redis Enterprise

Summary

This post covers a contact ACD implementation I've done utilizing Redis data structures.  The applications are written in Python.  The client interface is implemented as REST API via FastAPI.  An internal Python app (Dispatcher) is used to monitor and administer the ACD data structures in Redis.  Docker containers are used for architectural components.


Application Architecture



Data Structures


Contact, Queue


Contacts are implemented as Redis JSON objects.  Each contact has an associated array of skills necessary to service that contact.  Example:  English language proficiency.

A single queue for all contacts is implemented as a Redis Sorted Set.  The members of the set are the Redis key names of the contacts.  The associated scores are millisecond timestamps of the time the contact entered the queue.  This allows for FIFO queue management  
resp_type: RESPONSE_TYPE = None
result: str = None
contact_key: str = f'contact:{str(uuid4())}'
try:
await client.json().set(contact_key, '$', {'skills': skills, 'state': CONTACT_STATE.QUEUED.value, 'agent': None})
await client.zadd('queue', mapping={ contact_key: round(time.time()*1000) }) #time in ms
resp_type = RESPONSE_TYPE.OK
result = contact_key
except Exception as err:
result = f'create_contact - {err}'
resp_type = RESPONSE_TYPE.ERR
finally:
return Response(resp_type, result)
view raw bacd-contact.py hosted with ❤ by GitHub


Agent


Agents are implemented as Redis JSON objects.  Agent meta-data is stored as simple properties.  Agent skills are maintained as arrays.  The redis-py implementation of Redlock is used to ensure mutual exclusion to agent objects.
resp_type: RESPONSE_TYPE = None
result: str = None
try:
lock: Lock = Lock(redis=client, name=f'{agent_key}:lock', timeout=LOCK_TIMEOUT, blocking_timeout=BLOCK_TIME)
lock_acquired: bool = await lock.acquire()
if lock_acquired:
exists: int = await client.exists(agent_key)
if exists:
result = f'create_agent - agent {agent_key} already exists'
resp_type = RESPONSE_TYPE.ERR
else:
agent_obj: dict = { 'id': agent_key, 'fname': fname, 'lname': lname, 'skills': skills, 'state': AGENT_STATE.UNAVAILABLE.value }
await client.json().set(agent_key, '$', agent_obj)
result = agent_key
resp_type = RESPONSE_TYPE.OK
else:
resp_type = RESPONSE_TYPE.LOCKED
except Exception as err:
result = f'create_agent - {err}'
resp_type = RESPONSE_TYPE.ERR
finally:
if await lock.locked():
await lock.release()
return Response(resp_type, result)
view raw bacd-agent.py hosted with ❤ by GitHub


Agent Availability


Redis Sorted Sets are also used to track Agent availability.  A sorted set is created per skill.  The members of that set are the Redis keys for the agents that are available with the associated skill.  The associated scores are millisecond timestamps of the time the agent became available.  This use of sorted sets allows for multi-skill routing to the longest available agent (LAA).


try:
lock: Lock = Lock(redis=client, name=f'{agent_key}:lock', timeout=LOCK_TIMEOUT, blocking_timeout=BLOCK_TIME)
lock_acquired: bool = await lock.acquire()
if lock_acquired:
exists: int = await client.exists(agent_key)
if not exists:
result = f'set_agent_state - {agent_key} does not exist'
resp_type = RESPONSE_TYPE.ERR
else:
current_state = (await client.json().get(agent_key, '$.state'))[0]
if AGENT_STATE(current_state) != state:
skills: list[list[str]] = await client.json().get(agent_key, '$.skills')
for skill in skills[0]:
match state:
case AGENT_STATE.AVAILABLE:
await client.zadd(f'{{availAgentsSkill}}:{skill}', mapping={ agent_key: round(time.time()*1000) })
await client.json().set(agent_key, '$.state', AGENT_STATE.AVAILABLE.value)
case AGENT_STATE.UNAVAILABLE:
await client.zrem(f'{{availAgentsSkill}}:{skill}', agent_key)
await client.json().set(agent_key, '$.state', AGENT_STATE.UNAVAILABLE.value)
case _:
raise Exception(f'invalid agent state parameter: {state}')
result = agent_key
resp_type = RESPONSE_TYPE.OK
else:
result = f'set_agent_state - {agent_key} already in {AGENT_STATE(current_state)}'
resp_type = RESPONSE_TYPE.ERR
else:
resp_type = RESPONSE_TYPE.LOCKED
except Exception as err:
result = f'set_agent_state - {err}'
resp_type = RESPONSE_TYPE.ERR
finally:
if await lock.locked():
await lock.release()
return Response(resp_type, result)

Operations


Agent Targeting 


Routing of contacts to agents is performed by multiple Dispatcher processes.  Each Dispatcher is running an infinite loop that does the following:
  • Pop the oldest contact from the queue
  • Perform an intersection of the availability sets for the skills necessary for that contact
  • If there are agent(s) available, assign that agent to this contact and set the agent to unavailable.
  • If there are no agents available with the necessary skills, put the contact back in the queue

while True:
try:
response: list[tuple] = await client.bzpopmin('queue') # using a sorted set as a fifo queue
contact_key: str = response[1].decode('utf-8')
timestamp: int = int(response[2])
skills: list[list[str]] = await client.json().get(contact_key, '$.skills')
avail_keys: list[str] = [f'{{availAgentsSkill}}:{skill}' for skill in skills[0]]
agents: list[str] = await client.zinter(avail_keys)
agents = [agent.decode('utf-8') for agent in agents]
found: bool = False
for agent in agents:
response: Response = await ops.set_agent_state(client, agent, AGENT_STATE.UNAVAILABLE)
if response.resp_type == RESPONSE_TYPE.OK:
found = True
await client.json().mset([(contact_key, '$.agent', agent),
(contact_key, '$.state', CONTACT_STATE.ASSIGNED.value)])
logger.info(f'{contact_key} assigned to {agent}')
break
if not found:
# check if the contact has been abandoned
state: list[int] = (await client.json().get(contact_key, '$.state'))[0]
if CONTACT_STATE(state) != CONTACT_STATE.COMPLETE:
# no agent avail. put contact back on queue with a 1 sec decelerator to allow other contacts to bubble up
await client.zadd('queue', mapping={ contact_key: timestamp+1000 })
logger.info(f'{contact_key} queued')
await asyncio.sleep(uniform(0, 2))
except Exception as err:
if str(err) != "Connection closed by server.":
logger.error(err)
raise err

Source


Sunday, November 19, 2023

DICOM Image Caching with Redis

 Summary

This post covers a demonstration of the usage of Redis for caching DICOM imagery.  I use a Jupyter Notebook to step through loading and searching DICOM images in a Redis Enterprise environment.

Architecture




Redis Enterprise Environment

Screen-shot below of the resulting environment in Docker.



Sample DICOM Image

I use a portion of sample images included with the Pydicom lib.  Below is an example:


Code Snippets

Data Load

The code below loops through the Pydicom-included DICOM files.  Those that contain the meta-data that is going to be subsequently used for some search scenarios are broken up into 5 KB chunks and stored as Redis Strings.  Those chunks and the meta-data are then saved to a Redis JSON object.  The chunks' Redis key names are stored as an array in that JSON object.

def load_chunks(key, file, chunk_size):
i = 0
chunk_keys = []
with open(file, 'rb') as infile:
while chunk := infile.read(chunk_size):
chunk_key = f'chunk:{key}:{i}'
client.set(chunk_key, chunk)
chunk_keys.append(chunk_key)
i += 1
return chunk_keys
count = 0
pydicom.config.settings.reading_validation_mode = pydicom.config.RAISE
for file in pydicom.data.get_testdata_files():
try:
ds = pydicom.dcmread(file)
key = f'file:{os.path.basename(file)}'
image_name = os.path.basename(file)
protocol_name = re.sub(r'\s+', ' ', ds.ProtocolName)
patient_sex = ds.PatientSex
study_date = ds.StudyDate
manufacturer = ds.Manufacturer.upper()
chunk_keys = load_chunks(key, file, CHUNK_SIZE)
client.json().set(key, '$', {
'imageName': image_name,
'protocolName': protocol_name,
'patientSex': patient_sex,
'studyDate': study_date,
'manufacturer': manufacturer,
'chunks': chunk_keys
})
count += 1
except:
pass
print(f'Files loaded: {count}')

Search Scenario 1

This code retrieves all the byte chunks for a DICOM image where the Redis key is known.  Strictly, speaking this isn't a 'search'.  I'm simply performing a JSON GET for a key name.

file_name = 'JPGExtended.dcm'
t1 = perf_counter()
results = client.json().get(f'file:{file_name}', '$.chunks')
total_bytes = get_bytes(results[0])
t2 = perf_counter()
print(f'Exec time: {round((t2-t1)*1000,2)} ms')
print(f'Bytes Retrieved: {len(total_bytes)}')

Search Scenario 2

The code below demonstrates how to put together a Redis Search on the image meta-data.  In this case, we're looking for a DICOM image with a protocolName of 194 and studyDate in 2019.

query = Query('@protocolName:194 @studyDate:{2019*}')\
.return_field('$.chunks', as_field='chunks')\
.return_field('$.imageName', as_field='imageName')
t1 = perf_counter()
result = client.ft('dicom_idx').search(query)
total_bytes = bytearray()
if len(result.docs) > 0:
total_bytes = get_bytes(json.loads(result.docs[0].chunks))
t2 = perf_counter()
print(f'Exec time: {round((t2-t1)*1000,2)} ms')
print(f'Image name: {result.docs[0].imageName}')
print(f'Bytes Retrieved: {len(total_bytes)}')

Source


Copyright ©1993-2024 Joey E Whelan, All rights reserved.

Sunday, November 5, 2023

Redis Vector Database Sizing Tool

Summary

In this post, I cover a utility I wrote for observing Redis vector data and index sizes with varying data types and index parameters.  The tool creates a single-node, single-shard Redis Enterprise database with the Search and JSON modules enabled.

Code Snippets

Constants and Enums



REDIS_URL: str = 'redis://default:redis@localhost:12000'
NUM_KEYS: int = 100000
connection: Connection = None
class OBJECT_TYPE(Enum):
HASH = 'hash'
JSON = 'json'
class INDEX_TYPE(Enum):
FLAT = 'flat'
HNSW = 'hnsw'
class METRIC_TYPE(Enum):
L2 = 'l2'
IP = 'ip'
COSINE = 'cosine'
class FLOAT_TYPE(Enum):
F32 = 'float32'
F64 = 'float64'

Redis Index Build and Data Load


global connection
num_workers: int = cpu_count() - 1 #number of worker processes for data loading
keys: list[int] = [self.num_keys // num_workers for i in range(num_workers)] #number of keys each worker will generate
keys[0] += self.num_keys % num_workers
connection.flushdb()
sleep(5) #wait for counters to reset
base_ram: float = round(connection.info('memory')['used_memory']/1048576, 2) # 'empty' Redis db memory usage
vec_params: dict = {
"TYPE": self.float_type.value,
"DIM": self.vec_dim,
"DISTANCE_METRIC": self.metric_type.value,
}
if self.index_type is INDEX_TYPE.HNSW:
vec_params['M'] = self.vec_m
match self.object_type:
case OBJECT_TYPE.JSON:
schema = [ VectorField('$.vector', self.index_type.value, vec_params, as_name='vector')]
idx_def: IndexDefinition = IndexDefinition(index_type=IndexType.JSON, prefix=['key:'])
case OBJECT_TYPE.HASH:
schema = [ VectorField('vector', self.index_type.value, vec_params)]
idx_def: IndexDefinition = IndexDefinition(index_type=IndexType.HASH, prefix=['key:'])
connection.ft('idx').create_index(schema, definition=idx_def)
pool_params = zip(keys, repeat(self.object_type), repeat(self.vec_dim), repeat(self.float_type))
t1_start: float = perf_counter()
with Pool(cpu_count()) as pool:
pool.starmap(load_db, pool_params) # load a Redis instance via a pool of worker processes
t1_stop:float = perf_counter()

Sample Results


python3 vss-sizer.py --nkeys 100000 --objecttype hash --indextype flat --metrictype cosine --floattype f32 --vecdim 1536
Vector Index Test
*** Parameters ***
nkeys: 100000
objecttype: hash
indextype: flat
metrictype: cosine
floattype: float32
vecdim: 1536
*** Results ***
index ram used: 606.75 MB
data ram used: 808.52 MB
index to data ratio: 75.04%
document size: 7376 B
execution time: 2.3 sec

Source


Copyright ©1993-2024 Joey E Whelan, All rights reserved.

Redis Search - Rental Availability

Summary

This post covers a very specific use case of Redis in the short-term rental domain.  Specifically, Redis is used to find property availability in a given geographic area and date/time slot.

Architecture


Code Snippets

Data Load

The code below loads rental properties as Redis JSON objects and US Postal ZIP codes with their associated lat/longs as Redis strings.

async #insertProperties() {
const csvStream = fs.createReadStream("./data/co.csv").pipe(parse({ delimiter: ",", from_line: 2}));
let id = 1;
for await (const row of csvStream) {
const doc = {
"id": id,
"address": {
"coords": `${row[0]} ${row[1]}`,
"number": row[2],
"street": row[3],
"unit": row[4],
"city": row[5],
"state": "CO",
"postcode": row[8]
},
"owner": {
"fname": uniqueNamesGenerator({dictionaries: [names], style: 'capital', length: 1, separator: ' '}),
"lname": uniqueNamesGenerator({dictionaries: [names], style: 'capital', length: 1, separator: ' '}),
},
"type": `${TYPES[Math.floor(Math.random() * TYPES.length)]}`,
"availability": this.#getAvailability(),
"rate": Math.round((Math.random() * 250 + 125) * 100) / 100
}
await this.client.json.set(`property:${id}`, '.', doc);
id++;
if (id > MAX_PROPERTIES) {
break;
}
}
async #insertZips() {
const csvStream = fs.createReadStream("./data/zip_lat_long.csv").pipe(parse({ delimiter: ",", from_line: 2}));
for await (const row of csvStream) {
const zip = row[0];
const lat = row[1];
const lon = row[2];
await this.client.set(`zip:${zip}`, `${lon} ${lat}`);
}
}
view raw rental-load.js hosted with ❤ by GitHub

Property Search

The code below represents an Expressjs route for performing searches on the Redis properties.  The search is performed on rental property type and geographic distance from a given location.

app.post('/property/search', async (req, res) => {
const { type, zip, radius, begin, end } = req.body;
console.log(`app - POST /property/search ${JSON.stringify(req.body)}`);
try {
const loc = await client.get(`zip:${zip}`);
if (!loc) {
throw new Error('Zip code not found');
}
const query = `@type:{${type}} @coords:[${loc} ${radius} mi]`;
const docs = await client.ft.aggregate('propIdx', query,
{
DIALECT: 3,
LOAD: [
'@__key',
{ identifier: `$.availability[?(@.begin<=${begin} && @.end>=${end})]`,
AS: 'match'
}
],
STEPS: [
{ type: AggregateSteps.FILTER,
expression: 'exists(@match)'
},
{
type: AggregateSteps.SORTBY,
BY: {
BY: '@rate',
DIRECTION: 'ASC'
}
},
{
type: AggregateSteps.LIMIT,
from: 0,
size: 3
}
]
});
if (docs && docs.results) {
let properties = [];
for (const result of docs.results) {
const rental_date = JSON.parse(result.match);
const property = {
"key": result.__key,
"rate": result.rate,
"begin": rental_date[0].begin,
"end": rental_date[0].end
};
properties.push(property);
}
console.log(`app - POST /property/search - properties found: ${properties.length}`);
res.status(200).json(properties);
}
else {
console.log('app - POST /property/search - no properties found');
res.status(401).send('No properties found');
}
}
catch (err) {
console.log(`app - POST /property/search - error: ${err.message}`);
res.status(400).json({ 'error': err.message });
}
});

Source


Copyright ©1993-2024 Joey E Whelan, All rights reserved.

Monday, May 29, 2023

OpenAI Q&A using Redis VSS for Context

Summary

I'll be covering the use case of providing supplemental context to OpenAI in a question/answer scenario (ChatGPT).  Various news articles will be vectorized and stored in Redis.  For a given question that lies outside of ChatGPT's knowledge, additional context will be fetched from Redis via Vector Similarity Search (VSS).   That context will aid ChatGPT in providing a more accurate answer.

Architecture


Code Snippets

OpenAI Prompt/Collect Helper Function


The code below is a simple function for sending a prompt into ChatGPT and then extracting the resulting response.
import openai
import os
from dotenv import load_dotenv
load_dotenv()
openai.api_key = os.getenv("OPENAI_API_KEY")
def get_completion(prompt, model="gpt-3.5-turbo"):
messages = [{"role": "user", "content": prompt}]
response = openai.ChatCompletion.create(
model=model,
messages=messages,
temperature=0,
)
return response.choices[0].message["content"]

OpenAI QnA Example 1


The prompt below is on a topic (FTX meltdown) that is outside of ChatGPT's training cut-off date. As a result, the response is of poor quality (wrong).
prompt = "Is Sam Bankman-Fried's company, FTX, considered a well-managed company?"
response = get_completion(prompt)
print(response)
As an AI language model, I cannot provide a personal opinion. However, FTX has been recognized as one of the fastest-growing cryptocurrency exchanges and has received positive reviews for its user-friendly interface, low fees, and innovative products. Additionally, Sam Bankman-Fried has been praised for his leadership and strategic decision-making, including FTX's recent acquisition of Blockfolio. Overall, FTX appears to be a well-managed company.
view raw result.txt hosted with ❤ by GitHub

Redis Context Index Build


The code below uses Redis-py client lib to build an index for business article content in Redis. The index has two fields in its schema: the text content itself and a vector representing the embedding of that text content.
from redis.commands.search.field import TextField, VectorField
from redis.commands.search.indexDefinition import IndexDefinition, IndexType
schema = [ VectorField('$.vector',
"FLAT",
{ "TYPE": 'FLOAT32',
"DIM": 1536,
"DISTANCE_METRIC": "COSINE"
}, as_name='vector' ),
TextField('$.content', as_name='content')
]
idx_def = IndexDefinition(index_type=IndexType.JSON, prefix=['doc:'])
try:
client.ft('idx').dropindex()
except:
pass
client.ft('idx').create_index(schema, definition=idx_def)

Context Storage as Redis JSON


The code below loads up a dozen different business articles into Redis as JSON objects.
import os
import openai
directory = './assets/'
model='text-embedding-ada-002'
i = 1
for file in os.listdir(directory):
with open(os.path.join(directory, file)) as f:
content = f.read()
vector = openai.Embedding.create(input = [content], model = model)['data'][0]['embedding']
client.json().set(f'doc:{i}', '$', {'content': content, 'vector': vector})
i += 1

RedisInsight



Redis Vector Search (KNN)


A vector search in Redis is depicted below. This particular query picks the #1 article as far as vector distance to a given question (prompt).
from redis.commands.search.query import Query
import numpy as np
vec = np.array(openai.Embedding.create(input = [prompt], model = model)['data'][0]['embedding'], dtype=np.float32).tobytes()
q = Query('*=>[KNN 1 @vector $query_vec AS vector_score]')\
.sort_by('vector_score')\
.return_fields('content')\
.dialect(2)
params = {"query_vec": vec}
context = client.ft('idx').search(q, query_params=params).docs[0].content
print(context)
True
b'OK'
Embattled Crypto Exchange FTX Files for Bankruptcy
Nov. 11, 2022
On Monday, Sam Bankman-Fried, the chief executive of the cryptocurrency exchange FTX, took to Twitter to reassure his customers: “FTX is fine,” he wrote. “Assets are fine.”
On Friday, FTX announced that it was filing for bankruptcy, capping an extraordinary week of corporate drama that has upended crypto markets, sent shock waves through an industry struggling to gain mainstream credibility and sparked government investigations that could lead to more damaging revelations or even criminal charges.
In a statement on Twitter, the company said that Mr. Bankman-Fried had resigned, with John J. Ray III, a corporate turnaround specialist, taking over as chief executive.
The speed of FTX’s downfall has left crypto insiders stunned. Just days ago, Mr. Bankman-Fried was considered one of the smartest leaders in the crypto industry, an influential figure in Washington who was lobbying to shape regulations. And FTX was widely viewed as one of the most stable and responsible companies in the freewheeling, loosely regulated crypto industry.
“Here we are, with one of the richest people in the world, his net worth dropping to zero, his business dropping to zero,” said Jared Ellias, a bankruptcy professor at Harvard Law School. “The velocity of this failure is just unbelievable.”
Now, the bankruptcy has set up a rush among investors and customers to salvage funds from what remains of FTX. A surge of customers tried to withdraw funds from the platform this week, and the company couldn’t meet the demand. The exchange owes as much as $8 billion, according to people familiar with its finances.
FTX’s collapse has destabilized the crypto industry, which was already reeling from a crash in the spring that drained $1 trillion from the market. The prices of the leading cryptocurrencies, Bitcoin and Ether, have plummeted. The crypto lender BlockFi, which was closely entangled with FTX, announced on Thursday that it was suspending operations as a result of FTX’s collapse.
Mr. Bankman-Fried was backed by some of the highest-profile venture capital investors in Silicon Valley, including Sequoia Capital and Lightspeed Venture Partners. Some of those investors, facing questions about how closely they scrutinized FTX before they put money into it, have said that their nine-figure investments in the crypto exchange are now essentially worthless.
The company’s demise has also set off a reckoning over risky practices that have become pervasive in crypto, an industry that was founded partly as a corrective to the type of dangerous financial engineering that caused the 2008 economic crisis.
“I’m really sorry, again, that we ended up here,” Mr. Bankman-Fried said on Twitter on Friday. “Hopefully this can bring some amount of transparency, trust, and governance.”
The bankruptcy filing marks the start of what will probably be months or even years of legal fallout, as lawyers try to work out whether the exchange can ever continue to operate in some form and customers demand compensation. FTX is already the target of investigations by the Securities and Exchange Commission and the Justice Department, with investigators focused on whether the company improperly used customer funds to prop up Alameda Research, a trading firm that Mr. Bankman-Fried also founded.
...
Not long ago, Mr. Bankman-Fried was performing a comedy routine onstage at a conference with Anthony Scaramucci, the former White House communications director and a business partner of FTX.
“I’m disappointed,” Mr. Scaramucci said in an interview on CNBC on Friday. “Duped, I guess, is the right word.”
view raw result.txt hosted with ❤ by GitHub

Reprompt ChatGPT with Redis-fetched Context


The context fetched in the previous step is now added as supplemental info to ChatGPT for the same FTX-related question. The response is now in line with expectations.
prompt = f"""
Using the information delimited by triple backticks, answer this question: Is Sam Bankman-Fried's company, FTX, considered a well-managed company?
Context: ```{context}```
"""
response = get_completion(prompt)
print(response)
No, Sam Bankman-Fried's company FTX is not considered a well-managed company as it has filed for bankruptcy and owes as much as $8 billion to its creditors. The collapse of FTX has destabilized the crypto industry, and the company is already the target of investigations by the Securities and Exchange Commission and the Justice Department. FTX was widely viewed as one of the most stable and responsible companies in the freewheeling, loosely regulated crypto industry, but its risky practices have become pervasive in crypto, leading to a reckoning.
view raw result.txt hosted with ❤ by GitHub

Source


Copyright ©1993-2024 Joey E Whelan, All rights reserved.

OpenAI + Redis VSS w/JSON

Summary

This post will cover an example of how to use Redis Vector Similarity Search (VSS) capabilities with OpenAI as the embedding engine.  Documents will be stored as JSON objects within Redis and then searched via VSS via KNN and Hybrid queries.

Architecture

Code Snippets

OpenAI Embedding


def get_vector(text, model="text-embedding-ada-002"):
text = text.replace("\n", " ")
return openai.Embedding.create(input = [text], model = model)['data'][0]['embedding']
text_1 = """Japan narrowly escapes recession
Japan's economy teetered on the brink of a technical recession in the three months to September, figures show.
Revised figures indicated growth of just 0.1% - and a similar-sized contraction in the previous quarter. On an annual basis, the data suggests annual growth of just 0.2%, suggesting a much more hesitant recovery than had previously been thought. A common technical definition of a recession is two successive quarters of negative growth.
The government was keen to play down the worrying implications of the data. "I maintain the view that Japan's economy remains in a minor adjustment phase in an upward climb, and we will monitor developments carefully," said economy minister Heizo Takenaka. But in the face of the strengthening yen making exports less competitive and indications of weakening economic conditions ahead, observers were less sanguine. "It's painting a picture of a recovery... much patchier than previously thought," said Paul Sheard, economist at Lehman Brothers in Tokyo. Improvements in the job market apparently have yet to feed through to domestic demand, with private consumption up just 0.2% in the third quarter.
"""
doc_1 = {"content": text_1, "vector": get_vector(text_1)}

Redis Index Creation


from redis.commands.search.field import TextField, VectorField
from redis.commands.search.indexDefinition import IndexDefinition, IndexType
schema = [ VectorField('$.vector',
"FLAT",
{ "TYPE": 'FLOAT32',
"DIM": len(doc_1['vector']),
"DISTANCE_METRIC": "COSINE"
}, as_name='vector' ),
TextField('$.content', as_name='content')
]
idx_def = IndexDefinition(index_type=IndexType.JSON, prefix=['doc:'])
try:
client.ft('idx').dropindex()
except:
pass
client.ft('idx').create_index(schema, definition=idx_def)

Redis JSON Document Insertion


client.json().set('doc:1', '$', doc_1)
client.json().set('doc:2', '$', doc_2)
client.json().set('doc:3', '$', doc_3)
view raw oaijson-load.py hosted with ❤ by GitHub

RedisInsight



Redis Semantic Search (KNN)


text_4 = """Radcliffe yet to answer GB call
Paula Radcliffe has been granted extra time to decide whether to compete in the World Cross-Country Championships.
The 31-year-old is concerned the event, which starts on 19 March in France, could upset her preparations for the London Marathon on 17 April. "There is no question that Paula would be a huge asset to the GB team," said Zara Hyde Peters of UK Athletics. "But she is working out whether she can accommodate the worlds without too much compromise in her marathon training." Radcliffe must make a decision by Tuesday - the deadline for team nominations. British team member Hayley Yelling said the team would understand if Radcliffe opted out of the event. "It would be fantastic to have Paula in the team," said the European cross-country champion. "But you have to remember that athletics is basically an individual sport and anything achieved for the team is a bonus. "She is not messing us around. We all understand the problem." Radcliffe was world cross-country champion in 2001 and 2002 but missed last year's event because of injury. In her absence, the GB team won bronze in Brussels.
"""
vec = np.array(get_vector(text_4), dtype=np.float32).tobytes()
q = Query('*=>[KNN 3 @vector $query_vec AS vector_score]')\
.sort_by('vector_score')\
.return_fields('vector_score', 'content')\
.dialect(2)
params = {"query_vec": vec}
results = client.ft('idx').search(q, query_params=params)
for doc in results.docs:
print(f"distance:{round(float(doc['vector_score']),3)} content:{doc['content']}\n")
distance:0.188 content:Dibaba breaks 5,000m world record
Ethiopia's Tirunesh Dibaba set a new world record in winning the women's 5,000m at the Boston Indoor Games.
Dibaba won in 14 minutes 32.93 seconds to erase the previous world indoor mark of 14:39.29 set by another Ethiopian, Berhane Adera, in Stuttgart last year. But compatriot Kenenisa Bekele's record hopes were dashed when he miscounted his laps in the men's 3,000m and staged his sprint finish a lap too soon. Ireland's Alistair Cragg won in 7:39.89 as Bekele battled to second in 7:41.42. "I didn't want to sit back and get out-kicked," said Cragg. "So I kept on the pace. The plan was to go with 500m to go no matter what, but when Bekele made the mistake that was it. The race was mine." Sweden's Carolina Kluft, the Olympic heptathlon champion, and Slovenia's Jolanda Ceplak had winning performances, too. Kluft took the long jump at 6.63m, while Ceplak easily won the women's 800m in 2:01.52.
distance:0.268 content:Japan narrowly escapes recession
Japan's economy teetered on the brink of a technical recession in the three months to September, figures show.
Revised figures indicated growth of just 0.1% - and a similar-sized contraction in the previous quarter. On an annual basis, the data suggests annual growth of just 0.2%, suggesting a much more hesitant recovery than had previously been thought. A common technical definition of a recession is two successive quarters of negative growth.
The government was keen to play down the worrying implications of the data. "I maintain the view that Japan's economy remains in a minor adjustment phase in an upward climb, and we will monitor developments carefully," said economy minister Heizo Takenaka. But in the face of the strengthening yen making exports less competitive and indications of weakening economic conditions ahead, observers were less sanguine. "It's painting a picture of a recovery... much patchier than previously thought," said Paul Sheard, economist at Lehman Brothers in Tokyo. Improvements in the job market apparently have yet to feed through to domestic demand, with private consumption up just 0.2% in the third quarter.
distance:0.287 content:Google's toolbar sparks concern
Search engine firm Google has released a trial tool which is concerning some net users because it directs people to pre-selected commercial websites.
The AutoLink feature comes with Google's latest toolbar and provides links in a webpage to Amazon.com if it finds a book's ISBN number on the site. It also links to Google's map service, if there is an address, or to car firm Carfax, if there is a licence plate. Google said the feature, available only in the US, "adds useful links". But some users are concerned that Google's dominant position in the search engine market place could mean it would be giving a competitive edge to firms like Amazon.
AutoLink works by creating a link to a website based on information contained in a webpage - even if there is no link specified and whether or not the publisher of the page has given permission.
If a user clicks the AutoLink feature in the Google toolbar then a webpage with a book's unique ISBN number would link directly to Amazon's website. It could mean online libraries that list ISBN book numbers find they are directing users to Amazon.com whether they like it or not. Websites which have paid for advertising on their pages may also be directing people to rival services. Dan Gillmor, founder of Grassroots Media, which supports citizen-based media, said the tool was a "bad idea, and an unfortunate move by a company that is looking to continue its hypergrowth". In a statement Google said the feature was still only in beta, ie trial, stage and that the company welcomed feedback from users. It said: "The user can choose never to click on the AutoLink button, and web pages she views will never be modified. "In addition, the user can choose to disable the AutoLink feature entirely at any time."
The new tool has been compared to the Smart Tags feature from Microsoft by some users. It was widely criticised by net users and later dropped by Microsoft after concerns over trademark use were raised. Smart Tags allowed Microsoft to link any word on a web page to another site chosen by the company. Google said none of the companies which received AutoLinks had paid for the service. Some users said AutoLink would only be fair if websites had to sign up to allow the feature to work on their pages or if they received revenue for any "click through" to a commercial site. Cory Doctorow, European outreach coordinator for digital civil liberties group Electronic Fronter Foundation, said that Google should not be penalised for its market dominance. "Of course Google should be allowed to direct people to whatever proxies it chooses. "But as an end user I would want to know - 'Can I choose to use this service?, 'How much is Google being paid?', 'Can I substitute my own companies for the ones chosen by Google?'." Mr Doctorow said the only objection would be if users were forced into using AutoLink or "tricked into using the service".
view raw results.txt hosted with ❤ by GitHub

Redis Hybrid Search (Full-text + KNN)


text_5 = """Ethiopia's crop production up 24%
Ethiopia produced 14.27 million tonnes of crops in 2004, 24% higher than in 2003 and 21% more than the average of the past five years, a report says.
In 2003, crop production totalled 11.49 million tonnes, the joint report from the Food and Agriculture Organisation and the World Food Programme said. Good rains, increased use of fertilizers and improved seeds contributed to the rise in production. Nevertheless, 2.2 million Ethiopians will still need emergency assistance.
The report calculated emergency food requirements for 2005 to be 387,500 tonnes. On top of that, 89,000 tonnes of fortified blended food and vegetable oil for "targeted supplementary food distributions for a survival programme for children under five and pregnant and lactating women" will be needed.
In eastern and southern Ethiopia, a prolonged drought has killed crops and drained wells. Last year, a total of 965,000 tonnes of food assistance was needed to help seven million Ethiopians. The Food and Agriculture Organisation (FAO) recommend that the food assistance is bought locally. "Local purchase of cereals for food assistance programmes is recommended as far as possible, so as to assist domestic markets and farmers," said Henri Josserand, chief of FAO's Global Information and Early Warning System. Agriculture is the main economic activity in Ethiopia, representing 45% of gross domestic product. About 80% of Ethiopians depend directly or indirectly on agriculture.
"""
vec = np.array(get_vector(text_5), dtype=np.float32).tobytes()
q = Query('@content:recession => [KNN 3 @vector $query_vec AS vector_score]')\
.sort_by('vector_score')\
.return_fields('vector_score', 'content')\
.dialect(2)
params = {"query_vec": vec}
results = client.ft('idx').search(q, query_params=params)
for doc in results.docs:
print(f"distance:{round(float(doc['vector_score']),3)} content:{doc['content']}\n")
distance:0.241 content:Japan narrowly escapes recession
Japan's economy teetered on the brink of a technical recession in the three months to September, figures show.
Revised figures indicated growth of just 0.1% - and a similar-sized contraction in the previous quarter. On an annual basis, the data suggests annual growth of just 0.2%, suggesting a much more hesitant recovery than had previously been thought. A common technical definition of a recession is two successive quarters of negative growth.
The government was keen to play down the worrying implications of the data. "I maintain the view that Japan's economy remains in a minor adjustment phase in an upward climb, and we will monitor developments carefully," said economy minister Heizo Takenaka. But in the face of the strengthening yen making exports less competitive and indications of weakening economic conditions ahead, observers were less sanguine. "It's painting a picture of a recovery... much patchier than previously thought," said Paul Sheard, economist at Lehman Brothers in Tokyo. Improvements in the job market apparently have yet to feed through to domestic demand, with private consumption up just 0.2% in the third quarter.
view raw results.txt hosted with ❤ by GitHub

Source


Copyright ©1993-2024 Joey E Whelan, All rights reserved.

Saturday, May 27, 2023

Redis Polygon Search

Summary

This post will demonstrate the usage of a new search feature within Redis - geospatial search with polygons.  This search feature is part of the 7.2.0-M01 Redis Stack release.  This initial release supports the WITHIN and CONTAINS query types for polygons, only.  Additional geospatial search types will be forthcoming in future releases.  

Architecture


Code Snippets

Point Generation

I use the Shapely module to generate the geometries for this demo.  The code snippet below will generate a random point, optionally within a bounding box.

def _get_point(self, box: Polygon = None) -> Point:
""" Private function to generate a random point, potentially within a bounding box
Parameters
----------
box - Optional bounding box
Returns
-------
Shapely Point object
"""
point: Point
if box:
minx, miny, maxx, maxy = box.bounds
while True:
point = Point(random.uniform(minx, maxx), random.uniform(miny, maxy))
if box.contains(point):
break
else:
point = Point(random.uniform(MIN_X, MAX_X), random.uniform(MIN_Y, MAX_Y))
return point

Polygon Generation

Random polygons can be generated using the random point function above.  By passing a polygon as an input parameter, the generated polygon can be placed inside that input polygon.

def _get_polygon(self, box: Polygon = None) -> Polygon:
""" Private function to generate a random polygon, potentially within a bounding box
Parameters
----------
box - Optional bounding box
Returns
-------
Shapely Polygon object
"""
points: List[Point] = []
for _ in range(random.randint(3,10)):
points.append(self._get_point(box))
ob: MultiPoint = MultiPoint(points)
return Polygon(ob.convex_hull)

Redis Polygon Search Index

The command below creates an index on the polygons with the new keyword 'GEOMETRY' for their associated WKT-formatted points.  Note this code is sending a raw CLI command to Redis.  The redis-py lib does not support the new geospatial command sets at the time of this writing.

self.client.execute_command('FT.CREATE', 'idx', 'ON', 'JSON', 'PREFIX', '1', 'key:',
'SCHEMA', '$.name', 'AS', 'name', 'TEXT', '$.geom', 'AS', 'geom', 'GEOSHAPE', 'FLAT')
view raw poly-index.py hosted with ❤ by GitHub

Redis Polygon Load as JSON

The code below inserts 4 polygons into Redis as JSON objects.  Those objects are indexed within Redis by the code above.
  
self.client.json().set('key:1', '$', { "name": "Red Polygon", "geom": poly_red.wkt })
self.client.json().set('key:2', '$', { "name": "Green Polygon", "geom": poly_green.wkt })
self.client.json().set('key:3', '$', { "name": "Blue Polygon", "geom": poly_blue.wkt })
self.client.json().set('key:4', '$', { "name": "Cyan Polygon", "geom": poly_cyan.wkt })
self.client.json().set('key:5', '$', { "name": "Purple Point", "geom": point_purple.wkt })
self.client.json().set('key:6', '$', { "name": "Brown Point", "geom": point_brown.wkt })
self.client.json().set('key:7', '$', { "name": "Orange Point", "geom": point_orange.wkt })
self.client.json().set('key:8', '$', { "name": "Olive Point", "geom": point_olive.wkt })
view raw poly-json.py hosted with ❤ by GitHub

Redis Polygon Search

Redis Polygon search (contains or within) code below. Again, this is the raw CLI command.
def _poly_search(self, qt: QUERY, color: COLOR, shape: Polygon, filter: SHAPE) -> None:
""" Private function for POLYGON search in Redis.
Parameters
----------
qt - Redis Geometry search type (contains or within)
color - color attribute of polygon
shape - Shapely point or polygon object
filter - query filter on shape types (polygon or point) to be returned
Returns
-------
None
"""
results: list = self.client.execute_command('FT.SEARCH', 'idx', f'(-@name:{color.value} @name:{filter.value} @geom:[{qt.value} $qshape])', 'PARAMS', '2', 'qshape', shape.wkt, 'RETURN', '1', 'name', 'DIALECT', '3')
if (results[0] > 0):
for res in results:
if isinstance(res, list):
print(res[1].decode('utf-8').strip('[]"'))
else:
print('None')

Results

Plot






Results


*** Polygons within the Red Polygon ***
Green Polygon
Blue Polygon
Cyan Polygon
*** Polygons within the Green Polygon ***
Blue Polygon
Cyan Polygon
*** Polygons within the Blue Polygon ***
Cyan Polygon
*** Polygons within the Cyan Polygon ***
None
*** Points within the Red Polygon ***
Purple Point
Brown Point
Orange Point
Olive Point
*** Points within the Green Polygon ***
Purple Point
Brown Point
Orange Point
Olive Point
*** Points within the Blue Polygon ***
Purple Point
Brown Point
*** Points within the Cyan Polygon ***
Purple Point
Brown Point
*** Polygons containing the Red Polygon ***
None
*** Polygons containing the Green Polygon ***
Red Polygon
*** Polygons containing the Blue Polygon ***
Red Polygon
Green Polygon
*** Polygons containing the Cyan Polygon ***
Red Polygon
Green Polygon
Blue Polygon
*** Polygons containing the Purple Point ***
Red Polygon
Green Polygon
Blue Polygon
Cyan Polygon
*** Polygons containing the Brown Point ***
Red Polygon
Green Polygon
Blue Polygon
Cyan Polygon
*** Polygons containing the Orange Point ***
Red Polygon
Green Polygon
*** Polygons containing the Olive Point ***
Red Polygon
Green Polygon
view raw results.txt hosted with ❤ by GitHub

Source


Copyright ©1993-2024 Joey E Whelan, All rights reserved.

Sunday, April 23, 2023

Redis Document/Search - Java Examples

Summary

This post will provide some snippets from Redis Query Workshop available on GitHub.  That workshop covers parallel examples in CLI, Python, Nodejs, Java, and C#.  This post will focus on Java examples.



Basic JSON


no = new JSONObject();
no.put("num1", 1);
no.put("arr2", new JSONArray(Arrays.asList("val1", "val2", "val3")));
jo = new JSONObject();
jo.put("str1", "val1");
jo.put("str2", "val2");
jo.put("arr1", new JSONArray(Arrays.asList(1, 2, 3, 4)));
jo.put("obj1", no);
client.jsonSet("ex2:5", jo);
res = client.jsonGetAsPlainString("ex2:5", new Path("$.obj1.arr2"));
System.out.println(res);
res = client.jsonGetAsPlainString("ex2:5", new Path("$.arr1[1]"));
System.out.println(res);
res = client.jsonGetAsPlainString("ex2:5", new Path("$.obj1.arr2[0:2]"));
System.out.println(res);
res = client.jsonGetAsPlainString("ex2:5", new Path("$.arr1[-2:]"));
System.out.println(res);
[["val1","val2","val3"]]
[2]
["val1","val2"]
[3,4]

Basic Search


Schema schema = new Schema().addNumericField("$.id")
.addTagField("$.gender").as("gender")
.addTagField("$.season.*").as("season")
.addTextField("$.description", 1.0).as("description")
.addNumericField("$.price").as("price")
.addTextField("$.city",1.0).as("city")
.addGeoField("$.coords").as("coords");
IndexDefinition rule = new IndexDefinition(IndexDefinition.Type.JSON)
.setPrefixes(new String[]{"product:"});
client.ftCreate("idx1", IndexOptions.defaultOptions().setDefinition(rule), schema);
private class Product {
private int id;
private String gender;
private String[] season;
private String description;
private double price;
private String city;
private String coords;
public Product(int id, String gender, String[] season,
String description, double price, String city, String coords) {
this.id = id;
this.gender = gender;
this.season = season;
this.description = description;
this.price = price;
this.city = city;
this.coords = coords;
}
}
Product prod15970 = new Product(15970, "Men", new String[]{"Fall, Winter"}, "Turtle Check Men Navy Blue Shirt",
34.95, "Boston", "-71.057083, 42.361145");
Product prod59263 = new Product(59263, "Women", new String[]{"Fall, Winter", "Spring", "Summer"}, "Titan Women Silver Watch",
129.99, "Dallas", "-96.808891, 32.779167");
Product prod46885 = new Product(46885, "Boys", new String[]{"Fall"}, "Ben 10 Boys Navy Blue Slippers",
45.99, "Denver", "-104.991531, 39.742043");
Gson gson = new Gson();
client.jsonSet("product:15970", gson.toJson(prod15970));
client.jsonSet("product:59263", gson.toJson(prod59263));
client.jsonSet("product:46885", gson.toJson(prod46885));
q = new Query("@price:[40, 100] @description:Blue");
res = client.ftSearch("idx1", q);
docs = res.getDocuments();
for (Document doc : docs) {
System.out.println(doc);
}
id:product:46885, score: 1.0, payload:null, properties:[$={"id":59263,"gender":"Boys","season":["Fall"],"description":"Ben 10 Boys Navy Blue Slippers","price":45.99,"city":"Denver","coords":"-104.991531, 39.742043"}]

Advanced JSON


private class Product {
private int id;
private String gender;
private String[] season;
private String description;
private double price;
public Product(int id, String gender, String[] season, String description, double price) {
this.id = id;
this.gender = gender;
this.season = season;
this.description = description;
this.price = price;
}
}
private class Warehouse {
private String city;
private String location;
private Product[] inventory;;
public Warehouse(String city, String location, Product[] inventory) {
this.city = city;
this.location = location;
this.inventory = inventory;
}
}
Product prod15970 = new Product(15970, "Men", new String[]{"Fall", "Winter"}, "Turtle Check Men Navy Blue Shirt",
34.95);
Product prod59263 = new Product(59263, "Women", new String[]{"Fall", "Winter", "Spring", "Summer"}, "Titan Women Silver Watch",
129.99);
Product prod46885 = new Product(46885, "Boys", new String[]{"Fall"}, "Ben 10 Boys Navy Blue Slippers",
45.99);
Warehouse wh1 = new Warehouse("Boston", "42.361145, -71.057083",
new Product[]{prod15970, prod59263, prod46885});
Gson gson = new Gson();
client.jsonSet("warehouse:1", gson.toJson(wh1));
res = client.jsonGetAsPlainString("warehouse:1",
new Path("$.inventory[?(@.description==\"Turtle Check Men Navy Blue Shirt\")]"));
System.out.println(res);
[{"id":15970,"gender":"Men","season":["Fall","Winter"],"description":"Turtle Check Men Navy Blue Shirt","price":34.95}]

Advanced Search


HashMap<String, Object> attr = new HashMap<String, Object>();
attr.put("TYPE", "FLOAT32");
attr.put("DIM", "4");
attr.put("DISTANCE_METRIC", "L2");
Schema schema = new Schema().addVectorField("$.vector", Schema.VectorField.VectorAlgo.FLAT, attr).as("vector");
IndexDefinition rule = new IndexDefinition(IndexDefinition.Type.JSON)
.setPrefixes(new String[]{"vec:"});
client.ftCreate("vss_idx", IndexOptions.defaultOptions().setDefinition(rule), schema);
client.jsonSet("vec:1", new JSONObject("{\"vector\": [1,1,1,1]}"));
client.jsonSet("vec:2", new JSONObject("{\"vector\": [2,2,2,2]}"));
client.jsonSet("vec:3", new JSONObject("{\"vector\": [3,3,3,3]}"));
client.jsonSet("vec:4", new JSONObject("{\"vector\": [4,4,4,4]}"));
float[] vec = new float[]{2,2,3,3};
ByteBuffer buffer = ByteBuffer.allocate(vec.length * Float.BYTES);
buffer.order(ByteOrder.LITTLE_ENDIAN);
buffer.asFloatBuffer().put(vec);
Query q = new Query("*=>[KNN 3 @vector $query_vec]")
.addParam("query_vec", buffer.array())
.setSortBy("__vector_score", true)
.dialect(2);
SearchResult res = client.ftSearch("vss_idx", q);
List<Document> docs = res.getDocuments();
for (Document doc : docs) {
System.out.println(doc);
}
id:vec:2, score: 1.0, payload:null, properties:[$={"vector":[2,2,2,2]}, __vector_score=2]
id:vec:3, score: 1.0, payload:null, properties:[$={"vector":[3,3,3,3]}, __vector_score=2]
id:vec:1, score: 1.0, payload:null, properties:[$={"vector":[1,1,1,1]}, __vector_score=10]

Source


Copyright ©1993-2024 Joey E Whelan, All rights reserved.

Redis Document/Search - C# Examples

Summary

This post will provide some snippets from Redis Query Workshop available on GitHub.  That workshop covers parallel examples in CLI, Python, Nodejs, Java, and C#.  This post will focus on C# examples.



Basic JSON


json.Set("ex3:1", "$", new {field1 = "val1"});
json.Set("ex3:1", "$", new {foo = "bar"});
Console.WriteLine(json.Get(key: "ex3:1",
indent: "\t",
newLine: "\n"
));
{
"foo":"bar"
}

Basic Search


ISearchCommands ft = db.FT();
try {ft.DropIndex("idx1");} catch {};
ft.Create("idx1", new FTCreateParams().On(IndexDataType.JSON)
.Prefix("product:"),
new Schema().AddNumericField(new FieldName("$.id", "id"))
.AddTagField(new FieldName("$.gender", "gender"))
.AddTagField(new FieldName("$.season.*", "season"))
.AddTextField(new FieldName("$.description", "description"))
.AddNumericField(new FieldName("$.price", "price"))
.AddTextField(new FieldName("$.city", "city"))
.AddGeoField(new FieldName("$.coords", "coords")));
IJsonCommands json = db.JSON();
json.Set("product:15970", "$", new {
id = 15970,
gender = "Men",
season = new[] {"Fall", "Winter"},
description = "Turtle Check Men Navy Blue Shirt",
price = 34.95,
city = "Boston",
coords = "-71.057083, 42.361145"
});
json.Set("product:59263", "$", new {
id = 59263,
gender = "Women",
season = new[] {"Fall", "Winter", "Spring", "Summer"},
description = "Titan Women Silver Watch",
price = 129.99,
city = "Dallas",
coords = "-96.808891, 32.779167"
});
json.Set("product:46885", "$", new {
id = 46885,
gender = "Boys",
season = new[] {"Fall"},
description = "Ben 10 Boys Navy Blue Slippers",
price = 45.99,
city = "Denver",
coords = "-104.991531, 39.742043"
});
foreach (var doc in ft.Search("idx1", new Query("@season:{Spring}"))
.Documents.Select(x => x["json"]))
{
Console.WriteLine(doc);
}
{"id":59263,"gender":"Women","season":["Fall","Winter","Spring","Summer"],"description":"Titan Women Silver Watch","price":129.99,"city":"Dallas","coords":"-96.808891, 32.779167"}

Advanced JSON


IJsonCommands json = db.JSON();
json.Set("warehouse:1", "$", new {
city = "Boston",
location = "42.361145, -71.057083",
inventory = new[] {
new {
id = 15970,
gender = "Men",
season = new[] {"Fall", "Winter"},
description = "Turtle Check Men Navy Blue Shirt",
price = 34.95
},
new {
id = 59263,
gender = "Women",
season = new[] {"Fall", "Winter", "Spring", "Summer"},
description = "Titan Women Silver Watch",
price = 129.99
},
new {
id = 46885,
gender = "Boys",
season = new[] {"Fall"},
description = "Ben 10 Boys Navy Blue Slippers",
price = 45.99
}
}
});
Console.WriteLine(json.Get(key: "warehouse:1",
path: "$.inventory[?(@.price<100)]",
indent: "\t",
newLine: "\n"
));
[
{
"id":15970,
"gender":"Men",
"season":[
"Fall",
"Winter"
],
"description":"Turtle Check Men Navy Blue Shirt",
"price":34.95
},
{
"id":46885,
"gender":"Boys",
"season":[
"Fall"
],
"description":"Ben 10 Boys Navy Blue Slippers",
"price":45.99
}
]

Advanced Search


ISearchCommands ft = db.FT();
try {ft.DropIndex("wh_idx");} catch {};
ft.Create("wh_idx", new FTCreateParams()
.On(IndexDataType.JSON)
.Prefix("warehouse:"),
new Schema().AddTextField(new FieldName("$.city", "city")));
IJsonCommands json = db.JSON();
json.Set("warehouse:1", "$", new {
city = "Boston",
location = "-71.057083, 42.361145",
inventory = new[] {
new {
id = 15970,
gender = "Men",
season = new[] {"Fall", "Winter"},
description = "Turtle Check Men Navy Blue Shirt",
price = 34.95
},
new {
id = 59263,
gender = "Women",
season = new[] {"Fall", "Winter", "Spring", "Summer"},
description = "Titan Women Silver Watch",
price = 129.99
},
new {
id = 46885,
gender = "Boys",
season = new[] {"Fall"},
description = "Ben 10 Boys Navy Blue Slippers",
price = 45.99
}
}
});
json.Set("warehouse:2", "$", new {
city = "Dallas",
location = "-96.808891, 32.779167",
inventory = new[] {
new {
id = 51919,
gender = "Women",
season = new[] {"Summer"},
description = "Nyk Black Horado Handbag",
price = 52.49
},
new {
id = 4602,
gender = "Unisex",
season = new[] {"Fall", "Winter"},
description = "Wildcraft Red Trailblazer Backpack",
price = 50.99
},
new {
id = 37561,
gender = "Girls",
season = new[] {"Spring", "Summer"},
description = "Madagascar3 Infant Pink Snapsuit Romper",
price = 23.95
}
}
});
foreach (var doc in ft.Search("wh_idx",
new Query("@city:(Dallas)")
.ReturnFields(new FieldName("$.inventory[?(@.gender==\"Women\" || @.gender==\"Girls\")]", "result"))
.Dialect(3))
.Documents.Select(x => x["result"]))
{
Console.WriteLine(doc);
}
[{"id":51919,"gender":"Women","season":["Summer"],"description":"Nyk Black Horado Handbag","price":52.49},{"id":37561,"gender":"Girls","season":["Spring","Summer"],"description":"Madagascar3 Infant Pink Snapsuit Romper","price":23.95}]

Source


Copyright ©1993-2024 Joey E Whelan, All rights reserved.

Redis Document/Search - Nodejs Examples

Summary

This post will provide some snippets from Redis Query Workshop available on GitHub.  That workshop covers parallel examples in CLI, Python, Nodejs, Java, and C#.  This post will focus on Nodejs examples.



Basic JSON


await client.json.set('ex3:3', '$', {"obj1": {"str1": "val1", "num2": 2}});
await client.json.set('ex3:3', '$.obj1.num2', 3);
result = await client.json.get('ex3:3');
console.log(result);
{ obj1: { str1: 'val1', num2: 3 } }

Basic Search


await client.ft.create('idx1', {
'$.id': {
type: SchemaFieldTypes.NUMERIC,
AS: 'id'
},
'$.gender': {
type: SchemaFieldTypes.TAG,
AS: 'gender'
},
'$.season.*': {
type: SchemaFieldTypes.TAG,
AS: 'season'
},
'$.description': {
type: SchemaFieldTypes.TEXT,
AS: 'description'
},
'$.price': {
type: SchemaFieldTypes.NUMERIC,
AS: 'price'
},
'$.city': {
type: SchemaFieldTypes.TEXT,
AS: 'city'
},
'$.coords': {
type: SchemaFieldTypes.GEO,
AS: 'coords'
}
}, { ON: 'JSON', PREFIX: 'product:'});
await client.json.set('product:15970', '$', {"id": 15970, "gender": "Men", "season":["Fall", "Winter"], "description": "Turtle Check Men Navy Blue Shirt", "price": 34.95, "city": "Boston", "coords": "-71.057083, 42.361145"});
await client.json.set('product:59263', '$', {"id": 59263, "gender": "Women", "season":["Fall", "Winter", "Spring", "Summer"],"description": "Titan Women Silver Watch", "price": 129.99, "city": "Dallas", "coords": "-96.808891, 32.779167"});
await client.json.set('product:46885', '$', {"id": 46885, "gender": "Boys", "season":["Fall"], "description": "Ben 10 Boys Navy Blue Slippers", "price": 45.99, "city": "Denver", "coords": "-104.991531, 39.742043"});
result = await client.ft.search('idx1', '@price:[40,130]');
console.log(JSON.stringify(result, null, 4));
{
"total": 2,
"documents": [
{
"id": "product:46885",
"value": {
"id": 46885,
"gender": "Boys",
"season": [
"Fall"
],
"description": "Ben 10 Boys Navy Blue Slippers",
"price": 45.99,
"city": "Denver",
"coords": "-104.991531, 39.742043"
}
},
{
"id": "product:59263",
"value": {
"id": 59263,
"gender": "Women",
"season": [
"Fall",
"Winter",
"Spring",
"Summer"
],
"description": "Titan Women Silver Watch",
"price": 129.99,
"city": "Dallas",
"coords": "-96.808891, 32.779167"
}
}
]
}

Advanced JSON


await client.json.set('warehouse:1', '$', {
"city": "Boston",
"location": "42.361145, -71.057083",
"inventory":[{
"id": 15970,
"gender": "Men",
"season":["Fall", "Winter"],
"description": "Turtle Check Men Navy Blue Shirt",
"price": 34.95
},{
"id": 59263,
"gender": "Women",
"season": ["Fall", "Winter", "Spring", "Summer"],
"description": "Titan Women Silver Watch",
"price": 129.99
},{
"id": 46885,
"gender": "Boys",
"season": ["Fall"],
"description":
"Ben 10 Boys Navy Blue Slippers",
"price": 45.99
}]});
result = await client.json.get('warehouse:1', { path: '$.inventory[?(@.description=="Turtle Check Men Navy Blue Shirt")]' });
console.log(JSON.stringify(result, null, 4));
[
{
"id": 15970,
"gender": "Men",
"season": [
"Fall",
"Winter"
],
"description": "Turtle Check Men Navy Blue Shirt",
"price": 34.95
}
]

Advanced Search


await client.ft.create('wh_idx', {
'$.city': {
type: SchemaFieldTypes.TEXT,
AS: 'city'
}
}, { ON: 'JSON', PREFIX: 'warehouse:'});
await client.json.set('warehouse:1', '$', {
"city": "Boston",
"location": "-71.057083, 42.361145",
"inventory":[
{
"id": 15970,
"gender": "Men",
"season":["Fall", "Winter"],
"description": "Turtle Check Men Navy Blue Shirt",
"price": 34.95
},
{
"id": 59263,
"gender": "Women",
"season": ["Fall", "Winter", "Spring", "Summer"],
"description": "Titan Women Silver Watch",
"price": 129.99
},
{
"id": 46885,
"gender": "Boys",
"season": ["Fall"],
"description": "Ben 10 Boys Navy Blue Slippers",
"price": 45.99
}
]});
await client.json.set('warehouse:2', '$', {
"city": "Dallas",
"location": "-96.808891, 32.779167",
"inventory": [
{
"id": 51919,
"gender": "Women",
"season":["Summer"],
"description": "Nyk Black Horado Handbag",
"price": 52.49
},
{
"id": 4602,
"gender": "Unisex",
"season": ["Fall", "Winter"],
"description": "Wildcraft Red Trailblazer Backpack",
"price": 50.99
},
{
"id": 37561,
"gender": "Girls",
"season": ["Spring", "Summer"],
"description": "Madagascar3 Infant Pink Snapsuit Romper",
"price": 23.95
}
]});
result = await client.ft.search('wh_idx', '@city:(Dallas)', {
RETURN: ['$.inventory[?(@.gender=="Women" || @.gender=="Girls")]'],
DIALECT: 3
});
console.log(JSON.stringify(result, null, 4));
{
"total": 1,
"documents": [
{
"id": "warehouse:2",
"value": {
"$.inventory[?(@.gender==\"Women\" || @.gender==\"Girls\")]": "[{\"id\":51919,\"gender\":\"Women\",\"season\":[\"Summer\"],\"description\":\"Nyk Black Horado Handbag\",\"price\":52.49},{\"id\":37561,\"gender\":\"Girls\",\"season\":[\"Spring\",\"Summer\"],\"description\":\"Madagascar3 Infant Pink Snapsuit Romper\",\"price\":23.95}]"
}
}
]
}

Source


Copyright ©1993-2024 Joey E Whelan, All rights reserved.

Redis Document/Search - Python Examples

Summary

This post will provide some snippets from Redis Query Workshop available on GitHub.  That workshop covers parallel examples in CLI, Python, Nodejs, Java, and C#.  This post will focus on Python examples.



Basic JSON


client.json().set('ex2:5', '$', {"str1": "val1", "str2": "val2", "arr1":[1,2,3,4], "obj1": {"num1": 1,"arr2":["val1","val2", "val3"]}})
result = client.json().get('ex2:5', Path('$.obj1.arr2'))
print(result)
result = client.json().get('ex2:5', Path('$.arr1[1]' ))
print(result)
result = client.json().get('ex2:5', Path('$.obj1.arr2[0:2]'))
print(result)
result = client.json().get('ex2:5', Path('$.arr1[-2:]'))
print(result)
[['val1', 'val2', 'val3']]
[2]
['val1', 'val2']
[3, 4]

Basic Search


idx_def = IndexDefinition(index_type=IndexType.JSON, prefix=['product:'])
schema = [
NumericField('$.id', as_name='id'),
TagField('$.gender', as_name='gender'),
TagField('$.season.*', as_name='season'),
TextField('$.description', as_name='description'),
NumericField('$.price', as_name='price'),
TextField('$.city', as_name='city'),
GeoField('$.coords', as_name='coords')
]
result = client.ft('idx1').create_index(schema, definition=idx_def)
client.json().set('product:15970', '$', {"id": 15970, "gender": "Men", "season":["Fall", "Winter"], "description": "Turtle Check Men Navy Blue Shirt", "price": 34.95, "city": "Boston", "coords": "-71.057083, 42.361145"})
client.json().set('product:59263', '$', {"id": 59263, "gender": "Women", "season":["Fall", "Winter", "Spring", "Summer"],"description": "Titan Women Silver Watch", "price": 129.99, "city": "Dallas", "coords": "-96.808891, 32.779167"})
client.json().set('product:46885', '$', {"id": 46885, "gender": "Boys", "season":["Fall"], "description": "Ben 10 Boys Navy Blue Slippers", "price": 45.99, "city": "Denver", "coords": "-104.991531, 39.742043"})
query = Query('@description:("Blue Shirt")')
result = client.ft('idx1').search(query)
print(result)
Result{1 total, docs: [Document {'id': 'product:15970', 'payload': None, 'json': '{"id":15970,"gender":"Men","season":["Fall","Winter"],"description":"Turtle Check Men Navy Blue Shirt","price":34.95,"city":"Boston","coords":"-71.057083, 42.361145"}'}]}

Advanced JSON


client.json().set('warehouse:1', '$', {
"city": "Boston",
"location": "42.361145, -71.057083",
"inventory":[{
"id": 15970,
"gender": "Men",
"season":["Fall", "Winter"],
"description": "Turtle Check Men Navy Blue Shirt",
"price": 34.95
},{
"id": 59263,
"gender": "Women",
"season": ["Fall", "Winter", "Spring", "Summer"],
"description": "Titan Women Silver Watch",
"price": 129.99
},{
"id": 46885,
"gender": "Boys",
"season": ["Fall"],
"description":
"Ben 10 Boys Navy Blue Slippers",
"price": 45.99
}]})
result = client.json().get('warehouse:1', Path('$.inventory[?(@.description=="Turtle Check Men Navy Blue Shirt")]'))
print(json.dumps(result, indent=4))
[
{
"id": 15970,
"gender": "Men",
"season": [
"Fall",
"Winter"
],
"description": "Turtle Check Men Navy Blue Shirt",
"price": 34.95
}
]

Advanced Search


schema = [VectorField('$.vector', 'FLAT', { "TYPE": 'FLOAT32', "DIM": 4, "DISTANCE_METRIC": 'L2'}, as_name='vector')]
idx_def: IndexDefinition = IndexDefinition(index_type=IndexType.JSON, prefix=['vec:'])
result = client.ft('vss_idx').create_index(schema, definition=idx_def)
client.json().set('vec:1', '$', {'vector': [1,1,1,1]})
client.json().set('vec:2', '$', {'vector': [2,2,2,2]})
client.json().set('vec:3', '$', {'vector': [3,3,3,3]})
client.json().set('vec:4', '$', {'vector': [4,4,4,4]})
vec = [2,2,3,3]
query_vector = np.array(vec, dtype=np.float32).tobytes()
q_str = '*=>[KNN 3 @vector $query_vec]'
q = Query(q_str)\
.sort_by('__vector_score')\
.dialect(2)
params_dict = {"query_vec": query_vector}
results = client.ft('vss_idx').search(q, query_params=params_dict)
print(results)
Result{3 total, docs: [Document {'id': 'vec:2', 'payload': None, '__vector_score': '2', 'json': '{"vector":[2,2,2,2]}'}, Document {'id': 'vec:3', 'payload': None, '__vector_score': '2', 'json': '{"vector":[3,3,3,3]}'}, Document {'id': 'vec:1', 'payload': None, '__vector_score': '10', 'json': '{"vector":[1,1,1,1]}'}]}

Source


Copyright ©1993-2024 Joey E Whelan, All rights reserved.

Redis Document/Search - CLI Examples

Summary

This post will provide some snippets from Redis Query Workshop available on Github.  That workshop covers parallel examples in CLI, Python, Nodejs, Java, and C#.  This post will focus on the CLI examples.



Basic JSON


JSON.SET ex2:5 $ '{"str1": "val1", "str2": "val2", "arr1":[1,2,3,4], "obj1": {"num1": 1,"arr2":["val1","val2", "val3"]}}'
JSON.GET ex2:5 $.obj1.arr2
JSON.GET ex2:5 $.arr1[1]
JSON.GET ex2:5 $.obj1.arr2[0:2]
JSON.GET ex2:5 $.arr1[-2:]
"[[\"val1\",\"val2\",\"val3\"]]"
"[2]"
"[\"val1\",\"val2\"]"
"[3,4]"

Basic Search


FT.CREATE idx1 ON JSON PREFIX 1 product: SCHEMA $.id as id NUMERIC $.gender as gender TAG $.season.* AS season TAG $.description AS description TEXT $.price AS price NUMERIC $.city AS city TEXT $.coords AS coords GEO
JSON.SET product:15970 $ '{"id": 15970, "gender": "Men", "season":["Fall", "Winter"], "description": "Turtle Check Men Navy Blue Shirt", "price": 34.95, "city": "Boston", "coords": "-71.057083, 42.361145"}'
JSON.SET product:59263 $ '{"id": 59263, "gender": "Women", "season":["Fall", "Winter", "Spring", "Summer"],"description": "Titan Women Silver Watch", "price": 129.99, "city": "Dallas", "coords": "-96.808891, 32.779167"}'
JSON.SET product:46885 $ '{"id": 46885, "gender": "Boys", "season":["Fall"], "description": "Ben 10 Boys Navy Blue Slippers", "price": 45.99, "city": "Denver", "coords": "-104.991531, 39.742043"}'
FT.SEARCH idx1 '@description:Slippers'
1) "1"
2) "product:46885"
3) 1) "$"
2) "{\"id\":46885,\"gender\":\"Boys\",\"season\":[\"Fall\"],\"description\":\"Ben 10 Boys Navy Blue Slippers\",\"price\":45.99,\"city\":\"Denver\",\"coords\":\"-104.991531, 39.742043\"}"

Advanced JSON


JSON.SET warehouse:1 $ '{"city": "Boston","location": "42.361145, -71.057083","inventory":[{"id": 15970,"gender": "Men","season":["Fall", "Winter"],"description": "Turtle Check Men Navy Blue Shirt","price": 34.95},{"id": 59263,"gender": "Women","season": ["Fall", "Winter", "Spring", "Summer"],"description": "Titan Women Silver Watch","price": 129.99},{"id": 46885,"gender": "Boys","season": ["Fall"],"description": "Ben 10 Boys Navy Blue Slippers","price": 45.99}]}'
JSON.GET warehouse:1 '$.inventory[*].price'
"[34.95,129.99,45.99]"

Advanced Search


FT.CREATE wh_idx ON JSON PREFIX 1 warehouse: SCHEMA $.city as city TEXT
JSON.SET warehouse:1 $ '{"city": "Boston","location": "-71.057083, 42.361145","inventory":[{"id": 15970,"gender": "Men","season":["Fall", "Winter"],"description": "Turtle Check Men Navy Blue Shirt","price": 34.95},{"id": 59263,"gender": "Women","season": ["Fall", "Winter", "Spring", "Summer"],"description": "Titan Women Silver Watch","price": 129.99},{"id": 46885,"gender": "Boys","season": ["Fall"],"description": "Ben 10 Boys Navy Blue Slippers","price": 45.99}]}'
JSON.SET warehouse:2 $ '{"city": "Dallas","location": "-96.808891, 32.779167","inventory": [{"id": 51919,"gender": "Women","season":["Summer"],"description": "Nyk Black Horado Handbag","price": 52.49},{"id": 4602,"gender": "Unisex","season": ["Fall", "Winter"],"description": "Wildcraft Red Trailblazer Backpack","price": 50.99},{"id": 37561,"gender": "Girls","season": ["Spring", "Summer"],"description": "Madagascar3 Infant Pink Snapsuit Romper","price": 23.95}]}'
FT.SEARCH wh_idx '@city:(Boston)' RETURN 1 '$.inventory[?(@.price>50)].id' DIALECT 3
1) "1"
2) "warehouse:1"
3) 1) "$.inventory[?(@.price>50)].id"
2) "[59263]"

Source

Redis Search - Card Transactions Example

Summary

I'll be demonstrating Redis Search capabilities in a credit card transaction domain.  All the data will be synthetically generated from the Faker module.  Data will be stored as Hash sets in Redis.  Subsequently, Redis Search will be leveraged to generate analytics on the data.

Architecture


Code Snippets

Data Generation


merchants_provider = DynamicProvider(
provider_name='merchants',
elements=['Walmart', 'Nordstrom', 'Amazon', 'Exxon', 'Kroger', 'Safeway', 'United Airlines', 'Office Depot', 'Ford', 'Taco Bell']
)
categories_provider = DynamicProvider(
provider_name='categories',
elements= ['AUTO', 'FOOD', 'GASS', 'GIFT', 'TRAV', 'GROC', 'HOME', 'PERS', 'HEAL', 'MISC']
)
def generate_data(client, count):
Faker.seed(0)
random.seed(0)
fake = Faker()
fake.add_provider(merchants_provider)
fake.add_provider(categories_provider)
for i in range(count):
tdate = fake.date_time_between(start_date='-3y', end_date='now')
txn_record = {
'acct_id': int(fake.ean(length=13)),
'txn_id': int(fake.ean(length=13)),
'txn_date': re.escape(tdate.isoformat()),
'txn_timestamp': time.mktime(tdate.timetuple()),
'card_last_4': fake.credit_card_number()[-4:],
'txn_amt': round(random.uniform(1, 1000), 2),
'txn_currency': 'USD',
'expense_category': fake.categories(),
'merchant_name': fake.merchants(),
'merchant_address': re.escape(fake.address())
}
client.hset(f'{PREFIX}{txn_record["txn_id"]}', mapping=txn_record)

Index Creation


try:
client.ft(IDX_NAME).dropindex()
except:
pass
idx_def = IndexDefinition(index_type=IndexType.HASH, prefix=[PREFIX])
schema = [
TagField('txn_id', sortable=True),
TextField('txn_date'),
NumericField('txn_timestamp', sortable=True),
NumericField('txn_amt'),
TagField('txn_currency'),
TagField('expense_category'),
TextField('merchant_name'),
TextField('merchant_address')
]
client.ft(IDX_NAME).create_index(schema, definition=idx_def)
view raw cctxn_index.py hosted with ❤ by GitHub

Sample Query

The query below aggregates total spend by category for those transactions with a dollar value >$500 in Dec 2021.

request = AggregateRequest('(@txn_date:2021\-12* @txn_currency:{USD} @txn_amt:[(500, inf])')\
.group_by('@expense_category', reducers.sum('@txn_amt').alias('total_spend'))\
.sort_by(Desc('@total_spend'))
result = client.ft(IDX_NAME).aggregate(request)
view raw cctxn_search.py hosted with ❤ by GitHub

Source


Copyright ©1993-2024 Joey E Whelan, All rights reserved.

Saturday, February 25, 2023

Autocomplete with Redis Search

Summary

In this post, I'm going demonstrate a real-world usage scenario of one of the features of Redis Search: Suggestion (aka autocomplete).  This particular autocomplete scenario is around addresses, similar to what you see in Google Maps.  I pull real address data from a Canadian government statistics site and populate Redis suggestion dictionaries for autocomplete of either full address (with a street number) or just street name.  The subsequent address chosen by the user is then put into a full Redis search for an exact match.

Architecture


Application

I wrote this app completely in Javascript:  front and back end.  

Front End

The front end is a static web page with a single text input.  The expected input is either a street name or house number + street name.  The page leverages this Javascript autocomplete input module.  The module generates REST calls to the back end.  Screenshot below:



Back End

The back end consists of two Nodejs files:  dataLoader.js and app.js.  The dataLoader module handles fetching data from the Canadian gov site and loading it into Redis as JSON objects.  Additionally, it sets up two suggestion dictionaries: one that includes the street number with the address and another that does not.  Snippet below of the Redis client actions.
  1. async #insert(client, doc) {
  2. await client.json.set(`account:${doc.id}`, '.', doc);
  3.  
  4. const addr = doc.address;
  5. if (addr) {
  6. await client.ft.sugAdd(`fAdd`, addr, 1);
  7. await client.ft.sugAdd(`pAdd`, addr.substr(addr.indexOf(' ') + 1), 1);
  8. }
  9. }
App.js is an ExpressJS-based REST API server.  It exposes a couple GET endpoints: one for address suggestions and the other for a full-text search of an address.  A snippet of the address suggest endpoint below.

  1. app.get('/address/suggest', async (req, res) => {
  2. const address = decodeURI(req.query.address);
  3. console.log(`app - GET /address/suggest ${address}`);
  4. try {
  5. let addrs;
  6. if (address.match(/^\d/)) {
  7. addrs = await client.ft.sugGet(`fAdd`, address);
  8. }
  9. else {
  10. addrs = await client.ft.sugGet(`pAdd`, address);
  11. }
  12. let suggestions = []
  13. for (const addr of addrs) {
  14. suggestions.push({address: addr})
  15. }
  16. res.status(200).json(suggestions);
  17. }
  18. catch (err) {
  19. console.error(`app - GET /address/suggest ${req.query.address} - ${err.message}`)
  20. res.status(400).json({ 'error': err.message });
  21. }
  22. });

Source


Copyright ©1993-2024 Joey E Whelan, All rights reserved.