Vector Storage¶
The cbintel.vectl module provides text embedding generation and semantic vector search capabilities.
Overview¶
graph TB
subgraph "Input"
TEXT[Text Documents]
CHUNKS[Text Chunks]
end
subgraph "Embedding"
EMBED[EmbeddingService]
OLLAMA[Ollama nomic-embed-text]
end
subgraph "Storage"
STORE[VectorStore]
KMEANS[K-means Clustering]
end
subgraph "Search"
SEMANTIC[SemanticSearch]
INDEX[DocumentIndex]
end
TEXT --> CHUNKS --> EMBED --> OLLAMA
EMBED --> STORE --> KMEANS
STORE --> SEMANTIC & INDEX
Module Structure¶
src/cbintel/vectl/
├── __init__.py # Public exports
├── embedding_service.py # Generate embeddings
├── vector_store.py # Vector storage
├── semantic_search.py # Search interface
├── chunking.py # Text chunking
└── document_index.py # Document indexing
EmbeddingService¶
Generate text embeddings using Ollama.
Basic Usage¶
from cbintel.vectl import EmbeddingService
service = EmbeddingService(
model="nomic-embed-text",
base_url="http://127.0.0.1:11434"
)
# Single embedding
vector = await service.embed("Hello, world!")
print(f"Dimensions: {len(vector)}") # 768
# Batch embedding
vectors = await service.embed_batch([
"First document",
"Second document",
"Third document",
])
Configuration¶
service = EmbeddingService(
model="nomic-embed-text", # Embedding model
base_url="http://127.0.0.1:11434", # Ollama URL
batch_size=32, # Batch size for embed_batch
timeout=60.0, # Request timeout
)
Supported Models¶
| Model | Dimensions | Speed | Quality |
|---|---|---|---|
nomic-embed-text |
768 | Fast | Good |
all-MiniLM-L6-v2 |
384 | Very fast | Moderate |
mxbai-embed-large |
1024 | Slower | High |
VectorStore¶
Store and search vectors with K-means clustering.
Basic Usage¶
from cbintel.vectl import VectorStore
store = VectorStore(path="./my-index")
# Add vectors
await store.add("doc1", vector1, metadata={"source": "file1.txt"})
await store.add("doc2", vector2, metadata={"source": "file2.txt"})
# Search
results = await store.search(query_vector, top_k=10)
for result in results:
print(f"{result.id}: {result.score:.3f}")
# Persist
await store.save()
Batch Operations¶
# Add multiple vectors
await store.add_batch([
("doc1", vector1, {"source": "file1.txt"}),
("doc2", vector2, {"source": "file2.txt"}),
("doc3", vector3, {"source": "file3.txt"}),
])
# Load existing store
store = VectorStore.load("./my-index")
Search Results¶
results = await store.search(query_vector, top_k=10)
for result in results:
print(f"ID: {result.id}")
print(f"Score: {result.score:.3f}") # Cosine similarity
print(f"Metadata: {result.metadata}")
Filtering¶
# Search with metadata filter
results = await store.search(
query_vector,
top_k=10,
filter={"source": "file1.txt"}
)
ChunkingService¶
Split text into semantic chunks for embedding.
Basic Usage¶
from cbintel.vectl import ChunkingService
chunker = ChunkingService(
chunk_size=500, # Words per chunk
overlap=50, # Overlapping words
)
# Chunk text
chunks = chunker.chunk(long_text)
for chunk in chunks:
print(f"Chunk {chunk.index}: {chunk.word_count} words")
print(chunk.text[:100] + "...")
Chunk Object¶
@dataclass
class Chunk:
text: str # Chunk content
index: int # Chunk position
word_count: int # Word count
start_char: int # Start character offset
end_char: int # End character offset
SemanticSearch¶
End-to-end semantic search combining embedding and storage.
Basic Usage¶
from cbintel.vectl import SemanticSearch
search = SemanticSearch("./my-index")
# Index documents
await search.index_file("document.txt")
await search.index_directory("./docs")
# Search with text query
results = await search.search("machine learning algorithms", top_k=10)
for result in results:
print(f"{result.text[:100]}... (score: {result.score:.3f})")
Indexing Options¶
# Index with custom chunking
await search.index_file(
"document.txt",
chunk_size=500,
overlap=50,
)
# Index with metadata
await search.index_file(
"document.txt",
metadata={"category": "research", "date": "2024-01-15"}
)
# Index raw text
await search.index_text(
text_content,
document_id="custom-id",
metadata={"source": "api"}
)
Search Results¶
results = await search.search(query, top_k=10)
for result in results:
print(f"Text: {result.text}")
print(f"Score: {result.score}")
print(f"Document: {result.document_id}")
print(f"Chunk: {result.chunk_index}")
print(f"Metadata: {result.metadata}")
DocumentIndex¶
Simple document indexing interface.
Basic Usage¶
from cbintel.vectl import DocumentIndex
index = DocumentIndex("./my-corpus")
# Index with automatic chunking
await index.add_document("doc1", "Long document text...", chunk_size=512)
# Search
matches = await index.search("query text", top_k=5)
for match in matches:
print(f"{match.document_id}: {match.score:.3f}")
Graph Operations¶
embed Operation¶
embed_batch Operation¶
store_vector Operation¶
- op: store_vector
input: vector
params:
store: my-index
id: "{{ document_id }}"
metadata:
source: "{{ source_url }}"
output: ref
search_vectors Operation¶
Example Pipeline¶
name: document_indexing
stages:
- name: load
sequential:
- op: fetch
params:
url: "https://example.com/document"
output: html
- name: process
sequential:
- op: to_text
input: html
output: text
- op: chunk
input: text
params:
size: 500
overlap: 50
output: chunks
- name: embed
parallel:
- op: embed_batch
input: chunks
output: vectors
- name: store
sequential:
- op: store_vectors
input: [chunks, vectors]
params:
store: research-index
output: refs
CLI Commands¶
# Generate embeddings
cbintel-vectl embed document.txt --output embeddings.json
# Index documents
cbintel-vectl index ./docs --store my-index
# Search
cbintel-vectl search "machine learning" --store my-index --top 10
# Store statistics
cbintel-vectl stats my-index
Configuration¶
Environment Variables¶
# Ollama
OLLAMA_BASE_URL=http://127.0.0.1:11434
OLLAMA_EMBED_MODEL=nomic-embed-text
# Vector storage
VECTL_INDEX_PATH=./data/vectors
VECTL_BATCH_SIZE=32
Error Handling¶
from cbintel.vectl import (
VectlError,
EmbeddingError,
StorageError,
IndexNotFoundError,
)
try:
vector = await service.embed(text)
except EmbeddingError as e:
print(f"Embedding failed: {e}")
except VectlError as e:
print(f"Vectl error: {e}")
Best Practices¶
Chunking Strategy¶
# For general text
chunker = ChunkingService(chunk_size=500, overlap=50)
# For technical documents (larger chunks)
chunker = ChunkingService(chunk_size=1000, overlap=100)
# For Q&A (smaller chunks)
chunker = ChunkingService(chunk_size=200, overlap=20)
Batch Processing¶
# Process in batches to avoid memory issues
texts = [...] # Large list
batch_size = 100
for i in range(0, len(texts), batch_size):
batch = texts[i:i + batch_size]
vectors = await service.embed_batch(batch)
await store.add_batch([
(f"doc_{i+j}", v, {})
for j, v in enumerate(vectors)
])
Index Persistence¶
# Always save after modifications
await store.add("doc1", vector1, metadata)
await store.save() # Persist to disk
# Load for search
store = VectorStore.load("./my-index")
results = await store.search(query_vector)
Performance¶
K-means Clustering¶
VectorStore uses K-means clustering for efficient search:
# Clustering happens automatically
store = VectorStore(path="./index", n_clusters=100)
# More clusters = faster search, more memory
store = VectorStore(path="./index", n_clusters=500)
Backend Options¶
| Backend | Speed | Memory | Accuracy |
|---|---|---|---|
| NumPy | Fast | Low | Exact |
| vectl C++ | Very fast | Low | Exact |