Skip to content

Vector Storage

The cbintel.vectl module provides text embedding generation and semantic vector search capabilities.

Overview

graph TB
    subgraph "Input"
        TEXT[Text Documents]
        CHUNKS[Text Chunks]
    end

    subgraph "Embedding"
        EMBED[EmbeddingService]
        OLLAMA[Ollama nomic-embed-text]
    end

    subgraph "Storage"
        STORE[VectorStore]
        KMEANS[K-means Clustering]
    end

    subgraph "Search"
        SEMANTIC[SemanticSearch]
        INDEX[DocumentIndex]
    end

    TEXT --> CHUNKS --> EMBED --> OLLAMA
    EMBED --> STORE --> KMEANS
    STORE --> SEMANTIC & INDEX

Module Structure

src/cbintel/vectl/
├── __init__.py          # Public exports
├── embedding_service.py # Generate embeddings
├── vector_store.py      # Vector storage
├── semantic_search.py   # Search interface
├── chunking.py          # Text chunking
└── document_index.py    # Document indexing

EmbeddingService

Generate text embeddings using Ollama.

Basic Usage

from cbintel.vectl import EmbeddingService

service = EmbeddingService(
    model="nomic-embed-text",
    base_url="http://127.0.0.1:11434"
)

# Single embedding
vector = await service.embed("Hello, world!")
print(f"Dimensions: {len(vector)}")  # 768

# Batch embedding
vectors = await service.embed_batch([
    "First document",
    "Second document",
    "Third document",
])

Configuration

service = EmbeddingService(
    model="nomic-embed-text",     # Embedding model
    base_url="http://127.0.0.1:11434",  # Ollama URL
    batch_size=32,                 # Batch size for embed_batch
    timeout=60.0,                  # Request timeout
)

Supported Models

Model Dimensions Speed Quality
nomic-embed-text 768 Fast Good
all-MiniLM-L6-v2 384 Very fast Moderate
mxbai-embed-large 1024 Slower High

VectorStore

Store and search vectors with K-means clustering.

Basic Usage

from cbintel.vectl import VectorStore

store = VectorStore(path="./my-index")

# Add vectors
await store.add("doc1", vector1, metadata={"source": "file1.txt"})
await store.add("doc2", vector2, metadata={"source": "file2.txt"})

# Search
results = await store.search(query_vector, top_k=10)
for result in results:
    print(f"{result.id}: {result.score:.3f}")

# Persist
await store.save()

Batch Operations

# Add multiple vectors
await store.add_batch([
    ("doc1", vector1, {"source": "file1.txt"}),
    ("doc2", vector2, {"source": "file2.txt"}),
    ("doc3", vector3, {"source": "file3.txt"}),
])

# Load existing store
store = VectorStore.load("./my-index")

Search Results

results = await store.search(query_vector, top_k=10)

for result in results:
    print(f"ID: {result.id}")
    print(f"Score: {result.score:.3f}")  # Cosine similarity
    print(f"Metadata: {result.metadata}")

Filtering

# Search with metadata filter
results = await store.search(
    query_vector,
    top_k=10,
    filter={"source": "file1.txt"}
)

ChunkingService

Split text into semantic chunks for embedding.

Basic Usage

from cbintel.vectl import ChunkingService

chunker = ChunkingService(
    chunk_size=500,      # Words per chunk
    overlap=50,          # Overlapping words
)

# Chunk text
chunks = chunker.chunk(long_text)

for chunk in chunks:
    print(f"Chunk {chunk.index}: {chunk.word_count} words")
    print(chunk.text[:100] + "...")

Chunk Object

@dataclass
class Chunk:
    text: str           # Chunk content
    index: int          # Chunk position
    word_count: int     # Word count
    start_char: int     # Start character offset
    end_char: int       # End character offset

SemanticSearch

End-to-end semantic search combining embedding and storage.

Basic Usage

from cbintel.vectl import SemanticSearch

search = SemanticSearch("./my-index")

# Index documents
await search.index_file("document.txt")
await search.index_directory("./docs")

# Search with text query
results = await search.search("machine learning algorithms", top_k=10)

for result in results:
    print(f"{result.text[:100]}... (score: {result.score:.3f})")

Indexing Options

# Index with custom chunking
await search.index_file(
    "document.txt",
    chunk_size=500,
    overlap=50,
)

# Index with metadata
await search.index_file(
    "document.txt",
    metadata={"category": "research", "date": "2024-01-15"}
)

# Index raw text
await search.index_text(
    text_content,
    document_id="custom-id",
    metadata={"source": "api"}
)

Search Results

results = await search.search(query, top_k=10)

for result in results:
    print(f"Text: {result.text}")
    print(f"Score: {result.score}")
    print(f"Document: {result.document_id}")
    print(f"Chunk: {result.chunk_index}")
    print(f"Metadata: {result.metadata}")

DocumentIndex

Simple document indexing interface.

Basic Usage

from cbintel.vectl import DocumentIndex

index = DocumentIndex("./my-corpus")

# Index with automatic chunking
await index.add_document("doc1", "Long document text...", chunk_size=512)

# Search
matches = await index.search("query text", top_k=5)

for match in matches:
    print(f"{match.document_id}: {match.score:.3f}")

Graph Operations

embed Operation

- op: embed
  input: text
  params:
    model: nomic-embed-text
  output: vector

embed_batch Operation

- op: embed_batch
  input: chunks
  params:
    model: nomic-embed-text
    batch_size: 32
  output: vectors

store_vector Operation

- op: store_vector
  input: vector
  params:
    store: my-index
    id: "{{ document_id }}"
    metadata:
      source: "{{ source_url }}"
  output: ref

search_vectors Operation

- op: search_vectors
  params:
    query: "machine learning"
    store: my-index
    top_k: 10
  output: matches

Example Pipeline

name: document_indexing
stages:
  - name: load
    sequential:
      - op: fetch
        params:
          url: "https://example.com/document"
        output: html

  - name: process
    sequential:
      - op: to_text
        input: html
        output: text
      - op: chunk
        input: text
        params:
          size: 500
          overlap: 50
        output: chunks

  - name: embed
    parallel:
      - op: embed_batch
        input: chunks
        output: vectors

  - name: store
    sequential:
      - op: store_vectors
        input: [chunks, vectors]
        params:
          store: research-index
        output: refs

CLI Commands

# Generate embeddings
cbintel-vectl embed document.txt --output embeddings.json

# Index documents
cbintel-vectl index ./docs --store my-index

# Search
cbintel-vectl search "machine learning" --store my-index --top 10

# Store statistics
cbintel-vectl stats my-index

Configuration

Environment Variables

# Ollama
OLLAMA_BASE_URL=http://127.0.0.1:11434
OLLAMA_EMBED_MODEL=nomic-embed-text

# Vector storage
VECTL_INDEX_PATH=./data/vectors
VECTL_BATCH_SIZE=32

Error Handling

from cbintel.vectl import (
    VectlError,
    EmbeddingError,
    StorageError,
    IndexNotFoundError,
)

try:
    vector = await service.embed(text)
except EmbeddingError as e:
    print(f"Embedding failed: {e}")
except VectlError as e:
    print(f"Vectl error: {e}")

Best Practices

Chunking Strategy

# For general text
chunker = ChunkingService(chunk_size=500, overlap=50)

# For technical documents (larger chunks)
chunker = ChunkingService(chunk_size=1000, overlap=100)

# For Q&A (smaller chunks)
chunker = ChunkingService(chunk_size=200, overlap=20)

Batch Processing

# Process in batches to avoid memory issues
texts = [...]  # Large list

batch_size = 100
for i in range(0, len(texts), batch_size):
    batch = texts[i:i + batch_size]
    vectors = await service.embed_batch(batch)
    await store.add_batch([
        (f"doc_{i+j}", v, {})
        for j, v in enumerate(vectors)
    ])

Index Persistence

# Always save after modifications
await store.add("doc1", vector1, metadata)
await store.save()  # Persist to disk

# Load for search
store = VectorStore.load("./my-index")
results = await store.search(query_vector)

Performance

K-means Clustering

VectorStore uses K-means clustering for efficient search:

# Clustering happens automatically
store = VectorStore(path="./index", n_clusters=100)

# More clusters = faster search, more memory
store = VectorStore(path="./index", n_clusters=500)

Backend Options

Backend Speed Memory Accuracy
NumPy Fast Low Exact
vectl C++ Very fast Low Exact
# Force NumPy backend
store = VectorStore(path="./index", backend="numpy")