Skip to content

AI Clients

cbintel provides three AI client wrappers for interacting with language models: AnthropicClient for Claude, OllamaClient for local models, and CBAIClient as a unified interface.

Module Structure

src/cbintel/ai/
├── __init__.py          # Public exports
├── anthropic_client.py  # Claude API wrapper
├── ollama_client.py     # Local LLM wrapper
├── cbai_client.py       # Unified client
├── embeddings.py        # Embedding generation
├── diff.py              # Content diff utilities
└── sentiment.py         # Sentiment analysis

CBAIClient (Unified)

The recommended client for most use cases. Automatically routes to the appropriate backend.

Basic Usage

from cbintel.ai import CBAIClient

async with CBAIClient() as ai:
    # Simple completion
    response = await ai.complete("What is machine learning?")
    print(response)

    # With context
    response = await ai.complete(
        "Summarize this document",
        context=document_text
    )

    # Specify model
    response = await ai.complete(
        "Analyze this code",
        model="claude-3-5-sonnet",  # or "llama3.2" for Ollama
    )

Model Selection

# Claude models (via Anthropic)
await ai.complete(prompt, model="claude-3-5-sonnet")
await ai.complete(prompt, model="claude-3-opus")
await ai.complete(prompt, model="claude-3-haiku")

# Local models (via Ollama)
await ai.complete(prompt, model="llama3.2")
await ai.complete(prompt, model="mistral")
await ai.complete(prompt, model="codellama")

Configuration

from cbintel.ai import CBAIClient

client = CBAIClient(
    default_model="claude-3-5-sonnet",
    temperature=0.7,
    max_tokens=4096,
)

AnthropicClient

Direct wrapper for the Anthropic Claude API.

Basic Usage

from cbintel.ai import AnthropicClient

client = AnthropicClient(model="claude-3-5-sonnet-20241022")

# Simple completion
response = await client.complete("Explain quantum computing")

# With system prompt
response = await client.complete(
    "Analyze this data",
    system="You are a data analyst. Be concise and precise.",
    context=data
)

# With conversation history
response = await client.chat([
    {"role": "user", "content": "Hello"},
    {"role": "assistant", "content": "Hi! How can I help?"},
    {"role": "user", "content": "Explain AI"},
])

Streaming

async for chunk in client.stream("Write a story about robots"):
    print(chunk, end="", flush=True)

Configuration

client = AnthropicClient(
    model="claude-3-5-sonnet-20241022",
    api_key="sk-ant-...",  # Or set ANTHROPIC_API_KEY
    max_tokens=4096,
    temperature=0.7,
)

Available Models

Model ID Best For
Claude 3.5 Sonnet claude-3-5-sonnet-20241022 General use
Claude 3 Opus claude-3-opus-20240229 Complex reasoning
Claude 3 Haiku claude-3-haiku-20240307 Fast, simple tasks

OllamaClient

Wrapper for local LLM inference via Ollama.

Basic Usage

from cbintel.ai import OllamaClient

client = OllamaClient(
    model="llama3.2",
    base_url="http://127.0.0.1:11434"
)

# Simple completion
response = await client.complete("What is Python?")

# With context
response = await client.complete(
    "Summarize this",
    context=text
)

Streaming

async for chunk in client.stream("Write a haiku"):
    print(chunk, end="", flush=True)

Configuration

client = OllamaClient(
    model="llama3.2",
    base_url="http://127.0.0.1:11434",  # Or set OLLAMA_BASE_URL
    timeout=120.0,
)

Available Models

Common models for Ollama:

Model Size Best For
llama3.2 3B/8B General use
mistral 7B Fast, quality
codellama 7B/13B Code tasks
nomic-embed-text 137M Embeddings
translategemma 4B/12B/27B Translation

Embeddings

Generate text embeddings for semantic search.

Using EmbeddingService

from cbintel.vectl import EmbeddingService

service = EmbeddingService(
    model="nomic-embed-text",
    base_url="http://127.0.0.1:11434"
)

# Single embedding
vector = await service.embed("Hello, world!")
print(f"Dimensions: {len(vector)}")  # 768

# Batch embeddings
vectors = await service.embed_batch([
    "First document",
    "Second document",
    "Third document",
])

Common Embedding Models

Model Dimensions Speed
nomic-embed-text 768 Fast
all-MiniLM-L6-v2 384 Very fast
mxbai-embed-large 1024 Slower, higher quality

Sentiment Analysis

from cbintel.ai import sentiment

result = await sentiment("This product is amazing!")
print(f"Sentiment: {result.label}")  # positive
print(f"Score: {result.score}")       # 0.95

Content Diff

Compare content versions with AI analysis:

from cbintel.ai import diff, diff_summary, has_significant_changes

# Get structured diff
result = await diff(old_text, new_text)
print(f"Additions: {result.additions}")
print(f"Deletions: {result.deletions}")

# Get human-readable summary
summary = await diff_summary(old_text, new_text)
print(summary)

# Check for significant changes
if await has_significant_changes(old_text, new_text):
    print("Content changed significantly")

Graph Operations

AI clients are used in graph operations:

stages:
  - name: analyze
    sequential:
      - op: summarize
        input: text
        params:
          max_length: 500
          model: claude-3-5-sonnet
        output: summary

      - op: entities
        input: text
        params:
          types: [person, organization, location]
        output: entities

      - op: sentiment
        input: text
        output: sentiment_result

Error Handling

from cbintel.ai import CBAIClient, AIError, RateLimitError

async with CBAIClient() as ai:
    try:
        response = await ai.complete(prompt)
    except RateLimitError:
        print("Rate limited - wait and retry")
    except AIError as e:
        print(f"AI error: {e}")

Configuration

Environment Variables

# Anthropic
ANTHROPIC_API_KEY=sk-ant-...

# Ollama
OLLAMA_BASE_URL=http://127.0.0.1:11434
OLLAMA_EMBED_MODEL=nomic-embed-text

# CBAI (unified)
CBAI_DEFAULT_MODEL=claude-3-5-sonnet
CBAI_TEMPERATURE=0.7
CBAI_MAX_TOKENS=4096

Best Practices

Model Selection

  1. Claude 3.5 Sonnet - Default for most tasks
  2. Claude 3 Opus - Complex reasoning, long context
  3. Claude 3 Haiku - Fast, simple tasks, cost-effective
  4. Ollama models - Local processing, privacy, cost-free

Context Management

# Keep context concise
response = await ai.complete(
    "Summarize the key points",
    context=text[:10000]  # Limit context size
)

# Use system prompts for consistent behavior
response = await ai.complete(
    prompt,
    system="You are a research analyst. Be thorough and cite sources."
)

Error Handling

import asyncio

async def complete_with_retry(ai, prompt, max_retries=3):
    for attempt in range(max_retries):
        try:
            return await ai.complete(prompt)
        except RateLimitError:
            wait = 2 ** attempt  # Exponential backoff
            await asyncio.sleep(wait)
    raise Exception("Max retries exceeded")