Skip to content

Chat Pipeline

The chat pipeline enables natural language queries to be transformed into executable research graphs.

Overview

sequenceDiagram
    participant User
    participant Agent as ResearchAgent
    participant Intent as IntentClassifier
    participant Builder as PipelineBuilder
    participant Executor as GraphExecutor
    participant Workers

    User->>Agent: "Research John Smith's voting record"
    Agent->>Intent: classify(query)
    Intent-->>Agent: intent=person_research, conf=0.85
    Agent->>Builder: build(intent, params)
    Builder-->>Agent: GraphDef YAML
    Agent->>Executor: execute(graph, params)
    Executor->>Workers: dispatch operations
    Workers-->>Executor: results
    Executor-->>Agent: GraphResult
    Agent-->>User: Research Report

Quick Reference

Document Description
Intent Classification Query understanding
Graph Building Interactive pipeline construction
Auto-Complete Type-based suggestions

Components

Component Purpose
ResearchAgent Conversational interface
IntentClassifier Query to intent mapping
PipelineBuilder Intent to graph construction
GraphExecutor Graph execution

Flow

1. User Query

User submits a natural language query:

"Research John Smith's voting record on healthcare"

2. Intent Classification

Query is classified into an intent:

{
    "intent": "person_research",
    "confidence": 0.85,
    "entities": {
        "person": "John Smith",
        "topic": "healthcare"
    },
    "params": {
        "subject_name": "John Smith",
        "focus_topic": "healthcare"
    }
}

3. Pipeline Building

Intent maps to a graph template:

name: person_research
params:
  subject_name: "John Smith"
  focus_topic: "healthcare"

stages:
  - name: discover
    sequential:
      - op: search
        params:
          query: "John Smith healthcare voting record"
        output: urls
  # ... more stages

4. Graph Execution

Graph executes through the jobs system.

5. Response

Results are formatted for the user.

Intent Types

Intent Description Parameters
research_person Research a person person_name, topic
research_company Research a company company_name
compare_sources Compare sources urls, topic
track_position Track positions over time person, topic
news_aggregation Aggregate news topic, regions
historical_analysis Historical content analysis url, date_range

Example Session

User: Research Senator Jane Smith's position on climate change

Agent: I'll research Senator Jane Smith's position on climate change.

[Classifying intent...]
Intent: person_research (confidence: 0.92)
Parameters:
  - person_name: "Senator Jane Smith"
  - topic: "climate change"

[Building research pipeline...]
Using template: opposition_research

[Executing graph...]
Stage 1/5: Discovering sources... ✓ (42 URLs found)
Stage 2/5: Fetching content... ✓ (38 pages retrieved)
Stage 3/5: Extracting entities... ✓ (127 entities)
Stage 4/5: Analyzing positions... ✓
Stage 5/5: Synthesizing report... ✓

Research complete!

## Summary
Senator Jane Smith has consistently supported climate legislation,
including voting for the Clean Energy Act (2023) and co-sponsoring
the Carbon Reduction Initiative...

## Key Positions
- Supports carbon pricing
- Advocates for renewable energy subsidies
- Voted against oil drilling expansion

## Sources
- 42 sources analyzed
- Time range: 2019-2024
- Key sources: congress.gov, senate.gov, news outlets

Configuration

Environment Variables

# Intent classification
INTENT_MODEL=claude-3-5-sonnet
INTENT_CONFIDENCE_THRESHOLD=0.7

# Pipeline building
DEFAULT_MAX_URLS=50
DEFAULT_GEO=us

Usage

Via Chat Interface

from cbintel.chat import ResearchAgent

agent = ResearchAgent()

# Interactive session
async for message in agent.chat("Research AI safety trends"):
    print(message)

Via API

from cbintel.chat import IntentClassifier, PipelineBuilder

# Classify intent
classifier = IntentClassifier()
intent = await classifier.classify("Research John Smith")

# Build graph
builder = PipelineBuilder()
graph = builder.build(intent)

# Execute
result = await executor.run(graph, intent.params)

Best Practices

  1. Be specific - More specific queries yield better results
  2. Include context - Topics, time ranges, locations help
  3. Iterate - Refine queries based on initial results
  4. Use templates - Leverage built-in research patterns
  5. Review outputs - Validate AI-generated insights