Chat Pipeline¶
The chat pipeline enables natural language queries to be transformed into executable research graphs.
Overview¶
sequenceDiagram
participant User
participant Agent as ResearchAgent
participant Intent as IntentClassifier
participant Builder as PipelineBuilder
participant Executor as GraphExecutor
participant Workers
User->>Agent: "Research John Smith's voting record"
Agent->>Intent: classify(query)
Intent-->>Agent: intent=person_research, conf=0.85
Agent->>Builder: build(intent, params)
Builder-->>Agent: GraphDef YAML
Agent->>Executor: execute(graph, params)
Executor->>Workers: dispatch operations
Workers-->>Executor: results
Executor-->>Agent: GraphResult
Agent-->>User: Research Report
Quick Reference¶
| Document | Description |
|---|---|
| Intent Classification | Query understanding |
| Graph Building | Interactive pipeline construction |
| Auto-Complete | Type-based suggestions |
Components¶
| Component | Purpose |
|---|---|
| ResearchAgent | Conversational interface |
| IntentClassifier | Query to intent mapping |
| PipelineBuilder | Intent to graph construction |
| GraphExecutor | Graph execution |
Flow¶
1. User Query¶
User submits a natural language query:
2. Intent Classification¶
Query is classified into an intent:
{
"intent": "person_research",
"confidence": 0.85,
"entities": {
"person": "John Smith",
"topic": "healthcare"
},
"params": {
"subject_name": "John Smith",
"focus_topic": "healthcare"
}
}
3. Pipeline Building¶
Intent maps to a graph template:
name: person_research
params:
subject_name: "John Smith"
focus_topic: "healthcare"
stages:
- name: discover
sequential:
- op: search
params:
query: "John Smith healthcare voting record"
output: urls
# ... more stages
4. Graph Execution¶
Graph executes through the jobs system.
5. Response¶
Results are formatted for the user.
Intent Types¶
| Intent | Description | Parameters |
|---|---|---|
research_person |
Research a person | person_name, topic |
research_company |
Research a company | company_name |
compare_sources |
Compare sources | urls, topic |
track_position |
Track positions over time | person, topic |
news_aggregation |
Aggregate news | topic, regions |
historical_analysis |
Historical content analysis | url, date_range |
Example Session¶
User: Research Senator Jane Smith's position on climate change
Agent: I'll research Senator Jane Smith's position on climate change.
[Classifying intent...]
Intent: person_research (confidence: 0.92)
Parameters:
- person_name: "Senator Jane Smith"
- topic: "climate change"
[Building research pipeline...]
Using template: opposition_research
[Executing graph...]
Stage 1/5: Discovering sources... ✓ (42 URLs found)
Stage 2/5: Fetching content... ✓ (38 pages retrieved)
Stage 3/5: Extracting entities... ✓ (127 entities)
Stage 4/5: Analyzing positions... ✓
Stage 5/5: Synthesizing report... ✓
Research complete!
## Summary
Senator Jane Smith has consistently supported climate legislation,
including voting for the Clean Energy Act (2023) and co-sponsoring
the Carbon Reduction Initiative...
## Key Positions
- Supports carbon pricing
- Advocates for renewable energy subsidies
- Voted against oil drilling expansion
## Sources
- 42 sources analyzed
- Time range: 2019-2024
- Key sources: congress.gov, senate.gov, news outlets
Configuration¶
Environment Variables¶
# Intent classification
INTENT_MODEL=claude-3-5-sonnet
INTENT_CONFIDENCE_THRESHOLD=0.7
# Pipeline building
DEFAULT_MAX_URLS=50
DEFAULT_GEO=us
Usage¶
Via Chat Interface¶
from cbintel.chat import ResearchAgent
agent = ResearchAgent()
# Interactive session
async for message in agent.chat("Research AI safety trends"):
print(message)
Via API¶
from cbintel.chat import IntentClassifier, PipelineBuilder
# Classify intent
classifier = IntentClassifier()
intent = await classifier.classify("Research John Smith")
# Build graph
builder = PipelineBuilder()
graph = builder.build(intent)
# Execute
result = await executor.run(graph, intent.params)
Best Practices¶
- Be specific - More specific queries yield better results
- Include context - Topics, time ranges, locations help
- Iterate - Refine queries based on initial results
- Use templates - Leverage built-in research patterns
- Review outputs - Validate AI-generated insights