Intent Classification¶
The Intent Classifier analyzes natural language queries to determine research intent and extract parameters.
Overview¶
flowchart LR
QUERY[User Query] --> CLASSIFY[IntentClassifier]
CLASSIFY --> INTENT[Intent Type]
CLASSIFY --> PARAMS[Parameters]
CLASSIFY --> CONF[Confidence]
INTENT --> TEMPLATE[Graph Template]
PARAMS --> TEMPLATE
IntentClassifier¶
from cbintel.chat import IntentClassifier
classifier = IntentClassifier()
result = await classifier.classify(
"Research John Smith's voting record on healthcare"
)
print(f"Intent: {result.intent}")
print(f"Confidence: {result.confidence}")
print(f"Entities: {result.entities}")
print(f"Params: {result.params}")
Classification Result¶
@dataclass
class IntentResult:
intent: str # Intent type
confidence: float # 0.0 to 1.0
entities: dict # Extracted entities
params: dict # Parameters for graph
suggestions: list # Alternative intents
Intent Types¶
person_research¶
Research an individual.
# Query
"Research Senator Jane Smith"
"Deep dive on CEO John Doe"
"Background check on candidate Alice"
# Result
{
"intent": "person_research",
"entities": {"person": "Senator Jane Smith"},
"params": {
"subject_name": "Senator Jane Smith",
"max_urls": 100
}
}
company_research¶
Research a company or organization.
# Query
"Research Acme Corporation"
"Company profile for TechCorp Inc"
# Result
{
"intent": "company_research",
"entities": {"company": "Acme Corporation"},
"params": {
"company_name": "Acme Corporation",
"include_news": True
}
}
compare_sources¶
Compare how different sources cover a topic.
# Query
"Compare coverage of the election between these news sites"
"How do different outlets report on climate change?"
# Result
{
"intent": "compare_sources",
"entities": {"topic": "election"},
"params": {
"topic": "election coverage",
"sources": ["url1", "url2", "url3"]
}
}
track_position¶
Track positions or statements over time.
# Query
"How has the Senator's position on healthcare changed?"
"Track company statements about privacy over the years"
# Result
{
"intent": "track_position",
"entities": {
"person": "Senator",
"topic": "healthcare"
},
"params": {
"subject": "Senator",
"topic": "healthcare",
"include_archives": True
}
}
news_aggregation¶
Aggregate news on a topic.
# Query
"Latest news on AI regulation"
"News about the merger in the past week"
# Result
{
"intent": "news_aggregation",
"entities": {"topic": "AI regulation"},
"params": {
"topic": "AI regulation",
"days": 7
}
}
historical_analysis¶
Analyze historical content.
# Query
"How has this website changed since 2020?"
"Archive analysis of company about page"
# Result
{
"intent": "historical_analysis",
"entities": {"url": "https://example.com"},
"params": {
"url": "https://example.com",
"start_date": "2020-01-01"
}
}
video_research¶
Research from video content.
# Query
"Analyze this politician's YouTube interviews"
"Research public statements from video"
# Result
{
"intent": "video_research",
"entities": {"topic": "interviews"},
"params": {
"query": "politician interviews",
"max_videos": 10
}
}
Entity Extraction¶
The classifier extracts entities from queries:
| Entity Type | Examples |
|---|---|
person |
"John Smith", "Senator Jane" |
company |
"Acme Corp", "TechCo Inc" |
location |
"California", "New York" |
topic |
"healthcare", "climate change" |
date |
"2020", "last month" |
url |
"https://example.com" |
Entity Examples¶
# Query: "Research John Smith from Acme Corp in California"
{
"entities": {
"person": "John Smith",
"company": "Acme Corp",
"location": "California"
}
}
Confidence Scoring¶
Classification includes confidence:
| Confidence | Meaning |
|---|---|
| 0.9 - 1.0 | High confidence, proceed |
| 0.7 - 0.9 | Good confidence |
| 0.5 - 0.7 | Medium confidence, may clarify |
| < 0.5 | Low confidence, suggest alternatives |
Handling Low Confidence¶
result = await classifier.classify(query)
if result.confidence < 0.7:
# Suggest alternatives
print("Did you mean:")
for suggestion in result.suggestions:
print(f" - {suggestion.intent} ({suggestion.description})")
Configuration¶
Classifier Options¶
classifier = IntentClassifier(
model="claude-3-5-sonnet", # AI model
threshold=0.7, # Min confidence
include_suggestions=True, # Include alternatives
)
Environment Variables¶
Custom Intents¶
Register custom intent types:
from cbintel.chat import register_intent
@register_intent("competitor_analysis")
class CompetitorAnalysisIntent:
description = "Analyze competitors in a market"
required_entities = ["company"]
optional_entities = ["market", "region"]
def to_params(self, entities):
return {
"company_name": entities["company"],
"market": entities.get("market"),
"include_news": True
}
Intent to Template Mapping¶
INTENT_TEMPLATES = {
"person_research": "opposition_research",
"company_research": "company_profile",
"compare_sources": "source_comparison",
"track_position": "temporal_analysis",
"news_aggregation": "news_aggregation",
"historical_analysis": "temporal_analysis",
"video_research": "video_analysis",
}
Error Handling¶
from cbintel.chat import ClassificationError
try:
result = await classifier.classify(query)
except ClassificationError as e:
print(f"Classification failed: {e}")
Best Practices¶
- Be specific - "Research John Smith CEO" vs "Research John Smith"
- Include context - Topics, time ranges, locations
- Handle ambiguity - Provide clarification options
- Log classifications - Track for improvement
- Set thresholds - Appropriate confidence levels