Auto-Complete¶

The type system enables intelligent auto-complete suggestions during interactive graph building.

Overview¶

flowchart LR
    STATE[Current State] --> ANALYZE[Analyze Output Type]
    ANALYZE --> SUGGEST[Generate Suggestions]
    SUGGEST --> RANK[Rank by Relevance]
    RANK --> DISPLAY[Show to User]

Context-Based Suggestions¶

Based on the last output type, the system suggests compatible operations:

from cbintel.chat import AutoComplete

complete = AutoComplete()

# Get suggestions based on current output type
suggestions = complete.suggest(last_output_type="Url[]")

for s in suggestions:
    print(f"{s.operation}: {s.reason} (score: {s.score})")

Suggestion Mappings¶

After Url[]¶

- op: fetch_batch
  score: 0.95
  reason: "Fetch content from URLs"

- op: filter_urls
  score: 0.70
  reason: "Filter URLs by pattern"

- op: screenshot
  score: 0.65
  reason: "Capture screenshots"

After Html¶

- op: to_text
  score: 0.90
  reason: "Extract plain text"

- op: to_markdown
  score: 0.85
  reason: "Convert to markdown"

- op: extract_links
  score: 0.70
  reason: "Extract links from page"

After Html[]¶

- op: to_text_batch
  score: 0.90
  reason: "Extract text from all pages"

- op: to_markdown_batch
  score: 0.85
  reason: "Convert all to markdown"

After Text¶

- op: chunk
  score: 0.90
  reason: "Split into chunks for processing"

- op: entities
  score: 0.85
  reason: "Extract named entities"

- op: summarize
  score: 0.80
  reason: "Generate summary"

- op: embed
  score: 0.75
  reason: "Generate embedding"

After Chunk[]¶

- op: semantic_filter
  score: 0.90
  reason: "Filter relevant chunks"

- op: quality_filter
  score: 0.85
  reason: "Filter by quality"

- op: embed_batch
  score: 0.80
  reason: "Generate embeddings"

- op: integrate
  score: 0.75
  reason: "Synthesize into report"

After Entity[]¶

- op: filter_entities
  score: 0.90
  reason: "Filter by type/confidence"

- op: store_entities
  score: 0.70
  reason: "Store in knowledge base"

- op: to_report
  score: 0.65
  reason: "Format as report"

After Vector[]¶

- op: store_vectors
  score: 0.90
  reason: "Store in vector database"

- op: search_vectors
  score: 0.80
  reason: "Search similar content"

Suggestion Ranking¶

Suggestions are ranked by multiple factors:

Type Compatibility¶

Primary factor - operation must accept the current output type.

def type_score(output_type: str, operation: str) -> float:
    expected_input = get_operation_input_type(operation)
    if output_type == expected_input:
        return 1.0
    elif can_coerce(output_type, expected_input):
        return 0.8
    else:
        return 0.0

Context Relevance¶

Based on what's already in the graph.

def context_score(graph: GraphDef, operation: str) -> float:
    # If entities already extracted, don't suggest again
    if has_operation(graph, "entities") and operation == "entities":
        return 0.1

    # If no synthesis yet, boost synthesis ops
    if not has_synthesis(graph) and operation in ["integrate", "to_report"]:
        return 1.2

    return 1.0

Intent Alignment¶

Based on the original intent.

def intent_score(intent: str, operation: str) -> float:
    # Person research should prioritize entity extraction
    if intent == "person_research" and operation == "entities":
        return 1.3

    # Historical analysis should include archives
    if intent == "historical_analysis" and operation == "fetch_archive":
        return 1.3

    return 1.0

Final Score¶

def rank_suggestion(suggestion: Suggestion, context: Context) -> float:
    return (
        suggestion.base_score *
        type_score(context.output_type, suggestion.operation) *
        context_score(context.graph, suggestion.operation) *
        intent_score(context.intent, suggestion.operation)
    )

Interactive Usage¶

In Chat Interface¶

Agent: Found 42 URLs. What would you like to do next?

Suggestions:
1. [Recommended] Fetch content from URLs
2. Filter URLs by domain
3. Take screenshots
4. Other...

User: 1

Agent: Fetching content from 42 URLs...
[Stage added: fetch_batch]

Agent: Retrieved 38 pages. Next?

Suggestions:
1. [Recommended] Extract plain text
2. Convert to markdown
3. Extract links
4. Other...

In Code¶

from cbintel.chat import InteractivePipelineBuilder

builder = InteractivePipelineBuilder()

# After each operation
suggestions = builder.get_suggestions()

# Display to user
for i, s in enumerate(suggestions):
    print(f"{i+1}. {s.label}")
    print(f"   {s.description}")

# User selects
choice = int(input("Select: ")) - 1
builder.apply_suggestion(suggestions[choice])

Custom Suggestions¶

Register custom suggestion rules:

from cbintel.chat import register_suggestion_rule

@register_suggestion_rule(priority=10)
def suggest_geo_for_news(context: Context) -> list[Suggestion]:
    """Suggest geo routing for news research."""
    if context.intent == "news_aggregation":
        if not context.has_param("geo"):
            return [Suggestion(
                operation="set_param",
                params={"geo": "us"},
                score=0.8,
                reason="Add geographic routing for regional news"
            )]
    return []

Filtering Suggestions¶

By Category¶

suggestions = complete.suggest(
    last_output_type="Chunk[]",
    categories=["filter", "synthesize"]  # Only these categories
)

By Score Threshold¶

suggestions = complete.suggest(
    last_output_type="Chunk[]",
    min_score=0.7  # Only high-confidence suggestions
)

By Count¶

suggestions = complete.suggest(
    last_output_type="Chunk[]",
    limit=3  # Top 3 only
)

Suggestion Object¶

@dataclass
class Suggestion:
    operation: str       # Operation name
    score: float         # Relevance score (0-1)
    reason: str          # Human-readable reason
    category: str        # Operation category
    params: dict         # Suggested parameters
    label: str           # Display label
    description: str     # Detailed description

Configuration¶

Environment Variables¶

# Suggestion settings
AUTOCOMPLETE_MIN_SCORE=0.5
AUTOCOMPLETE_MAX_SUGGESTIONS=5
AUTOCOMPLETE_INCLUDE_DESCRIPTIONS=true

Customize Rankings¶

complete = AutoComplete(
    min_score=0.6,
    max_suggestions=5,
    boost_synthesis=True,  # Boost synthesis ops
    penalize_redundant=True  # Lower score for repeated ops
)

Best Practices¶

Follow top suggestions - Usually most appropriate
Consider context - Lower-ranked may be better for specific cases
Check reasons - Understand why suggested
Complete the pipeline - Don't stop without synthesis
Review before execution - Validate the full graph