Skip to content

Roadmap

Future development plans and vision for cbintel.

Current State (v1.14.0)

Completed Features

Core Infrastructure: - Core primitives (AI, HTTP, browser, archive, vectors) - Graph execution engine with comprehensive type system - Jobs API with 7 worker types - VPN cluster management (16 OpenWRT workers) - Tor gateway integration - Workspace management with artifact tracking - Index API integration

Graph Operations (67 total): - Discovery: search, archive_discover, youtube_search, tor_search - Acquisition: fetch, fetch_archive, screenshot, tor_fetch - Transform: to_markdown, to_text, translate, ocr - Processing: chunk, embed, embed_batch - Filtering: semantic_filter, quality_filter, filter_urls - Extraction: entities, topics, summarize, geocode, reverse_geocode - Storage: store_vectors, store_entities - Synthesis: integrate, chat, to_report - Document: analyze_document, process_document, detect_doctype - Geo/News: correlate_news_geo, aggregate_district_news, generate_map

Scheduler System: - Cron-based job scheduling - Redis-backed schedule storage - Full CRUD API (/api/v1/scheduler/schedules) - Tick processor with job spawning - Manual trigger support

Chat→Graph Pipeline: - Intent classification (AI + regex fallback) - Parameter extraction from natural language - Query expansion for better coverage - Graph template selection - ResearchAgent with plan + execute - Type-based auto-complete suggestions - Natural language type resolution

Type System: - Complete type hierarchy (primitives, domain, collections) - Compatibility matrix with coercion rules - Filter expression grammar (EBNF) - Runtime type validation


Open Issues

Current GitHub issues requiring attention:

Issue Title Priority
#18 youtube_search slice indices error Bug - High
#21 Fix failing domain fetch tests Test reliability
#22 Fix failing edge case tests Test reliability
#20 cbintel-doc CLI for document processing Feature
#19 Document Intelligence Pipeline (Agent 3) Feature
#17 Production Agent Deployment guide Docs
#16 Agent Composition (atoms/molecules) Feature

Near-Term (v1.15.0 - v1.17.0)

v1.15.0: Polish & Stability

Bug Fixes: - Fix youtube_search slice indices error (#18) - Stabilize integration tests (#21, #22)

CLI Tooling: - cbintel-doc CLI for document processing (#20) - analyze, ocr, entities, summarize, topics, chunk - Consistent with cbintel-workspace patterns

Documentation: - Production deployment guide (#17) - Agent composition patterns (#16)

v1.16.0: Document Intelligence

Document Pipeline (Agent 3): - Multi-format ingestion (PDF, images, Office docs) - AI-powered format detection and routing - OCR with layout preservation - Table extraction - Workspace integration for processed documents

flowchart LR
    UPLOAD[Upload] --> DETECT[Detect Type]
    DETECT --> PDF[PDF Parser]
    DETECT --> IMG[OCR Pipeline]
    DETECT --> OFFICE[Office Parser]
    PDF & IMG & OFFICE --> EXTRACT[Entity Extraction]
    EXTRACT --> INDEX[Vector Index]

v1.17.0: Agent API

Features: - REST API endpoints for ResearchAgent - Streaming progress updates - Session management for multi-turn conversations

Endpoints: | Endpoint | Method | Purpose | |----------|--------|---------| | /api/v1/agent/plan | POST | Plan research without executing | | /api/v1/agent/research | POST | Execute research query | | /api/v1/agent/sessions | GET/POST | Manage conversation sessions | | /api/v1/agent/sessions/{id}/message | POST | Send message to session |


Mid-Term (v2.0.0 - v2.5.0)

Graph Builder UI

Visual graph construction interface (#13).

graph LR
    CANVAS[Visual Canvas] --> NODES[Drag-Drop Operations]
    NODES --> CONNECT[Connect Nodes]
    CONNECT --> VALIDATE[Real-time Validation]
    VALIDATE --> EXPORT[Export to YAML]

Features: - Drag-and-drop operation nodes - Visual connection of inputs/outputs - Real-time type validation - YAML import/export - Template library

Custom Operation Plugins

User-defined operations (#14).

from cbintel.graph import register_operation

@register_operation("my_custom_op", output_type="Text")
async def my_custom_op(ctx, inputs, params):
    # Custom logic
    result = await process(inputs)
    return result

Features: - Operation registration API - Plugin package format - Marketplace for community operations - Versioning and dependencies

Connector System

External service integrations (#15).

Planned Connectors: - Slack/Discord notifications - Email delivery - S3/GCS storage backends - Webhook triggers

Enhanced Monitoring

Prometheus/Grafana integration.

Metrics:

cbintel_jobs_processed_total{type="crawl"} 500
cbintel_jobs_failed_total{type="crawl"} 12
cbintel_job_duration_seconds{type="crawl"} 245.3
cbintel_queue_depth{type="crawl"} 5
cbintel_vpn_workers_active 12
cbintel_tor_circuits_total 45

Dashboards: - Job throughput and latency - Worker health and utilization - Queue depth and processing rates - Error rates and types

Multi-Tenant Workspaces

Team collaboration features.

Features: - Workspace sharing - Role-based access control - Team activity logs - Collaborative annotations

Graph Versioning

Version control for graphs.

Features: - Git-like versioning - Diff between versions - Rollback support - Branch and merge

Streaming Results

Real-time partial results.

async for partial in executor.run_streaming(graph, params):
    print(f"Stage {partial.stage}: {partial.status}")
    if partial.output:
        display_partial(partial.output)

Long-Term Vision (v3.0.0+)

Distributed Execution

Multi-node graph processing.

graph TB
    COORD[Coordinator] --> W1[Worker Node 1]
    COORD --> W2[Worker Node 2]
    COORD --> W3[Worker Node 3]

    W1 --> SHARE[(Shared State)]
    W2 --> SHARE
    W3 --> SHARE

Features: - Horizontal scaling - Distributed state management - Auto-scaling based on queue depth - Geographic distribution

Knowledge Graphs

Entity relationship mapping.

graph LR
    P1[Person A] -->|works_for| O1[Org 1]
    P1 -->|knows| P2[Person B]
    P2 -->|works_for| O2[Org 2]
    O1 -->|partner_of| O2

Features: - Entity relationship extraction - Graph database storage (Neo4j) - Relationship querying - Visualization

Adversarial Detection

Disinformation and manipulation detection.

Features: - Content authenticity scoring - Narrative tracking - Source credibility assessment - Bot network detection

Federation

Cross-instance sharing.

Features: - Federated workspace sharing - Cross-organization collaboration - Privacy-preserving search - Selective data sharing


Research Areas

Autonomous Agents

Goal-directed research agents.

agent = AutonomousAgent(
    goal="Research and report on AI safety trends",
    constraints={
        "max_time": "1 hour",
        "max_cost": "$10",
        "focus_areas": ["technical", "policy"]
    }
)

result = await agent.run()

Features: - Goal decomposition - Strategy selection - Progress monitoring - Self-correction

Citation Graphs

Source attribution and verification.

Features: - Citation extraction - Source chain tracking - Claim verification - Bibliography generation


Contributing

How to Contribute

  1. Check open issues on GitHub
  2. Discuss approach in issue comments
  3. Submit PR with tests
  4. Follow code style guidelines (ruff, pre-commit)

Priority Areas

Area Priority Skills Needed
Bug fixes (#18, #21, #22) High Python, pytest
Document CLI (#20) High Python, typer
Graph UI (#13) Medium React, TypeScript
Monitoring Medium Prometheus, Grafana
Plugins (#14) Medium Python, packaging

Contact