Roadmap¶
Future development plans and vision for cbintel.
Current State (v1.14.0)¶
Completed Features¶
Core Infrastructure: - Core primitives (AI, HTTP, browser, archive, vectors) - Graph execution engine with comprehensive type system - Jobs API with 7 worker types - VPN cluster management (16 OpenWRT workers) - Tor gateway integration - Workspace management with artifact tracking - Index API integration
Graph Operations (67 total):
- Discovery: search, archive_discover, youtube_search, tor_search
- Acquisition: fetch, fetch_archive, screenshot, tor_fetch
- Transform: to_markdown, to_text, translate, ocr
- Processing: chunk, embed, embed_batch
- Filtering: semantic_filter, quality_filter, filter_urls
- Extraction: entities, topics, summarize, geocode, reverse_geocode
- Storage: store_vectors, store_entities
- Synthesis: integrate, chat, to_report
- Document: analyze_document, process_document, detect_doctype
- Geo/News: correlate_news_geo, aggregate_district_news, generate_map
Scheduler System:
- Cron-based job scheduling
- Redis-backed schedule storage
- Full CRUD API (/api/v1/scheduler/schedules)
- Tick processor with job spawning
- Manual trigger support
Chat→Graph Pipeline: - Intent classification (AI + regex fallback) - Parameter extraction from natural language - Query expansion for better coverage - Graph template selection - ResearchAgent with plan + execute - Type-based auto-complete suggestions - Natural language type resolution
Type System: - Complete type hierarchy (primitives, domain, collections) - Compatibility matrix with coercion rules - Filter expression grammar (EBNF) - Runtime type validation
Open Issues¶
Current GitHub issues requiring attention:
| Issue | Title | Priority |
|---|---|---|
| #18 | youtube_search slice indices error |
Bug - High |
| #21 | Fix failing domain fetch tests | Test reliability |
| #22 | Fix failing edge case tests | Test reliability |
| #20 | cbintel-doc CLI for document processing |
Feature |
| #19 | Document Intelligence Pipeline (Agent 3) | Feature |
| #17 | Production Agent Deployment guide | Docs |
| #16 | Agent Composition (atoms/molecules) | Feature |
Near-Term (v1.15.0 - v1.17.0)¶
v1.15.0: Polish & Stability¶
Bug Fixes:
- Fix youtube_search slice indices error (#18)
- Stabilize integration tests (#21, #22)
CLI Tooling:
- cbintel-doc CLI for document processing (#20)
- analyze, ocr, entities, summarize, topics, chunk
- Consistent with cbintel-workspace patterns
Documentation: - Production deployment guide (#17) - Agent composition patterns (#16)
v1.16.0: Document Intelligence¶
Document Pipeline (Agent 3): - Multi-format ingestion (PDF, images, Office docs) - AI-powered format detection and routing - OCR with layout preservation - Table extraction - Workspace integration for processed documents
flowchart LR
UPLOAD[Upload] --> DETECT[Detect Type]
DETECT --> PDF[PDF Parser]
DETECT --> IMG[OCR Pipeline]
DETECT --> OFFICE[Office Parser]
PDF & IMG & OFFICE --> EXTRACT[Entity Extraction]
EXTRACT --> INDEX[Vector Index]
v1.17.0: Agent API¶
Features: - REST API endpoints for ResearchAgent - Streaming progress updates - Session management for multi-turn conversations
Endpoints:
| Endpoint | Method | Purpose |
|----------|--------|---------|
| /api/v1/agent/plan | POST | Plan research without executing |
| /api/v1/agent/research | POST | Execute research query |
| /api/v1/agent/sessions | GET/POST | Manage conversation sessions |
| /api/v1/agent/sessions/{id}/message | POST | Send message to session |
Mid-Term (v2.0.0 - v2.5.0)¶
Graph Builder UI¶
Visual graph construction interface (#13).
graph LR
CANVAS[Visual Canvas] --> NODES[Drag-Drop Operations]
NODES --> CONNECT[Connect Nodes]
CONNECT --> VALIDATE[Real-time Validation]
VALIDATE --> EXPORT[Export to YAML]
Features: - Drag-and-drop operation nodes - Visual connection of inputs/outputs - Real-time type validation - YAML import/export - Template library
Custom Operation Plugins¶
User-defined operations (#14).
from cbintel.graph import register_operation
@register_operation("my_custom_op", output_type="Text")
async def my_custom_op(ctx, inputs, params):
# Custom logic
result = await process(inputs)
return result
Features: - Operation registration API - Plugin package format - Marketplace for community operations - Versioning and dependencies
Connector System¶
External service integrations (#15).
Planned Connectors: - Slack/Discord notifications - Email delivery - S3/GCS storage backends - Webhook triggers
Enhanced Monitoring¶
Prometheus/Grafana integration.
Metrics:
cbintel_jobs_processed_total{type="crawl"} 500
cbintel_jobs_failed_total{type="crawl"} 12
cbintel_job_duration_seconds{type="crawl"} 245.3
cbintel_queue_depth{type="crawl"} 5
cbintel_vpn_workers_active 12
cbintel_tor_circuits_total 45
Dashboards: - Job throughput and latency - Worker health and utilization - Queue depth and processing rates - Error rates and types
Multi-Tenant Workspaces¶
Team collaboration features.
Features: - Workspace sharing - Role-based access control - Team activity logs - Collaborative annotations
Graph Versioning¶
Version control for graphs.
Features: - Git-like versioning - Diff between versions - Rollback support - Branch and merge
Streaming Results¶
Real-time partial results.
async for partial in executor.run_streaming(graph, params):
print(f"Stage {partial.stage}: {partial.status}")
if partial.output:
display_partial(partial.output)
Long-Term Vision (v3.0.0+)¶
Distributed Execution¶
Multi-node graph processing.
graph TB
COORD[Coordinator] --> W1[Worker Node 1]
COORD --> W2[Worker Node 2]
COORD --> W3[Worker Node 3]
W1 --> SHARE[(Shared State)]
W2 --> SHARE
W3 --> SHARE
Features: - Horizontal scaling - Distributed state management - Auto-scaling based on queue depth - Geographic distribution
Knowledge Graphs¶
Entity relationship mapping.
graph LR
P1[Person A] -->|works_for| O1[Org 1]
P1 -->|knows| P2[Person B]
P2 -->|works_for| O2[Org 2]
O1 -->|partner_of| O2
Features: - Entity relationship extraction - Graph database storage (Neo4j) - Relationship querying - Visualization
Adversarial Detection¶
Disinformation and manipulation detection.
Features: - Content authenticity scoring - Narrative tracking - Source credibility assessment - Bot network detection
Federation¶
Cross-instance sharing.
Features: - Federated workspace sharing - Cross-organization collaboration - Privacy-preserving search - Selective data sharing
Research Areas¶
Autonomous Agents¶
Goal-directed research agents.
agent = AutonomousAgent(
goal="Research and report on AI safety trends",
constraints={
"max_time": "1 hour",
"max_cost": "$10",
"focus_areas": ["technical", "policy"]
}
)
result = await agent.run()
Features: - Goal decomposition - Strategy selection - Progress monitoring - Self-correction
Citation Graphs¶
Source attribution and verification.
Features: - Citation extraction - Source chain tracking - Claim verification - Bibliography generation
Contributing¶
How to Contribute¶
- Check open issues on GitHub
- Discuss approach in issue comments
- Submit PR with tests
- Follow code style guidelines (ruff, pre-commit)
Priority Areas¶
| Area | Priority | Skills Needed |
|---|---|---|
| Bug fixes (#18, #21, #22) | High | Python, pytest |
| Document CLI (#20) | High | Python, typer |
| Graph UI (#13) | Medium | React, TypeScript |
| Monitoring | Medium | Prometheus, Grafana |
| Plugins (#14) | Medium | Python, packaging |
Contact¶
- GitHub Issues: github.com/nominate/cbintel/issues
- Documentation PRs welcome
- Feature requests via issues