Directory Structure¶
This document describes the package layout of cbintel's source code.
Top-Level Structure¶
src/cbintel/
├── __init__.py # Package exports
├── logging.py # Logging configuration
├── auth.py # Authentication utilities
├── service.py # Service management
│
├── ai/ # AI client wrappers
├── net/ # Network operations
├── io/ # File/process I/O
├── cluster/ # VPN cluster management
├── geo/ # Geographic routing
├── tor/ # Tor gateway client
├── ferret/ # Browser automation (SWRM)
├── crawl/ # AI-powered crawling
├── lazarus/ # Historical archives
├── vectl/ # Vector storage
├── screenshots/ # Browser screenshots
├── transcript/ # YouTube transcripts
├── jobs/ # Async job system
├── workspace/ # Workspace management
├── graph/ # Graph execution
├── knowledge/ # Entity storage
├── index/ # Content indexing
├── geocode/ # Geocoding
├── email/ # Email sending
├── sms/ # SMS sending
├── scheduler/ # Job scheduling
├── districts/ # Political districts
├── maps/ # Map tile generation
├── tiles/ # Tile serving
├── data/ # Data models
├── usage/ # Usage tracking
├── web/ # Web interface
├── client/ # API clients
└── stress/ # Stress testing
Core Modules¶
cbintel.ai - AI Client Wrappers¶
ai/
├── __init__.py # Public exports
├── anthropic_client.py # Anthropic Claude API
├── ollama_client.py # Ollama local LLM
├── cbai_client.py # Unified AI client
├── embeddings.py # Embedding generation
├── diff.py # Content diff
└── sentiment.py # Sentiment analysis
Key Classes:
- AnthropicClient - Claude API wrapper
- OllamaClient - Local LLM wrapper
- CBAIClient - Unified interface
cbintel.net - Network Operations¶
net/
├── __init__.py # Public exports
├── http_client.py # Async HTTP client
├── search_client.py # Multi-engine search
├── url_cleaner.py # URL normalization
└── webhook.py # Webhook sending
Key Classes:
- HTTPClient - Async HTTP with proxy support
- SearchClient - Web search (10+ engines)
- URLCleaner - URL normalization and cleaning
cbintel.io - File/Process I/O¶
io/
├── __init__.py # Public exports
├── html_processor.py # HTML parsing
├── markdown.py # Markdown conversion
├── storage.py # File storage
└── session.py # Session management
Key Classes:
- HTMLProcessor - HTML content extraction
- MarkdownConverter - HTML to Markdown
- SessionManager - Research session management
Service Modules¶
cbintel.cluster - VPN Cluster Management¶
cluster/
├── __init__.py
├── main.py # FastAPI application
├── config.py # Settings
├── routers/
│ ├── banks.py # Bank endpoints
│ ├── workers.py # Worker endpoints
│ ├── cluster.py # Cluster endpoints
│ └── devices.py # Device endpoints
├── services/
│ ├── bank_service.py # Bank management
│ ├── worker_service.py # Worker operations
│ ├── device_service.py # Device utilities
│ └── state_manager.py # State persistence
├── clients/
│ └── luci_rpc.py # OpenWRT RPC
└── models/
├── bank.py
├── worker.py
└── device.py
cbintel.tor - Tor Gateway Client¶
tor/
├── __init__.py # Public exports
├── models.py # Request/response models
├── client.py # TorClient
└── router.py # TorRouter (sessions)
cbintel.ferret - Browser Automation¶
ferret/
├── __init__.py # Public exports
├── config.py # Settings
├── exceptions.py # Error hierarchy
├── models.py # Pydantic models
├── actions.py # AST action nodes
├── client.py # SWRMClient
├── executor.py # Script execution
├── learning.py # Adaptive learning
└── cli.py # CLI interface
cbintel.jobs - Async Job System¶
jobs/
├── __init__.py
├── main.py # FastAPI application
├── config.py # Settings
├── routers/
│ ├── jobs.py # Job endpoints
│ └── schedules.py # Schedule endpoints
├── services/
│ └── job_service.py # Job management
├── workers/
│ ├── base.py # BaseWorker
│ ├── crawl_worker.py
│ ├── graph_worker.py
│ ├── browser_worker.py
│ ├── lazarus_worker.py
│ ├── vectl_worker.py
│ ├── transcript_worker.py
│ └── screenshot_worker.py
├── queue.py # Redis job queue
└── models.py # Job models
cbintel.graph - Graph Execution¶
graph/
├── __init__.py # Public exports
├── executor.py # GraphExecutor
├── parser.py # YAML parsing
├── validator.py # Type validation
├── operations/
│ ├── __init__.py # Operation registry
│ ├── discover.py # Discovery ops
│ ├── acquire.py # Acquisition ops
│ ├── transform.py # Transform ops
│ ├── extract.py # Extraction ops
│ ├── filter.py # Filter ops
│ ├── store.py # Storage ops
│ └── synthesize.py # Synthesis ops
├── templates/ # Built-in templates
└── types/
├── __init__.py
├── primitives.py # Primitive types
├── domain.py # Domain types
└── collections.py # Collection types
Data Processing Modules¶
cbintel.crawl - AI-Powered Crawling¶
crawl/
├── __init__.py
├── pipeline.py # CrawlPipeline
├── config.py # CrawlConfig
├── batch.py # Batch processing
├── evaluate.py # Content evaluation
└── synthesize.py # Report synthesis
cbintel.lazarus - Historical Archives¶
lazarus/
├── __init__.py
├── cdx_client.py # CDX API client
├── url_discovery.py # gau wrapper
├── archive_client.py # High-level client
└── temporal.py # Temporal analysis
cbintel.vectl - Vector Storage¶
vectl/
├── __init__.py
├── embedding_service.py # Generate embeddings
├── vector_store.py # Vector storage
├── semantic_search.py # Search interface
├── chunking.py # Text chunking
└── document_index.py # Document indexing
cbintel.screenshots - Browser Screenshots¶
screenshots/
├── __init__.py
├── service.py # ScreenshotService
├── pdf_service.py # PDF generation
├── dom_service.py # DOM extraction
└── config.py # Settings
Support Modules¶
cbintel.workspace - Workspace Management¶
workspace/
├── __init__.py
├── manager.py # WorkspaceManager
├── manifest.py # WorkspaceManifest
└── artifacts.py # Artifact tracking
cbintel.knowledge - Entity Storage¶
cbintel.index - Content Indexing¶
cbintel.geocode - Geocoding¶
Entry Points¶
CLI tools are defined in pyproject.toml:
[project.scripts]
cbintel-crawl = "cbintel.crawl.cli:main"
cbintel-lazarus = "cbintel.lazarus.cli:main"
cbintel-vectl = "cbintel.vectl.cli:main"
cbintel-screenshots = "cbintel.screenshots.cli:main"
cbintel-cluster = "cbintel.cluster.main:run"
cbintel-jobs = "cbintel.jobs.main:run"
cbintel-ferret = "cbintel.ferret.cli:main"
Import Patterns¶
Public API¶
# AI clients
from cbintel.ai import CBAIClient, AnthropicClient, OllamaClient
# Network
from cbintel.net import HTTPClient, SearchClient
# Vector storage
from cbintel.vectl import EmbeddingService, VectorStore, SemanticSearch
# Browser
from cbintel.screenshots import ScreenshotService
from cbintel.ferret import SWRMClient
# Archives
from cbintel.lazarus import ArchiveClient, CDXClient
# Crawling
from cbintel.crawl import CrawlPipeline, CrawlConfig
# Tor
from cbintel.tor import TorClient, TorRouter
# Graph
from cbintel.graph import GraphExecutor