Skip to content

Directory Structure

This document describes the package layout of cbintel's source code.

Top-Level Structure

src/cbintel/
├── __init__.py          # Package exports
├── logging.py           # Logging configuration
├── auth.py              # Authentication utilities
├── service.py           # Service management
├── ai/                  # AI client wrappers
├── net/                 # Network operations
├── io/                  # File/process I/O
├── cluster/             # VPN cluster management
├── geo/                 # Geographic routing
├── tor/                 # Tor gateway client
├── ferret/              # Browser automation (SWRM)
├── crawl/               # AI-powered crawling
├── lazarus/             # Historical archives
├── vectl/               # Vector storage
├── screenshots/         # Browser screenshots
├── transcript/          # YouTube transcripts
├── jobs/                # Async job system
├── workspace/           # Workspace management
├── graph/               # Graph execution
├── knowledge/           # Entity storage
├── index/               # Content indexing
├── geocode/             # Geocoding
├── email/               # Email sending
├── sms/                 # SMS sending
├── scheduler/           # Job scheduling
├── districts/           # Political districts
├── maps/                # Map tile generation
├── tiles/               # Tile serving
├── data/                # Data models
├── usage/               # Usage tracking
├── web/                 # Web interface
├── client/              # API clients
└── stress/              # Stress testing

Core Modules

cbintel.ai - AI Client Wrappers

ai/
├── __init__.py          # Public exports
├── anthropic_client.py  # Anthropic Claude API
├── ollama_client.py     # Ollama local LLM
├── cbai_client.py       # Unified AI client
├── embeddings.py        # Embedding generation
├── diff.py              # Content diff
└── sentiment.py         # Sentiment analysis

Key Classes: - AnthropicClient - Claude API wrapper - OllamaClient - Local LLM wrapper - CBAIClient - Unified interface

cbintel.net - Network Operations

net/
├── __init__.py          # Public exports
├── http_client.py       # Async HTTP client
├── search_client.py     # Multi-engine search
├── url_cleaner.py       # URL normalization
└── webhook.py           # Webhook sending

Key Classes: - HTTPClient - Async HTTP with proxy support - SearchClient - Web search (10+ engines) - URLCleaner - URL normalization and cleaning

cbintel.io - File/Process I/O

io/
├── __init__.py          # Public exports
├── html_processor.py    # HTML parsing
├── markdown.py          # Markdown conversion
├── storage.py           # File storage
└── session.py           # Session management

Key Classes: - HTMLProcessor - HTML content extraction - MarkdownConverter - HTML to Markdown - SessionManager - Research session management

Service Modules

cbintel.cluster - VPN Cluster Management

cluster/
├── __init__.py
├── main.py              # FastAPI application
├── config.py            # Settings
├── routers/
│   ├── banks.py         # Bank endpoints
│   ├── workers.py       # Worker endpoints
│   ├── cluster.py       # Cluster endpoints
│   └── devices.py       # Device endpoints
├── services/
│   ├── bank_service.py  # Bank management
│   ├── worker_service.py # Worker operations
│   ├── device_service.py # Device utilities
│   └── state_manager.py  # State persistence
├── clients/
│   └── luci_rpc.py      # OpenWRT RPC
└── models/
    ├── bank.py
    ├── worker.py
    └── device.py

cbintel.tor - Tor Gateway Client

tor/
├── __init__.py          # Public exports
├── models.py            # Request/response models
├── client.py            # TorClient
└── router.py            # TorRouter (sessions)

cbintel.ferret - Browser Automation

ferret/
├── __init__.py          # Public exports
├── config.py            # Settings
├── exceptions.py        # Error hierarchy
├── models.py            # Pydantic models
├── actions.py           # AST action nodes
├── client.py            # SWRMClient
├── executor.py          # Script execution
├── learning.py          # Adaptive learning
└── cli.py               # CLI interface

cbintel.jobs - Async Job System

jobs/
├── __init__.py
├── main.py              # FastAPI application
├── config.py            # Settings
├── routers/
│   ├── jobs.py          # Job endpoints
│   └── schedules.py     # Schedule endpoints
├── services/
│   └── job_service.py   # Job management
├── workers/
│   ├── base.py          # BaseWorker
│   ├── crawl_worker.py
│   ├── graph_worker.py
│   ├── browser_worker.py
│   ├── lazarus_worker.py
│   ├── vectl_worker.py
│   ├── transcript_worker.py
│   └── screenshot_worker.py
├── queue.py             # Redis job queue
└── models.py            # Job models

cbintel.graph - Graph Execution

graph/
├── __init__.py          # Public exports
├── executor.py          # GraphExecutor
├── parser.py            # YAML parsing
├── validator.py         # Type validation
├── operations/
│   ├── __init__.py      # Operation registry
│   ├── discover.py      # Discovery ops
│   ├── acquire.py       # Acquisition ops
│   ├── transform.py     # Transform ops
│   ├── extract.py       # Extraction ops
│   ├── filter.py        # Filter ops
│   ├── store.py         # Storage ops
│   └── synthesize.py    # Synthesis ops
├── templates/           # Built-in templates
└── types/
    ├── __init__.py
    ├── primitives.py    # Primitive types
    ├── domain.py        # Domain types
    └── collections.py   # Collection types

Data Processing Modules

cbintel.crawl - AI-Powered Crawling

crawl/
├── __init__.py
├── pipeline.py          # CrawlPipeline
├── config.py            # CrawlConfig
├── batch.py             # Batch processing
├── evaluate.py          # Content evaluation
└── synthesize.py        # Report synthesis

cbintel.lazarus - Historical Archives

lazarus/
├── __init__.py
├── cdx_client.py        # CDX API client
├── url_discovery.py     # gau wrapper
├── archive_client.py    # High-level client
└── temporal.py          # Temporal analysis

cbintel.vectl - Vector Storage

vectl/
├── __init__.py
├── embedding_service.py # Generate embeddings
├── vector_store.py      # Vector storage
├── semantic_search.py   # Search interface
├── chunking.py          # Text chunking
└── document_index.py    # Document indexing

cbintel.screenshots - Browser Screenshots

screenshots/
├── __init__.py
├── service.py           # ScreenshotService
├── pdf_service.py       # PDF generation
├── dom_service.py       # DOM extraction
└── config.py            # Settings

Support Modules

cbintel.workspace - Workspace Management

workspace/
├── __init__.py
├── manager.py           # WorkspaceManager
├── manifest.py          # WorkspaceManifest
└── artifacts.py         # Artifact tracking

cbintel.knowledge - Entity Storage

knowledge/
├── __init__.py
├── entity_store.py      # Entity persistence
└── models.py            # Entity models

cbintel.index - Content Indexing

index/
├── __init__.py
└── client.py            # IndexClient

cbintel.geocode - Geocoding

geocode/
├── __init__.py
└── client.py            # GeocodeClient

Entry Points

CLI tools are defined in pyproject.toml:

[project.scripts]
cbintel-crawl = "cbintel.crawl.cli:main"
cbintel-lazarus = "cbintel.lazarus.cli:main"
cbintel-vectl = "cbintel.vectl.cli:main"
cbintel-screenshots = "cbintel.screenshots.cli:main"
cbintel-cluster = "cbintel.cluster.main:run"
cbintel-jobs = "cbintel.jobs.main:run"
cbintel-ferret = "cbintel.ferret.cli:main"

Import Patterns

Public API

# AI clients
from cbintel.ai import CBAIClient, AnthropicClient, OllamaClient

# Network
from cbintel.net import HTTPClient, SearchClient

# Vector storage
from cbintel.vectl import EmbeddingService, VectorStore, SemanticSearch

# Browser
from cbintel.screenshots import ScreenshotService
from cbintel.ferret import SWRMClient

# Archives
from cbintel.lazarus import ArchiveClient, CDXClient

# Crawling
from cbintel.crawl import CrawlPipeline, CrawlConfig

# Tor
from cbintel.tor import TorClient, TorRouter

# Graph
from cbintel.graph import GraphExecutor

Internal Imports

# Within a module, use relative imports
from .models import JobStatus
from ..ai import CBAIClient

# Between modules, use absolute imports
from cbintel.net import HTTPClient
from cbintel.vectl import EmbeddingService