Skip to content

Geographic Routing

The GeoRouter provides intelligent geographic routing for HTTP requests, automatically selecting appropriate VPN banks or Tor based on the requested location.

Overview

graph TB
    subgraph "Application"
        HTTP[HTTP Request]
        GEO_PARAM[geo parameter]
    end

    subgraph "GeoRouter"
        PARSE[Parse Geo Filter]
        SELECT[Select Route]
        CACHE[Route Cache]
    end

    subgraph "Routes"
        VPN[VPN Banks<br/>Geographic Proxies]
        TOR[Tor Gateway<br/>Anonymous Access]
        DIRECT[Direct<br/>No Proxy]
    end

    HTTP --> GEO_PARAM --> PARSE
    PARSE --> SELECT
    SELECT --> CACHE
    CACHE --> VPN & TOR & DIRECT

GeoRouter

Basic Usage

from cbintel.geo import GeoRouter

router = GeoRouter()

# Get proxy for geographic region
proxy = await router.get_proxy("us:ca")
print(f"Using proxy: {proxy}")  # http://17.0.0.1:8890

# Use in HTTP request
from cbintel.net import HTTPClient

async with HTTPClient() as client:
    response = await client.get(url, proxy=proxy)

Route Types

# VPN route (geographic)
proxy = await router.get_proxy("us:ca")

# Tor route (anonymous)
proxy = await router.get_proxy("tor")

# Direct route (no proxy)
proxy = await router.get_proxy(None)  # Returns None

Geographic Filters

Filter Format

{country}[:{state}][:{type}]
Component Description Examples
country ISO country code us, de, uk, fr
state US state code (optional) ca, ny, tx
type Special routing (optional) tor

Common Filters

Filter Description Route Type
us United States VPN
us:ca California VPN
us:ny New York VPN
de Germany VPN
uk United Kingdom VPN
tor Tor network Tor
us:ca:tor California via Tor VPN + Tor

Integration Patterns

In HTTP Client

from cbintel.net import HTTPClient
from cbintel.geo import GeoRouter

router = GeoRouter()

async with HTTPClient() as client:
    # Route through California
    proxy = await router.get_proxy("us:ca")
    response = await client.get("https://example.com", proxy=proxy)

In Graph Operations

stages:
  - name: fetch_multi_geo
    parallel:
      - op: fetch
        params:
          url: "https://example.com"
          geo: "us:ca"
        output: ca_content

      - op: fetch
        params:
          url: "https://example.com"
          geo: "us:ny"
        output: ny_content

      - op: fetch
        params:
          url: "https://example.com"
          geo: "de"
        output: de_content

In Job Submission

curl -X POST https://intel.nominate.ai/api/v1/jobs/crawl \
  -H "Content-Type: application/json" \
  -d '{
    "query": "local news trends",
    "geo": "us:ca"
  }'

Bank Selection

GeoRouter automatically selects VPN banks based on the geo filter.

Selection Logic

flowchart TD
    START[Geo Filter] --> CHECK_TOR{Is 'tor'?}
    CHECK_TOR -->|Yes| TOR[Return Tor Proxy]
    CHECK_TOR -->|No| CHECK_BANK{Bank exists?}

    CHECK_BANK -->|Yes| CHECK_HEALTH{Bank healthy?}
    CHECK_HEALTH -->|Yes| RETURN[Return Bank Proxy]
    CHECK_HEALTH -->|No| CREATE[Create Bank]

    CHECK_BANK -->|No| CREATE
    CREATE --> ASSIGN[Assign Workers]
    ASSIGN --> START_VPN[Start VPNs]
    START_VPN --> RETURN

Bank Creation

When a geo filter is requested but no matching bank exists:

# Request California proxy
proxy = await router.get_proxy("us:ca")

# GeoRouter checks for existing "us:ca" bank
# If none exists, creates one with available workers
# Returns bank endpoint: http://17.0.0.1:8890

Bank Caching

Banks are cached and reused:

# First request - may create bank
proxy1 = await router.get_proxy("us:ca")

# Second request - uses cached bank
proxy2 = await router.get_proxy("us:ca")

# Same endpoint
assert proxy1 == proxy2

Fallback Strategies

GeoRouter handles failures gracefully:

Bank Unhealthy

# If preferred bank is unhealthy
proxy = await router.get_proxy("us:ca", fallback=True)

# Router will:
# 1. Check if bank is healthy (workers up)
# 2. If unhealthy, try to recover
# 3. If recovery fails, fall back to:
#    - Another bank with similar geo
#    - Direct connection (if allowed)

No Workers Available

# If all workers are assigned to other banks
try:
    proxy = await router.get_proxy("us:ca")
except NoWorkersAvailable:
    # No workers free for new bank
    # Options:
    # 1. Wait for workers to free up
    # 2. Use existing bank with different geo
    # 3. Use direct connection

Health Checking

GeoRouter monitors bank health:

from cbintel.geo import GeoRouter

router = GeoRouter()

# Check if geo route is available
if await router.is_available("us:ca"):
    proxy = await router.get_proxy("us:ca")
else:
    print("California route unavailable")

# Get route status
status = await router.get_status("us:ca")
print(f"Workers up: {status['workers_up']}/{status['workers_total']}")

Configuration

Environment Variables

# Default geo filter (if none specified)
GEOROUTER_DEFAULT_GEO=us

# Fallback behavior
GEOROUTER_FALLBACK_ENABLED=true
GEOROUTER_FALLBACK_TO_DIRECT=false

# Health check interval
GEOROUTER_HEALTH_CHECK_INTERVAL=30

Router Options

router = GeoRouter(
    default_geo="us",           # Default if none specified
    fallback_enabled=True,      # Try fallbacks on failure
    fallback_to_direct=False,   # Don't allow direct as fallback
    cache_ttl=300,              # Cache routes for 5 minutes
)

Use Cases

Geographic Content Research

name: geographic_news_comparison
stages:
  - name: fetch_regional
    parallel:
      # Fetch same URL from different regions
      - op: fetch
        params:
          url: "https://news.example.com"
          geo: "us:ca"
        output: west_coast

      - op: fetch
        params:
          url: "https://news.example.com"
          geo: "us:ny"
        output: east_coast

      - op: fetch
        params:
          url: "https://news.example.com"
          geo: "uk"
        output: uk

  - name: compare
    sequential:
      - op: compare
        input: [west_coast, east_coast, uk]
        params:
          aspects: ["headlines", "coverage", "tone"]
        output: comparison

Privacy-Focused Research

name: anonymous_research
stages:
  - name: fetch_anonymous
    parallel:
      - op: tor_fetch
        params:
          url: "https://target.com"
        output: tor_content

      - op: fetch
        params:
          url: "https://target.com"
          geo: "ch"  # Switzerland
        output: vpn_content

Multi-Region Verification

from cbintel.geo import GeoRouter
from cbintel.net import HTTPClient

router = GeoRouter()
regions = ["us:ca", "us:ny", "uk", "de", "jp"]

async with HTTPClient() as client:
    results = {}
    for region in regions:
        proxy = await router.get_proxy(region)
        response = await client.get("https://api.example.com/data", proxy=proxy)
        results[region] = response.json()

    # Compare results across regions
    for region, data in results.items():
        print(f"{region}: {data['version']}")

Best Practices

  1. Reuse routes: Let GeoRouter cache and reuse banks
  2. Handle failures: Always have fallback strategies
  3. Check availability: Verify route before critical operations
  4. Monitor health: Watch for degraded banks
  5. Limit concurrency: Don't overwhelm single banks

Troubleshooting

Route Not Available

  1. Check if workers are available: GET /api/v1/workers/
  2. Check existing banks: GET /api/v1/banks/
  3. Verify profile exists for filter

Slow Performance

  1. Check bank health
  2. Consider worker load
  3. Try different geographic region

Inconsistent Results

  1. Verify all workers using same VPN provider
  2. Check for circuit rotation
  3. Consider IP caching at destination