Output Shapes¶

Each job type produces a specific output structure.

Common Fields¶

All job responses include:

{
  "job_id": "job_abc123",
  "status": "COMPLETED",
  "job_type": "crawl",
  "progress": 100,
  "created_at": "2024-01-15T10:30:00Z",
  "started_at": "2024-01-15T10:30:05Z",
  "completed_at": "2024-01-15T10:35:00Z",
  "duration_seconds": 295,
  "result": {...}
}

crawl Output¶

{
  "result": {
    "total_urls": 42,
    "urls_processed": 42,
    "urls_failed": 0,
    "chunks_generated": 156,
    "embeddings_stored": true,
    "batches": [
      {
        "batch_id": 1,
        "depth": 0,
        "urls_count": 10,
        "avg_score": 7.2
      },
      {
        "batch_id": 2,
        "depth": 1,
        "urls_count": 32,
        "avg_score": 6.8
      }
    ],
    "top_sources": [
      {
        "url": "https://example.com/article",
        "score": 9.2,
        "title": "AI Regulation Overview"
      }
    ],
    "synthesis": "AI regulation is evolving rapidly across jurisdictions...",
    "report_url": "https://files.nominate.ai/cbintel-jobs/job_abc123/report.md",
    "data_url": "https://files.nominate.ai/cbintel-jobs/job_abc123/data.json"
  }
}

Key Fields¶

Field	Type	Description
`total_urls`	int	Total URLs discovered
`urls_processed`	int	URLs successfully processed
`chunks_generated`	int	Text chunks created
`embeddings_stored`	bool	Whether vectors were stored
`synthesis`	string	AI-generated summary
`report_url`	string	URL to markdown report

graph Output¶

{
  "result": {
    "graph_name": "research_pipeline",
    "stages_completed": 5,
    "stages_total": 5,
    "duration_seconds": 245,
    "stages": [
      {
        "name": "discover",
        "status": "completed",
        "duration_seconds": 12,
        "outputs": ["urls"]
      },
      {
        "name": "fetch",
        "status": "completed",
        "duration_seconds": 85,
        "outputs": ["pages"]
      },
      {
        "name": "process",
        "status": "completed",
        "duration_seconds": 120,
        "outputs": ["chunks", "entities"]
      },
      {
        "name": "synthesize",
        "status": "completed",
        "duration_seconds": 28,
        "outputs": ["summary"]
      }
    ],
    "outputs": {
      "urls": ["https://example.com/1", "https://example.com/2"],
      "summary": "Research findings indicate...",
      "entity_count": 47
    },
    "artifacts_url": "https://files.nominate.ai/cbintel-jobs/job_abc123/"
  }
}

Key Fields¶

Field	Type	Description
`graph_name`	string	Name of executed graph
`stages_completed`	int	Stages finished
`stages`	array	Per-stage details
`outputs`	object	Named outputs from graph
`artifacts_url`	string	URL to all artifacts

screenshot Output¶

{
  "result": {
    "urls_processed": 5,
    "urls_failed": 0,
    "format": "png",
    "screenshots": [
      {
        "url": "https://example.com",
        "file_url": "https://files.nominate.ai/.../example_com.png",
        "width": 1920,
        "height": 3500,
        "size_bytes": 1250000,
        "captured_at": "2024-01-15T10:32:15Z"
      },
      {
        "url": "https://example.org",
        "file_url": "https://files.nominate.ai/.../example_org.png",
        "width": 1920,
        "height": 2800,
        "size_bytes": 980000,
        "captured_at": "2024-01-15T10:32:45Z"
      }
    ]
  }
}

Key Fields¶

Field	Type	Description
`screenshots`	array	Screenshot details
`file_url`	string	URL to download image
`width`	int	Image width in pixels
`height`	int	Image height in pixels

lazarus Output¶

{
  "result": {
    "mode": "domain",
    "domain": "example.com",
    "urls_discovered": 1523,
    "snapshots_retrieved": 87,
    "date_range": {
      "earliest": "2015-03-21T00:00:00Z",
      "latest": "2024-01-10T00:00:00Z"
    },
    "yearly_distribution": {
      "2015": 12,
      "2016": 45,
      "2017": 30
    },
    "snapshots": [
      {
        "url": "https://example.com/about",
        "timestamp": "2020-06-15T12:00:00Z",
        "status": 200,
        "content_type": "text/html",
        "size_bytes": 45000
      }
    ],
    "output_url": "https://files.nominate.ai/cbintel-jobs/job_abc123/snapshots.json"
  }
}

Key Fields¶

Field	Type	Description
`urls_discovered`	int	Historical URLs found
`snapshots_retrieved`	int	Snapshots downloaded
`date_range`	object	Earliest/latest dates
`yearly_distribution`	object	Snapshots by year

vectl Output¶

Embed Operation¶

{
  "result": {
    "operation": "embed",
    "texts_processed": 50,
    "model": "nomic-embed-text",
    "dimensions": 768,
    "store": "my-index",
    "vectors_stored": true
  }
}

Search Operation¶

{
  "result": {
    "operation": "search",
    "query": "machine learning algorithms",
    "store": "my-index",
    "matches": [
      {
        "id": "doc_123",
        "score": 0.92,
        "text": "Machine learning algorithms can be categorized...",
        "metadata": {
          "source": "ml-textbook.txt",
          "chunk_index": 5
        }
      },
      {
        "id": "doc_456",
        "score": 0.87,
        "text": "The most common ML algorithms include...",
        "metadata": {
          "source": "intro-to-ai.txt",
          "chunk_index": 12
        }
      }
    ]
  }
}

transcript Output¶

{
  "result": {
    "video_id": "dQw4w9WgXcQ",
    "title": "Video Title Here",
    "channel": "Channel Name",
    "duration_seconds": 212,
    "language": "en",
    "transcript": [
      {
        "start": 0.0,
        "duration": 3.5,
        "text": "Never gonna give you up"
      },
      {
        "start": 3.5,
        "duration": 4.0,
        "text": "Never gonna let you down"
      }
    ],
    "full_text": "Never gonna give you up. Never gonna let you down...",
    "word_count": 487
  }
}

Key Fields¶

Field	Type	Description
`transcript`	array	Timestamped segments
`full_text`	string	Complete transcript text
`duration_seconds`	int	Video length

browser Output¶

{
  "result": {
    "success": true,
    "actions_executed": 4,
    "actions": [
      {
        "type": "fill",
        "selector": "input[name='q']",
        "status": "success"
      },
      {
        "type": "click",
        "selector": "button[type='submit']",
        "status": "success"
      },
      {
        "type": "wait_for_element",
        "selector": ".results",
        "status": "success",
        "wait_time_ms": 1250
      },
      {
        "type": "extract_text",
        "selector": ".results",
        "status": "success"
      }
    ],
    "extracted_data": {
      "results": "Search results text here..."
    },
    "initial_url": "https://example.com",
    "final_url": "https://example.com/search?q=test",
    "screenshots": [
      {
        "action_index": 3,
        "file_url": "https://files.nominate.ai/.../screenshot.png"
      }
    ]
  }
}

Key Fields¶

Field	Type	Description
`success`	bool	Overall success
`actions_executed`	int	Actions completed
`extracted_data`	object	Data from extractions
`final_url`	string	URL after all actions

Error Output¶

When jobs fail:

{
  "job_id": "job_abc123",
  "status": "FAILED",
  "job_type": "crawl",
  "progress": 45,
  "error": {
    "type": "NetworkError",
    "message": "Connection timeout after 30 seconds",
    "details": {
      "url": "https://example.com/page",
      "attempt": 3
    }
  },
  "attempts": 3,
  "max_retries": 3,
  "partial_result": {
    "urls_processed": 15,
    "urls_remaining": 35
  }
}

Error Fields¶

Field	Type	Description
`error.type`	string	Error class name
`error.message`	string	Human-readable message
`error.details`	object	Additional context
`partial_result`	object	Partial results if available