Operations Reference¶
Complete reference for all 31 graph operations.
Discovery Operations¶
search¶
Web search to discover URLs.
| Param | Type | Default | Description |
|---|---|---|---|
query |
string | required | Search query |
max_results |
int | 50 | Max results |
provider |
string | duckduckgo | Search engine |
Input: None
Output: Url[]
archive_discover¶
Discover historical URLs from archives.
- op: archive_discover
params:
domain: "example.com"
sources: ["wayback", "commoncrawl"]
limit: 1000
output: urls
| Param | Type | Default | Description |
|---|---|---|---|
domain |
string | required | Domain to discover |
sources |
string[] | ["wayback"] | Archive sources |
limit |
int | 100 | Max URLs |
Output: Url[]
youtube_search¶
Search YouTube videos.
Output: Url[]
Acquisition Operations¶
fetch¶
Fetch single URL content.
| Param | Type | Default | Description |
|---|---|---|---|
url |
string | required | URL to fetch |
geo |
string | null | Geographic routing |
timeout |
int | 30 | Timeout seconds |
Output: Html
fetch_batch¶
Fetch multiple URLs.
Input: Url[]
Output: Html[]
fetch_archive¶
Fetch historical content.
Input: Url
Output: Html
tor_fetch¶
Fetch through Tor network.
- op: tor_fetch
params:
url: "http://example.onion"
mode: "sticky"
sticky_key: "example.onion"
output: content
Output: Html
screenshot¶
Capture browser screenshot.
- op: screenshot
params:
url: "https://example.com"
full_page: true
viewport_width: 1920
output: image
Output: Image
tor_screenshot¶
Screenshot through Tor.
Output: Image
download¶
Download binary file.
Output: bytes
Transform Operations¶
to_markdown¶
Convert HTML to Markdown.
Input: Html
Output: Markdown
to_text¶
Convert HTML to plain text.
Input: Html
Output: Text
to_text_batch¶
Batch text conversion.
Input: Html[]
Output: Text[]
chunk¶
Split text into chunks.
| Param | Type | Default | Description |
|---|---|---|---|
size |
int | 500 | Words per chunk |
overlap |
int | 50 | Overlapping words |
Input: Text
Output: Chunk[]
extract_links¶
Extract URLs from HTML.
Input: Html
Output: Url[]
merge¶
Merge multiple texts.
Input: Text[]
Output: Text
Process Operations¶
embed¶
Generate single embedding.
Input: Text
Output: Vector
embed_batch¶
Batch embedding generation.
Input: Chunk[]
Output: Vector[]
ocr¶
Extract text from image.
Input: Image
Output: Text
Filter Operations¶
semantic_filter¶
Filter by semantic similarity.
- op: semantic_filter
input: chunks
params:
query: "AI safety"
threshold: 0.5
output: relevant_chunks
| Param | Type | Default | Description |
|---|---|---|---|
query |
string | required | Filter query |
threshold |
float | 0.5 | Min similarity |
Input: Chunk[]
Output: Chunk[]
quality_filter¶
Filter by quality score.
Input: Chunk[]
Output: Chunk[]
filter_urls¶
Filter URLs by pattern.
- op: filter_urls
input: urls
params:
pattern: "*.gov"
exclude: ["tracking", "analytics"]
output: filtered_urls
Input: Url[]
Output: Url[]
filter_entities¶
Filter entities by type/confidence.
- op: filter_entities
input: entities
params:
types: ["person", "organization"]
min_confidence: 0.7
output: filtered_entities
Input: Entity[]
Output: Entity[]
Extract Operations¶
entities¶
Extract named entities.
| Param | Type | Default | Description |
|---|---|---|---|
types |
string[] | all | Entity types |
Input: Text
Output: Entity[]
topics¶
Extract topics.
Input: Text
Output: string[]
summarize¶
Generate summary.
Input: Text
Output: Text
Store Operations¶
store_vector¶
Store embedding.
- op: store_vector
input: vector
params:
store: "my-index"
id: "doc_001"
metadata:
source: "{{ url }}"
output: ref
store_vectors¶
Batch store embeddings.
store_entity¶
Store entity.
store_entities¶
Batch store entities.
search_vectors¶
Semantic vector search.
Output: Chunk[]
Synthesize Operations¶
integrate¶
Synthesize chunks into summary.
- op: integrate
input: chunks
params:
query: "{{ query }}"
model: "claude-3-5-sonnet"
output: synthesis
Input: Chunk[]
Output: Text
chat¶
AI conversation.
- op: chat
params:
messages:
- role: user
content: "Analyze this data: {{ data }}"
model: "claude-3-5-sonnet"
output: response
Output: Text
compare¶
Compare multiple texts.
- op: compare
input: [text1, text2]
params:
aspects: ["coverage", "tone", "facts"]
output: comparison
Input: Text[]
Output: Text
to_report¶
Generate structured report.
- op: to_report
input:
synthesis: synthesis
entities: entities
sources: urls
params:
template: research_report
include_diagrams: true
output: report
Output: Markdown
Geo Operations¶
geocode¶
Forward geocoding.
Output: GeoPoint
reverse_geocode¶
Reverse geocoding.
Input: GeoPoint
Output: string
Utility Operations¶
diff¶
Compare content versions.
sentiment¶
Sentiment analysis.
translate¶
Text translation.