Artifacts¶
Artifacts are files produced by graph executions within a workspace.
Artifact Types¶
| Type | Extension | Description |
|---|---|---|
result |
.json | Graph execution result |
report |
.md | Generated markdown report |
screenshot |
.png | Browser screenshots |
pdf |
Generated PDF documents | |
data |
.json | Structured data output |
transcript |
.txt | Video transcripts |
embeddings |
.npy | Vector embeddings |
Artifact Model¶
@dataclass
class Artifact:
artifact_id: str # Unique identifier
workspace_id: str # Parent workspace
run_id: str # Parent run (optional)
path: str # Storage path
artifact_type: str # Type from table above
size_bytes: int # File size
content_type: str # MIME type
created_at: datetime # Creation timestamp
metadata: dict # Additional metadata
file_url: str # Download URL
Storage Layout¶
workspaces/{workspace_id}/
├── runs/
│ └── {run_id}/
│ ├── result.json # Graph result
│ ├── report.md # Generated report
│ └── artifacts/
│ ├── screenshots/
│ │ ├── 001_example_com.png
│ │ └── 002_example_org.png
│ ├── data/
│ │ ├── entities.json
│ │ └── chunks.json
│ └── pdfs/
│ └── document.pdf
└── outputs/
├── weekly_report.md
└── entity_network.json
Working with Artifacts¶
List Artifacts¶
workspace = await manager.get("ws_abc123")
artifacts = await workspace.list_artifacts()
for artifact in artifacts:
print(f"{artifact.path}")
print(f" Type: {artifact.artifact_type}")
print(f" Size: {artifact.size_bytes} bytes")
print(f" URL: {artifact.file_url}")
Filter by Type¶
# Get all screenshots
screenshots = await workspace.list_artifacts(
artifact_type="screenshot"
)
# Get all from specific run
run_artifacts = await workspace.list_artifacts(
run_id="run_xyz789"
)
Download Artifact¶
artifact = await workspace.get_artifact("art_abc123")
# Get URL
print(f"Download: {artifact.file_url}")
# Download content
content = await workspace.download_artifact("art_abc123")
Upload Artifact¶
# Upload file
artifact = await workspace.upload_artifact(
path="reports/custom_report.md",
content=report_content.encode(),
artifact_type="report",
metadata={"author": "analyst"}
)
Artifact Creation During Graph Execution¶
Automatic Artifacts¶
When graphs execute, these artifacts are created automatically:
# result.json - always created
{
"graph_name": "research_pipeline",
"stages_completed": 5,
"outputs": {...}
}
# report.md - if to_report operation used
# Research Report
## Summary
...
Custom Artifacts¶
Use store_artifact operation:
- op: store_artifact
input: data
params:
path: "analysis/entities.json"
type: "data"
output: artifact_ref
Artifact Metadata¶
Standard Metadata¶
{
"artifact_id": "art_abc123",
"created_at": "2024-01-15T10:30:00Z",
"content_type": "image/png",
"size_bytes": 125000,
"checksum": "sha256:abc123..."
}
Custom Metadata¶
await workspace.upload_artifact(
path="screenshots/homepage.png",
content=image_data,
artifact_type="screenshot",
metadata={
"url": "https://example.com",
"viewport": "1920x1080",
"full_page": True
}
)
Artifact Lifecycle¶
stateDiagram-v2
[*] --> Created: upload/auto-create
Created --> Available: processing complete
Available --> Indexed: index enabled
Available --> Archived: archive workspace
Indexed --> Archived: archive workspace
Archived --> Available: unarchive
Available --> [*]: delete
Archived --> [*]: delete with workspace
Querying Artifacts¶
By Path Pattern¶
# Get all markdown reports
reports = await workspace.list_artifacts(
path_pattern="**/*.md"
)
# Get screenshots from specific run
screenshots = await workspace.list_artifacts(
path_pattern="runs/run_xyz789/artifacts/screenshots/*"
)
By Date Range¶
from datetime import datetime, timedelta
# Last 7 days
recent = await workspace.list_artifacts(
created_after=datetime.utcnow() - timedelta(days=7)
)
By Metadata¶
# Screenshots of specific URL
screenshots = await workspace.list_artifacts(
artifact_type="screenshot",
metadata_filter={"url": "https://example.com"}
)
Artifact Size Limits¶
| Artifact Type | Max Size |
|---|---|
| screenshot | 10 MB |
| 50 MB | |
| result | 10 MB |
| report | 5 MB |
| data | 100 MB |
| embeddings | 500 MB |
Best Practices¶
- Use consistent paths - Follow the standard layout
- Add metadata - Makes querying easier
- Clean up - Delete unnecessary artifacts
- Monitor size - Watch workspace total size
- Use checksums - Verify artifact integrity