Studio — Agentic Data Fabric

02 — The Real Problem

● Before Studio

Manual. Brittle. Invisible.

Engineers spent 2–4 weeks per connector writing schema mapping scripts
Field-level mappings broke silently when connectors updated their schemas
No visibility into what data was mapped, missed, or transformed
Only engineers could do it — security teams waited in a queue
Rollback meant reverting code deploys, not reversing data changes
Each connector was a bespoke integration — knowledge didn't transfer

● After Studio

Intentional. Observable. Reversible.

Describe the connector in plain language — agents handle the rest
Drift Sentinel monitors live schemas and proposes remappings automatically
Every agent action, decision, and confidence score is logged and auditable
Security engineers can trigger and review ingestions without engineering queues
Every ingestion is a versioned commit — roll back any graph change in seconds
Agent learnings generalise — new connectors bootstrap from prior mappings

03 — Agent Fleet Architecture

Five specialized agents, each with a scoped role, named identity, and confidence scoring — designed to be observed, not trusted blindly.

Agent 01

Sentinel

Discovery

Scans raw connector output to identify entity types, relationship candidates, and schema structure. First agent in every pipeline — generates the discovery manifest.

🔭

Agent 02

Architect

Schema Mapping

Proposes how discovered entities map to the existing Knowledge Graph ontology — matching types, properties, and relationship labels with a confidence score per mapping.

🗺

Agent 03

Weaver

Transformation

Builds field-level transformation rules — type coercions, normalizations, ID resolutions, and format conversions. Generates executable transform specs, not code.

🧵

Agent 04

Arbiter

Conflict Resolution

Surfaces mapping disagreements between Architect and Weaver as Resolution Cards — showing both proposals, confidence levels, and reasoned recommendations for human review.

⚖

Agent 05

Steward

Pre-commit Validation

Runs integrity checks before any graph write — schema compliance, duplicate detection, relationship consistency, and graph impact analysis. Nothing commits without Steward's sign-off.

🛡

04 — Design Challenges

01

Making agents legible without creating anxiety

Each agent runs its own inference loop. Showing every intermediate thought would overwhelm users; hiding everything would destroy trust in the output.

Design Response

Agents surface a single key signal per run — their top decision with confidence — behind a progressive disclosure layer. The full action log is always one click away, but never forced.

02

Designing conflict as a first-class state, not an error

When Arbiter fires, it means two agents disagree. In most systems this is treated as a failure. In Studio, it means the data has genuine ambiguity that a human should weigh in on.

Design Response

Conflict Resolution Cards present both proposals side-by-side with confidence scores, agent reasoning, and downstream graph impact. The human chooses — not accepts a "best guess".

03

Defining the human-in-the-loop threshold

Interrupting too often defeats the purpose of agentic automation. Interrupting too rarely creates a false sense of safety and hidden errors in the graph.

Design Response

Admins configure a per-pipeline confidence threshold (default: 80%). Agents below this threshold pause and create a Review Item rather than proceeding. High-confidence runs commit automatically with a full audit trail.

04

Time as a design primitive — reversibility from day one

Graph mutations are persistent. An agent making a wrong high-confidence mapping could silently corrupt relationships across thousands of connected entities.

Design Response

Every ingestion run is a versioned transaction. The graph maintains a full history of agent-authored commits. Rollback is a single button — not a database operation — and shows a diff of what will be undone before confirming.

05 — Trust & Provenance Layer

Every entity in the Knowledge Graph carries a trust record — which connector sourced it, which agent mapped it, and what confidence tier it was committed at. This isn't metadata. It's a design primitive that shapes how analysts use the data.

Entity Provenance Log · Finance BU Run · v4.2 Committed · 47 entities

HOST · api-gateway-prod Qualys Weaver 97% High

CVE · CVE-2024-23113 Tenable Weaver 91% High

CONTROL · PCI-DSS 6.3.3 ServiceNow Arbiter 74% Reviewed

IDENTITY · svc-account-db01 Okta Architect 88% High

ASSET · db-cluster-prod-01 CrowdStrike Architect 83% High

06 — Key Interface Decisions

Decision

Name every agent. Never call them "the system" or "AI".

Rationale

When something is wrong, users need to know which agent was responsible. Named agents can be individually paused, re-run, or replaced — anonymous systems cannot. Identity creates accountability.

Decision

Confidence is always visible. No binary pass/fail states.

Rationale

A mapping that passes at 65% confidence is not the same as one at 97%. Showing the number changes user behavior — they scrutinize the right things. Binary states create false security and overrides happen blindly.

Decision

Preview graph changes before any commit is allowed.

Rationale

The Knowledge Graph is a shared resource. A single ingestion run might create, modify, or re-link thousands of entities. Users need to see the blast radius before they approve — not after.

Decision

Drift detection is ambient, not on-demand.

Rationale

Connectors change schemas without warning. If we only check at ingestion time, silent drift corrupts the graph between runs. A background Drift Sentinel agent monitors continuously and surfaces changes before they cause damage.

07 — Deliberately Paused

Features we prototyped and chose to hold — not because they were impossible, but because shipping them without the right trust foundation would have undermined the system's core value.

Autonomous Commits

Agents writing directly to the graph without human review. The speed gain wasn't worth the trust risk in early adoption — reversibility needs to be demonstrated before it can be removed.
Cross-Tenant Agent Sharing

Using mapping learnings from one customer's ingestion to bootstrap another's. Technically straightforward; legally and ethically complex. Tabled until data isolation guarantees were airtight.
Free-form Agent Instructions

Letting users write natural language instructions that modify agent behavior mid-run. Powerful but unpredictable. Held in favor of a structured intent-field with validated options.

08 — Outcomes

3 wks → 4h

Connector integration time reduced by ~95% for standard connectors. Engineers no longer block the queue.

0 silent failures

Drift Sentinel caught 4 connector schema changes in the first month that would have silently corrupted graph data under the old system.

Full audit trail

Every entity in the graph now has a complete provenance record — source, agent, confidence, timestamp, and full rollback path.

Studio —Agentic Data Fabric

Manual. Brittle. Invisible.

Intentional. Observable. Reversible.

Making agents legible without creating anxiety

Designing conflict as a first-class state, not an error

Defining the human-in-the-loop threshold

Time as a design primitive — reversibility from day one

Autonomous Commits

Cross-Tenant Agent Sharing

Free-form Agent Instructions

Studio —
Agentic Data Fabric