The semantic layer
your AI agents reason on.
A multi-campus university system publishes its data into one governed Apache Iceberg lake on Amazon S3, registered in the AWS Glue Data Catalog. The data team authors in an Amazon Redshift warehouse and publishes those domains into the lake; everything else (facilities, financial aid, research grants, housing, library) lives there natively. One business glossary governs the whole lake, so a term like "Earned Credit Hours" is defined once and means the same thing for every reader. Amazon Quick and AI agents read the same governed definitions. Define once, govern everywhere. The layer runs on managed AWS services and is designed so that adopting AWS Context is a tool re-target, not a re-platform.
The problem, and the story this demo tells
A guided walkthrough. Follow the tabs in order.
A data lake records where the bytes are. It does not record
what they mean. Ask three people "how many students are active?" and you get
three numbers, because each person holds a different definition of "active." An AI agent has
less to work with: it sees only a column named enrollment_status_cd. This demo
shows how one governed glossary in the AWS Glue Data Catalog makes meaning consistent for every
human and every agent over one governed Iceberg lake, and how a single tool answers any question
without being rebuilt as the data grows.
- 1Knowledge GraphSee the whole system: campuses, the one governed lake, the glossary that governs it, and the agent tools.
- 2Meet the GuideAsk the live agent a question. One tool finds the data in the governed Iceberg lake, applies the glossary, and answers with the SQL (Structured Query Language) it ran. Amazon Quick returns the same governed number.
- 3GlossaryThe source of truth. Edit a term here and every agent's answer changes, with no redeploy.
- 4OnboardAdd a new data source by manifest. Because the tool is data-agnostic, the new source is answerable with no code change.
- 5Technology & RoadmapEvery layer is a managed AWS service, designed so that adopting AWS Context is a tool re-target, not a re-platform.
Executive summary
What this reference implementation provides.
The catalog is the semantic layer
Business meaning lives in the AWS Glue Data Catalog: a governed glossary whose terms carry the exact metric SQL, plus skill assets that ground agents in trusted definitions instead of inference. Define a term once, and every consumer reads the updated definition. There is no separate ontology store to operate.
One glossary over one governed lake
The data team authors student records in an Amazon Redshift warehouse and publishes them into one governed Apache Iceberg lake on S3, where the wider estate already lives. One glossary governs the whole lake, so "Earned Credit Hours" means the same thing for Amazon Quick and every agent. Unstructured questions go to a Bedrock Knowledge Base and return answers with citations.
One tool, agnostic of the data
A single MCP (Model Context Protocol) tool, kl_ask, answers any
question. It discovers the tables at query time, grounds the SQL on the live glossary,
and queries the governed lake through Athena, so onboarding a new table set
requires no code change. The agents' tools stay the same as the data grows.
Agents as first-class consumers
Every agent in the fleet reaches the layer as MCP tools through the AgentCore Gateway, both to consume governed knowledge and to contribute terms back under review. The design makes adopting AWS Context a tool re-target, not a re-platform. One layer serves the whole fleet.
One layer over a distributed system
The layer maps sources instead of migrating them. Meaning is the product.
Many campuses, one governed lake, one definition
The whole university system runs on one governed Apache Iceberg lake on S3 in the Glue Data Catalog: student records published from an Amazon Redshift warehouse, alongside facilities, financial aid, research grants, housing, and library. A single Glue Data Catalog business glossary governs all of it, so "how many students are enrolled this fall?" or "what is our housing occupancy rate?" gets the same trustworthy answer whether it is asked of one campus or the whole system, by a person, by Amazon Quick, or by an AI agent.
Live knowledge graph
A real-time projection of the layer: campuses, students, courses, enrollments, the governed glossary, agent skill assets, document corpora, and the tools the fleet uses. Drag to explore. Choose any node for detail.
Onboard a data source
A new source joins the knowledge layer by declaring a small manifest, with no code. Choose a type, describe the source, and run a dry run to see what would be enriched and indexed and how the source joins the graph. The dry run is read-only and never writes.
Business glossary
The controlled vocabulary, governed in the AWS Glue Data Catalog — the single source of truth every agent reads. Browse what's defined, propose a new term in plain language, and (as a reviewer) promote it live — where every tool and agent immediately consumes it. This is the steward workflow, running against the real catalog.
Live terms
Pending drafts
Propose a new term
A steward writes plain business language. It's saved as a reviewed draft — nothing goes live until promoted.
Skills
Skills are native Glue Data Catalog assets
(amazon::Skill) whose markdown bodies live in S3 — the agent loads these
bodies at query time. Table links are amazon::RelatedTo attachments.
Edit skill
Meet the Knowledge Layer Guide
The same agent your fleet runs, here in the browser. It explores the layer end to end: it asks governed questions over one governed lake with one tool, shows that meaning is governed in one place, runs live code for analysis, and browses the web, narrating which tool it calls at each step. The chat uses the native AG-UI protocol: your browser streams typed events directly from an Amazon Bedrock AgentCore runtime, carrying your identity to the agent with no relay in between.
Ask me anything about the Knowledge Layer. I discover the data in the governed lake, apply the governed glossary, query it through Athena, and show my work: the tool I called, the SQL, the citations, the code. Start with a chip above, or type your own question.
kl_ask over one governed lake · read-only over governed dataThis agent's capabilities
A single consolidated agent, wired with the tools it needs to explore and interact with the layer end to end.
One definition governs every answer
Ask the Guide about student records (enrollments, credits, GPA,
DFW rate) or the wider estate (housing, financial aid, research grants). It is all
one governed Apache Iceberg lake on S3. The kl_ask tool discovers the
right tables at query time, pulls the governed definition from the glossary, writes the
SQL, and answers with the SQL and provenance. The agent holds no baked
definitions, so a steward who edits a term in the catalog changes every answer
immediately, with no redeploy. Amazon Quick reads the same governed terms, so the
numbers match.
| Governed term | What the glossary defines |
|---|---|
| Enrolled Headcount | COUNT(DISTINCT student_pidm) where registration is active |
| Earned Credit Hours | sum of credits_earned (not attempted; W/I/AU excluded) |
| DFW Rate | 1.0 * SUM(CASE grade IN (D,F,W))/COUNT(*) |
How this conversation reaches the agent
The browser speaks the native AG-UI protocol directly to the runtime, with no relay and no WebSocket gateway. Typed events stream the answer, the tool calls, and the results token by token, and your identity travels all the way to the agent.
The technology
Every layer is a managed AWS service. This section describes what each service does, how this implementation uses it, and how it scales, including capabilities announced at AWS Summit New York 2026.
How the platform evolves
The platform adopted managed services as they reached general availability, designed the data model around preview features, and aligned to the two published contracts of AWS Context, so adopting AWS Context later is a tool re-target, not a re-platform.
The AWS Context horizon
AWS Context is AWS's own managed knowledge graph and agentic search service. It maps relationships across your data, serves governed relationships, business rules, and domain knowledge to agents at runtime, learns from how agents use it, and stores its metadata as Iceberg in S3 Tables — Apache Iceberg on S3, the same open format this layer writes.
Because the lake is Apache Iceberg on S3, the semantics live in the Glue Data Catalog, and all agent access is MCP behind the gateway, adopting AWS Context is a migration of one tool target. The institution's accumulated meaning carries over.