The semantic layer
your AI agents reason on.
The Bedrock University System holds its truth in scattered systems — campuses on different data engines (a lakehouse for most, Amazon Redshift for the flagship), course catalogs, policies, transcripts, and the working memory of an entire fleet of AI agents. The Knowledge Layer turns that sprawl into one governed meaning layer — a single master glossary where "enrolled headcount" means the same thing on every engine and to every consumer. Built entirely on managed AWS services, designed to ride AWS Context the day it lands.
Executive summary
What we built, in one minute.
The catalog is the semantic layer
Business meaning lives natively in the AWS Glue Data Catalog — table and column descriptions, a governed glossary, and skill assets that ground agents in trusted definitions instead of guesswork. No bespoke ontology store to run.
Two retrieval spines, one truth
Structured questions become governed SQL over Apache Iceberg via Athena. Unstructured questions hit a Bedrock Knowledge Base on S3 Vectors and return answers with citations. Same catalog, same governed definitions.
Agents as first-class citizens
Every agent in the fleet reaches the layer as MCP tools through the AgentCore Gateway — to consume knowledge and to contribute it back under review. One layer, the whole fleet.
Onboard anything
A manifest onboards a new source in minutes — structured data, an S3 document store, or an external SaaS like a personal notes graph (federated, never copied). The layer grows with the institution.
One layer over a distributed system
Mapping, not migration — meaning is the product.
Many campuses. Many engines. One definition.
Each campus keeps its own systems — three on the Iceberg lakehouse, the Bedrock University flagship on Amazon Redshift. The layer maps their meaning into one governed graph with a single master glossary, so "how many students are enrolled this fall?" gets the same trustworthy answer on every engine, asked of one campus or the whole system, by a person, an AI agent, or Amazon Quick.
Live knowledge graph
A real-time projection of the layer — campuses, students, courses, enrollments, the governed glossary, agent skill assets, document corpora, and the tools the fleet uses. Drag to explore. Click any node for detail.
Onboard a data source
A new source joins the knowledge layer by declaring a small manifest — no code. Pick a type, describe it, and run a dry run to see exactly what would be enriched, indexed, and how it joins the graph. The dry run is read-only — it never writes.
Business glossary
The controlled vocabulary, governed in the AWS Glue Data Catalog. Browse what's defined, propose a new term in plain language, and (as a reviewer) promote it live — where every tool and agent immediately consumes it. This is §5.6 of the guide, working.
Live terms
Pending drafts
Propose a new term
A steward writes plain business language. It's saved as a reviewed draft — nothing goes live until promoted.
Redshift, inside the lakehouse. One governed definition.
Bedrock University is the flagship campus, and the one whose data lives in Amazon Redshift. Through SageMaker Lakehouse its curriculum is brought into the same AWS Glue Data Catalog as the Iceberg campuses — so Redshift isn't a separate stack beside the lakehouse, it's a first-class citizen inside it, governed by the same master glossary and Lake Formation. That one governed vocabulary is consumed three independent ways — an AI agent doing live NL→SQL on Redshift, Athena, and a live Amazon Quick dashboard — and all three agree on the same governed number because they read one catalog. Define once, consume everywhere. Ask below; it queries the live warehouse.
Ask the governed warehouse
checking warehouse…bedrock-university · read-only · governed by the Glue glossaryThe one master glossary
These governed terms live ONCE in the Glue Data Catalog master glossary — the same vocabulary the lakehouse campuses, this Redshift campus, and Amazon Quick all bind to. So "enrolled headcount" means the same thing across the entire system.
One definition, three consumers, one answer
The headline proof, verified live: the same governed metric, computed independently by three different consumers — each reading the one Glue Data Catalog — agrees to the number.
| Fall 2025 | Agent · NL→SQL | Athena · Glue catalog | Amazon Quick · DirectQuery |
|---|---|---|---|
| Enrolled Headcount | 452 | 452 | 452 |
| Student Credit Hours | 1,778 | 1,778 | 1,778 |
COUNT(DISTINCT student_id) WHERE status='enrolled'
on Redshift; Athena and the Amazon Quick dashboard (bu-parity, DirectQuery)
run the same over the Redshift-origin curriculum in the Glue lakehouse catalog — governed by the
identical glossary definition the agent uses. One definition, three consumers, one number.The technology
Every layer is a managed AWS service. Here's what each does, how we use it, and how it scales — newest capabilities from the AWS Summit New York 2026 included.
How the platform evolves
We adopted the managed pieces as they reached GA, designed the data model to ride preview features, and aligned to the two published contracts of AWS Context — so adopting it later is re-pointing a tool, not a re-platform.
The AWS Context horizon
AWS Context is AWS's own managed knowledge-graph + agentic-search service. It maps relationships across your data, serves governed relationships, business rules, and domain knowledge to agents at runtime, learns from how agents use it, and — crucially — stores its metadata as Iceberg in S3 Tables: exactly the substrate this layer already writes.
Because our data is Iceberg-in-S3-Tables, our semantics live in the Glue Data Catalog, and all agent access is MCP behind the gateway, adopting AWS Context becomes a migration of one tool target — the institution's accumulated meaning carries straight over.