FULLY MANAGED · CATALOG-FIRST · AWS-NATIVE

The semantic layer
your AI agents reason on.

The Bedrock University System holds its truth in scattered systems — campuses on different data engines (a lakehouse for most, Amazon Redshift for the flagship), course catalogs, policies, transcripts, and the working memory of an entire fleet of AI agents. The Knowledge Layer turns that sprawl into one governed meaning layer — a single master glossary where "enrolled headcount" means the same thing on every engine and to every consumer. Built entirely on managed AWS services, designed to ride AWS Context the day it lands.

Executive summary

What we built, in one minute.

🗂️

The catalog is the semantic layer

Business meaning lives natively in the AWS Glue Data Catalog — table and column descriptions, a governed glossary, and skill assets that ground agents in trusted definitions instead of guesswork. No bespoke ontology store to run.

🧠

Two retrieval spines, one truth

Structured questions become governed SQL over Apache Iceberg via Athena. Unstructured questions hit a Bedrock Knowledge Base on S3 Vectors and return answers with citations. Same catalog, same governed definitions.

🤝

Agents as first-class citizens

Every agent in the fleet reaches the layer as MCP tools through the AgentCore Gateway — to consume knowledge and to contribute it back under review. One layer, the whole fleet.

🌐

Onboard anything

A manifest onboards a new source in minutes — structured data, an S3 document store, or an external SaaS like a personal notes graph (federated, never copied). The layer grows with the institution.

One layer over a distributed system

Mapping, not migration — meaning is the product.

CONSUMERS

Open WebUI

AgentCore fleet

Campus staff & advisors

▼ AgentCore Gateway · MCP · per-user identity ▼

KNOWLEDGE LAYER · MCP TOOLS (one gateway)

kl_ask · kl_metricsgoverned SQL · all campuses

bu_ask · bu_glossarydirect Redshift surface

kl_retrieve · kl_recall

kl_search · kl_describe · kl_sources

kl_propose

▼ ONE master glossary in the Glue Data Catalog governs every engine ▼

SAGEMAKER LAKEHOUSE · ONE CATALOG OVER MANY ENGINES

Glue Data Catalogmaster glossary · governs all engines

S3 Tables / Iceberg3 campuses · Athena

Amazon Redshiftflagship · FEDERATED into the catalog

Amazon QuickDirectQuery BI · governed 452

Bedrock Knowledge BaseS3 Vectors · cited RAG

Lake Formationone governance plane

▼ Iceberg-in-S3 + federated catalogs — the substrate AWS Context reads ▼

ON THE HORIZON

AWS Context — managed knowledge graph + agentic search coming soon

Many campuses. Many engines. One definition.

Each campus keeps its own systems — three on the Iceberg lakehouse, the Bedrock University flagship on Amazon Redshift. The layer maps their meaning into one governed graph with a single master glossary, so "how many students are enrolled this fall?" gets the same trustworthy answer on every engine, asked of one campus or the whole system, by a person, an AI agent, or Amazon Quick.

Live knowledge graph

A real-time projection of the layer — campuses, students, courses, enrollments, the governed glossary, agent skill assets, document corpora, and the tools the fleet uses. Drag to explore. Click any node for detail.

Select a node to inspect it.

drag nodes · scroll to zoom · click to inspect

Onboard a data source

A new source joins the knowledge layer by declaring a small manifest — no code. Pick a type, describe it, and run a dry run to see exactly what would be enriched, indexed, and how it joins the graph. The dry run is read-only — it never writes.

read-only · nothing is written

Fill the manifest and run a dry run to preview the onboarding plan.

Business glossary

The controlled vocabulary, governed in the AWS Glue Data Catalog. Browse what's defined, propose a new term in plain language, and (as a reviewer) promote it live — where every tool and agent immediately consumes it. This is §5.6 of the guide, working.

Live terms

Propose a new term

A steward writes plain business language. It's saved as a reviewed draft — nothing goes live until promoted.

Term name Short definition Usage / long definition Applies to (optional)

saved as a draft · reviewed before going live

LIVE · BEDROCK UNIVERSITY — THE REDSHIFT CAMPUS, FEDERATED INTO THE LAKEHOUSE

Redshift, inside the lakehouse. One governed definition.

Bedrock University is the flagship campus, and the one whose data lives in Amazon Redshift. Through SageMaker Lakehouse its curriculum is brought into the same AWS Glue Data Catalog as the Iceberg campuses — so Redshift isn't a separate stack beside the lakehouse, it's a first-class citizen inside it, governed by the same master glossary and Lake Formation. That one governed vocabulary is consumed three independent ways — an AI agent doing live NL→SQL on Redshift, Athena, and a live Amazon Quick dashboard — and all three agree on the same governed number because they read one catalog. Define once, consume everywhere. Ask below; it queries the live warehouse.

🛢️Amazon Redshiftcurriculum warehouse (RMS)

↔

🏛️SageMaker Lakehousefederates Redshift into the catalog

↔

📖Glue Data CatalogONE master glossary · governs every engine

→

🤖Agentsbu_ask · NL→SQL · MCP → 452

🔎Athenaover the Glue catalog → 452

📊Amazon QuickDirectQuery dashboard → 452

Ask the governed warehouse

checking warehouse…

Ask a question about Bedrock University's curriculum — enrollments, programs, courses, completions. The agent writes governed Redshift SQL (grounded on the glossary), runs it, and answers with the SQL + provenance.

Live · Redshift Serverless bedrock-university · read-only · governed by the Glue glossary

The one master glossary

These governed terms live ONCE in the Glue Data Catalog master glossary — the same vocabulary the lakehouse campuses, this Redshift campus, and Amazon Quick all bind to. So "enrolled headcount" means the same thing across the entire system.

Loading glossary…

One definition, three consumers, one answer

The headline proof, verified live: the same governed metric, computed independently by three different consumers — each reading the one Glue Data Catalog — agrees to the number.

Fall 2025	Agent · NL→SQL	Athena · Glue catalog	Amazon Quick · DirectQuery
Enrolled Headcount	452	452	452
Student Credit Hours	1,778	1,778	1,778

The agent emits COUNT(DISTINCT student_id) WHERE status='enrolled' on Redshift; Athena and the Amazon Quick dashboard (bu-parity, DirectQuery) run the same over the Redshift-origin curriculum in the Glue lakehouse catalog — governed by the identical glossary definition the agent uses. One definition, three consumers, one number.

The technology

Every layer is a managed AWS service. Here's what each does, how we use it, and how it scales — newest capabilities from the AWS Summit New York 2026 included.

How the platform evolves

We adopted the managed pieces as they reached GA, designed the data model to ride preview features, and aligned to the two published contracts of AWS Context — so adopting it later is re-pointing a tool, not a re-platform.

coming soon

The AWS Context horizon

AWS Context is AWS's own managed knowledge-graph + agentic-search service. It maps relationships across your data, serves governed relationships, business rules, and domain knowledge to agents at runtime, learns from how agents use it, and — crucially — stores its metadata as Iceberg in S3 Tables: exactly the substrate this layer already writes.

Because our data is Iceberg-in-S3-Tables, our semantics live in the Glue Data Catalog, and all agent access is MCP behind the gateway, adopting AWS Context becomes a migration of one tool target — the institution's accumulated meaning carries straight over.

Knowledge Layer

The semantic layeryour AI agents reason on.