The Semantic Layer in Modern Data Architecture

Why data lakes, lakehouses, and meshes still fail without a governed semantic layer

ENTERPRISE

At Volvo Cars, data is generated at enormous scale across manufacturing lines, supply chain systems, dealer networks, connected vehicles, and procurement platforms. The challenge is producing meaning from that volume. Across 50+ teams, the same metric was calculated differently. “Duty savings” alone had 23 conflicting definitions. Finance calculated it one way for the earnings call. Operations calculated it differently for performance reviews. Both were “correct” in their local context, yet both produced different numbers in the same executive meeting.

Better pipelines, newer warehouses, or trendier architectures cannot solve this problem on their own. Over the past fifteen years, the industry has produced a rich landscape of data architectures: data lakes, modern warehouses, lakehouses, data fabrics, and data meshes. Each addresses real problems. None of them, by themselves, solve the problem of meaning.

This post traces the evolution of modern data architectures and explains where and why they fall short. It also shows how to move from fragmented reporting to a governed, reusable abstraction that enables consistent decisions across the organization, regardless of which architecture sits underneath.

Interactive demo: Metric Trust Explorer shows how three source systems compute the same procurement metrics differently, and how a governed semantic layer resolves the inconsistency. Source on GitHub.


How We Got Here: The Evolution of Enterprise Data

The infrastructure underlying data has transformed dramatically. Fifteen years ago, a 1.5 PB data warehouse was an anomaly outside of Big Tech circles. Today, managing tens or even hundreds of petabytes is routine for large enterprises, and companies like Snowflake and Databricks handle exabytes of customer data across their platforms.

Three shifts made this possible. Storage moved from slow, expensive SAN-attached appliances to commodity SSDs and expandable on-demand object storage at a fraction of the historical cost. Networking moved from 1G Ethernet in most back-office environments to 40G and 100G cloud infrastructure, drastically improving data throughput and latency. Compute moved from expensive SMP servers and MPP appliances like Netezza and Greenplum to distributed computing frameworks that democratized access to scalable, high-performance processing.

Despite these advances, the core challenges have only grown in scale. Delivering trustworthy data reliably. Integrating data from disparate sources, especially those acquired through mergers and acquisitions. And managing data efficiently, where traditional governance models often become bottlenecks rather than enablers. Data architecture was once a straightforward task involving ETL tools and a data warehouse, with the choice between Inmon or Kimball models. Today the landscape is far more complex.


The Architecture Landscape: What Each Pattern Solves

Modern data architectures have proliferated because the problems are real and multifaceted. Each pattern addresses a genuine need. Understanding what they solve, and what they leave unresolved, is essential for making sound architectural decisions.

Data Lakes: Flexible Storage, Deferred Meaning

The data lake, coined by James Dixon in 2011, provides a unified repository for all data from various sources on affordable, scalable storage. The key innovation was decoupling storage from compute, allowing organizations to store raw data in whatever form the source provides and process it later using tools like Spark, Hive, or Presto.

Data lakes introduced the concept of logical zoning, recently popularized as the “medallion” or “multi-hop” architecture. Data progresses through layers: a landing zone for raw ingestion, a conformed zone for trusted transformations, and a serving zone for business-ready products. This layered approach provides schema flexibility, cost efficiency, and the ability to reprocess data as requirements change.

The limitation is structural. A data lake stores data without encoding business meaning. Every consumer must independently interpret what a number represents. A centralized data team managing a monolithic lake can also limit organizational agility, because delivering new data sources or addressing new use cases requires complex coordination across ingestion, processing, and serving pipelines.

Data Virtualization: Real-Time Access Without a Shared Definition

Data virtualization creates a logical layer that enables applications to retrieve and manipulate data without detailed knowledge of the underlying systems. It abstracts source complexity, connects to multiple backends in real time, and provides centralized security. At its best, it eliminates data movement and gives users on-demand access to the freshest data available.

The drawbacks are significant for analytical workloads. Cross-system joins are notoriously difficult to optimize, because the virtualization layer must predict execution costs across systems it does not control. Complex analytical queries involving large joins, table scans, and data shuffling are often incompatible with virtualization’s access patterns. Most importantly, virtualization does not impose a shared business model. It federates access without federating meaning. Two consumers querying the same virtualized endpoint can still interpret the results differently.

Modern Data Warehouses: Performance With Inherited Limitations

The modern data warehouse combines the business value of relational analytics with the flexibility of a data lake. In this architecture, the lake replaces the traditional staging area and handles large-scale transformations, while business-oriented datasets are published into a dimensional model for reporting. Platforms like Snowflake, Databricks, Azure Synapse, BigQuery, and Amazon Redshift decouple storage from compute and deliver strong query performance.

This architecture handles structured, unstructured, and streaming data. It supports real-time analytics, elastic scalability, and flexible data modeling including schema-on-read and schema evolution. However, the hybrid nature introduces complexity: orchestrating multiple technologies requires specialized skills, data duplication across the lake and warehouse complicates governance, and compliance certification (PCI DSS, HIPAA) becomes harder when the attack surface spans multiple components.

Data Fabric: Metadata-Driven Integration

Data fabric is a design concept that uses continuous analytics over metadata to facilitate the design, deployment, and utilization of integrated data across hybrid and multi-cloud environments. Its five pillars include an augmented data catalog, knowledge graphs enriched with semantics, a metadata activation and recommendation engine, AI-powered data preparation and delivery, and automated orchestration.

The vision is compelling: a platform-agnostic layer that abstracts the complexity of diverse data environments and makes data a strategic asset regardless of origin or platform. In practice, much of the technology needed to fully realize this vision is still in its infancy. The effectiveness depends heavily on metadata quality, and integrating metadata from incompatible vendor systems is inherently challenging. In regulated environments, ML-based reasoning may not be acceptable due to human oversight requirements. Building and maintaining a useful knowledge graph remains complex and rarely automatable.

Data Lakehouse: Pragmatic Convergence

The data lakehouse, popularized by Databricks and later championed by Bill Inmon, combines the scalability and cost-effectiveness of a data lake with the analytical infrastructure of a data warehouse. It stores data in open formats (Parquet, ORC) on low-cost object storage, uses technologies like Delta Lake or Apache Iceberg for ACID transactions, supports schema enforcement and evolution, and decouples storage from compute for independent scaling.

The lakehouse is arguably the most pragmatic of the modern patterns, and its low barrier to adoption means nearly all vendors claim to implement it. However, it focuses primarily on technology, often overlooking people and processes. It pays insufficient attention to data silos, business alignment, SLAs, and SLOs. Centralized governance and schema enforcement can hinder organizational agility, and while the lakehouse provides the technical capabilities for large-scale transformation, it does not fully address the challenges of data integration: managing complexity, metadata, and context mapping rules.

Data Mesh: Organizational Design for Data

Data Mesh, introduced by Zhamak Dehghani, is a sociotechnical approach that shifts data ownership from centralized teams to the business domains where data originates. It treats data as a product with published quality guarantees, promotes self-serve infrastructure, and federates governance computationally rather than bureaucratically.

The principles are sound. Domain teams understand their data best and should own its quality. A product mindset improves discoverability, trust, and reuse. But implementation is demanding. Domain teams may lack the expertise for data modeling, lifecycle management, and API creation. Defining appropriate domain boundaries often requires organizational restructuring. Building a self-service data infrastructure at the domain level requires sophisticated platform capabilities, including federated governance, data lineage, and interoperability, that are still emerging. Gartner has predicted that Data Mesh may become obsolete before reaching the plateau of productivity, largely because the organizational change it demands outpaces most enterprises’ capacity for transformation.

The Common Gap: Meaning

Each of these architectures addresses real challenges in storage, compute, integration, governance, or organizational structure. Yet none of them, by design, solves the problem of shared business meaning. A data lake stores data without defining what it represents. A lakehouse adds transactional guarantees without adding business definitions. A data mesh distributes ownership but can fragment meaning across domains. A data fabric attempts metadata-driven integration but often cannot reconcile semantic differences between systems.

The result is that organizations can build technically sophisticated data platforms and still end up in executive meetings where the first twenty minutes are spent debating whose number is right.


The Semantic Layer: The Missing Piece

A semantic layer is:

A governed, business-aligned abstraction over raw data that defines metrics once, correctly, and makes them reusable across dashboards, tools, and AI systems.

It separates:

  • Raw tables (fact_sales, store_dim, transaction_log)
  • From business meaning (Supply Growth %, Comp Store Sales, Service Time, Waste Rate)

Without this separation, every new dashboard becomes an interpretation exercise. Analysts reverse-engineer SQL to understand what a number means. Stakeholders lose trust when they cannot verify a number is right, even when the underlying data is sound. The semantic layer is what turns data infrastructure into organizational intelligence, regardless of whether that infrastructure is a lakehouse, a mesh, or a hybrid of both.

Ownership of this layer is ownership of trust. Without it, every metric is a negotiation.


Why Global Scale Makes This Non-Negotiable

At a company like Volvo Cars, the absence of a semantic layer shows up as a daily operational failure with measurable cost.

Fragmented Systems Tell Conflicting Stories

Volvo Cars procurement runs across three legacy systems: VGS for supplier governance, VPC for price and cost management, and SI+ for implementation tracking. A buyer searching for a contract clause might check VGS for the agreement, open a PDF attachment, cross-reference VPC for price history, verify SI+ for implementation records, and then confirm the result with a colleague. That single lookup can take 15 minutes and still produce ambiguous answers.

When I shadowed procurement buyers, the pattern was unmistakable: they spent more time finding information than using it. The root cause was the absence of a shared definition layer that reconciled what “contract value,” “negotiated savings,” or “supplier lead time” actually meant across these three systems. Data existed in each system; the definitions did not. Each system encoded its own version, and the buyer became the human semantic layer, manually stitching meaning together under time pressure.

A governed semantic layer eliminates this reconciliation tax. It defines “negotiated savings” once, computes it from the authoritative source, and exposes it consistently regardless of whether the consumer is a Power BI dashboard, a procurement AI assistant, or a buyer running an ad-hoc query.

Customs and Trade Compliance Demands Precision

Volvo Cars operates import and export flows across more than 100 markets. Customs compliance decisions like HS classification, origin determination, and free trade agreement eligibility are high-risk, high-volume, and governed by natural-language policy that varies by jurisdiction. A wrong classification may only surface years later during an audit, but the financial and legal impact is immediate once discovered.

A metric like “duty cost per unit” sounds simple but is deceptively complex. It depends on classification accuracy, origin qualification under specific FTAs, valuation methodology, and whether special procedures like inward processing or returned goods relief apply. If the trade compliance team, the finance team, and the logistics team each compute it from their own data sources with their own assumptions, the result is three numbers that do not agree and no clear basis for choosing among them.

The semantic layer resolves this by encoding the full calculation logic, including jurisdiction-specific rules, exclusion criteria, and time boundaries, in a single governed definition. When a customs analyst asks “what is our average duty rate for battery components imported under EUKOR?”, the answer comes from the same logic that feeds the CFO’s landed cost dashboard and the trade compliance team’s audit trail.

Cross-Functional Decisions Require Common Ground

The most expensive consequence of missing semantic governance is the time leadership spends in meetings arguing about whose numbers are right. When manufacturing defines “first-pass yield” differently from quality engineering, the weekly operations review becomes a debate about methodology rather than a conversation about improvement. When procurement reports “cost savings” using a different baseline than finance, the quarterly business review surfaces distrust rather than decisions.

At Volvo Cars, data flows through dozens of domain teams, each with legitimate reasons to model reality differently for their own purposes. The semantic layer does not force everyone into a single model. Instead, it establishes a shared set of certified metrics that serve as the common ground for cross-functional decisions, while allowing domain teams to maintain their own specialized views for local analysis. This aligns with the data mesh principle of domain ownership while solving the fragmentation problem that pure decentralization creates.

Scale Compounds the Problem

A small team can survive without a semantic layer. People know each other, can ask clarifying questions, and share institutional knowledge informally. At Volvo Cars scale, with 1,169 suppliers, 7,000+ contracts, operations across 100+ markets, and data flowing through manufacturing, logistics, procurement, sales, and after-market systems, informal coordination breaks down completely. Every new dashboard, every new AI feature, and every new analyst who joins the organization multiplies the inconsistency surface area.

The semantic layer is the infrastructure that makes organizational scaling possible without proportional growth in confusion. This is true whether the underlying platform is a lakehouse, a modern warehouse, or a federated mesh. The architecture may change. The need for shared meaning does not.


What the Semantic Layer Unlocks

Once a governed semantic layer exists, it becomes the foundation for capabilities that are impossible to build reliably without it.

Trustworthy AI Assistants Over Enterprise Data

When I built a RAG-based procurement assistant at Volvo Cars, the hardest problem was ensuring that the AI’s answers matched the numbers on the official dashboard. That problem was harder than retrieval accuracy or prompt engineering alone. If a buyer asks the assistant “what is the current contract value for supplier X?” and the assistant returns a different number than the one in VGS, the buyer blames the AI.

A semantic layer gives the AI system the same governed metric definitions that power dashboards, reports, and analyst queries. The LLM does not need to interpret raw tables or infer business logic. It queries the semantic layer, which returns the certified answer: the same figure stakeholders already trust in official reporting.

Automated Compliance Monitoring at Scale

In customs and trade compliance, the semantic layer enables continuous monitoring that would be impossible with ad-hoc queries. Once “FTA utilization rate,” “average clearance time by corridor,” and “classification override frequency” are defined as governed metrics, they can be tracked automatically against thresholds. When the FTA utilization rate for a specific corridor drops below the expected range, the system can flag it for review before it becomes an audit finding rather than after.

This shifts compliance from a reactive, sample-based audit practice to a proactive, continuous assurance model. The semantic layer is what makes the monitoring trustworthy, because every alert traces back to a metric with a clear definition, a known data source, and an accountable owner.

Self-Service Analytics That Do Not Create Chaos

One of the most common objections to self-service BI is that it produces inconsistent numbers. A regional manager builds a report with one filter logic, another manager builds a similar report with different filters, and the two reports disagree. Data virtualization can federate access to the underlying sources, but it cannot prevent consumers from interpreting the results differently. The semantic layer eliminates this class of problem by constraining self-service exploration to certified metric definitions. Users keep freedom to slice, filter, and drill; the certified metric definitions keep the calculation logic fixed.

At Volvo Cars, this means a procurement analyst in Gothenburg and a logistics manager in Ghent can both explore supplier performance data independently, ask different questions, and arrive at answers that are internally consistent because they share the same metric foundation.

Faster Onboarding for New Teams and Tools

Every time Volvo Cars onboards a new analytics tool, integrates with a partner system, or stands up a new data team, the same question arises: where is the data, and what does it mean? Without a semantic layer, the answer lives in tribal knowledge, Confluence pages that may or may not be current, and SQL queries buried in someone’s personal folder. This is the data platform challenge at its core: technical capabilities exist for ingestion, transformation, and serving, while the catalog of what these outputs mean is missing.

With a semantic layer, the answer is codified, versioned, and queryable. A new data engineer can discover available metrics, understand their definitions, trace their lineage to source tables, and start building on top of them within days rather than weeks. This reduction in onboarding friction compounds over time as the organization grows. Semantic Layer


Where the Semantic Layer Fits in a Modern Stack

The architecture follows a layered pattern that complements, rather than replaces, whatever underlying platform the organization has chosen:

Raw > Curated > Semantic > BI / AI

Each layer has a distinct responsibility:

  • Fact tables: transactions, labor hours, inventory movements, logistics events
  • Dimension tables: plant, region, supplier, vehicle model, component
  • Aggregated views: daily_plant_performance, weekly_supplier_scorecard
  • Metric layer: calculated KPIs with business logic embedded

This maps cleanly onto the medallion architecture common in data lakes and lakehouses. The raw and conformed zones handle ingestion and transformation. The semantic layer sits above them, encoding business logic that the underlying storage and compute layers are agnostic to.

Modern tools that support this architecture:

  • Azure Synapse for warehouse and compute
  • dbt for transformation and metric definitions as code
  • Power BI semantic model as the consumption layer with governed datasets
  • Databricks / Fabric for unified analytics across both BI and ML workloads

These tools embody the guiding principles that will outlast any single architecture: separation of storage from compute, computational data governance as a first-class citizen, and treating data as a product with published quality guarantees and clear ownership.


Building It: Five Principles in Practice

Start With Business Decisions, Not Data

The most common failure mode in BI is starting from what data exists rather than what decisions need to be made. Start from the decisions; treat data availability as a constraint on how you answer them.

At Volvo Cars, the questions that mattered were:

  • A plant manager asks: “Why did our first-pass yield drop on the evening shift?”
  • A regional supply chain lead asks: “Which suppliers are consistently missing delivery windows?”
  • A procurement director asks: “Are we achieving negotiated cost savings targets across categories?”

The semantic layer is built around those decisions. Every metric that does not serve a decision is overhead.

Define Core Metrics With Precision

Each metric must have a clear definition, a specified data grain, inclusion and exclusion rules, time logic, edge case handling, and an accountable owner. For example, Supplier On-Time Delivery Rate requires specifying the numerator (deliveries received within the agreed window), the denominator (total scheduled deliveries in the period), exclusions (force majeure? rescheduled by Volvo?), time logic (rolling 30 days vs. calendar month), and edge cases (partial deliveries, split shipments).

These definitions belong in SQL-backed logic, not PowerPoint. A metric definition that cannot be executed remains an aspiration until it ships as code.

Build Reusable Data Models

As product owner, I defined what had to exist in the semantic layer, prioritized the build sequence by decision impact, pushed for metric consistency across consumption tools, and guarded against metric sprawl, because every new metric needs justification.

At Volvo Cars, where data flows through systems like VGS, SI+, and VPC, the semantic layer becomes the single point of reconciliation. Without it, each system tells its own version of the truth.

Balance Governance With Democratization

Self-service without governance produces chaos. Governance without self-service produces bottlenecks. The semantic layer resolves this tension by giving users freedom to explore while constraining them to trusted definitions.

Guardrails include certified datasets that are clearly marked as production-quality, role-based access controls, metric documentation with definitions, owners, refresh schedules, and known limitations, data freshness SLAs, and change management protocols that notify downstream consumers whenever a metric definition changes.

If a metric changes silently, trust collapses. In an organization like Volvo Cars, where engineering precision is a cultural value, silent data inconsistencies are especially corrosive.

Drive Adoption Through Behavior Change

A BI product without behavior change is a reporting artifact. Measure success by decision quality: track active usage by role, repeat usage patterns, decision impact (did supplier response time improve after scorecard adoption?), reduction in shadow reporting, and KPI movement correlation.

Adoption follows when you solve real workflow problems; polished charts alone rarely suffice. The semantic layer succeeds when people stop asking “where did this number come from?” and start asking “what should we do about it?”


Key Decisions and What I Learned

The Biggest Challenge

Behavior change across 50+ teams, without authority over any of them.

The 23 conflicting definitions were not caused by bad SQL or technical incompetence. They were caused by rational teams solving local optimization problems independently. Finance needed duty savings calculated for the earnings call format. Operations needed it for performance reviews. Procurement needed it for vendor negotiations. Each team had adapted the definition to their context, rationally, independently, and incompatibly.

No amount of data engineering solves a coordination problem. This is the lesson that data mesh, data fabric, and lakehouse advocates often understate: the hardest part of data architecture is getting humans who have built their workflows around a local definition to agree on one canonical version and then actually use it.

Start with five metrics before tackling all twenty-three

My first instinct was to standardize everything at once. Wrong. I started the parallel negotiation for all 23 definitions in month 1, and by month 2 progress had stalled across the board. Too many parallel conversations, too many stakeholders, no visible wins.

Course correction: I narrowed to 5 metrics that appeared in 80% of executive reports. Delivering canonical definitions for those 5 created immediate, visible credibility. Momentum from those wins carried the remaining 18 with far less resistance.

Bottom-up adoption after a top-down mandate failed

I tried a mandate first: “Everyone must use the metric registry.” Compliance was slow and grudging. Teams registered metrics on paper but did not change their actual reports. The mandate produced formal compliance without behavior change, which is the same as nothing.

What actually worked, the UNVALIDATED watermark: I added a visibility mechanism where any report or dashboard using a non-registered metric definition got an “UNVALIDATED” watermark. I did not block the report. I did not force anyone to change. I just made the discrepancy visible. Nobody wanted to present a watermarked report to their VP.

Social pressure accomplished what mandates could not. Within 6 weeks, 80% of teams had migrated to registered definitions; nobody wanted to explain an unregistered watermark in a VP meeting.

The adoption pattern, week by week:

  • Week 1-2: Executive dashboards migrated (4 dashboards, 12 metrics), top-down and non-negotiable.
  • Week 3-6: Champions from 15 teams completed training, started migrating their team reports.
  • Week 7-12: Watermark went live. 80% of reports migrated voluntarily within 2 weeks.
  • Week 12+: Remaining 20% migrated after their VPs asked “why does your report say UNVALIDATED?”

The hybrid leveraged executive attention: the SVP said “I love that the numbers finally match” in a meeting, directors asked their teams to adopt. Top-down created the demand; bottom-up created the supply.

Translation documents for resistant teams

Several teams pushed back because they did not understand why their definition was changing. Their version worked for their context. I built a “metric translation” for each team: “Your old definition counted X. The canonical definition excludes X for this reason: Y. Here is what changes for your specific reports and how to adjust.” This cut resistance dramatically. Resistance dropped once people understood what was changing and why.

Data contracts that prevent issues instead of only detecting them

Triggered by an 8% KPI discrepancy incident. I built data contracts: each source system documents schema, refresh cadence, quality SLAs, and breaking change notification rules. Any schema change requires sign-off before downstream pipelines refresh. Monitoring catches problems after they have polluted downstream reports. Contracts prevent them. The 8% incident was caused by a schema change that broke a JOIN condition silently. After implementing contracts: zero similar incidents in 12 consecutive months.

This aligns with the broader principle of computational data governance: governance practices, including lineage, quality, retention, and security policies, must be automated and integrated as first-class citizens in any mature data platform. Treating governance as an afterthought is what makes data systems fragile, regardless of how sophisticated the underlying architecture is.

Governance council as shared ownership instead of a pure approval gate

Created a metric governance council with representatives from analytics, engineering, legal, and operations. Quarterly reviews. Each team had voting rights on metric changes. The council functioned as an influence mechanism: co-owners advocate for the system internally, while a pure approval gate would have created bottlenecks. The council turned stakeholders from “people who must be consulted” into “people who own the standard.”

What I’d say no to

  • Self-service metric creation without governance. Teams wanted to create and publish new metrics freely. I allowed creation in sandboxed environments but required governance council review before any metric was published to shared dashboards. Self-service without governance leads to guaranteed metric proliferation and back to 23 conflicting definitions within 6 months.
  • Automated metric reconciliation. Engineering proposed auto-detecting and auto-correcting discrepancies. I rejected it because auto-correction without human review could mask legitimate data differences (e.g., regional tax treatment variations). The system flags discrepancies; humans investigate and resolve.

Business Impact

Metric Baseline After How Measured Timeline
Conflicting metric definitions 23 for core KPIs 0 for governed metrics Metric registry audit 6 months
Analytics QA time ~20% of analyst workweek on reconciliation ~6% Self-reported time tracking survey across 50+ teams 6 months
Executive meeting time on data debates ~20 min/meeting (estimated) ~0 min Qualitative feedback from 3 VPs 4 months
Data incidents (wrong numbers in reports) 2-3 per quarter 0 for 12 consecutive months Incident tracking system 12 months
Reports with UNVALIDATED watermark 80% at launch <5% Automated report status tracking 6 months
Metric registry coverage 0 metrics registered 40+ with canonical definitions, SQL, owner, version history Registry count 12 months
Framework org adoption One team pilot Adopted as org-wide standard Executive mandate after demonstrated results 9 months

How “zero data incidents” was measured: Before the governance framework, data quality issues were tracked reactively: incidents were logged when someone found wrong numbers in a report. After implementing data contracts and validation gates, we continued the same tracking mechanism. Zero incidents in 12 months meant: no schema changes propagated without sign-off, no validation gate failures reached production dashboards, and zero “my number vs. your number” escalations in executive reviews.


Where I’d Deliberately Constrained Automation

  1. Watermark instead of blocking. I never blocked a report from being published. I made non-compliance visible instead of punished. Blocking would have caused rebellion and workarounds. Visibility caused voluntary compliance. Preserving team autonomy while creating accountability was the key insight.
  2. No auto-correction of metric discrepancies. The system flags discrepancies; humans investigate. Auto-correction could mask legitimate data differences and remove the human conversation that is often where the real governance happens.
  3. No self-service metric publishing to shared dashboards. Sandboxed creation is free; publishing requires review. Freedom without governance creates metric chaos.
  4. No automation of governance council decisions. Every metric definition change requires human review and cross-functional sign-off. The hardest part of governance is the conversation between humans with different priorities, not automating rule enforcement. That conversation cannot be automated, and should not be.

The Hardest Lessons

Governance is a people problem wearing a data hat. The 23 conflicting definitions came from rational teams solving local problems independently. Fixing that takes shared ownership, visible consequences, and translation documents, not only a new lakehouse, data mesh, or fabric rollout on its own.

Mandates produce compliance; visibility produces behavior change. The UNVALIDATED watermark accomplished in 6 weeks what 3 months of mandates could not.

Start with five metrics before tackling all twenty-three. I learned this by failing at twenty-three first. Narrowing to 5 high-impact metrics that covered 80% of executive reports created visible wins and momentum for the remaining 18. This is the sequencing lesson I would apply on day one at any enterprise, and it mirrors the incremental development principle that applies to data platforms broadly: build incrementally, validate continuously, and anchor decisions in demonstrated ROI.

What I would do differently from the start: Build the metric translation documents before announcing the canonical definitions. Several teams pushed back because they did not understand what was changing for them specifically, even when they agreed with the goal. A proactive translation like “your old definition, the new definition, and here is what changes for your team” would have halved the resistance.


Why This Matters for AI

The semantic layer is the foundation for trustworthy AI systems, extending well beyond classic BI. When an LLM-powered assistant answers “What is our current supplier on-time rate?”, it must pull from the same governed metric that appears on the executive dashboard. If the AI layer and the BI layer define metrics independently, the organization ends up with two sources of truth, which effectively means zero.

At Volvo Cars, this connection between the semantic layer and AI is already concrete. The procurement AI assistant I built uses RAG to retrieve contract information from VGS, VPC, and SI+. But retrieval alone does not produce trustworthy answers. The assistant must also compute derived metrics like “total contract value including amendments” or “remaining commitment against annual volume targets.” If those computations are hardcoded in the AI pipeline separately from the BI layer, they will inevitably drift. The semantic layer is the shared contract that keeps both systems honest.

The same principle applies in customs and trade. An AI system that helps brokers classify goods under the Harmonized System needs access to governed definitions of product categories, tariff logic, and FTA eligibility rules. If the AI classifies a battery module as one HS code while the customs reporting dashboard uses another, the organization faces audit risk from its own internal inconsistency. The semantic layer ensures that classification logic, duty calculations, and compliance thresholds are defined once and consumed everywhere.

As AI-based services mature, they hold great potential to simplify and automate complex tasks like data classification, integration, and advanced analytics. But that potential depends entirely on the quality and consistency of the definitions those AI systems consume. Building the semantic layer correctly turns the jump from descriptive reporting to AI-assisted decision-making into an engineering reality: consistent definitions first, then automation on top. For a global operation like Volvo Cars, where decisions span procurement, manufacturing, customs, and logistics across 100+ markets, the semantic layer is the infrastructure that turns data into organizational intelligence.


Try the Demo

The Metric Trust Explorer brings this post to life with synthetic procurement data modeled after the VGS, VPC, and SI+ systems described above. It demonstrates three metrics (Supplier On-Time Delivery Rate, Negotiated Savings, and Active Contract Value), shows how each source system computes them differently, and visualizes how the governed semantic layer produces a single certified answer with full lineage.

The source code is available on GitHub.

Share: X (Twitter) Facebook LinkedIn