The Top 3 Ways to Implement a Semantic Layer

Over the last decade, we have seen some of the most exciting innovations emerge within the enterprise knowledge and data management spaces. Those innovations with real staying power have proven to drive business outcomes and prioritize intuitive user engagement. Within this list are a semantic layer (for breaking the silos between knowledge and data) and of course, generative AI (a topic that is often top of mind on today’s strategic roadmaps). Both have one thing in common—they are showing promise in addressing the age-old challenge of unlocking business insights from organizational knowledge and data, without the complexities of expensive data, system, and content migrations.  

In 2019, Gartner published research emphasizing the end to “a single version of the truth” for data and knowledge management and that by 2026, “active metadata” will power over 50% of BI and analytics tools and solutions to provide a structured and consistent approach to connecting instead of consolidating data.  

By employing semantic components and standards (through metadata, business glossaries, taxonomy/ontology, and graph solutions), a semantic layer arms organizations with a framework to aggregate and connect siloed data/content, explicitly provide business context for data, and serve as the layer for explainable AI. Once connected, independent business units can use the organization’s semantic layers to locate and work with not only enterprise data, but their own, unit-specific data as well. 

Incorporating a semantic layer into enterprise architecture is not just a theoretical concept, it’s a practical enhancement that transforms how organizations harness their data. Over the last ten years, we’ve worked with a diverse set of organizations to design and implement the components of a semantic layer. Many organizations we work with support a data architecture that is based on relational databases, data warehouses, and/or a wide range of content management, cloud, or hybrid cloud applications and systems that drive data analysis and analytics capabilities. These models do not necessarily mean that organizations need to start from scratch or overhaul their working enterprise architecture in order to adopt/implement a semantic layer. To the contrary, it is more effective to shift the focus on metadata and data modeling or designing efforts by adding models and standards that will allow for capturing business meaning and context in a manner that provides the least disruptive starting point. 

Though we’ve been implementing the individual components for over a decade, it has only been the last couple years where we’ve been integrating them all to form a semantic layer. The maturity of approaches, technologies, and awareness have all combined with the growing need of organizations and the AI revolution to create this opportunity now.

In this article, I will explore the top three common approaches we are seeing at play in order to weave this data and knowledge layer into the fabric of enterprise architecture, highlighting the applications and organizational considerations for each.

1. A Metadata-First Logical Architecture: Using Enterprise Semantic Layer Solutions

This is the most common and scalable model we see across various industries and use cases for enterprise-wide applications. 

Architecture 

Implementing a semantic layer through a metadata-first logical architecture involves creating a logical layer that abstracts the underlying data sources by focusing on metadata. This approach establishes an organizational logical layer through standardized definitions and governance at the enterprise level while allowing for additional, decentralized components and solutions to be “pushed,” “published,” or “pulled from” specific business units, use cases, and systems/applications at a set cadence. 

Semantic Layer ArchitecturePros

Using middleware solutions like a data catalog or an ontology/graph storage, organizations are able to create a metadata layer that abstracts the underlying complexities, offering a unified view of data in real time based on metadata only. This allows organizations to abstract access, ditch application-centric approaches, and analyze data without the need for physical consolidation. This model effectively leverages the capabilities of standalone systems or applications to manage semantic layer components (such as metadata, taxonomies, glossaries, etc.) while providing centralized storage for semantic components to create a shared, enterprise semantic layer. This approach ensures consistency in core or shared data definitions to be managed at the enterprise level while providing the flexibility for individual teams to manage their unique secondary and group-level semantic data requirements.

Cons

Implementing a semantic layer as a metadata architecture or logical layer across enterprise systems requires planning in phases and incremental development to maintain cohesion and prevent fragmentation of shared metadata and semantic components across business groups and systems. Additionally, depending on the selected synchronization approach of the layer with downstream/upstream applications (push vs. pull), data orchestration and ETL pipelines will need to plan for a centralized vs. decentralized orchestration that ensures ongoing alignment. 

Best Suited For

This approach is our most deployed and well-suited for organizations that want to balance standardization with the need for business unit or application level agility in data processing and operations in different parts of the business.

2. Built-for-Purpose Architecture: Individual Tools with Semantic Capabilities

This model allows for greater flexibility and autonomy at the business unit or functional level. 

Architecture 

This architecture approach is a distributed model that leverages each standalone system or application capabilities to own semantic layer components—without a connected technical framework or governance structure at the enterprise level for shared semantics. With this approach, organizations typically identify establishing semantic standards as a strategic initiative but each individual team or department (marketing, sales, product, data teams, etc.) is responsible for creating, executing, and managing its semantic components (metadata, taxonomies, glossaries, graph, etc.), tailored to their specific needs and requirements.

Semantic Layer ArchitectureMost knowledge and data solutions such as content or document management systems (CMS/DMS), digital asset management (DAMs), customer relationship management (CRM), and data analytics/BI dashboards (such as Tableau and PowerBI) have inherent capabilities to manage simple semantic components (although with varied maturity and feature flexibility levels). This decentralized architecture results in the implementation of multiple system-level semantic layers. Let’s take SharePoint as an example, an enterprise document and content collaboration platform. For organizations that are in the early stages of growing their semantic capabilities, we leverage the Term Store for structuring metadata and taxonomy management within SharePoint, which allows teams to create a unified language, fostering consistency across documents, lists, and libraries. This helps with information retrieval and also enhances collaboration by ensuring a shared understanding of key metrics. On the other hand, Salesforce, a renowned CRM platform, offers semantic capabilities that enable teams across sales, marketing, and customer service to define and interpret customer data consistently across various modules.

Pros

This decentralized model promotes agility and empowers business units to leverage their existing platforms (that are built-for-purpose) as not just data/content repositories but as dynamic sources of context and alignment—driving consistent understanding of shared data and knowledge assets for specific business functions.

Cons

However, this decentralized approach typically leads those users who need cohesive organizational content and data to do so through separate interfaces. Data governance teams or content stewards are also likely to manage each system independently. This leads to data silos, “semantic drifts,” and inconsistency in data definitions and governance (where duplication and data quality issues arise). This ultimately results in misalignment between business units, as they may interpret data elements differently, leading to confusion and potential inaccuracies.

Best Suited For

This approach is particularly advantageous for organizations with diverse business units or teams that operate independently. It empowers business users to have more control over their data definitions and modeling and allows for quicker adaptation to evolving business needs, enabling business units to respond swiftly to changing requirements without relying on a centralized team. 

3. A Centralized Architecture: Within an Enterprise Data Warehouse (EDW) or Data Lake (DL)

This structured environment simplifies data engineering and ensures a consistent and centralized semantic layer specifically for analytics and BI use cases.

Architecture

Organizations that are looking to create a single, unified representation of their core organizational domains develop a semantic layer architecture that serves as the authoritative source for shared data definitions and business logic within a centralized architecture—particularly within an Enterprise Data Warehouse or Data Lake. This model makes it easier to build the semantic layer since data is already in one place, and analytics solutions that are using cloud-based data warehousing platforms (e.g., Amazon Redshift, Google BigQuery, Snowflake, Azure Blob Storage, Databricks, etc.) can serve as a “centralized” location for semantic layer components. 

Building a semantic layer within an EDW/DL involves consolidating and ingesting data from various sources into a centralized repository, identifying key data sources to be ingested, defining business terms, establishing relationships between different datasets, and mapping the semantic layer to the underlying data structures to create a unified and standardized interface for data access. 

Semantic Layer ArchitecturePros

This model architecture is a common implementation approach we support specifically within a dedicated team of data management, data analytics, and BI groups that are consistently ingesting data, setting the implementation processes for changes to data structures, and enforcing business rules through dedicated pipelines (ETL/APIs) for governance across enterprise data. 

Cons

The core consideration here (that usually suffers) is collaboration between business and data teams that is pivotal during the implementation process, guides investment in the right tools and solutions that have semantic modeling capabilities, and supports the creation of a semantic layer within this centralized landscape. 

It is important to ensure that the semantic layer reflects the actual needs and perspectives of end users. Regular feedback loops and iterative refinements are essential for creating a model that evolves with the dynamic nature of business requirements. Adopting these solutions within this environment will enable the effective definition of business concepts, hierarchies, and relationships, allowing for translation of technical data into business-friendly terms.

Another important aspect with this type of centralized model is that it is dependent on data that is consolidated or co-located and requires upfront investment in terms of resources and time to design and implement the layer comprehensively. As such, it’s important to start small by focusing on specific business use cases, the relevant scope of knowledge/data sources and foundational models that are highly visible, and focused on business outcomes. This will allow the organization to create a foundational model that will expand across the rest of the organization’s data and knowledge assets—incrementally. 

Best Suited For

We have seen this approach being particularly beneficial for large enterprises with complex but shared data requirements and that have the need for stringent knowledge and data governance and compliance rules—specifically, organizations that produce data products and need to control the data and knowledge assets that are shared internally or externally on a regular basis. This includes, but is not limited to, financial institutions, healthcare organizations, bioengineering firms, and retail companies. 

Closing

A well-implemented semantic layer is not merely a technical necessity but a strategic asset for organizations aiming to harness the full potential of their knowledge and data assets, as well as have the right foundations in place to make AI efforts successful. The choice of how to architect and implement a semantic layer depends on the specific needs, size, and structure of the organization. When considering this solution, the core decision really comes down to striking the right balance between standardization and flexibility, in order to ensure that your semantic layer serves as an effective enabler for knowledge-driven decision making across the organization. 

Organizations that invest in an enterprise architecture through the metadata layer and those that rely on experts with modeling experience that are anchored in semantic web standards find it the most flexible and scalable approach. As such, they are better positioned to abstract their data from vendor lock and ensure interoperability to navigate the complexities of today’s technologies and future evolutions.

When embarking on a semantic layer initiative, not understanding or planning for a solid technical architecture and phased implementation approach leads to unplanned investments or failure for many organizations. If you are looking to get started and learn more about how other organizations are approaching scale, read more from our case studies or contact us if you have specific questions.

Lulit Tesfaye Lulit Tesfaye Lulit Tesfaye is a Partner and the VP for Knowledge & Data Services and Engineering at Enterprise Knowledge, LLC., the largest global consultancy dedicated to Knowledge and information management. Lulit brings over 15 years of experience leading diverse information and data management initiatives, specializing in technologies and integrations. Lulit is most recently focused on employing advanced Enterprise AI and semantic capabilities for optimizing enterprise data and information assets. More from Lulit Tesfaye »