Taxonomy and Information Architecture for the Semantic Layer

June 12, 2024

EK Team

There is a growing interest in implementing a semantic layer as part of a knowledge management strategy as organizations seek to contextualize and connect their business data and information in meaningful ways. A semantic layer is more than a collection of components; they are most effective when logically connected and implemented into a shared user interface or application layer that services the information-seeking needs of its users. That’s where the principles of good information architecture (IA) design come into play.

The semantic layer’s feature of connectivity of content and data and content sources requires consistent naming/labeling of concepts and entities and of relationships between concepts and entities. This consistent naming and alignment of resources is what a taxonomy and other controlled vocabularies provide.

Both taxonomy and information architecture also help provide context for other components of a semantic layer. This article describes the relationship between IA and taxonomy and their importance and varied roles in the semantic layer.

The Semantic Layer

A semantic layer is a standardized framework that organizes and abstracts organizational data (structured, unstructured, semi-structured) and serves as a connector for data and knowledge. It supports improved data management, content management, information management, and ultimately knowledge management enterprise-wide. It’s a method to bridge content and data silos through a structured and consistent approach to connecting instead of consolidating data. It connects all organizational knowledge assets, including content items, files, videos, media, etc. via a well defined and standardized semantic framework. The semantic layer comprises a combination of solutions, knowledge organization systems, and software systems, rather than any single one.

A semantic layer framework provides various benefits, including consistent metadata, support for open Semantic Web (W3C) standards, intuitive user interactions, high quality data and governance, efficient data analysis processes, and a semantic context for reliable AI responses. More specifically, a semantic layer allows users to:

Connect sources of structured and unstructured information to search for data and content at the same time;
Conduct a single search for information from multiple disparate solutions and locations
Access and analyze data without extensive technical knowledge;
Access relevant insights faster, abstracting the complexity of underlying data
Understand the business meaning of data;
Scale analytics capabilities to adapt to changing business needs and data sources; and
Incorporate new data sources easily, especially for faster development and deployment of AI models and technologies.

It is called a “layer” because in the larger framework, it’s a middle layer between the content/data repositories and one or more front-end applications for users to search, browse, analyze, or receive recommendations of information. In a sense, it cuts across systems and repositories horizontally. It’s called “semantic” because it provides standardized meaning and labels to entities and business objects and their relationships, often based on W3C standards.

Taxonomies and Information Architecture Defined

Taxonomies and information architecture overlap and are often somewhat intertwined, even though they may be discussed separately due to different perspectives.

A taxonomy is a knowledge organization system that supports information retrieval though its controlled and structured set of terms. More specifically, a taxonomy is a controlled vocabulary, based on unambiguous concepts, not just words, which are structured into a hierarchy or hierarchies of broader-narrower concepts, and are used primarily to tag content to support its findability and retrieval through searching or browsing by users.

The definition of information architecture (IA) in a broad sense is the structural design of shared information environments, which includes organization, labeling, search, and navigation systems. Specifically, this includes the organizing and labeling of web sites, intranets, online communities, software user interfaces, and information products (Rosenfeld, Morville, and Arango, Information Architecture, 4th ed., O’Reilly, 2015, p. 24). Thus, when IA is identified as a need or asset, taxonomy development is often a part of it.

When we look at how taxonomy and IA are connected, we can then see more broadly how other knowledge organization systems can also be connected with them. Navigation menu labels can match search refinements and top-level content categories. Navigation structures may align with other hierarchical taxonomies. The taxonomy tagging concepts may integrate with various metadata schema to provide the controlled values for different metadata properties. Glossaries may align to taxonomies, by bringing in full definitions for concepts. Ontologies can connect various taxonomies and metadata properties together with semantic relationships, and they also connect to data extracted from databases. Data field types may be converted to ontological relationships and attribute types. Knowledge graphs combine taxonomies, ontologies, and instance data.

Information Architecture as a Part of a Semantic Layer

Although IA is not a system or resource, as a taxonomy or ontology is, and thus not a “component” of a semantic layer in the same way, it is an important design element of a semantic layer, just as IA is also a core element of knowledge management.

IA organizes how taxonomies and ontologies are applied to the user experience / presentation layer. A taxonomy has both a front end, accessed by its users, and a back end, where it is tagged to content and linked to data. While all concepts in a taxonomy are tagged or linked, not all concepts (or all labels of concepts with multiple labels) are directly displayed to the users. Selected high-level taxonomy concepts may be displayed in hierarchies, frequently tagged concepts may be displayed in search refinement filters, concepts that match users’ search strings may be displayed as search suggestions listed under the search box, or a user’s search may directly serve up content without indicating what the matching concepts were. How best to display the taxonomy or features of it to the users and how to support user interaction with the taxonomy are decisions of IA best practices. IA also supports the use of ontologies in the user interface of applications, such as by determining the best way to display related data or suggesting related content.

IA, however, is not limited to the front-end presentation layer. In fact, IA transcends all layers of information and data management. Thus, IA comprises multiple layers itself, as the following diagram illustrates.

IA comes into play at different levels within solutions pursued by an organization. IA provides structural context by illustrating the overarching model/design of the semantic layer. It links content and data through shared and linked metadata and knowledge organization systems, which in turn are implemented in one or more front-end applications.

Taxonomy as a Part of a Semantic Layer

Key components of a semantic layer are knowledge organization systems, which include business glossaries, controlled metadata, a data catalog, taxonomies, thesauri, ontologies, and knowledge graphs. A semantic layer does not need to include all of these, but it should include a combination to define both terms and semantic relationships to a sufficient extent needed. Some semantic layer components, such as glossaries, focus on defining terms, while others, such as ontologies, focus on modeling classes, semantic relationships, and attributes. Taxonomies, which are very flexible and scalable in their design, include at least some of both features with concepts that may have notes and definitions and “semi-semantic” relationships that may be hierarchical or non-hierarchical (associative).

Taxonomies provide consistent naming of and alignment of concepts. For example, a standard naming and hierarchy of industries is needed for the industries of customers in the account management system, industries associated with customer and lead companies and individuals in the contact management system (CRM), and industries in the knowledge base content management system (CMS) that includes employee areas of expertise. An industry taxonomy and a product taxonomy provide consistency, alignment, and the ability to search/query across multiple systems and content/data repositories. In addition to contextualizing data and content through consistent naming, a taxonomy also provides subject domain context for concepts. Finally, a taxonomy is also a critical building block for preparing data for AI use cases, by providing disambiguation and context.

Since taxonomies are one of several components of a semantic layer, a dedicated taxonomy (or taxonomy/ontology) management system is one of several system components of a semantic layer. It’s essential that the taxonomy be managed and governed outside any individual system (CMS, DAM, CRM, etc.) that has a taxonomy management feature within it, and that the taxonomy be based on open W3C standards, namely SKOS (Simple Knowledge Organization System).

Learn more about EK’s taxonomy and ontology consulting services and how a taxonomy strategy can align with a larger semantic layer strategy. Contact us for more information and how we can be of service to you.

Blog

Taxonomy and Information Architecture for the Semantic Layer

The Semantic Layer

Taxonomies and Information Architecture Defined

Information Architecture as a Part of a Semantic Layer

Taxonomy as a Part of a Semantic Layer