Ontologies can capture highly complex ideas and business logic, provide more intuitive ways to structure information, and can ultimately power new use cases, such as semantic search, recommendation engines, and AI. While many organizations aim to leverage an ontology, they lack the strategic expertise and the in-house technical skills required to design or implement it.
In order to get you started, here are some tips to ensure your efforts result in a quality ontology design. Even though they sound simple, these practical design considerations will have a huge impact on the reusability and scalability of your ontology.
1. Identify a Clear Use Case.
At the beginning of any ontology design effort, identify the 1-2 critical questions that the ontology needs to answer. Modeling for these specific use cases will help you to show immediate value by having a working model implemented quickly. If you attempt to model a full domain, you may be modeling indefinitely with no clear return on investment for the time spent. As we know, ontologies are never ‘complete’ and can always be expanded for additional use cases and domain coverage so it is important to understand the first few use cases that will show immediate return on investment.
Some high-return example use cases we’ve worked on with success are:
- Related Content Recommendations: Using relationships between similar content and shared attributes, like the topic or author of a document, can support a recommendation engine that surfaces content to users.
- Natural Language Processing & Semantic Search: By using RDF and storing the ontology in triples, we can instantiate the model and traverse our data via relationships by asking natural language style questions that follow the same pattern using SPARQL. For example, if our model contains a relationship between two concepts:
|PersonName isAuthorOf BookTitle|
|We can instantiate examples based on our domain and data like these:|
|Jane Austen isAuthorOf Pride and Prejudice|
|Jane Austen isAuthorOf Emma|
|Jane Austen isAuthorOf Northanger Abbey|
|Then we can ask questions in a search such as, “What books has Jane Austen written?” and return results from our dataset of Pride and Prejudice, Emma, and Northanger Abbey.|
Showing immediate value with a concrete use case can help to ensure participation and support from stakeholders and end users on additional use cases and ontology design efforts.
2. Reuse Standards & Existing Vocabularies.
Look first for models and standards that already exist and may inform your design. One of the most important benefits of using standards for ontology development is the interoperability that comes with open linked data. Depending on the industry, well-developed models may already exist. For example, the Veterinary Extension of SnomedCT from the Veterinary Terminology Services Laboratory Browser (VTSL Terminology Browser) is packed with defined vocabulary and classes for Procedures, Clinical Findings, Events, and more for the veterinary industry. Another industry specific model is the Unified Medical Language System (UMLS) that has gathered many biomedical and clinical vocabularies in one web browser interface. Finding and leveraging an existing vocabulary or model for your industry can jumpstart your design and ensure that you are in line with the industry, even if your model is tailored or customized in some areas. One resource for finding industry or domain specific models is Linked Open Vocabularies, a collection of open source vocabularies.
Non-industry specific standards are also important and can be key for saving time, such as pulling in descriptions and alternative labels from DBpedia or Wikidata, classes and relationships from Friend of a Friend (FOAF) or Schema.org, and modeling standards like W3C’s Web Ontology Language (OWL) and Resource Description Framework Schema (RDFS) for consistency. These standards will also ensure interoperability with any applications or datasets that are also using semantic web standards.
3. Leverage Consistent Naming.
Follow a naming convention to ensure that your resources are easily understood and referenceable by others. If your naming conventions are inconsistent, it can make it much harder to integrate with organizational tools or reference parts of the model. On the other hand, if the naming conventions clearly differentiate between classes, properties, and instances, it will be immediately obvious which type of resource someone is looking at or trying to return in a query or API call. Luckily, the World Wide Web Consortium (W3C) has already defined some simple conventions:
|Resource Type||Naming Convention||Examples|
|Classes||Sentence case starting with capital letters.||Place
|Properties||Start with lowercase, then continue with title case.||inverseOf
|Instances||For proper names, capitalize the first letter of each word.||United States of America
These naming conventions will improve the clarity and quality of your model, and, as always, support interoperability.
4. Define Classes and Instances.
Two important components of any ontology design are classes and instances. These components allow us to model both the broad types of things and the specific examples of those things. The OWL standard defines these as:
“Classes provide an abstraction mechanism for grouping resources with similar characteristics. Like RDF classes, every OWL class is associated with a set of individuals, called the class extension. The individuals in the class extension are called the instances of the class.”
The differences between classes and instances can be tricky to define when designing a new ontology, especially if you are building off of an existing taxonomy or thesaurus. My colleague Ben describes how the top level of a well constructed taxonomy can often be repurposed as the classes of your ontology in his blog, From Taxonomy to Ontology. Taxonomies that include metadata fields like Content Type, Person, and Company can transition to ontological classes with the narrower terms, like Proposal, Jenni Doughty, and Enterprise Knowledge as instances of those classes, respectively.
It’s important to understand which of your taxonomy terms are candidates for classes, subclasses, or instances. A good rule of thumb is to recognize which terms are types of things, versus examples of individual things. For example, a Quarterly Report can be a type of report, (a subclass of the Report class), while the 2020 Q3 Quarterly Report is an instance, or a specific example of a Quarterly Report. The distinction is important for ensuring your ontology model is complete and can be implemented successfully.
In OWL and RDFS, there are many useful axioms that can help ontologists express different types of relationships and classes and allow applications using the model to infer different things and further define a class. Some of these include:
- rdfs:subClassOf – Can define a class as a narrower or child class of another, allowing the inheritance of properties and additional inferences based on this relationship.
- owl:equivalentClass – Can indicate a class that is equivalent to another, indicating that the instances within are the same in both classes. This is not the same as saying that a class is owl:sameAs another class, meaning that they have the same intensional meaning.
- owl:disjointWith – Can restrict classes from overlapping and containing the same instance in more than one class, reducing ambiguity when tagging or recommending content. For example, if we have an ontology with Animal and Car classes, we can disjoint these classes which will prevent the same instance of Jaguar from appearing in both classes.
Understanding which axioms to use will ensure that the ontology models not just the classes, but also contains information about those classes that further characterize the instances that fit within and how they relate to others.
5. Implement Iteratively.
Once you’ve designed the ontology model for your use case, it is time to begin mapping data sources to the ontology, or instantiating and implementing it with graph technology. As with all our design and implementation projects, EK recommends implementing an ontology iteratively, starting with 1-2 data sources or a high-level set of data from each intended source that is relevant to your priority use cases. When mapping or ingesting data, there are multiple best practices for Enterprise Knowledge Graph Design to assist in completing this task, including deciding what data to actually move and store in the knowledge graph, and which to map through a virtual graph.
The best way to ensure sustainable and scalable implementation is to start small, move quickly, and seek continuous feedback from end users and stakeholders. These stakeholders will be instrumental in ensuring that the resulting ontology and implementation meet business needs and end user expectations. A good rule of thumb is to engage a wide variety of stakeholders from all business areas that will be benefited or engaged in any part of the design, implementation, maintenance, or end user processes. Finally, through facilitated conversations or working sessions, these stakeholders will not only assist in the development of the ontology, but will also feel as if they have a stake in what has been designed. They can become your champions for future work and capabilities.
These five keys can help to ensure a strong, standards-based foundation for your ontology design that will result in an intuitive and interoperable model. For more information on how to begin designing an ontology, consider EK’s Two-Day Design Workshop or contact us at firstname.lastname@example.org.