You may come across different opinions about what Knowledge Graph is across industries, since every organization is trying to come up with their own definition and ways to maximize the value from it.
Surely, the term has gained in popularity since Google introduced its Knowledge Graph in 2012. Since then, there have been ongoing efforts to create a standard definition. The Alan Turing Institute says, “Knowledge Graphs organize data from multiple sources, capture information about entities of interest in a given domain or task (like people, places or events), and forge connections between them.” Stargod defines Knowledge Graph as “a flexible, reusable data layer used for answering complex queries across data silos. They create supreme connectedness with contextualized data, represented and organized in the form of graphs.” They are addting that a Knowledge Graph of enterprise data is what is called Enterprise Knowledge Graph. Based on IBM definition, Knowledge Graphs are “networks of semantic metadata which represent a collection of related entities.”
This article does not have any intention of explaining the terminology in-depth. Still, it aims to bring the term to attention and emphasizes some benefits for the organization of creating and using Knowledge Graphs. One simplified explanation of a Knowledge Graph is that it is a network of entities and relationships between them. The entities may be concrete, like people, but they can be abstract too, like a profession, for instance. The relationships between the entities are visualized as a graph structure, while the information is stored in a graph database. The three main elements of a Knowledge Graph are:
- Edges, and
The core characteristic of the knowledge graph is a knowledge model, which is a group of interlinked descriptions of concepts, entities, relationships and events. What is essential to know is that there are Knowledge Graphs that are used primarily within the organization that created them, like Google’s or Amazon’s Knowledge Graph, but there are Knowledge Graphs that are openly available: DBpedia, Wikidata, WordNet, Geonames, etc. In this article, we will showcase how NASA developed its knowledge management graph model.
NASA Use Case for Developing Graph Data Model
David Meza, AIML R&D Lead, Sr. Data Scientist People Analytics at NASA, says that “Knowledge Graph is the interconnection of domains to common relationships”. According to him, creating a knowledge management graph model does not require upfront knowledge, but relevant tools and methods. He also thinks that to get started with the Knowledge Graph, organizations must decide between the common ways to build a Knowledge Graph within the graph database: RDF – Resource Description Framework and LPG – Labeled Property Graphs.
The significant difference between the two is that within the RDF there is a strict labelling type for a particular node. At the same time, in the LPG there is a label associated with a node and a node can have multiple labels. This benefits organizations when they start searching and trying to traverse through the graphic set-up or applying properties to those nodes, labels or relationships and analyzing the information there.
Organizations should understand the tools and the skills that they have so that they can decide what’s better for them, RDF or LPG.
“I have a Lesson as a node (purple circle). And that Lesson has different types of information associated with it, and somebody wrote that. So, Summbiter (red circle) is another node, and the connection between that is the edge of the relationship talks about how those nodes interact. The relationship from the Lesson to the Center (green circle) where the Lesson was written is a particular category. With different algorithms applied, such as a Topic Modeling Algorithm, I can now group those Lessons into Topics (orange circles) which becomes another node in my Knowledge Graph. Those Topics have information about each other, so I can see how closely they relate and correlate based on the likelihood that similar documents are in those Topics and Terms (blue circle). All of that becomes information in the Knowledge Graph and allows you to enrich the Knowledge Graph to start using that information to get things out of your Knowledge Graph.”, explains Meza.
Once an organization has the Knowledge Graph, it can do different things with that information.
“The Lessons have textual information about what happened in a particular project: events that occurred, best practices, and issues we want to remember for future projects. All that information is stored in a document database. To put it in a graph database for a knowledge graph, I had to extract this information, develop my data model, take the textual information about that Lesson and apply the Topic Model algorithm on top of it. What that algorithm is doing is looking at the Term in the document and finding out what’s the probability that in this document those Terms are related, as well as what is the highest frequency of Terms in that document. It gives a probabilistic value and decides based on what we see in this document that, in this case, is highly likely to be about genes, DNA or genetics. And it does that to give all the documents and put them in various Topics so you can use that information to store back into your knowledge graph.”, describes Meza.
NASA compared which Lessons are going down or higher in prevalence over time.
“You can see that the topics on the left were mentioned less and less over time, as opposed to the topics on the right that went higher over the same period. This gives you an idea to look at this and see maybe there are issues you need to look at, and Knowledge Graph allows you to pick that quickly”, adds Meza.
Importance of the Knowledge Graphs for Organizations
Organizations are becoming more data-driven with the time, but being data-driven is not enough. What is most important today is to be knowledge-driven organizations when operating and making decisions. A Knowledge Graph creates a web of knowledge for organizations, unique to their domain used to:
- Break down data silos
- Find information faster
- Make better and informed decisions
- Give a new approach to insights
- Combine structured and unstructured data
This importance applies to organizations in information technology, people analytics, finance, security, fraud detection etc. All of them should focus on visualisation if they want to utilise the Knowledge Graphs. For example, organizations can save on cost and time by visualising documents into a Knowledge Graph. The only time they may spend is understanding the data model and graph databases that can be easily changed depending on the information organizations need. How is that done? We are showing one document from the Kennedy Space Center and explained by Meza as an example of how to identify issues in storage tanks and valves.
The image above shows Topics on the left, like valve contamination or tank contamination. Connecting them with the Topic on the right, as air plums, organizations can understand what issues happened (damaged, fire hazard, battery hazard) and how to prevent them in the future with the help of a Knowledge Graph. Going into the Lessons in the Topics, you can get more information, like battery leak, placement of the battery, the heat etc., that can damage the valve and eventually contaminate the tank. This is helpful to understand how things are developed, created and produced.
“That is the beauty of the Knowledge Graph. It gives you information quickly that you can otherwise do in a traditional search capability, where you have to return to the document and list the pages to see which are connected.”, adds Meza.
Tools for Creating Knowledge Graph
The tools for creating the Knowledge Graph depend on what information organizations have, and what type of information they are trying to extract. This time, we’ve listed several tools that you can explore further:
- Azure Cosmos DB
- Datastax Enterprise Graph
In the upcoming years, it is expected that Knowledge Graphs will gain more importance because more organizations are expected to create and implement them based on their infrastructure. In this article, we also suggest watching the presentation by Panos Alexopoulos, where you can hear more about the technical and business/organizational dimensions and challenges of Knowledge Graph initiatives, as well as relevant best practices, lessons and methods to build Knowledge Graphs. You can find out more about digital transformation with Graphs and how to use them for data management and analytics and to improve data fabrics, contextual AI and digital twins in the presentation by Stefan Wendin. Listen to David Meza’s presentation on Knowledge Graph delivered at the Data Innovation Summit 2022. He also spoke at this year’s Nordic People Analytics Summit on “Knowledge Graph in People Analytics”. Find that presentation here.
If you are a practitioner in data management, analytics, data science, or AI, and you would like to share your insights and experience with our reader community, don’t hesitate to contact us. We always look for fresh, informative, practical, and insightful never-before-published content. Contact us by email at [email protected]
Featured image: NEW DATA SERVICES on Unsplash