}

SQL Server's Graph Database for Contacts and Connections

What are Graph Tables?

Once when teaching SQL Training, a student, Amy J, asked, "I see SQL Server 2017 has a "graph database" feature. How does a graph differ from the old HierarchyID datatype?". Good question, Amy. While the old hierarchyID datatype can represent an acyclic graph in a very simplistic way, the new graph database capabilities couldn't be more different. In the mathematical sense, a graph is a set of nodes and their connecting lines or "edges." Many graphs of interest contain cycles. Consider your Facebook friends. You might be friends with Sarah, and Sarah is friends with Abdul. Abdul might, of course, be friends with you, completing a cycle.

At least as important as the representation of cycles is that the nodes in a graph database need not represent the same object type. As is always the case, Microsoft has provided a sample database for us to study; the graph database sample includes nodes representing people, cities, and restaurants. The edges connecting these nodes can be of different types. For example, people can be friends with each other, and they can be located in cities. Restaurants can also be located in cities, and if they are good enough, people can like them. A graph database aims to represent such relationships in an SQL Server database efficiently.

Creating Graph Tables

Unlike regular SQL Server tables, graph tables have a type; either NODE or EDGE. Data Insertion into a NODE table is like any other table. Still, insertion into an EDGE table must be different since the two nodes connected by an edge must be specified.

CREATE TABLE City (
ID INTEGER PRIMARY KEY,
name VARCHAR(100),
stateName VARCHAR(100)
) AS NODE;
CREATE TABLE livesIn AS EDGE;
Note that the EDGE table does not need to explicitly define any columna. If we wish, we can certainly add additional columns as required.
CREATE TABLE likes (rating INTEGER) AS EDGE;
Graph Tables as Listed in the Object Explorer.
Graph Tables as Listed in the Object Explorer.

We see in the Object Explorer that graph tables have their distinct folder. But, curiously, when we diagram the database, we see no established relationships defined among graph tables.

Graph Tables screenshot

Inserting an edge requires the node identifiers of the two connected nodes, which are queried from the corresponding NODE tables. The connecting edges are always directional; e.g., the "likes" edge goes from person to restaurant, not the other way around.

INSERT INTO likes VALUES ((SELECT $node_id FROM Person WHERE id = 1),
(SELECT $node_id FROM Restaurant WHERE id = 1),9);

Querying Graph Data

Not surprisingly, the unique features of graph data require particular SQL statements for their use. Fortunately, the MATCH statement is very natural in its syntax and easy to write and read.

Here we get a list of people and the restaurants they like:

SELECT Person.name, Restaurant.name
FROM Person, likes, Restaurant
WHERE MATCH (Person-(likes)->Restaurant)

As you might expect, we can get a list of the people who like a specific restaurant:

SELECT DISTINCT p1.name
FROM Person p1, Person p2, likes, Restaurant, livesIn, City, locatedIn, friendOf
WHERE MATCH (p1-(friendOf)->p2-(likes)->Restaurant) AND Restaurant.name = 'Noodle Land'

By expanding the graph path within the MATCH clause, we can, for example, get a list of the people who are friends with someone who likes the Noodle Land restaurant.

SELECT Person.name, Restaurant.name
FROM Person, likes, Restaurant
WHERE MATCH (Person-(likes)->Restaurant) AND Restaurant.name = 'Noodle Land'

Some Limitations

There are, however, some significant limitations. For example, we cannot combine two MATCH expressions using OR or NOT or use the same edge in two MATCH clauses. Therefore, we cannot get a list of people who are friends with someone who likes Noodle Land but who does not like Noodle Land. (Or, more precisely, they do not have a "like" edge themselves. A like edge could be missing because they do not like the restaurant, or it could simply be missing because they have never been there.)

Conclusions

Graph databases in SQL Server satisfy a critical but particular need. Many organizations do not concern themselves with the analysis of interconnections. However, for those that do, the new SQL Server graph technology is an essential first step in the right direction.

 

Acquire essential skills in database management, query writing, and data integration with SQL Training Courses

 

This piece was originally posted on April 17, 2018, and has been refreshed with updated styling and links.