NOSQL – How to build a real world relationship in a chart database (such as NEO4J)?

I have a general question about modeling in a graph database, and I can’t seem to surround my head.

How do you model this type of relationship: “Newton invented calculus”?

In simple graph, you can model like this:

Newton (node) -> invented (relationship) -> Calculus (node)

…So you will have a bunch of “invented” graphical relationships because you add more people and inventions.

The problem is that you start to add a bunch of attributes to the relationship:

Invention date
Influential concepts
Influential people
> books_inventor_wrote

…and you will start to create relationships between these attributes and other nodes, for example:

Influential people: relationship with human nodes
> books_inventor_wrote: and The relationship between book nodes

So now it seems that the “real world relationship” (“invention”) should actually be a node in a graph, and the graph should look like this:

Newton (node) -> (relationship) -> Invention of Calculus (node) -> (relationship) -> Calculus (node)

The more complicated , Other people also participated in the invention of calculus, so the graph now becomes:

Newton (node) -> 
(relationship) ->
Newton's Calculus Invention (node) ->
(relationship) ->
Invention of Calculus (node) ->
(relationship) ->
Calculus (node)
Leibniz (node) ->
(relationship) ->
Leibniz's Calculus Invention (node) ->
(relationship) ->
Invention of Calculus (node) ->
(relationship) ->
Calculus (node)

So I ask this question because you don’t seem to want to set properties on the actual graph database “relationship” object, because you might Hope to treat them as nodes in the graph at some point.

Is it correct?

I have been studying Freebase Metaweb Architecture, and they seem to be treating everything as a node. For example, Freebase has an idea of ​​Mediator/CVT, in which you can create a “Performance” node linking the “Actor” node to the “Film” node, as shown below: http://www.freebase.com/edit/ topic/en/the_last_samurai. Not sure if this is the same issue.

What guiding principles are used to determine whether the “real-world relationship” should be a graph node instead of a graph relationship?

If there is a good book on this topic, I would love to know. Thanks!

some of these things, such as the date of invention, can be stored as attributes on the edge, because Most graph database edges can have the same attributes as the vertices. For example, you can do this (the code is as follows TinkerPop’s Blueprints):

Graph graph = new Neo4jGraph("/tmp/my_graph");
Vertex newton = graph.addVertex(null);
newton.setProperty("given_name", "Isaac");
newton.setProperty("surname", "Newton");
newton.setProperty( "birth_year", 1643); // use Gregorian dates...
newton.setProperty("type", "PERSON");

Vertex calculus = graph.addVertex(null);
calculus.setProperty("type", "KNOWLEDGE");

Edge newton_calculus = graph.addEdge(null, newton, calculus, "DISCOVERED");
newton_calculus.setProperty ("year", 1666);

Now, let’s expand it and add it to Liebniz:

Vertex liebniz = graph.addVertex(null) ;
liebniz.setProperty("given_name", "Gottfried");
liebniz.setProperty("surnam", "Liebniz");
liebniz.setProperty("birth_year", "1646" );
liebniz.setProperty("type", "PERSON");

Edge liebniz_calculus = graph.addEdge(null, liebniz, calculus, "DISCOVERED");
liebniz_calculus .setPr operty("year", 1674);

Add a book:

Vertex principia = graph.addVertex(null);
principia.setProperty ("title", "Philosophiæ Naturalis Principia Mathematica");
principia.setProperty("year_first_published", 1687);
Edge newton_principia = graph.addEdge(null, newton, principia, "AUTHOR");
Edge principia_calculus = graph.addEdge(null, principia, calculus, "SUBJECT");

To find out all the books Newton wrote on the things he found, we can construct a graph traversal . We started with Newton, from the link he found to what he found, and then in turn obtained books on the subject through the link, and reversed the link again to get the author. If the author is Newton, then go back to the book and return the result. This query is written in Gremlin, a Groovy-based domain specific language for graph traversal:

newton.out("DISCOVERED").in("SUBJECT"). as("book").in("AUTHOR").filter{it == newton}.back("book").title.unique()

So, I hope I have displayed a How to traverse cleverly can avoid the problem of creating intermediate nodes to represent the edge. In a small database, this is not important, but in a large database, you will suffer a lot of performance attacks.

Yes, the sad part is that you cannot associate an edge with other edges in the edge, but this is a limitation of the data structure. Sometimes it makes sense to make all nodes into one node. For example, in Mediator/CVT, one performance is more specific. Individuals may wish to comment only in Tom Cruise’s “The Last Samurai” performance. However, for most graph databases, I found that some graph traversal applications can let me get what I want from the database.

I have a general question about modeling in a graph database, and I can’t seem to surround my head.

How do you model this type of relationship: “Newton invented calculus”?

In simple graph, you can model like this:

Newton (node) -> invented (relationship) -> Calculus (node)

…So you will have a bunch of “invented” graphical relationships because you add more people and inventions.

The problem is that you start to add a bunch of attributes to the relationship:

Invention date
Influential concepts
Influential people
> books_inventor_wrote

…and you will start to create relationships between these attributes and other nodes, for example:

Influential people: relationship with human nodes
> books_inventor_wrote: and The relationship between book nodes

So now it seems that the “real world relationship” (“invention”) should actually be a node in a graph, and the graph should look like this:

Newton (node) -> (relationship) -> Invention of Calculus (node) -> (relationship) -> Calculus (node)

The more complicated , Other people also participated in the invention of calculus, so the graph now becomes:

Newton (node) -> 
(relationship) ->
Newton's Calculus Invention (node) ->
(relationship) ->
Invention of Calculus (node) ->
(relationship) ->
Calculus (node)
Leibniz (node) ->
(relationship) ->
Leibniz's Calculus Invention (node) ->
(relationship) ->
Invention of Calculus (node) ->
(relationship) ->
Calculus (node)

So I ask this question because you don’t seem to want to set properties on the actual graph database “relationship” object, because you might Hope to treat them as nodes in the graph at some point.

Is it correct?

I have been studying Freebase Metaweb Architecture, and they seem to be treating everything as a node. For example, Freebase has an idea of ​​Mediator/CVT, in which you can create a “Performance” node linking the “Actor” node to the “Film” node, as shown below: http://www.freebase.com/edit/ topic/en/the_last_samurai. Not sure if this is the same issue.

What guiding principles are used to determine whether the “real-world relationship” should be a graph node instead of a graph relationship?

If there is a good book on this topic, I would love to know. Thanks!

Some of these things, such as the date of invention, can be stored as attributes on edges, because most graph database edges can have the same attributes as vertices Attributes. For example, you can do this (the code is as follows TinkerPop’s Blueprints):

Graph graph = new Neo4jGraph("/tmp/my_graph");
Vertex newton = graph.addVertex(null);
newton.setProperty("given_name", "Isaac");
newton.setProperty("surname", "Newton");
newton.setProperty( "birth_year", 1643); // use Gregorian dates...
newton.setProperty("type", "PERSON");

Vertex calculus = graph.addVertex(null);
calculus.setProperty("type", "KNOWLEDGE");

Edge newton_calculus = graph.addEdge(null, newton, calculus, "DISCOVERED");
newton_calculus.setProperty ("year", 1666);

Now, let’s expand it and add it to Liebniz:

Vertex liebniz = graph.addVertex(null) ;
liebniz.setProperty("given_name", "Gottfried");
liebniz.setProperty("surnam", "Liebniz");
liebniz.setProperty("birth_year", "1646" );
liebniz.setProperty("type", "PERSON");

Edge liebniz_calculus = graph.addEdge(null, liebniz, calculus, "DISCOVERED");
liebniz_calculus .setProperty( "year", 1674);

Add books:

Vertex principia = graph.addVertex(null);
principia.setProperty(" title", "Philosophiæ Naturalis Principia Mathematica");
principia.setProperty("year_first_published", 1687);
Edge newton_principia = graph.addEdge(null, newton, principia, "AUTHOR");
Edge principia_calculus = graph.addEdge(null, principia, calculus, "SUBJECT");

To find out all the books Newton wrote on the things he discovered, we can construct a graph traversal. We started with Newton, from the link he found to what he found, and then in turn obtained books on the subject through the link, and reversed the link again to get the author. If the author is Newton, then go back to the book and return the result. This query is written in Gremlin, a Groovy-based domain specific language for graph traversal:

newton.out("DISCOVERED").in("SUBJECT"). as("book").in("AUTHOR").filter{it == newton}.back("book").title.unique()

So, I hope I have displayed a How to traverse cleverly can avoid the problem of creating intermediate nodes to represent the edge. In a small database, this is not important, but in a large database, you will suffer a lot of performance attacks.

Yes, the sad part is that you cannot associate an edge with other edges in the edge, but this is a limitation of the data structure. Sometimes it makes sense to make all nodes into one node. For example, in Mediator/CVT, one performance is more specific. Individuals may wish to comment only in Tom Cruise’s “The Last Samurai” performance. However, for most graph databases, I found that some graph traversal applications can let me get what I want from the database.

Leave a Comment

Your email address will not be published.