GraphNesting/03_datamodel.tex at clean · gyankos/GraphNesting · GitHub

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
% !TeX root = 00_paper_entrypoint.tex

\section{Nested Graphs}\label{sec:model}
\begin{table}
\begin{adjustbox}{max width=.48\textwidth}
\begin{tabular}{c|l|c}
\toprule
\parbox[t]{2mm}{\multirow{6}{*}{\rotatebox[origin=c]{90}{$G$, NGDB}}} & $\mathcal{V},\mathcal{E}$ & vertex/edge indices in $\mathbb{N}^2$\\
                                 & $\lambda$ & edge to source-target vertices function\\
                                 & $\ell$ & vertex/edge multilabelling in $\Sigma^*$\\
                                 & $\omega$ & vertex/edge to tuple function\\
                                 & $\nu,\varepsilon$ & vertex/edge containment functions for $\mathcal{V}\cup\mathcal{E}$\\
                                 & $G_o$ & nested graph induced by vertex/edge $o\in\mathcal{V}\cup\mathcal{E}$\\
\midrule
\parbox[t]{2mm}{\multirow{10}{*}{\rotatebox[origin=c]{90}{$\eta$ operator dependencies}}} & $dt$ & index dovetailing function \\
                                 & $f\colon D\xrightarrow{f}C$ & function $f$ with domain $D$ and codomain $C$\\
                                 & $a\mapsto b$ & finite lambda function with domain $\{a\}$\\
                                 & $\oplus$ & finite function extension\\
                                 & $\kappa$ & (graph) pattern, i.e. multilabelled graph classifier\\
                                 & $g_\kappa$ & nested graph classifier  over $\kappa$\\
                                 & $f_C$ & morphism from pattern to subgraph $G_C$\\
                                 & $\iota_{G_o}$ & indexing function for subgraphs $G_C\subseteq G_o$\\
                                 & $\mu_\Omega$ & object user defined function\\
                                 & $\mu_E$ & edge user defined function\\
\bottomrule
\end{tabular}
\end{adjustbox}
\caption{Table Of Notations}
\vspace{-2em}
\end{table}
The term \textit{property graph}  \cite{angles12} usually refers to a directed, labelled and attributed multigraph.
% In other words, if there is a schema, each vertex and edge will be represented by a relational tuple and without, by a document of key-value pairs.
In a property graph a collection of \textit{labels} \cite{bergamimm17} is associated to every vertex and edge (e.g., \texttt{[}\mstr{Author}\texttt{]}) or \texttt{[}\mstr{coAuthorsip}\texttt{]}). Further on, vertices and edges may have arbitrary named attributes (\textit{properties}) in the form of key-value pairs (e.g., \texttt{\textbf{name}:Baldwin} or \texttt{\textbf{surname}:Oliver}). Property-value associations of vertices and edges can be represented by relational tuples; this is a common approach in literature used even when graphs have no fixed schema \cite{angles12}. We define the\textit{ nested (property) graph database} as the following extension of the property graph data model for nested information:

\begin{definition}[Nested Graph DataBase]
Given a set $\Sigma^*$ of strings,
	a \textbf{nested (property) graph database} $G$ is a tuple $G=\Braket{\VS, \ES, \lambda,\ell,\omega,\nestF,\prov}$, where $\VS$ and $\ES$ are disjoint sets, respectively referring to vertex and edge identifiers $o\equiv(c,i)\in\mathbb{N}^2$; $c$ is an incremental unique number associated to each graph. In particular, input data graphs have $c=0$ while  nested graphs created at nesting steps exhibit $c>0$.

	A function $\lambda\colon \ES\to \VS^2$ maps each edge to its source and target vertex. Each vertex and edge is assigned to multiple possible labels through the labelling function $\ell:\VS\cup \ES\to \wp(\Sigma^*)$.  $\omega$ is a function mapping each vertex and edge into a relational tuple.

	In addition to the previous components defining a property graph, we also introduce functions representing \textit{vertex members} $\nestF\colon (\VS\cup \ES)\to\wp(\VS)$ and \textit{edge members} $\prov\colon(\VS\cup \ES)\to \wp(\ES)$. These functions induce the nesting by associating a set of vertices or edges to each vertex and edge. Each vertex or edge $o\in V\cup E$ induces a \textbf{nested (property) graph} as the following pair:
	\[G_o=\Braket{\nu(o),\Set{e\in\epsilon(o)|\lambda(e)\in (\cup_{n\geq 0}\;{\nu\epsilon}^{(n)}(\{o\}))^2}}\]
	where ${\nu\epsilon}$ returns the vertices contained in both vertices and edges ($\nu\epsilon(x)=\nu(x)\cup \nu(\epsilon(x))$). We denote $f(X){:=}\bigcup_{x\in X} f(x)$ when $X\subseteq \textup{dom}(f)$
\end{definition}


Since the member functions $\nu$ and $\epsilon$ induce the expansion of each single vertex or edge to a graph, we must avoid recursive nesting to support expanding operations.
% This condition is fundamental in order to define different levels as different abstractions over the data.
Therefore, we additionally introduce the following constraints to be set at a nested property graph database level:

\begin{axiom}[Recursion Constraints]
	For each correctly nested property graph, each vertex $v\in \VS$ must not contain $v$ at any level of containment of $\nu$ and, any of its descendants $m$ must not contain $v$:
	\[\forall v\in \VS. \forall m\in \nu^+(v).\;\; m\neq v\wedge v\notin \cup_{n\geq 1}\;{\nu\epsilon}^{(n)}(m)\]
	Similarly to vertices, any edge shall not contain itself at any nesting level:
	\[\forall e\in \ES. \forall m\in \epsilon^+(e). m\neq e\wedge e\notin \cup_{n\geq 1}\;{\epsilon\nu}^{(n)}(m)\]
	where ${\epsilon\nu}$ returns the edges contained in both vertices and edges ($\epsilon\nu(x)=\epsilon(x)\cup \epsilon(\nu(x))$)
\end{axiom}

A vertex $v$ having a non-empty vertex or edge members is called \textbf{nested vertex}, while vertices with no members are simply referred to \textbf{simple vertices}. For edges, we respectively use the terms \textbf{nested edges} and \textbf{simple edges}.

\begin{ex}[label=exImpl]
The property graph in Figure \ref{fig:inputbibex2} can be represented by the graph $G_{(0,11)}$, which is a nested vertex contained in the following nested graph database:
\[G=\Braket{\{(0,0),(0,1),\dots,(0,5),(0,11)\}, \{(0,6),\dots,(0,10)\},\lambda,\ell,\omega,\nu,\epsilon}\]

The nested vertex $(0,11)$  represents a \mstr{Bibliography} graph ($\ell(0,11)=[\mstr{Bibliography}]$), to which an empty tuple is associated ($\omega(0,11)=\{\}$). Its vertex ($\nu$) and edge ($\epsilon$) members are defined as follows:
\[\nu(0,11)=\{(0,0),\dots,(0,5)\}\quad\epsilon(0,11)=\{(0,6),\dots,(0,10)\}\]

The simple edge $6$ within the property graph in Figure \ref{fig:inputbibex2} ($\nu(0,6)=\epsilon(0,6)=\emptyset$) has now id $(0,6)$; it has one label, $\ell(0,6)=[\mstr{AuthorOf}]$, and it is associated to an empty tuple ($\omega(0,6)=\{\}$).
The source and target vertices are
$\lambda(0,6)=\Braket{(0,0),\;(0,3)}$. Similar considerations can be carried out for each  remaining edge.

The simple vertex $0$ in the same Figure has id $(0,0)$ in the present example; such vertex refers to the \mstr{Author} \texttt{Abigail Conner}. This information is represented as follows:
\[\ell(0,0)=[\mstr{Author}]\quad\nu(0,0)=\epsilon(0,0)=\emptyset\] \[\omega(0,0)=\{\texttt{\textbf{name}}\colon\texttt{Abigail},\texttt{\textbf{surname}}\colon\texttt{Conner}\}\]
Similar considerations can be carried out for each remaining vertex.

\end{ex}