You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: docs/manual/patterns.md
+36-78Lines changed: 36 additions & 78 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -60,118 +60,96 @@ but this pattern does:
60
60
(plays/P * * ...)
61
61
```
62
62
63
-
## Non-strict search
63
+
## Type and subtype matching
64
64
65
-
Non-strict search allows for patterns to match atoms in the most general way, meaning that, if a subtype or other roles are not specified in the pattern, then any subtypes or argroles will match, as can be seen in this example:
65
+
If only a type but not a subtype is specified in the pattern, then any subtypes of the given type will match, as can be seen in this example:
Even though the full type and roles of `plays/Pd.so` and `alice/Cp.s` are not specified in the pattern, they still match the more general corresponding atoms `plays/P` and `alice/C`.
77
-
78
-
Non-strict search is semantically more powerful, at the expense of performance. Strict search can take advantage of the structure of the hypergraph database to perform fast queries, while non-strict search iterates through all edges in the hypergraph looking for matches.
74
+
Even though the full types of `plays/Pd.so`, `alice/Cp` and `chess/Cc` are not specified in the pattern, they still match the more general corresponding atoms `plays/P.so` and `*/C`.
79
75
80
76
## Matching argroles
81
77
82
-
Argroles can be specified in patterns. So:
78
+
As we have seen, argroles can be specified in patterns. So:
It is often desirable to match for the presence of a given set of argroles, independently of their respective positions, or of the presence of further argroles outside the set. This is indicated by surrounding with curly brackets the set of argroles that is to be matched in this way. For example:
101
97
102
-
```
98
+
```clojure
103
99
(is/P.{sc} * */C)
104
100
```
105
101
106
102
The above pattern would match both (independently of position):
107
103
108
-
```
104
+
```clojure
109
105
(is/P.sc (the/M sky/C) blue/C)
110
106
(is/P.cs blue/C (the/M sky/C))
111
107
```
112
108
113
109
and also (independently of the presence of further argroles outside the set):
In fact, when specifying argroles, more often than not this is the behavior that is the most useful, because it allows for the matching of the participants of a relationship purely according to the role they play in it (subject, object, etc.).
120
116
121
117
Sometimes it is also desirable to explicitly forbid certain argument roles. This is achieved by indicating them after '-' in the argrole sequence. For example:
When using `Hypergraph.search()`, order-independent (curly-braces) and argrole exclusions (-) only work in non-strict mode.
134
-
135
129
## Patterns with variables for information extraction
136
130
137
131
Let us introduce the concept of *variable*. Like a wildcard, a variable indicates a placeholder that can match a hyperedge, but can then be used to refer to that matched hyperedge. In SH representation, an atom label that starts with upper case represents a variable. For example: `PLAYER/C`. One can define perfectly valid hyperedges that include variables, as well as wildcards, so for example:
138
132
139
-
```
133
+
```clojure
140
134
(plays/P.{so} PLAYER/C *)
141
135
```
142
136
143
-
Then the `match_pattern(edge, pattern)` function can be used to apply patterns to edges. It works like this:
137
+
Then `edge.match(pattern)` can be used to apply patterns to edges. It works like this:
144
138
145
-
```pycon
146
-
>>> from hyperbase import hedge
147
-
>>> from hyperbase.patterns import match_pattern
148
-
>>> pattern = hedge('(plays/P.{so} PLAYER/C *)')
149
-
>>> edge = hedge('(plays/P.so mary/C *)')
150
-
>>> match_pattern(edge, pattern)
151
-
[{'PLAYER': mary/C}]
152
-
```
153
-
154
-
So, `match_pattern` gives a list of dictionaries (one pattern can match the same edge in several ways). Each dictionary represents a match, and assigns a value to a variable.
155
-
156
-
The `Hypergraph` object provides the `match()` method, which is similar to `search()` but returns dictionaries with the matched variables. Like search, it offers a non-strict mode with the same trade-offs:
The output is a list of tuples, where the first item is the matched hyperedge and the second is a dictionary with variables and their values.
146
+
So, `edge.match(pattern)` gives a list of dictionaries (one pattern can match the same edge in several ways). Each dictionary represents a match, and assigns values to the pattern variable(s).
169
147
170
148
## Functional patterns
171
149
172
-
Even more sophisticated patterns can be represented with the help of functional pattern expressions. These expressions are akin to function application in LISP-like languages, and take the general form:
150
+
More sophisticated patterns can be represented with the help of functional pattern expressions. These expressions are akin to function application in LISP-like languages, and take the general form:
173
151
174
-
```
152
+
```clojure
175
153
(functional-pattern-name argument_1 ...)
176
154
```
177
155
@@ -182,76 +160,56 @@ Even more sophisticated patterns can be represented with the help of functional
182
160
183
161
The `atoms` functional pattern matches any edge that contains all the atoms provided as arguments, at any depth:
184
162
185
-
```
163
+
```clojure
186
164
(atoms atom_1 ...)
187
165
```
188
166
189
167
For example this pattern:
190
168
191
-
```
169
+
```clojure
192
170
(atoms going/P)
193
171
```
194
172
195
173
would match the edge:
196
174
197
-
```
175
+
```clojure
198
176
(is/M (not/M going/P))
199
177
```
200
178
201
179
In the same vein, this pattern:
202
180
203
-
```
181
+
```clojure
204
182
(atoms not/M going/P)
205
183
```
206
184
207
185
would equally match the edge:
208
186
209
-
```
187
+
```clojure
210
188
(is/M (not/M going/P))
211
189
```
212
190
213
191
but not:
214
192
215
-
```
193
+
```clojure
216
194
(is/M going/P)
217
195
```
218
196
219
197
Furthermore, the atoms can define wildcards and all the pattern syntax specified above, for example:
220
198
221
-
```
199
+
```clojure
222
200
(atoms not/M */P)
223
201
```
224
202
225
-
### Lemma
226
-
227
-
The `lemma` functional pattern matches any atom which has the same lemma as the one specified. This functional pattern only works in the context of a hypergraph database that contains lemma information, that can be generated by the parsers provided with hyperbase.
228
-
229
-
For example, this pattern:
230
-
231
-
```
232
-
(lemma be/P)
233
-
```
234
-
235
-
could be used to match:
236
-
237
-
```
238
-
is/P
239
-
was/P
240
-
```
241
-
242
-
!!! note
243
-
When using `lemma`, it is necessary to specify some semantic hypergraph object when calling the pattern matching function: `match_pattern(edge, pattern, hg=hg)`.
244
-
245
203
### Var
246
204
247
205
The `var` functional pattern is used to specify a part of a pattern while also capturing it as a variable. It has the general form:
248
206
249
-
```
207
+
```clojure
250
208
(var pattern-edge variable-name)
251
209
```
252
210
253
211
This way, a complex expression such as the following can be captured in a variable:
0 commit comments