|
| 1 | +## Programs with Loops in SemGuS |
| 2 | + |
| 3 | +One differentiator of SemGuS is the ability to encode programs with loops. In this tutorial, we will walk |
| 4 | +through defining the syntax and semantics of a simple looping SemGuS problem. In particular, we want a |
| 5 | +function: |
| 6 | +```c |
| 7 | +multiply(int x, int y) { ... } |
| 8 | +``` |
| 9 | +that takes two integers and multiplies them together---but by using repeated addition. |
| 10 | +
|
| 11 | +> Note: we are _not_ synthesizing an SMT expression or function; rather, in this tutorial, we are synthesizing something more like a C-style function that is called functionally but is implemented with imperative statements, including loops. |
| 12 | +
|
| 13 | +### Defining a universe of terms |
| 14 | +
|
| 15 | +Our first step is to define a universe of terms; that is, an unrestricted grammar of all terms that our program |
| 16 | +will be defined with. We will further restrict this grammar later. |
| 17 | +
|
| 18 | +``` |
| 19 | +F ::= fn(x,y) { S; return E } |
| 20 | +S ::= x ← E | y ← E | r ← E | while B do { S } | S; S | noop |
| 21 | +E ::= x | y | r | E + E | E - E | 0 | 1 |
| 22 | +B ::= E < E |
| 23 | +``` |
| 24 | +
|
| 25 | +This is a simple imperative grammar that defines a function `fn(int x, int y) -> int`. Inside the |
| 26 | +function body, the `S` non-terminal defines assignment statements for `x`, `y`, and a register `r`, |
| 27 | +a composition operator `;`, an empty statement `noop`, and a `while` loop statement. Assignments, |
| 28 | +the loop condition, and the function return statement all take an integer expression `E`. |
| 29 | +
|
| 30 | +**Example**: Summing `x` and `y`. A program that returns the sum of `x` |
| 31 | +and `y` does not need any body statements, just the return: |
| 32 | +```c |
| 33 | +fn(x,y) { noop; return x + y } |
| 34 | +``` |
| 35 | + |
| 36 | +**Example**: `x`-sign is `y`-parity. A program that returns `x` if `y` is even and `-x` if `y` is odd. |
| 37 | +```c |
| 38 | +fn(x,y) { |
| 39 | + r ← x; |
| 40 | + while 0 < y do { |
| 41 | + y ← y - 1; |
| 42 | + r ← (x - r) - x |
| 43 | + // r oscillates between +/- x |
| 44 | + }; |
| 45 | + return r |
| 46 | +} |
| 47 | +``` |
| 48 | +
|
| 49 | +To encode this universe of terms in a SemGuS file, we use the `declare-term-types` command, which is similar to the `declare-datatypes` command native to SMT-LIB2. Each non-terminal |
| 50 | +maps to an algebraic datatype, and each production maps to a constructor for the given datatype. The grammar above, when converted to the SemGuS format, looks like: |
| 51 | +```smt |
| 52 | +(declare-term-types |
| 53 | + ;; Nonterminals |
| 54 | + ((F 0) (S 0) (E 0) (B 0)) |
| 55 | + |
| 56 | +;; Productions |
| 57 | + ((($function S E)) ; F |
| 58 | + (($x<- E) ; S |
| 59 | + ($y<- E) |
| 60 | + ($r<- E) |
| 61 | + ($while B S) |
| 62 | + ($seq S S) |
| 63 | + ($noop)) |
| 64 | + (($r) ; E |
| 65 | + ($0) |
| 66 | + ($1) |
| 67 | + ($x) |
| 68 | + ($y) |
| 69 | + ($+ E E) |
| 70 | + ($- E E)) |
| 71 | + (($< E E)))) ; B |
| 72 | +``` |
| 73 | +By convention, term type constructors are prefixed with `$`, simply to |
| 74 | +reduce namespace collisions; however, this is not required. The two |
| 75 | +example programs can thus be expressed as SMT expressions constructing a |
| 76 | +tree of term types: |
| 77 | +```smt |
| 78 | +; Sum x and y |
| 79 | +($function $noop ($+ $x $y)) |
| 80 | +
|
| 81 | +; x-sign is y-parity |
| 82 | +($function |
| 83 | + ($seq ($r<- $x) |
| 84 | + ($while ($< $0 $y) |
| 85 | + ($seq ($y<- ($- $y $1)) |
| 86 | + ($r<- ($- ($- $x $r) $x))))) |
| 87 | + $r) |
| 88 | +``` |
| 89 | + |
| 90 | +### Semantics as CHCs |
| 91 | +So far, we have specified the *syntax* of a program we could synthesize, but not its *semantics*. In SyGuS, for example, the syntax and semantics are equivalent; however, in SemGuS, we must imbue our |
| 92 | +productions with semantic meaning ourselves. |
| 93 | + |
| 94 | +In the SemGuS format, CHCs are encoded as `match` expressions over term types, with each CHC head getting a separate function. Each CHC head corresponds to a unique non-terminal, but a given non-terminal may have many CHC heads, depending on the desired problem semantics. The CHC head's signature contains the term being matched, the input tuple, and the output tuple. The outline of our example grammar's semantics looks like: |
| 95 | +```smt |
| 96 | +(define-funs-rec |
| 97 | + ;; CHC heads |
| 98 | + ((F.Sem ((t F) (x Int) (y Int) (ret Int)) Bool) |
| 99 | + (S.Sem ((t S) (xi Int) (yi Int) (ri Int) (xo Int) (yo Int) (ro Int)) Bool) |
| 100 | + (E.Sem ((t E) (xi Int) (yi Int) (ri Int) (out Int)) Bool) |
| 101 | + (B.Sem ((t B) (xi Int) (yi Int) (ri Int) (out Bool)) Bool)) |
| 102 | + |
| 103 | + ;; Bodies |
| 104 | + ((! (match t (...F productions...)) |
| 105 | + :input (x y) :output (ret)) |
| 106 | + |
| 107 | + ; Statements |
| 108 | + (! (match t (...S productions...)) |
| 109 | + :input (xi yi ri) :output (xo yo ro)) |
| 110 | + |
| 111 | + ; Int expressions |
| 112 | + (! (match t (...E productions...)) |
| 113 | + :input (xi yi ri) :output (out)) |
| 114 | +
|
| 115 | + ; Boolean expressions |
| 116 | + (! (match t (...B productions...)) |
| 117 | + :input (xi yi ri) :output (out)))) |
| 118 | +``` |
| 119 | + |
| 120 | +Our example grammar is interpreted with standard semantics as expected; however, some notes are warranted. Since we are using imperative semantics, each production takes an input state tuple of `(x, y, r)`. Productions from `S` return output state tuples `(x', y', r')`, while |
| 121 | +the remaining productions return a single expression value. |
| 122 | + |
| 123 | +**The Expression Productions.** Each expression production follows the same form of calling the semantics for child productions (if any) and then returning the output of the expression function. For example, `$r` is simply: |
| 124 | +```smt |
| 125 | +($r (= out ri)) ; <-- Set the output variable to r from the input tuple |
| 126 | +``` |
| 127 | +and `$<` is: |
| 128 | +```smt |
| 129 | +(($< t1 t2) |
| 130 | + (exists ((o1 Int) (o2 Int)) |
| 131 | + (and |
| 132 | + (E.Sem t1 xi yi ri o1) ; <-- Call first child |
| 133 | + (E.Sem t2 xi yi ri o2) ; <-- Call second child |
| 134 | + (= out (< o1 o2))))) ; <-- Set output to o1 < o2 |
| 135 | +``` |
| 136 | + |
| 137 | +**The Function Production.** The `$function` production handles converting between the desired functional semantics of the program and the internal imperative nature of the program. It creates an initial input state tuple `(x, y, 0)` and passes it to the `S` child, and then passes the output `(xo, yo, ro)` of `S` to the return expression `E` for a final output value. Encoded in SemGuS, this is a match: |
| 138 | +```smt |
| 139 | +(($function tbody treturn) |
| 140 | + (exists ((xo Int) (yo Int) (ro Int)) |
| 141 | + (and |
| 142 | + (S.Sem tbody x y 0 xo yo ro) ; <-- Call the statement body |
| 143 | + (E.Sem treturn xo yo ro ret)))) ; <-- Return the final expression |
| 144 | +``` |
| 145 | + |
| 146 | +**The Assignment Productions.** The assignment productions (e.g., `$r<-`) simply update the output state with the new value of the given variable and pass the rest through. For assigning, e.g., `r`, this looks like: |
| 147 | +```smt |
| 148 | +(($r<- te) |
| 149 | +(exists ((out Int)) |
| 150 | + (and |
| 151 | + (E.Sem te xi yi ri out) ; <-- Call the expression child |
| 152 | + (= xo xi) ; <-- Pass x through |
| 153 | + (= yo yi) ; <-- Pass y through |
| 154 | + (= ro out)))) ; <-- Update r with the new value |
| 155 | +``` |
| 156 | + |
| 157 | +**The Loop Production.** The `$while` loop production has two separate CHCs: one for the base case (`B` evaluates to false), and one for the recursive case (`B` evaluates to true and the body runs again). Both cases are encoded as two alternatives for the `$while` production: |
| 158 | +```smt |
| 159 | +(($while tb ts) |
| 160 | + ; Recursive case |
| 161 | + (exists ((b Bool) (x1 Int) (y1 Int) (r1 Int)) |
| 162 | + (and |
| 163 | + (B.Sem tb xi yi ri b) ; <-- Evaluates the condition |
| 164 | + (= b true) ; <-- Guard: checks the condition |
| 165 | + (S.Sem ts xi yi ri x1 y1 r1) ; <-- Calls the statement body |
| 166 | + (S.Sem t x1 y1 r1 xo yo ro))) ; <-- Calls the $while term again |
| 167 | + ; Base case |
| 168 | + (exists ((b Bool)) |
| 169 | + (and |
| 170 | + (B.Sem tb xi yi ri b) ; <-- Evaluates the condition |
| 171 | + (= b false) ; <-- Guard: checks the condition |
| 172 | + (= xo xi) ; <-- Noop: copy x |
| 173 | + (= yo yi) ; <-- Noop: copy y |
| 174 | + (= ro ri)))) ; <-- Noop: copy r |
| 175 | +``` |
| 176 | + |
| 177 | +### The Synthesis Task |
| 178 | +Similar to SyGuS, we use the `synth-fun` task to declare what we are attempting to synthesize. However, the SemGuS `synth-fun` specifies the *term* to be synthesized (as a datatype tree). Here, the basic form is: |
| 179 | +```smt |
| 180 | +(synth-fun mul () F) |
| 181 | +``` |
| 182 | +We want to synthesize a function named `mul` that takes no arguments and returns a value of type `F`; that is, a tree of terms rooted at the non-terminal `F`. This declaration can optionally be augmented with a further-restricted grammar, in the same format as in SyGuS. In this case, we massage the grammar a little to make synthesis more tractable by: |
| 183 | +* removing `$noop` |
| 184 | +* restricting `B` to use literals |
| 185 | +* making `$seq` only right-recursive |
| 186 | +* not allowing `$+` and `$-` to recurse |
| 187 | +* forcing a loop at the top-level of the function body |
| 188 | + |
| 189 | +```smt |
| 190 | +(synth-fun mul () F |
| 191 | + ((F F) (Loop S) (Seq S) (Inner S) (Op E) (Val E) (B B)) |
| 192 | + ((F F (($function Loop Val))) |
| 193 | + (Loop S (($while B Seq))) |
| 194 | + (Seq S (($seq Inner Seq) Inner)) |
| 195 | + (Inner S (($x<- Op) ($y<- Op) ($r<- Op))) |
| 196 | + (Op E (($+ Val Val) ($- Val Val) Val)) |
| 197 | + (Val E ($0 $1 $x $y $r)) |
| 198 | + (B B (($< Val Val))))) |
| 199 | +``` |
| 200 | +As SemGuS solvers mature, we hope to not need these restrictions in the future! |
| 201 | + |
| 202 | +### Constraints |
| 203 | +We will use programming-by-example (PBE) constraints, specified as a relation over the initial CHC head with `mul` as the term. |
| 204 | +```smt |
| 205 | +(constraint (F.Sem mul 0 0 0)) ; 0 * 0 = 0 |
| 206 | +(constraint (F.Sem mul 1 1 1)) ; 1 * 1 = 1 |
| 207 | +(constraint (F.Sem mul 2 2 4)) ; 2 * 2 = 4 |
| 208 | +(constraint (F.Sem mul 3 3 9)) ; 3 * 3 = 9 |
| 209 | +(constraint (F.Sem mul 5 3 15)) ; 5 * 3 = 15 |
| 210 | +(constraint (F.Sem mul 3 4 12)) ; 3 * 4 = 12 |
| 211 | +``` |
| 212 | + These examples show that we are looking for a program that will multiply `x` and `y`. More specifically, these constraints state that we are looking for a value of the term `mul` such that the relation `(F.Sem mul x y o)` is true for all `x`, `y`, and `o` such that `o = x*y`, with specific examples of `x`, `y`, and `o` given as constraints. |
| 213 | + |
| 214 | +### Synthesizing |
| 215 | +The final command in a SemGuS file is the `check-synth` command, which tells the solver to find the solution! |
| 216 | +```smt |
| 217 | +(check-synth) |
| 218 | +``` |
| 219 | + |
| 220 | +Given the above constraints, the solution to this synthesis problem can be found by the bottom-up enumerator in `ks2` [[link]](https://github.com/kjcjohnson/ks2-mono) in fewer than 10 seconds. Run the following and see: |
| 221 | +```bash |
| 222 | +$ ks2 solve loops.sem -s 'bue' |
| 223 | +loops.sem: |
| 224 | +; <...trimmed...> |
| 225 | +( |
| 226 | + (define-fun mul () F |
| 227 | + ($function |
| 228 | + ($while ($< $0 $y) |
| 229 | + ($seq ($y<- ($- $y $1)) |
| 230 | + ($r<- ($+ $r $x)))) |
| 231 | + $r)) |
| 232 | +) |
| 233 | +``` |
| 234 | +This is the program we were looking for. Nice! |
| 235 | + |
| 236 | +### Full Example SemGuS File |
| 237 | +[Download the full example here.](loops.sem) |
| 238 | + |
| 239 | +```smt |
| 240 | +;;;; |
| 241 | +;;;; loops.sem - multiply via a loop |
| 242 | +;;;; |
| 243 | +
|
| 244 | +;;; |
| 245 | +;;; Term types |
| 246 | +;;; |
| 247 | +(declare-term-types |
| 248 | + ;; Nonterminals |
| 249 | + ((F 0) (S 0) (E 0) (B 0)) |
| 250 | + |
| 251 | +;; Productions |
| 252 | + ((($function S E)) ; F |
| 253 | + (($x<- E) ; S |
| 254 | + ($y<- E) |
| 255 | + ($r<- E) |
| 256 | + ($while B S) |
| 257 | + ($seq S S) |
| 258 | + ($noop)) |
| 259 | + (($r) ; E |
| 260 | + ($0) |
| 261 | + ($1) |
| 262 | + ($x) |
| 263 | + ($y) |
| 264 | + ($+ E E) |
| 265 | + ($- E E)) |
| 266 | + (($< E E)))) ; B |
| 267 | +
|
| 268 | +;;; |
| 269 | +;;; Semantics |
| 270 | +;;; |
| 271 | +(define-funs-rec |
| 272 | + ;; CHC heads |
| 273 | + ((F.Sem ((t F) (x Int) (y Int) (ret Int)) Bool) |
| 274 | + (S.Sem ((t S) (xi Int) (yi Int) (ri Int) (xo Int) (yo Int) (ro Int)) Bool) |
| 275 | + (E.Sem ((t E) (xi Int) (yi Int) (ri Int) (out Int)) Bool) |
| 276 | + (B.Sem ((t B) (xi Int) (yi Int) (ri Int) (out Bool)) Bool)) |
| 277 | + |
| 278 | + ;; Bodies |
| 279 | + ((! (match t |
| 280 | + ((($function tbody treturn) |
| 281 | + (exists ((xo Int) (yo Int) (ro Int)) |
| 282 | + (and |
| 283 | + (S.Sem tbody x y 0 xo yo ro) |
| 284 | + (E.Sem treturn xo yo ro ret)))))) |
| 285 | + :input (x y) :output (ret)) |
| 286 | + |
| 287 | + ; Statements |
| 288 | + (! (match t |
| 289 | + ((($x<- te) |
| 290 | + (exists ((out Int)) |
| 291 | + (and |
| 292 | + (E.Sem te xi yi ri out) |
| 293 | + (= xo out) |
| 294 | + (= yo yi) |
| 295 | + (= ro ri)))) |
| 296 | + (($y<- te) |
| 297 | + (exists ((out Int)) |
| 298 | + (and |
| 299 | + (E.Sem te xi yi ri out) |
| 300 | + (= xo xi) |
| 301 | + (= yo out) |
| 302 | + (= ro ri)))) |
| 303 | + (($r<- te) |
| 304 | + (exists ((out Int)) |
| 305 | + (and |
| 306 | + (E.Sem te xi yi ri out) |
| 307 | + (= xo xi) |
| 308 | + (= yo yi) |
| 309 | + (= ro out)))) |
| 310 | + |
| 311 | + (($seq t1 t2) |
| 312 | + (exists ((x1 Int) (y1 Int) (r1 Int)) |
| 313 | + (and |
| 314 | + (S.Sem t1 xi yi ri x1 y1 r1) |
| 315 | + (S.Sem t2 x1 y1 r1 xo yo ro)))) |
| 316 | + (($noop) |
| 317 | + (and |
| 318 | + (= xi xo) |
| 319 | + (= yi yo) |
| 320 | + (= ri ro))) |
| 321 | + (($while tb ts) |
| 322 | + (exists ((b Bool) (x1 Int) (y1 Int) (r1 Int)) |
| 323 | + (and |
| 324 | + (B.Sem tb xi yi ri b) |
| 325 | + (= b true) |
| 326 | + (S.Sem ts xi yi ri x1 y1 r1) |
| 327 | + (S.Sem t x1 y1 r1 xo yo ro))) |
| 328 | + (exists ((b Bool)) |
| 329 | + (and |
| 330 | + (B.Sem tb xi yi ri b) |
| 331 | + (= b false) |
| 332 | + (= xo xi) (= yo yi) (= ro ri)))))) |
| 333 | + :input (xi yi ri) :output (xo yo ro)) |
| 334 | + |
| 335 | + ; Int expressions |
| 336 | + (! (match t |
| 337 | + (($0 (= out 0)) |
| 338 | + ($1 (= out 1)) |
| 339 | + ($x (= out xi)) |
| 340 | + ($y (= out yi)) |
| 341 | + ($r (= out ri)) |
| 342 | + (($+ t1 t2) |
| 343 | + (exists ((o1 Int) (o2 Int)) |
| 344 | + (and |
| 345 | + (E.Sem t1 xi yi ri o1) |
| 346 | + (E.Sem t2 xi yi ri o2) |
| 347 | + (= out (+ o1 o2))))) |
| 348 | + (($- t1 t2) |
| 349 | + (exists ((o1 Int) (o2 Int)) |
| 350 | + (and |
| 351 | + (E.Sem t1 xi yi ri o1) |
| 352 | + (E.Sem t2 xi yi ri o2) |
| 353 | + (= out (- o1 o2))))))) |
| 354 | + :input (xi yi ri) :output (out)) |
| 355 | +
|
| 356 | + ; Bool expressions |
| 357 | + (! (match t |
| 358 | + ((($< t1 t2) |
| 359 | + (exists ((o1 Int) (o2 Int)) |
| 360 | + (and |
| 361 | + (E.Sem t1 xi yi ri o1) |
| 362 | + (E.Sem t2 xi yi ri o2) |
| 363 | + (= out (< o1 o2))))))) |
| 364 | + :input (xi yi ri) :output (out)))) |
| 365 | +
|
| 366 | +;;; |
| 367 | +;;; Function to synthesize - a term rooted at F |
| 368 | +;;; |
| 369 | +
|
| 370 | +(synth-fun mul () F |
| 371 | + ((F F) (Loop S) (Seq S) (Inner S) (Op E) (Val E) (B B)) |
| 372 | + ((F F (($function Loop Val))) |
| 373 | + (Loop S (($while B Seq))) |
| 374 | + (Seq S (($seq Inner Seq) Inner)) |
| 375 | + (Inner S (($x<- Op) ($y<- Op) ($r<- Op))) |
| 376 | + (Op E (($+ Val Val) ($- Val Val) Val)) |
| 377 | + (Val E ($0 $1 $x $y $r)) |
| 378 | + (B B (($< Val Val))))) |
| 379 | +
|
| 380 | +;;; |
| 381 | +;;; Constraints - examples |
| 382 | +;;; |
| 383 | +(constraint (F.Sem mul 0 0 0)) |
| 384 | +(constraint (F.Sem mul 1 1 1)) |
| 385 | +(constraint (F.Sem mul 2 2 4)) |
| 386 | +(constraint (F.Sem mul 3 3 9)) |
| 387 | +(constraint (F.Sem mul 5 3 15)) |
| 388 | +(constraint (F.Sem mul 3 4 12)) |
| 389 | +
|
| 390 | +;;; |
| 391 | +;;; Instruct the solver to find swap |
| 392 | +;;; |
| 393 | +(check-synth) |
| 394 | +``` |
0 commit comments