You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: docs/reference/python-integration.md
+31-19Lines changed: 31 additions & 19 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -96,43 +96,55 @@ We define a custom "primitive sort" (i.e. a builtin type) for `PyObject`s. This
96
96
97
97
### Saving Python Objects
98
98
99
-
To create an expression of type `PyObject`, we call the call the constructor with any Python object. It will
100
-
save a reference to the object:
99
+
To create an expression of type `PyObject`, call the constructor with any Python object. The value is immediately
100
+
serialized with `cloudpickle.dumps`, and the serialized bytes (base64 encoded when printed) are what get stored
101
+
inside the e-graph. This means the e-graph keeps a snapshot of the object rather than a live reference.
101
102
102
103
```{code-cell} python
103
-
PyObject(1)
104
-
```
105
-
106
-
We see that this as saved internally as a pointer to the Python object. For hashable objects like `int` we store two integers, a hash of the type and a has of the value.
104
+
from dataclasses import dataclass
107
105
108
-
We can also store unhashable objects in the e-graph like lists.
106
+
@dataclass
107
+
class MyObject:
108
+
a: int = 10
109
109
110
-
```{code-cell} python
111
-
lst = PyObject([1, 2, 3])
112
-
lst
110
+
PyObject(MyObject())
113
111
```
114
112
115
-
We see that this is stored with one number, simply the `id` of the object.
113
+
The new serialization approach works for both hashable and unhashable Python values, and no longer depends on
114
+
their `id()`. Subsequent inserts of equal values round-trip through `cloudpickle` so the e-graph can identify and
115
+
merge them by value.
116
116
117
-
```{admonition}Mutable Objects
118
-
:class: warning
117
+
```{admonition}Serialization requirements
118
+
:class: note
119
119
120
-
While it is possible to store unhashable objects in the e-graph, you have to be careful defining any rules which create new unhashable objects. If each time a rule is run, it creates a new object, then the e-graph will never saturate.
121
-
122
-
Creating hashable objects is safer, since while the rule might create new Python objects each time it executes, they should have the same hash, i.e. be equal, so that the e-graph can saturate.
120
+
`PyObject` relies on `cloudpickle`. Any object you store must be serializable by `cloudpickle.dumps`; objects such
121
+
as open file handles, generators, or extension types that `cloudpickle` cannot handle will raise an error when you
122
+
try to construct a `PyObject`.
123
123
```
124
124
125
125
### Retrieving Python Objects
126
126
127
-
Like other primitives, we can retrieve the Python object from the e-graph by using the `.value` property:
127
+
Like other primitives, we can retrieve a Python object by using the `.value` property. Deserialization happens on
128
+
every access, so you receive a fresh copy each time rather than the original object.
128
129
129
130
```{code-cell} python
130
-
assert lst.value == [1, 2, 3]
131
+
original = {"count": 1}
132
+
expr = PyObject(original)
133
+
134
+
restored = expr.value
135
+
assert restored == original
136
+
assert restored is not original
137
+
138
+
# Mutating the copy does not affect the stored value.
139
+
restored["count"] = 2
140
+
assert expr.value == {"count": 1}
131
141
```
132
142
133
143
### Builtin methods
134
144
135
-
Currently, we only support a few methods on `PyObject`s, but we plan to add more in the future.
145
+
Currently, we only support a few methods on `PyObject`s, but we plan to add more in the future. Each builtin
146
+
deserializes its inputs, performs the operation in Python, and then serializes the result back into a new
147
+
`PyObject`, so previously stored values remain unchanged.
0 commit comments