Skip to content

Commit db23b7c

Browse files
committed
Some notes on Region tracking
There are two strategies for tracking the regions. Strict or lazy. Guido proposed lazy where the region is only checked on certain `send` like operations. This has worse overall complexity, but can be less invasive on the runtime. Strict requires all the changes in topology to be tracked. This is an invasive change, but could be more efficient.
1 parent efd0d30 commit db23b7c

1 file changed

Lines changed: 48 additions & 0 deletions

File tree

Doc/regions.md

Lines changed: 48 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -454,6 +454,54 @@ The local reference count for the `region` is modified in the following way.
454454
Thus, the resulting change is to increase the LRC by 5, and then reduce it by 3.
455455

456456

457+
#### Strict region calculation
458+
459+
This approach is to intercept changes to the object graph and maintain the LRC correctly at this point.
460+
We might be able to realise this in Python by modifying every call to `Py_INCREF` and `Py_DECREF` to take an additional parameter of the source of the reference.
461+
462+
463+
* `Py_INCREF2(target, source)` is a function that increments the reference count of `target` by 1.
464+
465+
- If `source` is a local object, and `target` is in a region, then it also increases the `target` regions reference count.
466+
- If `source` is in a region, and `target` is a local object, then it promotes the local object to be in the region.
467+
- If `source` is in a region, and `target` is in a different region, then it raises an error.
468+
- [TODO deal with nested regions]
469+
470+
* `Py_DECREF2(target, source)` is a function that decrements the reference count of `target` by 1.
471+
472+
- If `source` is a local object, and `target` is in a region, then it also decreases the `target` regions reference count.
473+
- [TODO deal with nested regions]
474+
475+
This approach is the most efficient, but it requires modifying the runtime to intercept all writes to the object graph.
476+
477+
#### Lazy region calculation
478+
479+
An alternative is to not maintain the LRC or the region property, but to check it on calls to `send`.
480+
By traversing a region starting at the entry point we can determine if it has any external references.
481+
We calculate the total RC of the region in by a DFS traversal,
482+
and then subtract the number of references followed by the DFS that are not to immutable objects.
483+
484+
If this results in a non-zero count, then there are references into the region from outside.
485+
This is an error, and we should report it.
486+
Otherwise, it is safe to send the region.
487+
488+
The lazy approach is much better for compability with existing code as it only requires the types involved to support the `tp_traverse` operation.
489+
490+
It may perform worse if a region is sent multiple times.
491+
Each `send` requires an O(region size) operation, but the actual access to the region may be considerably smaller.
492+
493+
494+
#### Hybrid approach
495+
496+
We can combine the two approaches.
497+
Effectively, if the code run on an interpreter is using the new `PY_INCREF2` and `PY_DECREF2` API, then we do not need to perform the lazy calculation on `send`.
498+
However, if any code uses the legacy API, then we need to perform the lazy calculation on `send`.
499+
500+
Effectively `Py_INCREF` and `PY_DECREF` will mark the interpreter as "dirty", and then `send` will check if the interpreter is dirty, and perform the lazy calculation if it is.
501+
This allows for a gradual migration of the codebase to the new API.
502+
503+
An interpreter can be unmarked once it begins a new concurrent behaviour.
504+
457505
### Immutable objects
458506

459507
To share objects between behaviours, we need to ensure that the object is not mutated.

0 commit comments

Comments
 (0)