gh-143008: Fix Null pointer dereferences in TextIOWrapper underlying stream access#145957
gh-143008: Fix Null pointer dereferences in TextIOWrapper underlying stream access#145957cmaloney wants to merge 12 commits into
Conversation
TextIOWrapper keeps its underlying stream in a member called `self->buffer`. That stream can be detached by user code, such as custom `.flush` implementations resulting in `self->buffer` being set to NULL. The implementation often checked at the start of functions if `self->buffer` is in a good state, but did not always recheck after other Python code was called which could modify `self->buffer`. The cases which need to be re-checked are hard to spot so rather than rely on reviewer effort create better safety by making all self->buffer access go through helper functions. Thank you yihong0618 for the test, NEWS and initial implementation in pythongh-143041. Co-authored-by: yihong0618 <zouzou0208@gmail.com>
|
wow that is cool thanks |
|
@colesbury could you take a look as you've done some recent work on TextIO safety around threading. These cases are generally single threaded but generally the pattern "Implementation usually checked state; then did an operation which could modify it and didn't recheck". |
|
I'm a bit confused about the relationship here between |
|
Less important, but please match the CPython style for C code, e.g.:
|
The more complicated case is during deletion where there is an interaction with the base class. Adding to the complexity here all the objects involved and the multiple operations (
Sorry for missing those bits, doing a review of that and will update for it shortly. |
|
There are a couple cases I'm not sure how to PEP 7, in particular: /* hits 81 cols */
PyObject *bytes = _textiowrapper_buffer_callmethod_noargs(self,
&_Py_ID(read));
/* hits 81 cols */
res = _textiowrapper_buffer_callmethod_onearg(self,
&_Py_ID(_dealloc_warn),
(PyObject *)self);
PyObject *input_chunk = _textiowrapper_buffer_callmethod_onearg(self,
&_Py_ID(read), bytes_to_feed);A couple directions I looked at but don't know how to decide between:
|
|
I think it's fine to make the function names shorter, i.e., In general the preference is for wrapping function calls like the following (from PEP 7 example): PyErr_Format(PyExc_TypeError,
"cannot create '%.100s' instances",
type->tp_name);But if that's not practical due to overly long lines, than just try to fit the style of surround code. For example, from elsewhere in PyObject *incrementalDecoder = PyObject_CallFunctionObjArgs(
(PyObject *)state->PyIncrementalNewlineDecoder_Type,
self->decoder, self->readtranslate ? Py_True : Py_False, NULL); |
|
This should also resolve gh-143007 |
|
I think thish is ready for re-review @colesbury |
TextIOWrapper keeps its underlying stream in a member called
self->buffer. That stream can be detached by user code, such as custom.flush()implementations, resulting inself->bufferbeing set to NULL. The implementation often checked at the start of functions ifself->bufferis in a good state but did not always recheck after other Python code which could modifyself->buffer.The cases which need to be re-checked are hard to spot so rather than rely on reviewer effort make a better safety net by changing all
self->bufferaccess to go through helper functions.Thank you @yihong0618 for the test, NEWS and an initial implementation in gh-143041.
TextIOWrapper.truncatevia re-entrantflush#143008