Skip to content

Commit 51d2349

Browse files
authored
Merge pull request #244 from labthings/refactor-blob
Make use of `url_for_middleware` to tidy up `Blob` and `Invocation` URLs
2 parents 794fd23 + 9b116fc commit 51d2349

13 files changed

Lines changed: 796 additions & 567 deletions

File tree

dev-requirements.txt

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -321,6 +321,7 @@ tabulate==0.9.0
321321
# via sphinx-toolbox
322322
tomli==2.2.1
323323
# via
324+
# labthings-fastapi (pyproject.toml)
324325
# coverage
325326
# flake8-pyproject
326327
# mypy

docs/source/blobs.rst

Lines changed: 5 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -7,14 +7,14 @@ Blob input/output
77

88
If interactions require only simple data types that can easily be represented in JSON, very little thought needs to be given to data types - strings and numbers will be converted to and from JSON automatically, and your Python code should only ever see native Python datatypes whether it's running on the server or a remote client. However, if you want to transfer larger data objects such as images, large arrays or other binary data, you will need to use a `.Blob` object.
99

10-
`.Blob` objects are not part of the Web of Things specification, which doesn't give much consideration to returning large or complicated datatypes. In LabThings-FastAPI, the `.Blob` mechanism is intended to provide an efficient way to work with arbitrary binary data. If it's used to transfer data between two Things on the same server, the data should not be copied or otherwise iterated over - and when it must be transferred over the network it can be done using a binary transfer, rather than embedding in JSON with base64 encoding.
10+
`.Blob` objects are not part of the Web of Things specification, which doesn't give much consideration to returning large or complicated datatypes. In LabThings-FastAPI, the `.Blob` mechanism is intended to provide an efficient way to work with arbitrary binary data. If a `.Blob` is passed between two Things on the same server, the data will not be copied - and when it must be transferred over the network it can be done using a binary transfer, rather than embedding in JSON with base64 encoding.
1111

12-
A `.Blob` consists of some data and a MIME type, which sets how the data should be interpreted. It is best to create a subclass of `.Blob` with the content type set: this makes it clear what kind of data is in the `.Blob`. In the future, it might be possible to add functionality to `.Blob` subclasses, for example to make it simple to obtain an image object from a `.Blob` containing JPEG data. However, this will not currently work across both client and server code.
12+
A `.Blob` consists of some data and a MIME type, which sets how the data should be interpreted. It is best to create a subclass of `.Blob` with the ``media_type`` set: this makes it clear what kind of data is in the `.Blob`. In the future, it might be possible to add functionality to `.Blob` subclasses, for example to make it simple to obtain an image object from a `.Blob` containing JPEG data. However, this will not currently work across both client and server code.
1313

1414
Creating and using `.Blob` objects
1515
------------------------------------------------
1616

17-
Blobs can be created from binary data that is in memory (a `bytes` object) with `.Blob.from_bytes`, on disk (with `.Blob.from_temporary_directory` or `.Blob.from_file`), or using a URL as a placeholder. The intention is that the code that uses a `.Blob` should not need to know which of these is the case, and should be able to use the same code regardless of how the data is stored.
17+
Blobs can be created from binary data that is in memory (a `bytes` object) with `.Blob.from_bytes`, on disk (with `.Blob.from_temporary_directory` or `.Blob.from_file`). A `.Blob` may also point to remote data (see `.Blob.from_url`). Code that uses a `.Blob` should not need to know how the data is stored, as the interface is the same in each case.
1818

1919
Blobs offer three ways to access their data:
2020

@@ -122,7 +122,7 @@ On the client, we can use the `capture_image` action directly (as before), or we
122122
HTTP interface and serialization
123123
--------------------------------
124124

125-
`.Blob` objects are subclasses of `pydantic.BaseModel`, which means they can be serialized to JSON and deserialized from JSON. When this happens, the `.Blob` is represented as a JSON object with `.Blob.url` and `.Blob.content_type` fields. The `.Blob.url` field is a link to the data. The `.Blob.content_type` field is a string representing the MIME type of the data. It is worth noting that models may be nested: this means an action may return many `.Blob` objects in its output, either as a list or as fields in a `pydantic.BaseModel` subclass. Each `.Blob` in the output will be serialized to JSON with its URL and content type, and the client can then download the data from the URL, one download per `.Blob` object.
125+
`.Blob` objects can be serialized to JSON and deserialized from JSON. When this happens, the `.Blob` is represented as a JSON object with ``href`` and ``content_type`` fields. The ``href`` field is a link to the data. The ``content_type`` field is a string representing the MIME type of the data. It is worth noting that models may be nested: this means an action may return many `.Blob` objects in its output, either as a list or as fields in a `pydantic.BaseModel` subclass. Each `.Blob` in the output will be serialized to JSON with its URL and content type, and the client can then download the data from the URL, one download per `.Blob` object.
126126

127127
When a `.Blob` is serialized, the URL is generated with a unique ID to allow it to be downloaded. The URL is not guaranteed to be permanent, and should not be used as a long-term reference to the data. For `.Blob` objects that are part of the output of an action, the URL will expire after 5 minutes (or the retention time set for the action), and the data will no longer be available for download after that time.
128128

@@ -136,7 +136,7 @@ It may be possible to have actions return binary data directly in the future, bu
136136

137137
.. note::
138138

139-
Serialising or deserialising `.Blob` objects requires access to the `.BlobDataManager`\ . As there is no way to pass this in to the relevant methods at serialisation/deserialisation time, we use context variables to access them. This means that a `.blob_serialisation_context_manager` should be used to set (and then clear) those context variables. This is done by the `.BlobIOContextDep` dependency on the relevant endpoints (currently any endpoint that may return the output of an action).
139+
Serialising or deserialising `.Blob` objects generates URLs, which are specific to the HTTP request. This means that `.Blob` objects cannot be serialised or deserialised outside the context of an HTTP request handler, so if code in an Action or Property attempts to turn a `.Blob` into JSON, it is likely to raise exceptions. For more detail on this mechanism, see `.middleware.url_for`\ .
140140

141141

142142
Memory management and retention

pyproject.toml

Lines changed: 4 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -41,6 +41,7 @@ dev = [
4141
"sphinx>=7.2",
4242
"sphinx-autoapi",
4343
"sphinx-toolbox",
44+
"tomli; python_version < '3.11'",
4445
"codespell",
4546
]
4647

@@ -171,5 +172,8 @@ check-return-types = false
171172
check-class-attributes = false # prefer docstrings on the attributes
172173
check-yield-types = false # use type annotations instead
173174

175+
[tool.codespell]
176+
ignore-words-list = ["ser"]
177+
174178
[project.scripts]
175179
labthings-server = "labthings_fastapi.server.cli:serve_from_cli"

src/labthings_fastapi/actions.py

Lines changed: 16 additions & 49 deletions
Original file line numberDiff line numberDiff line change
@@ -39,6 +39,8 @@
3939
from fastapi import FastAPI, HTTPException, Request, Body, BackgroundTasks
4040
from pydantic import BaseModel, create_model
4141

42+
from labthings_fastapi.middleware.url_for import URLFor
43+
4244
from .base_descriptor import BaseDescriptor
4345
from .logs import add_thing_log_destination
4446
from .utilities import model_to_dict, wrap_plain_types_in_rootmodel
@@ -47,10 +49,8 @@
4749
from .exceptions import (
4850
InvocationCancelledError,
4951
InvocationError,
50-
NoBlobManagerError,
5152
NotConnectedToServerError,
5253
)
53-
from .outputs.blob import BlobIOContextDep, blobdata_to_url_ctx
5454
from . import invocation_contexts
5555
from .utilities.introspection import (
5656
EmptyInput,
@@ -149,23 +149,7 @@ def id(self) -> uuid.UUID:
149149

150150
@property
151151
def output(self) -> Any:
152-
"""Return value of the Action. If the Action is still running, returns None.
153-
154-
:raise NoBlobManagerError: If this is called in a context where the blob
155-
manager context variables are not available. This stops errors being raised
156-
later once the blob is returned and tries to serialise. If the errors
157-
happen during serialisation the stack-trace will not clearly identify
158-
the route with the missing dependency.
159-
"""
160-
try:
161-
blobdata_to_url_ctx.get()
162-
except LookupError as e:
163-
raise NoBlobManagerError(
164-
"An invocation output has been requested from a api route that "
165-
"doesn't have a BlobIOContextDep dependency. This dependency is needed "
166-
" for blobs to identify their url."
167-
) from e
168-
152+
"""Return value of the Action. If the Action is still running, returns None."""
169153
with self._status_lock:
170154
return self._return_value
171155

@@ -225,33 +209,28 @@ def cancel(self) -> None:
225209
"""
226210
self.cancel_hook.set()
227211

228-
def response(self, request: Optional[Request] = None) -> InvocationModel:
212+
def response(self) -> InvocationModel:
229213
"""Generate a representation of the invocation suitable for HTTP.
230214
231215
When an invocation is polled, we return a JSON object that includes
232216
its status, any log entries, a return value (if completed), and a link
233217
to poll for updates.
234218
235-
:param request: is used to generate the ``href`` in the response, which
236-
should retrieve an updated version of this response.
237-
238219
:return: an `.InvocationModel` representing this `.Invocation`.
239220
"""
240-
if request:
241-
href = str(request.url_for("action_invocation", id=self.id))
242-
else:
243-
href = f"{ACTION_INVOCATIONS_PATH}/{self.id}"
244221
links = [
245-
LinkElement(rel="self", href=href),
246-
LinkElement(rel="output", href=href + "/output"),
222+
LinkElement(rel="self", href=URLFor("action_invocation", id=self.id)),
223+
LinkElement(
224+
rel="output", href=URLFor("action_invocation_output", id=self.id)
225+
),
247226
]
248227
# The line below confuses MyPy because self.action **evaluates to** a Descriptor
249228
# object (i.e. we don't call __get__ on the descriptor).
250229
return self.action.invocation_model( # type: ignore[attr-defined]
251230
status=self.status,
252231
id=self.id,
253232
action=self.thing.path + self.action.name, # type: ignore[attr-defined]
254-
href=href,
233+
href=URLFor("action_invocation", id=self.id),
255234
timeStarted=self._start_time,
256235
timeCompleted=self._end_time,
257236
timeRequested=self._request_time,
@@ -442,7 +421,7 @@ def list_invocations(
442421
:return: A list of invocations, optionally filtered by Thing and/or Action.
443422
"""
444423
return [
445-
i.response(request=request)
424+
i.response()
446425
for i in self.invocations
447426
if thing is None or i.thing == thing
448427
if action is None or i.action == action
@@ -467,25 +446,19 @@ def attach_to_app(self, app: FastAPI) -> None:
467446
"""
468447

469448
@app.get(ACTION_INVOCATIONS_PATH, response_model=list[InvocationModel])
470-
def list_all_invocations(
471-
request: Request, _blob_manager: BlobIOContextDep
472-
) -> list[InvocationModel]:
449+
def list_all_invocations(request: Request) -> list[InvocationModel]:
473450
return self.list_invocations(request=request)
474451

475452
@app.get(
476453
ACTION_INVOCATIONS_PATH + "/{id}",
477454
responses={404: {"description": "Invocation ID not found"}},
478455
)
479-
def action_invocation(
480-
id: uuid.UUID, request: Request, _blob_manager: BlobIOContextDep
481-
) -> InvocationModel:
456+
def action_invocation(id: uuid.UUID, request: Request) -> InvocationModel:
482457
"""Return a description of a specific action.
483458
484459
:param id: The action's ID (from the path).
485460
:param request: FastAPI dependency for the request object, used to
486461
find URLs via ``url_for``.
487-
:param _blob_manager: FastAPI dependency that enables `.Blob` objects
488-
to be serialised.
489462
490463
:return: Details of the invocation.
491464
@@ -494,7 +467,7 @@ def action_invocation(
494467
"""
495468
try:
496469
with self._invocations_lock:
497-
return self._invocations[id].response(request=request)
470+
return self._invocations[id].response()
498471
except KeyError as e:
499472
raise HTTPException(
500473
status_code=404,
@@ -515,17 +488,13 @@ def action_invocation(
515488
503: {"description": "No result is available for this invocation"},
516489
},
517490
)
518-
def action_invocation_output(
519-
id: uuid.UUID, _blob_manager: BlobIOContextDep
520-
) -> Any:
491+
def action_invocation_output(id: uuid.UUID) -> Any:
521492
"""Get the output of an action invocation.
522493
523494
This returns just the "output" component of the action invocation. If the
524495
output is a file, it will return the file.
525496
526497
:param id: The action's ID (from the path).
527-
:param _blob_manager: FastAPI dependency that enables `.Blob` objects
528-
to be serialised.
529498
530499
:return: The output of the invocation, as a `pydantic.BaseModel`
531500
instance. If this is a `.Blob`, it may be returned directly.
@@ -800,8 +769,6 @@ def add_to_fastapi(self, app: FastAPI, thing: Thing) -> None:
800769
# The solution below is to manually add the annotation, before passing
801770
# the function to the decorator.
802771
def start_action(
803-
_blob_manager: BlobIOContextDep,
804-
request: Request,
805772
body: Any, # This annotation will be overwritten below.
806773
id: NonWarningInvocationID,
807774
background_tasks: BackgroundTasks,
@@ -816,7 +783,7 @@ def start_action(
816783
id=id,
817784
)
818785
background_tasks.add_task(action_manager.expire_invocations)
819-
return action.response(request=request)
786+
return action.response()
820787

821788
if issubclass(self.input_model, EmptyInput):
822789
annotation = Body(default_factory=StrictEmptyInput)
@@ -878,7 +845,7 @@ def start_action(
878845
),
879846
summary=f"All invocations of {self.name}.",
880847
)
881-
def list_invocations(_blob_manager: BlobIOContextDep) -> list[InvocationModel]:
848+
def list_invocations() -> list[InvocationModel]:
882849
action_manager = thing._thing_server_interface._action_manager
883850
return action_manager.list_invocations(self, thing)
884851

src/labthings_fastapi/client/__init__.py

Lines changed: 13 additions & 13 deletions
Original file line numberDiff line numberDiff line change
@@ -13,9 +13,9 @@
1313
import httpx
1414
from urllib.parse import urlparse, urljoin
1515

16-
from pydantic import BaseModel
16+
from pydantic import BaseModel, TypeAdapter
1717

18-
from .outputs import ClientBlobOutput
18+
from ..outputs.blob import Blob, RemoteBlobData
1919
from ..exceptions import (
2020
FailedToInvokeActionError,
2121
ServerActionError,
@@ -206,16 +206,14 @@ def invoke_action(self, path: str, **kwargs: Any) -> Any:
206206
"""
207207
for k in kwargs.keys():
208208
value = kwargs[k]
209-
if isinstance(value, ClientBlobOutput):
210-
# ClientBlobOutput objects may be used as input to a subsequent
211-
# action. When this is done, they should be serialised to a dict
212-
# with `href` and `media_type` keys, as done below.
213-
# Ideally this should be replaced with `Blob` and the use of
214-
# `pydantic` models to serialise action inputs.
209+
if isinstance(value, Blob):
210+
# Blob objects may be used as input to a subsequent
211+
# action. When this is done, they should be serialised by
212+
# pydantic, to a dictionary that includes href and media_type.
215213
#
216214
# Note that the blob will not be uploaded: we rely on the blob
217215
# still existing on the server.
218-
kwargs[k] = {"href": value.href, "media_type": value.media_type}
216+
kwargs[k] = TypeAdapter(Blob).dump_python(value)
219217
response = self.client.post(urljoin(self.path, path), json=kwargs)
220218
if response.is_error:
221219
message = _construct_failed_to_invoke_message(path, response)
@@ -228,10 +226,12 @@ def invoke_action(self, path: str, **kwargs: Any) -> Any:
228226
and "href" in invocation["output"]
229227
and "media_type" in invocation["output"]
230228
):
231-
return ClientBlobOutput(
232-
media_type=invocation["output"]["media_type"],
233-
href=invocation["output"]["href"],
234-
client=self.client,
229+
return Blob(
230+
RemoteBlobData(
231+
media_type=invocation["output"]["media_type"],
232+
href=invocation["output"]["href"],
233+
client=self.client,
234+
)
235235
)
236236
return invocation["output"]
237237
message = _construct_invocation_error_message(invocation)

src/labthings_fastapi/client/outputs.py

Lines changed: 0 additions & 77 deletions
This file was deleted.

src/labthings_fastapi/invocations.py

Lines changed: 3 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -13,6 +13,8 @@
1313

1414
from pydantic import BaseModel, ConfigDict, model_validator
1515

16+
from labthings_fastapi.middleware.url_for import URLFor
17+
1618
from .thing_description._model import Links
1719

1820

@@ -91,7 +93,7 @@ class GenericInvocationModel(BaseModel, Generic[InputT, OutputT]):
9193
status: InvocationStatus
9294
id: uuid.UUID
9395
action: str
94-
href: str
96+
href: URLFor
9597
timeStarted: Optional[datetime]
9698
timeRequested: Optional[datetime]
9799
timeCompleted: Optional[datetime]

0 commit comments

Comments
 (0)