Add real-time barcode scanning API and iOS backend implementation#4458
Add real-time barcode scanning API and iOS backend implementation#4458phildini wants to merge 11 commits into
Conversation
Add Camera.start_scanning(), stop_scanning(), is_scanning() and on_detection callback to toga-core, with BarcodeFormat enum supporting 7 barcode types. Full implementation in the Dummy backend for testing. iOS backend uses AVCaptureSession with AVCaptureMetadataOutput for native barcode detection. Cocoa and Android backends raise NotImplementedError stubs. Includes 14 new core tests (100% coverage), updated documentation, and towncrier change fragment.
|
There's some indication that the API wanted for this on Android is https://developers.google.com/ml-kit/vision/barcode-scanning/code-scanner |
Add test_barcode_format_str and test_barcode_format_values to exercise all BarcodeFormat members and their __str__ methods.
Replace static code type tests with parametrized test exercising each BarcodeFormat individually through the full scan API, plus a combined test for all types. The str() and member tests are now integrated into a single enumeration check.
|
I really think the test errors here are transient and not related to this PR, but please correct me if I'm wrong |
|
Argh... I think the iOS failure is due to beeware/briefcase-iOS-Xcode-template#75 being merged (part of the whole macOS-26 logging issue). |
|
Ok - I've resolved the issue that was causing problems; the test suite is now passing, but has some coverage gaps. |
freakboy3742
left a comment
There was a problem hiding this comment.
This is exciting stuff! Thanks for the PR.
A few comments inline; most of which are fairly minor. The biggest source of concern is the removal of explanatory comments; it's not clear why these have been stripped. There's also the test failure due to some coverage misses.
I haven't run this actual code yet - for the purposes of testing, it would be helpful if there the example/hardware app had an example of barcode scanning - maybe a new tab that scans barcodes on demand and presents a multiline text widget that is populated with found barcodes, cleared each time a session is started?
I have two other architectural questions that don't need to be implemented as part of this PR, but I'd like to have at least an indicative answer to make sure we're not boxing ourselves into a corner:
-
Is this API compatible with Android's scanner? I can't think of any reason it wouldn't be, but I'd prefer to have a light confirmation before we lock this API in.
-
This API displays a view controller to display a camera view, and that definitely makes sense as a simple API for getting a barcode. However, I can also imagine that a user might want to have a custom view where they are in control of the visualisation of the camera.
I'm thinking in particular of the "multi-scan" case - I might want a GUI where the top half of the screen is a camera preview, and the bottom half is the list of most recently scanned codes. In that context, I can imagine we might need something like a "CameraPreview" widget, with the ability to connect that widget to scanning processes.
I imagine the setup for that would be to pass a CameraPreview widget to start_scanning(); the UI constructed by the hardware service is then "the UI that is displayed if you don't give me an explcit camera preview". That would presumably also have analog with take_photo() so that users could have their own custom inline camera preview for taking photos. If you've got any other suggestions on how this might be structured, or why this might not be possible, I'd be interested in hearing your thoughts.
| def start_scanning( | ||
| self, | ||
| device: CameraDevice | None = None, | ||
| code_types: list[BarcodeFormat] | None = None, |
There was a problem hiding this comment.
As a convenience, should we also accept BarcodeFormat here? I would have expected the most common use case is "scan for QR Code", rather than "Scan for QR code or aztec code or Code128 barcode or ...".
There was a problem hiding this comment.
Would you prefer that over code_types?
There was a problem hiding this comment.
I wasn't suggesting changing the argument name - I was only suggesting adding the typing and argument handling shim so that start_scanning(QR) is a legal usage, equivalent to start_scanning([QR]).
start_scanning(code_types=QR) is a little weird because of inconsistent pluralization... but if code_types is the first argument, then you won't need to ever write code_types, and maybe that isn't as obvious as a weirdness?
| capture_metadata_output = _scan_symbols()["capture_metadata_output"] | ||
| for output in session.outputs(): | ||
| if output.isKindOfClass_(capture_metadata_output): | ||
| output.setMetadataObjectsDelegate_queue_(self._scan_delegate, None) |
There was a problem hiding this comment.
Rubicon provides a cleaner syntax for multi-argument ObjC methods:
| output.setMetadataObjectsDelegate_queue_(self._scan_delegate, None) | |
| output.setMetadataObjectsDelegate(self._scan_delegate, queue=None) |
Co-authored-by: Russell Keith-Magee <russell@keith-magee.com>
As a convenience for the common use case of scanning for a single barcode type (e.g. QR), start_scanning() now accepts a bare BarcodeFormat value in addition to a list. A single value is normalized to a one-element list before being passed to the backend.
Replace NotImplementedError stubs on macOS with real AVCaptureMetadataOutput implementation via TogaCameraScannerDelegate and TogaCameraScannerWindow. Add AVCaptureMetadataOutput, AVMetadataMachineReadableCodeObject, and 7 AVMetadataObjectType* constants to cocoa AVFoundation bindings. Restore # for classes that need to be monkeypatched for testing comment and commented-out native_video_quality() function in iOS backend. The code_types parameter now accepts a single BarcodeFormat value as a convenience shorthand. Update docs and changes fragment to reflect macOS support and single-value acceptance.
Follow the standard codebase pattern by declaring ObjCClass and objc_const symbols directly in libs/av_foundation.py instead of using a lazy dict lookup via _scan_symbols(). The BARCODE_FORMAT_MAP is now a static module-level dict using the directly imported AVMetadataObjectType constants.
Restore NSCameraUsageDescription, request_permission thread, take_photo configuration, delegate, and presentation comments that were lost during the rewrite of the iOS camera backend.
My knowledge of this is way out of date, and there may be a better option now, but last time I worked on an Android QR code scanner, the best option I found was the external This would require the app to specifically add a dependency in order to use the feature, as some of Toga's other features already do. |
freakboy3742
left a comment
There was a problem hiding this comment.
Nice work getting the Cocoa implementation working. A few review comments inline.
There's a lot of coverage gaps being reported by CI; there's also a lot of code marked as ignored. Some of that will be unavoidable because of the nature of simulating cameras - but we need to structure the code to minimize what we can't cover to just the content that is actually tied to camera hardware.
There are also some outstanding items from my previous review, including some inline comments about mapping ObjC APIs, and a requested extension to the Hardware example app for demonstration purposes.
|
|
||
| ## Scanning for Barcodes | ||
|
|
||
| The camera can be used to scan QR codes and other barcode types in real-time. Scanning is supported on iOS, macOS, and in the Dummy (test) backend. |
There was a problem hiding this comment.
We don't call out individual platform quirks (or availability) in the docs - they should describe the feature presuming it exists:
| The camera can be used to scan QR codes and other barcode types in real-time. Scanning is supported on iOS, macOS, and in the Dummy (test) backend. | |
| The camera can be used to scan QR codes and other barcode types in real-time. |
| - Android: The `android.permission.CAMERA` permission must be declared. | ||
| - The iOS simulator implements the iOS Camera APIs, but is not able to take photographs. To test your app's Camera usage, you must use a physical iOS device. | ||
| - The iOS simulator implements the iOS Camera APIs, but is not able to take photographs or scan barcodes. To test your app's Camera usage, you must use a physical iOS device. | ||
| - Barcode scanning is currently available on iOS, macOS, and in the Dummy (test) backend. Other backends will raise `NotImplementedError`. |
There was a problem hiding this comment.
We don't reference the Dummy backend - it's purely a testing artefact.
| - Barcode scanning is currently available on iOS, macOS, and in the Dummy (test) backend. Other backends will raise `NotImplementedError`. | |
| - Availability of barcode scanning is currently limited to iOS and macOS. |
| # granted from a different (inaccessible) thread, so it isn't picked up by | ||
| # coverage. | ||
| def permission_complete(result) -> None: | ||
| def permission_complete(result) -> None: # pragma: no cover |
There was a problem hiding this comment.
Why has this gained a no-cover?
| """A handler to invoke when a barcode is detected during scanning. | ||
|
|
||
| The callback receives the camera as the first argument, and the detected content | ||
| as a keyword argument: ``on_detection(camera, content=content)``. |
There was a problem hiding this comment.
This is in ReST format, not Markdown
| as a keyword argument: ``on_detection(camera, content=content)``. | |
| as a keyword argument: `on_detection(camera, content=content)`. |
| If scanning was started with ``continuous=True``, the callback will be invoked | ||
| each time a barcode is detected. If ``continuous=False`` (the default), the |
There was a problem hiding this comment.
More ReST:
| If scanning was started with ``continuous=True``, the callback will be invoked | |
| each time a barcode is detected. If ``continuous=False`` (the default), the | |
| If scanning was started with `continuous=True`, the callback will be invoked | |
| each time a barcode is detected. If `continuous=False` (the default), the |
This looks to be a recurring theme for the rest of this file.
|
|
||
| result = ScanResult(None) | ||
| self._impl.start_scanning( | ||
| result, device=device, code_types=code_types, continuous=continuous |
There was a problem hiding this comment.
As a code style thing, once we get to a point where Ruff is breaking code onto a new line to get space for the arguments, we tend to go directly to "1 argument per line", rather than the intermediate "all args on one standalone line" format:
| result, device=device, code_types=code_types, continuous=continuous | |
| result, | |
| device=device, | |
| code_types=code_types, | |
| continuous=continuous, |
There was a problem hiding this comment.
One argument per line might be more readable when the arguments are complex expressions, but forcing it for simple situations like this doesn't seem to accomplish anything except reducing the amount of code you can see on screen at once.
| all_types = list(BarcodeFormat) | ||
| assert len(all_types) == 7 | ||
| names = {str(t) for t in all_types} | ||
| assert names == {"Qr", "Code128", "Ean13", "Ean8", "Pdf417", "Aztec", "Data_Matrix"} |
There was a problem hiding this comment.
I can see what this is testing, but it's not a very strong test as there's no validation of constant to mapping. This might be better handled as a test parameterised over each of the enumerated values, looking for a specific str representation of each.
| self.camera._handle_scan(content) | ||
|
|
||
|
|
||
| class TogaCameraScannerWindow(toga.Window): # pragma: no cover |
There was a problem hiding this comment.
This is a big block to have marked no cover. If TogaCameraWindow is an indicator, only creating the session should be marked no-cover; everything else should be reachable in tests.
To that end ... it seems like there's a lot of overlapping content in the TogaCamera window. Is there any way to merge those two classes so there's a single UI for camera activity?
| self.camera.preview_windows.remove(self) | ||
|
|
||
|
|
||
| class TogaCameraScannerDelegate(NSObject): # pragma: no cover |
There was a problem hiding this comment.
Does this need to be a standalone class? Could we use the same AVSession subclass as the photo object uses, set it to be it's own delegate, migrate the body of the delegate method to a utility method on the Window, and use that to provide a point were we can just mock the Session object and everything else is tested explicitly?
| device: CameraDevice | None = None, | ||
| code_types: BarcodeFormat | list[BarcodeFormat] | None = None, | ||
| on_detection: Callable | None = None, | ||
| continuous: bool = False, |
There was a problem hiding this comment.
We haven't been consistent about this elsewhere in Toga, but when a method has optional arguments with no obvious order, making them keyword-only tends to make the calling code more readable, especially when they're simple types like numbers or booleans. In this case, I'd say that applies to on_detection and continuous.
| If ``continuous`` is ``False`` (the default), scanning stops automatically after | ||
| the first detection, and the returned ``ScanResult`` resolves with the detected | ||
| content string. If ``continuous`` is ``True``, scanning continues until | ||
| :meth:`stop_scanning` is called, and the ``ScanResult`` resolves with ``None``. |
There was a problem hiding this comment.
Part of this duplicates the continuous docstring, and the rest should be moved to the :returns: docstring.
Add Camera.start_scanning(), stop_scanning(), is_scanning() and on_detection callback to toga-core, with BarcodeFormat enum supporting 7 barcode types. Full implementation in the Dummy backend for testing. iOS backend uses AVCaptureSession with AVCaptureMetadataOutput for native barcode detection. Cocoa and Android backends raise NotImplementedError stubs.
Includes 14 new core tests (100% coverage), updated documentation, and towncrier change fragment.
Add the API surface area and the iOS backend for allowing real-time barcode processing off a camera stream. The iOS libraries support this natively, and toga already has a camera widget, this is exposing more features in toga. I believe Android and macOS also expose this, but need to do more research.
So, there's a lot of functionality that gets unlocked by exposing these APIs, but my immediate need is being able to do token auth / token exchange for a BeeWare app by having the hosting site generate a QR code that contains a token the app can use as part of an auth flow.
Refs #3675.
PR Checklist:
Assisted-by: OpenCode with DeepSeek 4 Flash