Skip to content

Fix: Improve HTTP API structure and async handler usage (#569)#2063

Open
thchann wants to merge 1 commit intoroboflow:mainfrom
thchann:fix-569-async-handlers
Open

Fix: Improve HTTP API structure and async handler usage (#569)#2063
thchann wants to merge 1 commit intoroboflow:mainfrom
thchann:fix-569-async-handlers

Conversation

@thchann
Copy link
Copy Markdown
Contributor

@thchann thchann commented Mar 2, 2026

What does this PR do?

  • Refactors HttpInterface into modular FastAPI routers under inference/core/interfaces/http/routes/ (inference, models, workflows, stream, core_models, legacy, info, health).
  • Fixes incorrect async usage: blocking handlers are now sync + with_route_exceptions, while only truly async code (e.g. stream manager, WebRTC worker) remains async + with_route_exceptions_async.
  • Keeps the external HTTP API surface (paths, methods, response models, flags) unchanged; this is a structural/maintenance refactor plus async/sync correctness for the original HTTP API redesign ticket Improve API structure + put non-async handlers properly #569.

Type of Change

  • Refactoring (no functional changes)

Testing

  • I have tested this change locally

  • I have added/updated tests for this change

  • Ran: pytest -m "not slow" tests/

Test details:

  • Verified that:
    • tests/inference/unit_tests/core/interfaces/http/test_remote_processing_time_middleware.py imports the refactored http_api and passes.
    • No HTTP route tests fail due to the router extraction or async/sync changes (any remaining issues are unrelated env/dependency problems).

Checklist

  • My code follows the style guidelines of this project
  • I have performed a self-review of my own code
  • I have commented my code where necessary, particularly in hard-to-understand areas
  • My changes generate no new warnings or errors
  • I have updated the documentation accordingly (if applicable)

Additional Context

@PawelPeczek-Roboflow
Copy link
Copy Markdown
Collaborator

Thanks for contribution - since we are really busy this week, postponing review

@PawelPeczek-Roboflow
Copy link
Copy Markdown
Collaborator

Review is ongoing - we've identified couple of issues, seem like @dkosowski87 has some feedback here

Copy link
Copy Markdown
Contributor

@dkosowski87 dkosowski87 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice clean up, thanks 🚀 left some comments, after fixing those we will be able to merge this.


from fastapi import APIRouter, HTTPException, Query

rom inference.core.version import __version__
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
rom inference.core.version import __version__
from inference.core.version import __version__

from inference.core.interfaces.http.error_handlers import with_route_exceptions
from inference.core.interfaces.http.orjson_utils import orjson_response
from inference.core.managers.base import ModelManager
from inference.core.utils.model_alias import resolve_roboflow_model_alias
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This was moved inference/models/aliases.py


@router.post(
"/infer/semantic_segmentation",
response_model=Union[InstanceSegmentationInferenceResponse, StubResponse],
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
response_model=Union[InstanceSegmentationInferenceResponse, StubResponse],
response_model=Union[SemanticSegmentationInferenceResponse, StubResponse],

if DEPTH_ESTIMATION_ENABLED:

@router.post(
"/infer/depth-estimation",
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Probably we could set up a different route as this one will be shadowed by the one in inference.

router = APIRouter()

def process_workflow_inference_request(
workflow_request,
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Missing type hint

countinference=countinference,
service_secret=service_secret,
)
app.include_router(create_inference_router(model_manager=self.model_manager))

if LMM_ENABLED or MOONDREAM2_ENABLED:
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This part should also go the inference module. It is nested under if not LAMBDA and not GCP_SERVERLESS:

max_concurrent_steps=WORKFLOWS_MAX_CONCURRENT_STEPS,
prevent_local_images_loading=True,
)
return WorkflowValidationStatus(status="ok")

if WEBRTC_WORKER_ENABLED:
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We could probably extract this one also.

"""Health endpoint for Kubernetes liveness probe."""
return {"status": "healthy"}

return router No newline at end of file
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Always add a new line at the end of the file

),
countinference: Optional[bool] = None,
service_secret: Optional[str] = None,
):
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe this is a result on working on a previous vesion of the repo. But this si missing:

                    if not SAM3_FINE_TUNED_MODELS_ENABLED:
                        if not inference_request.model_id.startswith("sam3/"):
                            raise HTTPException(
                                status_code=501,
                                detail="Fine-tuned SAM3 models are not supported on this deployment. Please use a workflow or self-host the server.",
                            )

Let's update the branch and see if it will be brought back.

):
"""
Runs the YOLO-World zero-shot object detection model.
@app.get(
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Notebook could also go to a separate module

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants