Skip to content

spp: fix resource leaks and add SDP discover timeout#565

Open
chejinxian wants to merge 878 commits into
open-vela:devfrom
chejinxian:bugfix/spp-fixes-gh
Open

spp: fix resource leaks and add SDP discover timeout#565
chejinxian wants to merge 878 commits into
open-vela:devfrom
chejinxian:bugfix/spp-fixes-gh

Conversation

@chejinxian
Copy link
Copy Markdown
Contributor

@chejinxian chejinxian commented Apr 1, 2026

Fix SPP UUID-based connection resource management issues:

  1. Fix conn_id leak when malloc fails in alloc_new_device (bug: v/88807)
  2. Change UNKNOWN_SERVER_CHANNEL_NUM to 0 per RFCOMM spec (bug: v/88808)
  3. Fix resource leak when RFCOMM connect fails after SDP discover (bug: v/88809)
  4. Add SDP discover timeout and fix resource cleanup (bug: v/88810)

When SPP connects via 128-bit UUID, the SDP discovery phase had multiple
resource management defects that could cause connection resource leaks,
subsequent connections failing with BT_STATUS_BUSY, or connections hanging
indefinitely when the remote device does not respond to SDP.

depends-on: [open-vela/external_zblue/pull/211]

expliyh and others added 30 commits December 23, 2025 23:48
bug: v/81606

Signed-off-by: YuhengLi <liyuheng@xiaomi.com>
bug: v/81606

Signed-off-by: YuhengLi <liyuheng@xiaomi.com>
bug: v/81606

Signed-off-by: YuhengLi <liyuheng@xiaomi.com>
bug: v/80279

Signed-off-by: YuhengLi <liyuheng@xiaomi.com>
bug: v/80279

Signed-off-by: YuhengLi <liyuheng@xiaomi.com>
bug: v/81606

Signed-off-by: YuhengLi <liyuheng@xiaomi.com>
bug: v/80270

Signed-off-by: YuhengLi <liyuheng@xiaomi.com>
bug: v/80268

Signed-off-by: YuhengLi <liyuheng@xiaomi.com>
bug: v/80268

Rootcause: unref conn before use
Signed-off-by: YuhengLi <liyuheng@xiaomi.com>
bug: v/81589

Signed-off-by: YuhengLi <liyuheng@xiaomi.com>
bug: v/80268

Signed-off-by: YuhengLi <liyuheng@xiaomi.com>
bug: v/80258

Rootcause: ARRAY_SIZE redefined
Signed-off-by: YuhengLi <liyuheng@xiaomi.com>
bug: v/80268

Rootcause: profile direct connect api has changed
Signed-off-by: YuhengLi <liyuheng@xiaomi.com>
bug: v/80268

Rootcause: should not call unref at this time
Signed-off-by: YuhengLi <liyuheng@xiaomi.com>
bug: v/80269

Rootcause: add logs to print call number when sync
Signed-off-by: YuhengLi <liyuheng@xiaomi.com>
bug: v/81504

return -EINPROGRESS dial callback and send OK after dial response

Signed-off-by: YuhengLi <liyuheng@xiaomi.com>
bug: v/81567

remove unused functions

Signed-off-by: YuhengLi <liyuheng@xiaomi.com>
bug: v/81589

remove global var g_sal_ag_sync_conn and sync  call by addr

Signed-off-by: YuhengLi <liyuheng@xiaomi.com>
bug: v/81504
Signed-off-by: YuhengLi <liyuheng@xiaomi.com>
bug: v/80268

Rootcause: Not unregister callbacks when cleanup.
Signed-off-by: YuhengLi <liyuheng@xiaomi.com>
bug: v/81522

When event_id == BT_AVRCP_EVT_VOLUME_CHANGED, flag = true is set only if both CONFIG_BLUETOOTH_AVRCP_ABSOLUTE_VOLUME and CONFIG_BLUETOOTH_AVRCP_CONTROL are enabled.
Otherwise in the else branch, flag = true is set only if CONFIG_BLUETOOTH_AVRCP_TARGET is enabled. All configurations are enabled, treating flag as always true, making the if (!flag) condition unreachable.

Signed-off-by: YuhengLi <liyuheng@xiaomi.com>
bug: v/81702

Fixed memory leak caused by premature return.

Signed-off-by: liuxiang18 <liuxiang18@xiaomi.com>
bug: v/80258

Rootcause: attributes in bt_sdp_discover_params may be modified by ZBlue SDP. Using const could cause a crash in some cases
Signed-off-by: YuhengLi <liyuheng@xiaomi.com>
bug: v/81752

Rootcause: audio_connect should not be called in bluetoothd task
Signed-off-by: YuhengLi <liyuheng@xiaomi.com>
bug: v/81968

Rootcause: disconnected_callback not called caused connect info not cleared in connection manager module

Signed-off-by: YuhengLi <liyuheng@xiaomi.com>
… an asynchronous API.

bug: v/81682

When priv dynamically allocates memory successfully but fails later due to other reasons before reaching the assignment ins->priv = priv;, the memory allocated to priv cannot be freed in bt_socket_async_client_deinit, leading to a resource leak.

Signed-off-by: jialu <jialu@xiaomi.com>
…se functions

bug: v/80811

Rootcause: In certain scenarios, users of `euv_pipe` must ensure all UV requests have completed execution before releasing resources. Consequently, it is necessary to notify users that `euv_pipe` has been fully released after its close operation is completed, thereby permitting subsequent operational procedures to proceed. Support for the close callback has therefore been added.

Signed-off-by: chejinxian1 <chejinxian1@xiaomi.com>
…e callback

bug: v/80808

Rootcause: In high-throughput reception scenarios, situations may arise where the `write_cb` for SPP data transmission to the application has not yet completed, yet the SPP device is released due to an abrupt disconnection, thereby preventing notification to the protocol stack that data reception has concluded.

To circumvent this issue, it is imperative to ensure all write operations are finalised before releasing the SPP device. Consequently, an `euv_pipe` close callback implementation has been introduced to guarantee that all `write_cb` operations execute successfully prior to severing the data pathway.

Signed-off-by: chejinxian1 <chejinxian1@xiaomi.com>
bug: v/74709

only open CONFIG_BLUETOOTH_AVRCP_CONTROL or CONFIG_BLUETOOTH_AVRCP_ABSOLUTE_VOLUME can build in bt_avrcp_control_notification_cb.

error: 'bt_avrcp_info_find_by_ct' undeclared (first use in this function); did you mean 'bt_avrcp_info_find_by_tg'?
 1501 |     avrcp_info = bt_list_find(bt_avrcp_conn, bt_avrcp_info_find_by_ct, ct);
      |                                              ^~~~~~~~~~~~~~~~~~~~~~~~
      |                                              bt_avrcp_info_find_by_tg

Signed-off-by: zhongzhijie1 <zhongzhijie1@xiaomi.com>
bug: v/82095

The spp_connect_handler was attempting to look up the SPP connection
by rfcomm_dlc before it was added to the connection list, causing
"SPP connection not found for rfcomm_dlc" error.

Root Cause:
The connection object wasn't in the global connection list at the
time of lookup, making spp_find_connection_by_dlc() always fail.

Fix:
Pass the spp_conn pointer directly as user_data to avoid the lookup,
and add it to the connection list after successful initialization.

Signed-off-by: zhongzhijie1 <zhongzhijie1@xiaomi.com>
expliyh and others added 16 commits March 24, 2026 17:25
bug: v/82380

The previous call operation path could block receiving and eventually lead to a
crash.
Signed-off-by: liyuheng <liyuheng@xiaomi.com>
bug: v/88151

In zblue_on_connected, when the device acts as ACP (acceptor), it only creates the a2dp_info and waits passively. It never initiates bt_a2dp_discover, so if the remote side also does not initiate the discover/set_config flow, the A2DP connection stalls -- signaling channel is up but no stream is ever configured.

Additionally, in bt_a2dp_discover_endpoint_cb, the a2dp_info->role remains SEP_INVALID for ACP connections because the role was only assigned during the connect initiation path. The SNK set_config branch also had an unnecessary int_acp == A2DP_INT guard, which blocked ACP-initiated discover from completing the configuration.

Fix:

Added a 2-second service_loop_timer in the ACP path of zblue_on_connected. If the remote does not send set_config within 2s, the local side proactively initiates bt_a2dp_discover.
The timer is cancelled in zblue_on_config_req (remote drove the flow first) and in a2dp_info_destroy (cleanup on disconnect).
In bt_a2dp_discover_endpoint_cb, the local role is now derived from the remote's sep_info->tsep: BT_AVDTP_SOURCE -> local SEP_SNK, BT_AVDTP_SINK -> local SEP_SRC.
Removed the int_acp == A2DP_INT restriction on the SEP_SNK set_config branch so ACP-triggered discover can also complete codec negotiation.

Signed-off-by: jialu <jialu@xiaomi.com>
…lf sent).

bug: v/87852

This commit performs an architecture-level refactoring of cs_ras.c and cs_ras.h. The core change replaces the shared global buffer and static arrays used for Real-time and On-demand mode data storage with independent dynamically-allocated linked list queues, resolving memory safety issues and concurrent processing defects in the original architecture.

Signed-off-by: jialu <jialu@xiaomi.com>
bug: v/88558

Add conditional compilation for Bluetooth LE Channel Sounding (CS)
feature across framework API, socket IPC, Zephyr SAL, CS profiles,
and le_cs tool.

Signed-off-by: jialu <jialu@xiaomi.com>
bug: v/87941

bt_sal_a2dp_source_send_data calls net_buf_add_mem(media_packet_buf, &buf[AVDTP_RTP_HEADER_LEN], nbytes) without validating whether nbytes exceeds the available space in media_packet_buf. The buffer is allocated from bt_a2dp_tx_pool with a data size of CONFIG_ZBLUE_A2DP_SOURCE_BUF_SIZE (default 660 bytes). After bt_a2dp_stream_create_pdu reserves protocol headers (STREAM_DATA_RESERVED, i.e. AVDTP_RTP_HEADER_LEN = 12 bytes), the actual usable payload space is CONFIG_ZBLUE_A2DP_SOURCE_BUF_SIZE - STREAM_DATA_RESERVED (648 bytes). Zephyr's net_buf_add_mem only has an __ASSERT_NO_MSG check which is stripped in release builds. If nbytes exceeds the tailroom, a buffer overflow occurs, corrupting adjacent memory and potentially causing hard faults or data corruption.

Fix: Add a length check before buffer allocation using CONFIG_ZBLUE_A2DP_SOURCE_BUF_SIZE - STREAM_DATA_RESERVED as the maximum payload limit. When nbytes exceeds this limit, log an error and return BT_STATUS_PARM_INVALID, avoiding unnecessary buffer allocation and out-of-bounds writes.

Signed-off-by: jialu <jialu@xiaomi.com>
bug: v/85832

Fix defects causing bttool resource leak during BT enable/disable
stress test:

1. Store bttool_t pointer in g_bttool_loop->data so TURNING_OFF callback
   can access the async queue (previously always 0, cleanup was skipped)
2. Replace do_in_thread_loop with bttool_uninit() in TURNING_OFF callback
   to send _uninit command via uv_async_queue_send, ensuring bt_tool_uninit
   runs on g_bttool_loop thread (mirrors bttool_quit pattern). Guard with
   CONFIG_LIBUV_EXTENSION only.
3. Add re-entry guard in bt_tool_uninit to prevent double cleanup on
   repeated BT disable cycles

Signed-off-by: chejinxian1 <chejinxian1@xiaomi.com>
bug: v/87902

Rootcause: When ACL disconnects during SDP discovery, the SDP client
only invokes the disconnected callback, not the func callback. HFP HF
SAL did not register a disconnected callback, causing sal_conn to be
orphaned in g_sal_hf_conn_list and the upper layer state machine to
get stuck. Add the disconnected callback to clean up sal_conn and
notify the upper layer of the disconnection.

Signed-off-by: liyuheng <liyuheng@xiaomi.com>
bug: v/87902

Rootcause: When ACL disconnects during SDP discovery, the SDP client
only invokes the disconnected callback, not the func callback. HFP AG
SAL did not register a disconnected callback, so the upper layer never
received a PROFILE_STATE_DISCONNECTED notification and could get stuck.
Add the disconnected callback to notify the upper layer of the
disconnection.

Signed-off-by: liyuheng <liyuheng@xiaomi.com>
…nnected

bug: v/87902

Use sal_conn->addr directly instead of an intermediate bt_address_t*
pointer variable. The sal_conn object remains valid until bt_list_remove
at the end of the function, so direct access is safe and cleaner.

Signed-off-by: liyuheng <liyuheng@xiaomi.com>
bug: v/87902

Refactor the HFP AG SAL connection establishment to align with the
HFP HF implementation:

- Remove global g_conn_params that serialized all connections through
  a single slot, preventing parallel outgoing connections.
- Split do_ag_connect into do_ag_sdp_discover (SDP phase) and
  do_ag_slc_connect (SLC phase), matching HF's do_hf_sdp_discover
  and do_hf_slc_connect.
- Create sal_conn early at SDP discover time so the connection is
  tracked from the start, enabling proper cleanup on SDP failure.
- Use service_loop_work to dispatch do_ag_slc_connect from
  zblue_on_sdp_done, instead of direct synchronous call.
- Add find_connection_by_context to look up sal_conn by bt_conn*.
- Fix zblue_on_ag_disconnected to use sal_conn->addr directly and
  call bt_list_remove after callbacks.
- Fix zblue_on_ag_connected to handle incoming connections and
  simultaneous connection collision consistently with HF.

Signed-off-by: liyuheng <liyuheng@xiaomi.com>
bug: v/5823

CM_RECONNECT_INTERVAL was changed from 8s to 12s for power optimization,
but CM_RECONNECT_TIMES was not updated accordingly, resulting in a 45-minute
reconnect window instead of the designed 30 minutes.

Fix by deriving CM_RECONNECT_TIMES from CM_RECONNECT_INTERVAL directly,
so future interval changes are automatically reflected.

Signed-off-by: liuxiang18 <liuxiang18@xiaomi.com>
bug: v/87639

Keep the ACL link in active mode during SCO connection to reduce
control latency for HFP commands such as call termination.

Also fix audio_on_exit to call bt_pm_idle instead of bt_pm_busy,
ensuring consistent idle state when SCO is not established.

Signed-off-by: Zihao Gao <gaozihao@xiaomi.com>
bug: v/88807

When malloc fails after index_alloc succeeds, the allocated
conn_id is not freed, causing index_allocator resource leak.
Fix by calling index_free() before returning NULL.

Signed-off-by: chejinxian1 <chejinxian1@xiaomi.com>
bug: v/88808

Per RFCOMM 1.2 spec, SCN valid range is 1-30, SCN 0 is
invalid/reserved. Change from -1 to 0 to fix UUID-based
connection state callback mismatch.

Signed-off-by: chejinxian1 <chejinxian1@xiaomi.com>
bug: v/88809

When bt_rfcomm_dlc_connect() fails in sdp_discovered_cb(),
spp_conn remains in connections list causing subsequent
connections to fail with BT_STATUS_BUSY. Fix by directly
cleaning up resources and notifying connection manager.

Signed-off-by: chejinxian1 <chejinxian1@xiaomi.com>
bug: v/88810

Add 3-second SDP discover timeout to prevent connection
hanging when remote device does not respond. Cancel timer
in sdp_discovered_cb, sdp_disconnected_cb and connection
free path. Also fix missing cm_profile_disconnected callback
when RFCOMM connect fails.

Signed-off-by: chejinxian1 <chejinxian1@xiaomi.com>
Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Fixes multiple resource-management issues in SPP UUID-based connection flow (Zephyr SAL), including preventing connection ID leaks, aligning the “unknown SCN” sentinel with RFCOMM expectations, and adding an SDP discovery timeout to avoid indefinite hangs when the peer doesn’t respond.

Changes:

  • Add SDP discovery timeout + timer cleanup hooks in Zephyr SPP SAL, and adjust cleanup on SDP/RFCOMM failure paths.
  • Fix conn_id leak on allocation failure in SPP profile service.
  • Update UNKNOWN_SERVER_CHANNEL_NUM to 0 and adjust related header docs; minor GATT dump refactor.

Reviewed changes

Copilot reviewed 4 out of 4 changed files in this pull request and generated 3 comments.

File Description
service/stacks/zephyr/sal_spp_interface.c Adds SDP discovery timeout timer and new cleanup paths for SDP/RFCOMM failures.
service/profiles/spp/spp_service.c Frees allocated conn_id when malloc(sizeof(spp_device_t)) fails.
service/profiles/gatt/gattc_service.c Minor refactor to inline a temporary variable in dump loop.
framework/include/bt_spp.h Updates unknown SCN sentinel to 0 and improves its documentation.
Comments suppressed due to low confidence (1)

service/stacks/zephyr/sal_spp_interface.c:792

  • sdp_discovered_cb() dereferences result (result->resp_buf) without checking whether result is NULL. Other SDP discovery callbacks in this repo (e.g., HFP) treat result == NULL as a valid “no matching service / discovery complete” signal. Add a !result check before accessing fields and handle the failure case by cleaning up and disconnecting/removing the pending SPP connection.
static uint8_t sdp_discovered_cb(struct bt_conn* conn, struct bt_sdp_client_result* result,
    const struct bt_sdp_discover_params* param)
{
    int err;
    uint8_t ret = BT_SDP_DISCOVER_UUID_STOP;
    uint16_t scn;
    sal_spp_connection_t* spp_conn;

    spp_conn = spp_find_connection_by_sdp_param(conn, param);
    if (!spp_conn) {
        BT_LOGE("SPP connection not found for conn");
        return BT_SDP_DISCOVER_UUID_STOP;
    }

    spp_sdp_cancel_timer(spp_conn->spp_client);

    if (!result->resp_buf) {
        BT_LOGE("SPP SDP discover response buffer is null");
        ret = BT_SDP_DISCOVER_UUID_CONTINUE;
        goto fail;

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

return;
}

spp_conn->spp_client->sdp_timer = NULL;
Copy link

Copilot AI Apr 1, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

spp_sdp_timeout_handler() sets sdp_timer to NULL but never cancels/closes the libuv timer handle. Since service_loop_cancel_timer() is what actually uv_close()s the timer and frees it (via handle_close_cb), this will leak the timer after it fires, and spp_connection_free() won't be able to cancel it later because the pointer was cleared. Cancel/close the timer in the timeout handler (e.g., call service_loop_cancel_timer(timer) or spp_sdp_cancel_timer() before NULLing) and avoid clearing the pointer until after cancel.

Suggested change
spp_conn->spp_client->sdp_timer = NULL;
if (spp_conn->spp_client->sdp_timer) {
service_loop_cancel_timer(spp_conn->spp_client->sdp_timer);
spp_conn->spp_client->sdp_timer = NULL;
}

Copilot uses AI. Check for mistakes.
Comment on lines 787 to +815
@@ -768,8 +804,15 @@ static uint8_t sdp_discovered_cb(struct bt_conn* conn, struct bt_sdp_client_resu
err = spp_connect_with_channel(spp_conn, scn);
if (err < 0) {
BT_LOGE("SPP connect RFCOMM fail, err:%d", err);
ret = BT_SDP_DISCOVER_UUID_STOP;
goto fail;
/* RFCOMM connect failed, clean up resources directly */
spp_on_connection_state_changed(&spp_conn->addr, spp_conn->conn_port,
PROFILE_STATE_DISCONNECTED);
bt_sal_cm_profile_disconnected_callback(&spp_conn->addr, PROFILE_SPP,
spp_conn->conn_port);
spp_conn_lock();
bt_list_remove(g_spp_manager.connections, spp_conn);
spp_conn_unlock();
return BT_SDP_DISCOVER_UUID_STOP;
Copy link

Copilot AI Apr 1, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The SDP timeout is canceled immediately on entry to sdp_discovered_cb(). If the callback returns BT_SDP_DISCOVER_UUID_CONTINUE (e.g., when resp_buf is NULL or bt_sdp_get_proto_param() fails), discovery can continue with no timeout running, which undermines the added hang protection. Consider canceling the timer only once discovery is definitively finishing (STOP/success) or restarting/rescheduling the timer when continuing.

Copilot uses AI. Check for mistakes.
* @brief Unknow server channel number
*
* @brief Unknown server channel number
* @note SCN valid range is 1-30 per RFCOMM spec, 0 means unspecified
Copy link

Copilot AI Apr 1, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The new UNKNOWN_SERVER_CHANNEL_NUM doc note says “SCN valid range is 1-30”, but other SPP API docs in this header still describe SCN as “range in 1-28”. Please reconcile these statements (e.g., mention that RFCOMM allows 1-30 but this API/service reserves 29–30, so callers should use 1–28).

Suggested change
* @note SCN valid range is 1-30 per RFCOMM spec, 0 means unspecified
* @note RFCOMM defines SCN values in the range 1-30; this API reserves 29-30 for
* internal use, so callers must use SCN values in the range 1-28. 0 means
* unspecified.

Copilot uses AI. Check for mistakes.
zhongzhijie1
zhongzhijie1 previously approved these changes Apr 9, 2026
Copy link
Copy Markdown
Contributor

@zhongzhijie1 zhongzhijie1 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. conn_id leak fix、UNKNOWN_SERVER_CHANNEL_NUM 改 0 符合 RFCOMM spec、SDP discover timeout + cancel + 资源清理完整。

@liujinye-sys liujinye-sys dismissed zhongzhijie1’s stale review May 9, 2026 06:44

The merge-base changed after approval.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.