Skip to content

[Deepin-Kernel-SIG] [linux 6.6.y] [Upstream] tcp: remove 64 KByte limit for initial tp->rcv_wnd value#1713

Open
opsiff wants to merge 2 commits into
deepin-community:linux-6.6.yfrom
opsiff:linux-6.6.y-2026-05-14-2
Open

[Deepin-Kernel-SIG] [linux 6.6.y] [Upstream] tcp: remove 64 KByte limit for initial tp->rcv_wnd value#1713
opsiff wants to merge 2 commits into
deepin-community:linux-6.6.yfrom
opsiff:linux-6.6.y-2026-05-14-2

Conversation

@opsiff
Copy link
Copy Markdown
Member

@opsiff opsiff commented May 14, 2026

Summary by Sourcery

Relax TCP initial receive window limits while constraining SYNACK window advertisements for security and RFC compliance.

Bug Fixes:

  • Cap the advertised SYNACK receive window to 64KB to harden TCP connections in NEW_SYN_RECV state against oversized window abuse.

Enhancements:

  • Remove the 64KB cap on the initial TCP receive window so it can fully utilize available space as computed by tcp_select_initial_window.
  • Centralize SYNACK window calculation in a helper that enforces the 64KB limit and reuse it for both IPv4 and IPv6 ACK handling.

JasonXing and others added 2 commits May 14, 2026 16:01
mainline inclusion
from mainline-v6.10-rc1
categroy: bugfix

Recently, we had some servers upgraded to the latest kernel and noticed
the indicator from the user side showed worse results than before. It is
caused by the limitation of tp->rcv_wnd.

In 2018 commit a337531 ("tcp: up initial rmem to 128KB and SYN rwin
to around 64KB") limited the initial value of tp->rcv_wnd to 65535, most
CDN teams would not benefit from this change because they cannot have a
large window to receive a big packet, which will be slowed down especially
in long RTT. Small rcv_wnd means slow transfer speed, to some extent. It's
the side effect for the latency/time-sensitive users.

To avoid future confusion, current change doesn't affect the initial
receive window on the wire in a SYN or SYN+ACK packet which are set within
65535 bytes according to RFC 7323 also due to the limit in
__tcp_transmit_skb():

    th->window      = htons(min(tp->rcv_wnd, 65535U));

In one word, __tcp_transmit_skb() already ensures that constraint is
respected, no matter how large tp->rcv_wnd is. The change doesn't violate
RFC.

Let me provide one example if with or without the patch:
Before:
client   --- SYN: rwindow=65535 ---> server
client   <--- SYN+ACK: rwindow=65535 ----  server
client   --- ACK: rwindow=65536 ---> server
Note: for the last ACK, the calculation is 512 << 7.

After:
client   --- SYN: rwindow=65535 ---> server
client   <--- SYN+ACK: rwindow=65535 ----  server
client   --- ACK: rwindow=175232 ---> server
Note: I use the following command to make it work:
ip route change default via [ip] dev eth0 metric 100 initrwnd 120
For the last ACK, the calculation is 1369 << 7.

When we apply such a patch, having a large rcv_wnd if the user tweak this
knob can help transfer data more rapidly and save some rtts.

Fixes: a337531 ("tcp: up initial rmem to 128KB and SYN rwin to around 64KB")
Signed-off-by: Jason Xing <kernelxing@tencent.com>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Acked-by: Neal Cardwell <ncardwell@google.com>
Link: https://lore.kernel.org/r/20240521134220.12510-1-kerneljasonxing@gmail.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
(cherry picked from commit 378979e)
Signed-off-by: Wentao Guan <guanwentao@uniontech.com>
mainline inclusion
from mainline-v6.10-rc2
category: bugfix

Jason commit made checks against ACK sequence less strict
and can be exploited by attackers to establish spoofed flows
with less probes.

Innocent users might use tcp_rmem[1] == 1,000,000,000,
or something more reasonable.

An attacker can use a regular TCP connection to learn the server
initial tp->rcv_wnd, and use it to optimize the attack.

If we make sure that only the announced window (smaller than 65535)
is used for ACK validation, we force an attacker to use
65537 packets to complete the 3WHS (assuming server ISN is unknown)

Fixes: 378979e ("tcp: remove 64 KByte limit for initial tp->rcv_wnd value")
Link: https://datatracker.ietf.org/meeting/119/materials/slides-119-tcpm-ghost-acks-00
Signed-off-by: Eric Dumazet <edumazet@google.com>
Acked-by: Neal Cardwell <ncardwell@google.com>
Reviewed-by: Jason Xing <kerneljasonxing@gmail.com>
Link: https://lore.kernel.org/r/20240523130528.60376-1-edumazet@google.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Conflicts:
	net/ipv4/tcp_ipv4.c
	net/ipv6/tcp_ipv6.c
(cherry picked from commit f4dca95)
Signed-off-by: Wentao Guan <guanwentao@uniontech.com>
@sourcery-ai
Copy link
Copy Markdown

sourcery-ai Bot commented May 14, 2026

Reviewer's Guide

Adjusts TCP initial receive window handling to remove the 64KB cap while ensuring SYNACK-exposed window and SYN-RECV state validation remain limited to 64KB for hardening, reusing a new helper across IPv4/IPv6 paths.

File-Level Changes

Change Details Files
Introduce tcp_synack_window() helper to cap SYNACK-visible receive window at 64KB and reuse it in IPv4/IPv6 ACK generation and SYN-RECV window checks.
  • Add tcp_synack_window(request_sock *) inline helper that returns rsk_rcv_wnd limited to 65535U with explanatory RFC 7323 comment
  • Use tcp_synack_window(req) >> inet_rsk(req)->rcv_wscale for the advertised window in tcp_v4_reqsk_send_ack instead of directly shifting rsk_rcv_wnd
  • Use tcp_synack_window(req) >> inet_rsk(req)->rcv_wscale for the advertised window in tcp_v6_reqsk_send_ack to mirror IPv4 behavior
  • Replace direct use of req->rsk_rcv_wnd in tcp_check_req TCP sequence window validation with tcp_synack_window(req) to constrain the acceptable window during NEW_SYN_RECV
include/net/request_sock.h
net/ipv4/tcp_ipv4.c
net/ipv6/tcp_ipv6.c
net/ipv4/tcp_minisocks.c
Remove the generic 64KB cap on initial receive window selection, allowing larger initial windows when not using the signed-windows workaround.
  • Change tcp_select_initial_window to set *rcv_wnd = space when tcp_workaround_signed_windows is disabled, instead of min_t(u32, space, U16_MAX)
  • Retain existing path that limits *rcv_wnd to MAX_TCP_WINDOW when tcp_workaround_signed_windows is enabled
  • Keep interaction with init_rcv_wnd and MSS unchanged so the final receive window remains bounded by init_rcv_wnd * mss
net/ipv4/tcp_output.c

Tips and commands

Interacting with Sourcery

  • Trigger a new review: Comment @sourcery-ai review on the pull request.
  • Continue discussions: Reply directly to Sourcery's review comments.
  • Generate a GitHub issue from a review comment: Ask Sourcery to create an
    issue from a review comment by replying to it. You can also reply to a
    review comment with @sourcery-ai issue to create an issue from it.
  • Generate a pull request title: Write @sourcery-ai anywhere in the pull
    request title to generate a title at any time. You can also comment
    @sourcery-ai title on the pull request to (re-)generate the title at any time.
  • Generate a pull request summary: Write @sourcery-ai summary anywhere in
    the pull request body to generate a PR summary at any time exactly where you
    want it. You can also comment @sourcery-ai summary on the pull request to
    (re-)generate the summary at any time.
  • Generate reviewer's guide: Comment @sourcery-ai guide on the pull
    request to (re-)generate the reviewer's guide at any time.
  • Resolve all Sourcery comments: Comment @sourcery-ai resolve on the
    pull request to resolve all Sourcery comments. Useful if you've already
    addressed all the comments and don't want to see them anymore.
  • Dismiss all Sourcery reviews: Comment @sourcery-ai dismiss on the pull
    request to dismiss all existing Sourcery reviews. Especially useful if you
    want to start fresh with a new review - don't forget to comment
    @sourcery-ai review to trigger a new review!

Customizing Your Experience

Access your dashboard to:

  • Enable or disable review features such as the Sourcery-generated pull request
    summary, the reviewer's guide, and others.
  • Change the review language.
  • Add, remove or edit custom review instructions.
  • Adjust other review settings.

Getting Help

@deepin-ci-robot
Copy link
Copy Markdown

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by:
Once this PR has been reviewed and has the lgtm label, please ask for approval from opsiff. For more information see the Code Review Process.

The full list of commands accepted by this bot can be found here.

Details Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

Copy link
Copy Markdown

@sourcery-ai sourcery-ai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey - I've found 1 issue, and left some high level feedback:

  • Consider using the existing U16_MAX (or a TCP-specific constant) instead of the hard-coded 65535U in tcp_synack_window() to make the RFC limit clearer and avoid magic numbers.
  • tcp_synack_window() is TCP-specific but currently lives in request_sock.h, which is fairly generic; consider moving it to a TCP header (or at least prefixing the comment more explicitly as TCP-only) to avoid leaking protocol-specific helpers into generic request_sock code.
Prompt for AI Agents
Please address the comments from this code review:

## Overall Comments
- Consider using the existing U16_MAX (or a TCP-specific constant) instead of the hard-coded 65535U in tcp_synack_window() to make the RFC limit clearer and avoid magic numbers.
- tcp_synack_window() is TCP-specific but currently lives in request_sock.h, which is fairly generic; consider moving it to a TCP header (or at least prefixing the comment more explicitly as TCP-only) to avoid leaking protocol-specific helpers into generic request_sock code.

## Individual Comments

### Comment 1
<location path="include/net/request_sock.h" line_range="249-251" />
<code_context>
+ * This means the SEG.WND carried in SYNACK can not exceed 65535.
+ * We use this property to harden TCP stack while in NEW_SYN_RECV state.
+ */
+static inline u32 tcp_synack_window(const struct request_sock *req)
+{
+	return min(req->rsk_rcv_wnd, 65535U);
+}
 #endif /* _REQUEST_SOCK_H */
</code_context>
<issue_to_address>
**nitpick:** Use U16_MAX instead of hard-coded 65535U for clarity and consistency

Since this helper is specifically constraining the window to 16 bits, using U16_MAX here makes that intent explicit, avoids a magic number, and stays consistent with other TCP code that uses the same limit.
</issue_to_address>

Sourcery is free for open source - if you like our reviews please consider sharing them ✨
Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.

Comment on lines +249 to +251
static inline u32 tcp_synack_window(const struct request_sock *req)
{
return min(req->rsk_rcv_wnd, 65535U);
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nitpick: Use U16_MAX instead of hard-coded 65535U for clarity and consistency

Since this helper is specifically constraining the window to 16 bits, using U16_MAX here makes that intent explicit, avoids a magic number, and stays consistent with other TCP code that uses the same limit.

Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR backports an upstream Linux TCP change to the deepin 6.6.y kernel that removes the 64KB cap on the initial tp->rcv_wnd while still ensuring SYNACK segments advertise a window that fits within the 16-bit SEG.WND field. A new tcp_synack_window() helper centralizes the 64KB clamp and is reused for IPv4/IPv6 ACK paths and the tcp_check_req() window check.

Changes:

  • Remove U16_MAX cap from initial receive window selection in tcp_select_initial_window().
  • Introduce tcp_synack_window() helper in include/net/request_sock.h that clamps rsk_rcv_wnd to 65535.
  • Use the helper in IPv4/IPv6 reqsk_send_ack paths and in tcp_check_req() window validation.

Reviewed changes

Copilot reviewed 5 out of 5 changed files in this pull request and generated no comments.

Show a summary per file
File Description
include/net/request_sock.h Adds the tcp_synack_window() helper enforcing the 64KB cap, with RFC 7323 explanatory comment.
net/ipv4/tcp_output.c Removes the U16_MAX clamp on initial rcv_wnd in tcp_select_initial_window().
net/ipv4/tcp_minisocks.c Uses tcp_synack_window() in the in-window check of tcp_check_req().
net/ipv4/tcp_ipv4.c Replaces direct req->rsk_rcv_wnd with tcp_synack_window(req) in tcp_v4_reqsk_send_ack(); drops moved RFC comment.
net/ipv6/tcp_ipv6.c Same substitution for tcp_v6_reqsk_send_ack(); drops moved RFC comment.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants