Skip to content

HPCC-36063 feat(jlib) Add support for Unix domain sockets#21146

Open
ghalliday wants to merge 1 commit intohpcc-systems:masterfrom
ghalliday:issue36063
Open

HPCC-36063 feat(jlib) Add support for Unix domain sockets#21146
ghalliday wants to merge 1 commit intohpcc-systems:masterfrom
ghalliday:issue36063

Conversation

@ghalliday
Copy link
Copy Markdown
Member

@ghalliday ghalliday commented Mar 25, 2026

Type of change:

  • This change is a bug fix (non-breaking change which fixes an issue).
  • This change is a new feature (non-breaking change which adds functionality).
  • This change improves the code (refactor or other change that does not change the functionality)
  • This change fixes warnings (the fix does not alter the functionality or the generated code)
  • This change is a breaking change (fix or feature that will cause existing behavior to change).
  • This change alters the query API (existing queries will have to be recompiled)

Checklist:

  • My code follows the code style of this project.
    • My code does not create any new warnings from compiler, build system, or lint.
  • The commit message is properly formatted and free of typos.
    • The commit message title makes sense in a changelog, by itself.
    • The commit is signed.
  • My change requires a change to the documentation.
    • I have updated the documentation accordingly, or...
    • I have created a JIRA ticket to update the documentation.
    • Any new interfaces or exported functions are appropriately commented.
  • I have read the CONTRIBUTORS document.
  • The change has been fully tested:
    • I have added tests to cover my changes.
    • All new and existing tests passed.
    • I have checked that this change does not introduce memory leaks.
    • I have used Valgrind or similar tools to check for potential issues.
  • I have given due consideration to all of the following potential concerns:
    • Scalability
    • Performance
    • Security
    • Thread-safety
    • Cloud-compatibility
    • Premature optimization
    • Existing deployed queries will not be broken
    • This change fixes the problem, not just the symptom
    • The target branch of this pull request is appropriate for such a change.
  • There are no similar instances of the same problem that should be addressed
    • I have addressed them here
    • I have raised JIRA issues to address them separately
  • This is a user interface / front-end modification
    • I have tested my changes in multiple modern browsers
    • The component(s) render as expected

Smoketest:

  • Send notifications about my Pull Request position in Smoketest queue.
  • Test my draft Pull Request.

Testing:

Signed-off-by: Gavin Halliday <gavin.halliday@lexisnexis.com>
@ghalliday
Copy link
Copy Markdown
Member Author

This did not seem to have any noticeable impact on roxie performance. I will retest on a v2 at a later date. I think worth merging though.

Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Adds Unix Domain Socket (UDS) support as an optional transport path, wiring it through jlib socket creation/connect paths and Roxie TCP transport helpers.

Changes:

  • Extended ISocket and secure socket utilities to optionally use AF_UNIX sockets.
  • Updated Roxie TCP sender/listener plumbing to switch to UDS when protocol=uds.
  • Updated unit tests call sites to match the new async-connect API signature.

Reviewed changes

Copilot reviewed 12 out of 12 changed files in this pull request and generated 7 comments.

Show a summary per file
File Description
testing/unittests/socketasynctests.cpp Updates async-connect socket creation calls to pass a UDS flag.
testing/unittests/multishotaccepttests.cpp Updates listener constructor call to include new UDS parameter.
system/security/securesocket/socketutils.hpp Extends listener/sender APIs with UDS options.
system/security/securesocket/socketutils.cpp Implements UDS-aware listen/connect and async connect setup.
system/jlib/jsocket.hpp Extends ISocket API for UDS and async connect.
system/jlib/jsocket.cpp Adds AF_UNIX socket-mode support and UDS create/connect helpers.
roxie/udplib/udpsha.cpp Adds global useUdsTransport configuration flag.
roxie/udplib/udplib.hpp Exposes useUdsTransport flag.
roxie/udplib/tcptrs.cpp Passes UDS flag into CTcpSender.
roxie/udplib/tcptrr.cpp Updates listener start to pass UDS flag.
roxie/ccd/ccdqueue.cpp Updates listener/sender startup to pass UDS flag.
roxie/ccd/ccdmain.cpp Parses protocol=uds and enables the UDS transport flag.

Comment thread system/jlib/jsocket.cpp
getSockAddrEndpoint(peerSockAddr, peerSockAddrLen, peerEp);
}

CSocket *ret = new CSocket(newsock,sm_tcp,true,&peerEp);
Copy link

Copilot AI Mar 25, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When accepting an AF_UNIX connection, the accepted socket is still constructed with sm_tcp. This will misclassify the socket (e.g., mode-dependent behavior like connectionless() and any future mode checks). Consider selecting sm_unix when peerSockAddr.sa.sa_family == AF_UNIX (or derive from the listening socket’s sockmode) when constructing the accepted CSocket.

Suggested change
CSocket *ret = new CSocket(newsock,sm_tcp,true,&peerEp);
auto mode = sm_tcp;
#ifndef _WIN32
if (peerSockAddr.sa.sa_family == AF_UNIX)
mode = sm_unix;
#endif
CSocket *ret = new CSocket(newsock,mode,true,&peerEp);

Copilot uses AI. Check for mistakes.
Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That looks like a potential issue - I will revisit later.

Comment thread system/jlib/jsocket.cpp
Comment on lines +971 to +979
inline socklen_t setUnixSockAddr(J_SOCKADDR &u, const IpAddress &ip, unsigned short port) {
StringBuffer ipStr;
ip.getIpText(ipStr);
ipStr.replace((char) '.', (char) '_');
ipStr.replace((char) ':', (char) '_');
u.sun.sun_family = AF_UNIX;
snprintf(u.sun.sun_path, sizeof(u.sun.sun_path), "/tmp/hpcc_uds_%s_%u.sock", ipStr.str(), port);
return sizeof(u.sun);
}
Copy link

Copilot AI Mar 25, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The UDS path is predictable and created under /tmp, which is world-writable. This can enable denial-of-service and/or symlink-style attacks around bind/unlink, and also risks path collisions. Consider using a dedicated runtime directory with restrictive permissions (e.g., under /var/run or a configured directory), and/or validate ownership/permissions before unlinking, and set explicit socket file permissions after bind.

Copilot uses AI. Check for mistakes.
Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is a valid point./ @mckellyln any suggestions /var/lib/HPCC-Systems/uds?

Copy link
Copy Markdown
Contributor

@mckellyln mckellyln Mar 30, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Interesting if it was a shared filesystem that could be seen across containers on the same host.
But otherwise yes, /var/run/HPCCSystems/uds seems like a good dir.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I sort of agree with a comment below - we can have different IPs but same host depending on the default interface and perhaps its better to use a hostname or something more of a chance of being the same.

Comment thread system/jlib/jsocket.cpp
Comment on lines +1410 to +1412
IpAddress bindIp;
if (targetip.isNull()) GetHostIp(bindIp); else bindIp = targetip;
ul = setUnixSockAddr(u, bindIp, hostport);
Copy link

Copilot AI Mar 25, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

UDS bind path derivation depends on bindIp (host IP if targetip is null). On the client side, the path derivation uses the endpoint IP passed for connect. If the endpoint IP differs from GetHostIp() (e.g., 127.0.0.1 vs interface IP, hostname vs IP, etc.), client and server will compute different filesystem paths and connects will fail. Consider deriving the UDS path from a stable identifier independent of endpoint IP (e.g., port only, or an explicit configured socket path), or ensuring both sides use the same canonical IP when useUDS is enabled.

Suggested change
IpAddress bindIp;
if (targetip.isNull()) GetHostIp(bindIp); else bindIp = targetip;
ul = setUnixSockAddr(u, bindIp, hostport);
// Derive UDS path from a stable identifier (port) rather than host IP,
// to avoid mismatches between server and client UDS paths.
ul = setUnixSockAddr(u, hostport);

Copilot uses AI. Check for mistakes.
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree with this point, not sure how common it is, but perhaps the hostname ?

Comment thread system/jlib/jsocket.hpp
// _useIOUring: true to enable io_uring multishot accept (if available), false to use traditional thread-based accept
// Defaults to true, but can be overridden by expert/@useIOUring configuration setting
CSocketConnectionListener(unsigned port, bool _useTLS, unsigned _inactiveCloseTimeoutMs, unsigned _maxListenHandlerSockets, bool _useIOUring = true);
CSocketConnectionListener(unsigned port, bool _useTLS, unsigned _inactiveCloseTimeoutMs, unsigned _maxListenHandlerSockets, bool _useIOUring, bool _useUDS);
Copy link

Copilot AI Mar 25, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The comment states _useIOUring defaults to true, but the constructor signature no longer provides a default and now requires both _useIOUring and _useUDS. Update the comment to match the API, or reintroduce default arguments (e.g., _useIOUring=true, _useUDS=false) if that’s the intended behavior.

Suggested change
CSocketConnectionListener(unsigned port, bool _useTLS, unsigned _inactiveCloseTimeoutMs, unsigned _maxListenHandlerSockets, bool _useIOUring, bool _useUDS);
// _useUDS: true to use Unix domain sockets, false to use TCP/IP (defaults to false)
CSocketConnectionListener(unsigned port, bool _useTLS, unsigned _inactiveCloseTimeoutMs, unsigned _maxListenHandlerSockets, bool _useIOUring=true, bool _useUDS=false);

Copilot uses AI. Check for mistakes.
Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

will fix

return config.getClear();
}

static constexpr bool useUDS = false;
Copy link

Copilot AI Mar 25, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

UDS support is introduced in the async-connect path, but the unit tests hardcode useUDS = false, so the new UDS branch is not exercised. Consider adding at least one test variant that sets useUDS = true and validates async connect (and failure/timeout behaviors) against a UDS listener to cover the new transport mode.

Suggested change
static constexpr bool useUDS = false;
static constexpr bool useUDS = true;

Copilot uses AI. Check for mistakes.
Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Definitely for future work - currently this doesn't show a significant boost, so unlikely to be used in production.

size32_t addrlen = 0;

Owned<ISocket> socket = ISocket::createForAsyncConnect(ep, addrHolder.getRef(), addrlen);
Owned<ISocket> socket = ISocket::createForAsyncConnect(ep, addrHolder.getRef(), addrlen, useUDS);
Copy link

Copilot AI Mar 25, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

UDS support is introduced in the async-connect path, but the unit tests hardcode useUDS = false, so the new UDS branch is not exercised. Consider adding at least one test variant that sets useUDS = true and validates async connect (and failure/timeout behaviors) against a UDS listener to cover the new transport mode.

Copilot uses AI. Check for mistakes.
@github-actions
Copy link
Copy Markdown

Jira Issue: https://hpccsystems.atlassian.net//browse/HPCC-36063

Jirabot Action Result:
Assigning user: gavin.halliday@lexisnexisrisk.com
Workflow Transition To: Merge Pending
Updated PR

Copy link
Copy Markdown
Member

@jakesmith jakesmith left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ghalliday - looks ok afaics, 1 minor comment and some bracing formatting/inconsistencies.

Comment thread system/jlib/jsocket.cpp
}


inline socklen_t setUnixSockAddr(J_SOCKADDR &u, const IpAddress &ip, unsigned short port) {
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

formatting: new non allman style bracing

Comment thread system/jlib/jsocket.cpp
sockaddrlen = setSockAddr(sockaddr, targetip, hostport);
T_SOCKET sock = ::socket(sockaddr.sa.sa_family, SOCK_STREAM, targetip.isIp4() ? 0 : PF_INET6);
#ifndef _WIN32
if (sockmode == sm_unix_server || sockmode == sm_unix) {
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

formatting: mixed bracing style (non allman) , and various other places.

Comment thread roxie/ccd/ccdmain.cpp

bool isBatchRoxie = strisame(roxieMode, "batch");
useTcpTransport = strisame(protocol, "tcp");
useTcpTransport = strisame(protocol, "tcp") || strisame(protocol, "uds");
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

intentional I assume, but could do with comment to clarify

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

agreed

{
socket.setown(ISocket::connect_timeout(ep, 5000));
if (sender.useUDS)
socket.setown(ISocket::unix_connect(ep));
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why not use the time value as well ?

Copy link
Copy Markdown
Contributor

@mckellyln mckellyln left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good but do have a few comments inline,

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants