Merged
Conversation
Enable clients connected to different dmsg servers to communicate by
having servers peer with each other. This removes the scaling limitation
where clients must be on the same server to reach each other.
Design:
- Servers peer as clients to each other using existing session mechanism
(TCP + noise XK handshake + yamux), requiring no new transport code
- Peers configured via static config (no discovery dependency)
- When a server can't find destination client locally, it tries
forwarding through peer server sessions
- 1-hop maximum: peer servers only check local sessions, no further
forwarding (prevents loops without TTL)
- Original SignedObject forwarded as-is (client signature preserved)
- Backward compatible: no wire protocol changes, existing clients
work unchanged
Key changes:
- ServerConfig.Peers: static peer server list (PK + address)
- Server.peerSessions: outbound connections to peer servers
- Server.peerPKs: identifies incoming sessions as peer servers
- SessionCommon.isPeer: relaxes SrcAddr.PK check for forwarded requests
- ServerSession.forwardViaPeer: iterates peers on local lookup failure
- maintainPeerConnection: persistent connection with reconnect backoff
Config example:
"peers": [{"public_key": "02abc...", "address": "1.2.3.4:8081"}]
Servers now automatically discover and peer with all other servers registered in dmsg discovery, in addition to statically configured peers. A background loop queries AllServers periodically and establishes peer connections to any new servers found. Static config peers take priority and are always connected. Discovery- based peers are additive — they're discovered and connected without requiring any config changes. This means in the current deployment, all dmsg servers will automatically mesh with each other as long as they share the same dmsg discovery.
DialStream now falls back to trying all existing sessions when the target's delegated servers are unreachable. If the client's server is meshed with the target's server, the request is forwarded through the peer connection transparently. The e2e test verifies: two servers peered via static config, each with one isolated client (separate filtered discovery), cross-server dial succeeds with bidirectional 1KB data transfer through the mesh.
Reorder DialStream to try mesh forwarding through existing sessions before attempting to establish new server connections. The new order: 1. Existing sessions matching target's delegated servers (direct, free) 2. All other existing sessions via mesh (free, already connected) 3. New sessions to delegated servers (expensive, last resort) This avoids unnecessary TCP+noise+yamux handshakes when the client is already connected to meshed servers that can forward the request.
- Replace hardcoded 5s timeout in initClient/initServer with the HandshakeTimeout constant (20s). The 5s was too aggressive and inconsistent with the exported constant used elsewhere. - Change DefaultMaxSessions from 100 to 2048 to match the actual production default in dmsgserver config. - Use dmsg.DefaultMaxSessions in dmsgserver GenerateDefaultConfig instead of a hardcoded 2048, ensuring a single source of truth.
bytedance/sonic/loader v0.5.0 -> v0.5.1 gin-contrib/sse v1.1.0 -> v1.1.1
Add targets ported from skywire's Makefile: - update-dep: go get -u, tidy, vendor, auto-commit - update-skywire: update skywire dep to latest develop - update-skycoin: update skycoin dep to latest develop - push-deps: commit and push vendor changes - sync-upstream-develop: sync fork's develop with upstream - tidy: standalone go mod tidy - format now depends on tidy (like skywire) - dep now depends on tidy
- Implement SOCKS5 whitelist enforcement: connections from PKs not in the --wl list are now rejected (was a no-op despite accepting the flag) - Add waitgroup to Client for clean goroutine shutdown on Close() - Remove kill.go force-exit workaround: all commands now use cmdutil.SignalContext for proper signal handling - Document why timestamp tracking passes 0: concurrent streams from the same client can arrive out of order, and noise nonce tracking already prevents replay at the session level - Remove resolved TODO on pty_client.go error choice
- TestControl_Ping: use require.NoError for fail-fast, close controls in correct order (responder first) to avoid EOF race on pipe cleanup - TestHTTPTransport_RoundTrip: use graceful srv.Shutdown() instead of raw lis.Close() to let in-flight HTTP requests finish before closing, preventing race between handler goroutines and listener teardown
peerPKs was read in isPeerPK (from handleSession goroutines) and written in discoverAndConnectPeers without synchronization. Protect both accesses with peerSessionsMx.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
No description provided.