Skip to content

Feature/server server mesh#356

Merged
0pcom merged 11 commits intoskycoin:developfrom
0pcom:feature/server-server-mesh
Mar 30, 2026
Merged

Feature/server server mesh#356
0pcom merged 11 commits intoskycoin:developfrom
0pcom:feature/server-server-mesh

Conversation

@0pcom
Copy link
Copy Markdown
Collaborator

@0pcom 0pcom commented Mar 30, 2026

No description provided.

0pcom added 11 commits March 30, 2026 08:02
Enable clients connected to different dmsg servers to communicate by
having servers peer with each other. This removes the scaling limitation
where clients must be on the same server to reach each other.

Design:
- Servers peer as clients to each other using existing session mechanism
  (TCP + noise XK handshake + yamux), requiring no new transport code
- Peers configured via static config (no discovery dependency)
- When a server can't find destination client locally, it tries
  forwarding through peer server sessions
- 1-hop maximum: peer servers only check local sessions, no further
  forwarding (prevents loops without TTL)
- Original SignedObject forwarded as-is (client signature preserved)
- Backward compatible: no wire protocol changes, existing clients
  work unchanged

Key changes:
- ServerConfig.Peers: static peer server list (PK + address)
- Server.peerSessions: outbound connections to peer servers
- Server.peerPKs: identifies incoming sessions as peer servers
- SessionCommon.isPeer: relaxes SrcAddr.PK check for forwarded requests
- ServerSession.forwardViaPeer: iterates peers on local lookup failure
- maintainPeerConnection: persistent connection with reconnect backoff

Config example:
  "peers": [{"public_key": "02abc...", "address": "1.2.3.4:8081"}]
Servers now automatically discover and peer with all other servers
registered in dmsg discovery, in addition to statically configured
peers. A background loop queries AllServers periodically and
establishes peer connections to any new servers found.

Static config peers take priority and are always connected. Discovery-
based peers are additive — they're discovered and connected without
requiring any config changes.

This means in the current deployment, all dmsg servers will
automatically mesh with each other as long as they share the same
dmsg discovery.
DialStream now falls back to trying all existing sessions when the
target's delegated servers are unreachable. If the client's server is
meshed with the target's server, the request is forwarded through the
peer connection transparently.

The e2e test verifies: two servers peered via static config, each with
one isolated client (separate filtered discovery), cross-server dial
succeeds with bidirectional 1KB data transfer through the mesh.
Reorder DialStream to try mesh forwarding through existing sessions
before attempting to establish new server connections. The new order:

1. Existing sessions matching target's delegated servers (direct, free)
2. All other existing sessions via mesh (free, already connected)
3. New sessions to delegated servers (expensive, last resort)

This avoids unnecessary TCP+noise+yamux handshakes when the client
is already connected to meshed servers that can forward the request.
- Replace hardcoded 5s timeout in initClient/initServer with the
  HandshakeTimeout constant (20s). The 5s was too aggressive and
  inconsistent with the exported constant used elsewhere.
- Change DefaultMaxSessions from 100 to 2048 to match the actual
  production default in dmsgserver config.
- Use dmsg.DefaultMaxSessions in dmsgserver GenerateDefaultConfig
  instead of a hardcoded 2048, ensuring a single source of truth.
bytedance/sonic/loader v0.5.0 -> v0.5.1
gin-contrib/sse v1.1.0 -> v1.1.1
Add targets ported from skywire's Makefile:
- update-dep: go get -u, tidy, vendor, auto-commit
- update-skywire: update skywire dep to latest develop
- update-skycoin: update skycoin dep to latest develop
- push-deps: commit and push vendor changes
- sync-upstream-develop: sync fork's develop with upstream
- tidy: standalone go mod tidy
- format now depends on tidy (like skywire)
- dep now depends on tidy
- Implement SOCKS5 whitelist enforcement: connections from PKs not in
  the --wl list are now rejected (was a no-op despite accepting the flag)
- Add waitgroup to Client for clean goroutine shutdown on Close()
- Remove kill.go force-exit workaround: all commands now use
  cmdutil.SignalContext for proper signal handling
- Document why timestamp tracking passes 0: concurrent streams from the
  same client can arrive out of order, and noise nonce tracking already
  prevents replay at the session level
- Remove resolved TODO on pty_client.go error choice
- TestControl_Ping: use require.NoError for fail-fast, close controls
  in correct order (responder first) to avoid EOF race on pipe cleanup
- TestHTTPTransport_RoundTrip: use graceful srv.Shutdown() instead of
  raw lis.Close() to let in-flight HTTP requests finish before closing,
  preventing race between handler goroutines and listener teardown
peerPKs was read in isPeerPK (from handleSession goroutines) and
written in discoverAndConnectPeers without synchronization. Protect
both accesses with peerSessionsMx.
@0pcom 0pcom merged commit 2efdaf6 into skycoin:develop Mar 30, 2026
3 checks passed
@0pcom 0pcom deleted the feature/server-server-mesh branch March 30, 2026 16:04
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant