Designing Secure Cross-Cluster Communication for Nutanix Part 3: Reverse Tunnel Acceptance

Reverse Tunnel Architecture

This is a continuation of our deep dive into reverse tunnel architecture. In Part 2, we explored how downstream Envoy instances initiate reverse tunnels using custom listeners, socket interfaces, and IOHandles to establish outbound connections. Now, we shift our focus to the upstream side.

Part 3: Accepting Reverse Connections

Let's examine how the upstream (responder) Envoy accepts these tunnels, validates them, and manages them for future use. The components we'll discuss here play a critical role not just in accepting tunnels, but also in the entire lifecycle of data requests that flow through them (which we'll explore in detail in the next blog).

Intercepting the Handshake: The Reverse Tunnel Network Filter

When a downstream Envoy initiates a reverse tunnel, it makes a TCP connection to the upstream Envoy and sends an HTTP handshake request. But how does the upstream Envoy know this is a reverse tunnel handshake rather than a normal HTTP request? This is where the reverse tunnel network filter comes in.

The reverse tunnel network filter (`ReverseTunnelFilter`) is configured on specific listeners on the upstream Envoy—typically listeners dedicated to accepting reverse tunnel connections. When a new TCP connection arrives, this filter intercepts it before any application-level processing happens. It operates at the network filter level, sitting below the HTTP connection manager in Envoy's filter chain, which gives it first access to the raw connection.

The filter examines the incoming HTTP/1.1 request and performs several critical validation steps:

Path and Method Validation: First, it checks if the request matches the expected reverse tunnel handshake pattern. By default, it expects GET /reverse_connections/request, though both the path and method are configurable. If the request doesn't match, it immediately responds with 404 Not Found and closes the connection. This ensures that only properly formatted handshake requests are processed as reverse tunnels.

Identity Extraction: If the path validates, the filter extracts three identity headers from the request:

x-envoy-reverse-tunnel-node-id: The unique node identifier
x-envoy-reverse-tunnel-cluster-id: The cluster this node belongs to
x-envoy-reverse-tunnel-tenant-id: The tenant identifier for multi-tenancy isolation

If any of these headers are missing, the filter rejects the handshake with `400 Bad Request`. These identifiers become the keys for routing data requests back through this tunnel later.

Flexible Authorization: The filter supports pluggable authorization through Envoy's substitution formatter system. Think of it like a bouncer checking IDs at a club—the reverse tunnel filter checks if the identity claimed in the handshake matches what we expect.

Here's how it works: when a handshake arrives, it contains headers claiming "I am node_123 from cluster_west." But should we trust this? The validator lets us verify these claims against trusted sources.

Example Used Case: Allowlist Validation

Imagine you run a cloud service that only wants to accept tunnels from pre-approved on premises clusters. You maintain an allowlist in a database or configuration system. An earlier filter (like ext_authz or ext_proc) queries your allowlist and stores the expected node ID in dynamic metadata. The reverse tunnel filter then validates:


validation:
node_id_format:"%DYNAMIC_METADATA(envoy.reverse_tunnel.allowlist:expected_node_id)%"
emit_dynamic_metadata: true

When a connection claiming to be "node_123" arrives, the filter checks: "Does the expected node ID from my allowlist match what this connection claims?" If yes, accept with 200 OK. If no, reject with 403 Forbidden.

Another used case: Geography-Based Routing

Suppose you route connections based on geographic regions. An earlier filter determines "this IP address should only connect as nodes from the us-west region." It stores the expected cluster ID in dynamic metadata:


validation:
cluster_id_format:"%DYNAMIC_METADATA(geo.validator:expected_cluster)%"

This prevents a compromised node in one region from impersonating a node in another region.

Socket Handoff: Once validation passes, the filter duplicates the connection's underlying socket file descriptor and hands it off to the thread-local socket manager along with the identity metadata. Why duplicate? Because when the filter chain completes and the original connection goes out of scope, it will close its socket. The duplicated FD allows the socket manager to maintain an independent connection on the same underlying socket for future data requests. The filter then responds with 200 OK to complete the handshake, and both sides are ready to use the tunnel for data traffic.

The Upstream Socket Manager

The socket manager is the heart of reverse tunnel connection lifecycle management on the upstream side. It's a sophisticated socket caching entity that maintains a registry of all established reverse tunnel connections, indexed by their identity metadata. When a data request needs to be sent to a downstream node, the socket manager is the component that finds an available tunnel connection and provides it.

Per-Worker Thread Architecture: Each Envoy worker thread has its own instance of the socket manager, registered through thread local storage. This design is crucial for performance—it avoids cross thread synchronization for the common case of retrieving a connection socket. When a request comes in on worker thread 3, it can immediately access that thread's socket manager without any locks or coordination.

But this raises an interesting challenge: when a new reverse tunnel connection arrives, which worker thread should handle it? The connection might be accepted on worker thread 1, but if all existing tunnels for that node are on worker thread 3, we're creating an imbalance. The socket manager solves this with intelligent load balancing.

Load Balancing Across Workers: When the reverse tunnel filter passes a socket to the Socket Manager, the socket manager checks if it's the least loaded worker for that particular node ID. It maintains a global map (protected by a mutex) tracking how many connections each worker has for each node. If another worker has fewer connections for this node, the socket manager hands off the socket to that worker.

This creates an even distribution of reverse tunnels across worker threads, ensuring that no single thread becomes a bottleneck for connections to a specific downstream node.

Building the Cluster Topology Map: The upstream Envoy has no prior knowledge of the downstream cluster topology. It doesn't know which nodes belong to which clusters, or how many nodes exist in any given cluster. The socket manager discovers this information organically as reverse tunnels are established.

When each handshake completes, the socket manager receives three pieces of information: the node ID, the cluster ID, and the socket itself. From this, it incrementally builds a map of the downstream topology. The first time it sees node "node-123" claiming to be part of cluster "on-prem-cluster", it creates an entry mapping that cluster to that node. When node "node-456" also connects with cluster "on-prem-cluster", the socket manager adds it to the same cluster's node list. Over time, as all nodes in a cluster establish their tunnels, the socket manager builds a complete picture of which nodes belong to which clusters.

This discovery based approach is powerful because it requires no configuration synchronization between upstream and downstream. Downstream clusters can scale up (new nodes appear), scale down (nodes disappear), or reorganize (nodes move between clusters) without any coordination with the upstream. The upstream's view of the topology is always eventually consistent with reality, derived purely from which tunnels are currently connected.

Idle Connections and Keepalive Monitoring: Once a reverse tunnel socket lands in the socket manager, it enters the "idle" state—ready to serve requests but not currently handling any traffic. While idle, these connections are fully owned by the reverse tunnel implementation, and the socket manager takes responsibility for ensuring they stay healthy.

The socket manager implements a continuous keepalive protocol on all idle connections. Every few seconds, it sends a small ping message down each idle socket. The downstream Envoy, which is monitoring these connections, immediately responds with a pong message. When the socket manager receives the pong, it knows the tunnel is still alive and the network path is functional.

But what if a pong doesn't arrive? Perhaps network packets were dropped, or the downstream Envoy crashed. The socket manager tracks missed pongs for each connection. If several consecutive pings go unanswered, the socket manager marks that connection as dead, removes it from the available pool, and updates all its internal mappings. This automatic failure detection ensures that data requests are never routed to broken tunnels—the socket manager's view of available connections is always accurate.

The Transition to Used State: When the upstream Envoy needs to send a request to a downstream node (we'll explore this flow in detail in Part 4), it asks the socket manager for an available socket to that node. The socket manager hands over one of the idle connections from its pool, and at that moment, ownership transfers.

The socket is now in the "used" state—it's owned by Envoy's standard filter chain processing the data request, not by the reverse tunnel implementation. The socket manager stops sending keepalive pings on it. The filter chain takes over, handling the HTTP/2 stream for the data request, managing flow control, processing the response, and eventually closing the stream when the request completes.

To be continued in Part 4: Life of a Data Request, where we'll trace how actual data requests flow through these established reverse tunnels.

©2026 Nutanix, Inc. All rights reserved. Nutanix, the Nutanix logo and all Nutanix product and service names mentioned are registered trademarks or trademarks of Nutanix, Inc. in the United States and other countries. All other brand names mentioned are for identification purposes only and may be the trademarks of their respective holder(s). Code samples and snippets that appear in this content are unofficial, are unsupported, and are provided AS IS Nutanix makes no representations or warranties of any kind, express or implied, as to the operation or content of the code samples, snippets and/or methods. Nutanix expressly disclaims all other guarantees, warranties, conditions and representations of any kind, either express or implied, and whether arising under any statute, law, commercial use or otherwise, including implied warranties of merchantability, fitness for a particular purpose, title and non-infringement therein.

Designing Secure Cross-Cluster Communication for Nutanix Part 3: Reverse Tunnel Acceptance