Designing Secure Cross-Cluster Communication for Nutanix Part 2: Reverse Tunnel Initiation

Reverse Tunnel Architecture

In Part 1, we explored why Nutanix needed a custom solution for cross-cluster communication and why we chose to build reverse connections into Envoy. Now, let's dive into how it actually works.

The Big Picture

At its core, the reverse tunnel feature inverts the traditional client server connection model to solve a fundamental networking challenge: how do you reach services that are behind NATs, firewalls, or in private networks without exposing them to the internet?

The solution works in three major phases. First, downstream Envoy instances (typically on premises) proactively establish persistent TCP connections to upstream Envoy instances (typically in the cloud). These aren't ordinary connections—they carry metadata identifying the source node, cluster, and tenant, and they're established through a special handshake protocol. Once the handshake succeeds, these connections are cached on the upstream side, creating a pool of ready to use tunnels mapped to specific downstream identities.

Second, when a service behind the upstream Envoy needs to communicate with a downstream service, it sends a request with headers identifying the target node or cluster. The upstream Envoy's reverse connection cluster looks up a cached tunnel connection for that identity and reuses it—no new connection establishment needed, no handshake overhead, just pure data transfer over an already-established HTTP connection.

Third, these tunnels aren't fire and forget. They're monitored continuously with keepalive pings, and automatically reconnected if they fail. The downstream Envoy maintains the target connection count for each upstream cluster, while the upstream Envoy tracks which downstream nodes are reachable and handles connection lifecycle events.

The beauty of this design is that it's completely transparent to the services on either end. A service in the cloud making a request to an on premises API doesn't know or care that it's using a reverse tunnel—it just sees a successful HTTP response. Similarly, the on premises service handling the request doesn't know it came through a tunnel—it looks like any other request. All the complexity is handled by Envoy's reverse tunnel architecture.

Acknowledgements

While Nutanix has been running a version of reverse tunnels in production for over two years, the open source implementation described in this series represents a significant architectural evolution. The design was developed and implemented through close collaboration between Nutanix and Databricks, with guidance from Envoy open source maintainers. Our thanks to all of the many talented people who were involved in creating a production ready solution that serves the broader community.

Part 2: Reverse Tunnel Initiation

Let's dive deeper into how these tunnels actually get established. This is where the design becomes interesting, because we needed to fit a completely novel connection pattern into Envoy's existing architecture without requiring invasive core changes.

Using Listeners to Initiate Reverse Tunnels

We use Envoy listeners to trigger reverse tunnel initiation. This might seem counterintuitive—listeners are traditionally passive entities that bind to ports and wait for incoming connections. Why use them for outbound reverse tunnels?

Listeners in Envoy are configuration objects that can be added, modified, and removed dynamically through the Listener Discovery Service (LDS). This dynamic configurability is exactly what we needed for reverse tunnels. In a production environment, you don't want to statically configure every downstream cluster you might need to connect to—you want to add and remove reverse tunnels on demand as clusters come online or are decommissioned.

By representing each reverse tunnel setup as a listener, we get all of Envoy's existing operational benefits. Want to add tunnels to a new upstream cluster? Just push an LDS update with a new listener configuration. Need to gracefully tear down tunnels during maintenance? Use Envoy's standard listener draining mechanism. This approach integrates reverse tunnels seamlessly into existing Envoy operational patterns.

Additionally, listeners provide a natural place to specify configuration: which upstream clusters to connect to, how many connections per cluster, and identity metadata like node ID, cluster ID, and tenant ID. The listener's address field becomes the carrier for this metadata.


- name: reverse_conn_listener
   listener_filters_timeout: 0s
   listener_filters: []
   address:
     socket_address:
       # Encodes src_node_id, src_cluster_id, src_tenant_id
       # and remote cluster cloud_cluster with 1 connection
       address: "rc://on-prem-node:on-prem-cluster:on-prem-tenant@cloud_cluster:1"
       port_value: 0
       resolver_name: "envoy.resolvers.reverse_connection"
   filter_chains:
   - filters: ....

The Custom Socket Interface: DownstreamReverseSocketInterface

Traditional listeners bind to a socket, listen for incoming connections, and accept them. But our reverse tunnel listeners need to do the opposite—create outbound connections and manage them.

Envoy's architecture provides a solution through its socket interface abstraction. Every socket creation in Envoy goes through a socket interface, which by default is the standard `SocketInterfaceImpl` that wraps normal Berkeley socket APIs. But socket interfaces in Envoy are pluggable—you can register custom implementations that provide alternative socket semantics.

We created the DownstreamReverseSocketInterface, a custom socket interface implementation that intercepts socket creation for reverse tunnel listeners. When the listener manager tries to create a "listening" socket for our reverse tunnel listener, it gets redirected to our custom interface, which returns a completely different kind of socket—one that initiates outbound connections instead of listening for inbound ones.

The listener manager knows to use our custom socket interface through Envoy's resolver system. We created a custom address resolver that recognizes a special rc:// URL format in the listener's address configuration. When the resolver sees rc://node_id:cluster_id:tenant_id@cloud_cluster:2, it creates a ReverseConnectionAddress instance. This address instance includes a method called socketInterface() that returns our custom DownstreamReverseSocketInterface. During listener initialization, Envoy checks if the address specifies a custom socket interface, and if so, uses it for socket creation.

Bootstrap Extension Registration

The DownstreamReverseSocketInterface is registered as a bootstrap extension rather than a regular extension, and this distinction is important. Bootstrap extensions are initialized during Envoy's startup sequence, before any listeners or clusters are created. This early initialization is critical for reverse tunnels because the socket interface needs to be available before any reverse tunnel listeners are processed.

Bootstrap extensions also have access to Envoy's core infrastructure—the dispatcher, cluster manager, stats scope, and thread-local storage. The reverse tunnel implementation needs all of these to function. It needs the cluster manager to resolve upstream cluster names to actual endpoints. It needs the dispatcher to schedule reconnection attempts and keepalive timers. It needs the stats scope to publish connection metrics. And it needs thread-local storage to maintain per-worker-thread state for connection management.

Additionally, bootstrap extensions can maintain global state across the entire Envoy instance. The DownstreamReverseSocketInterface maintains statistics of all active reverse tunnel connections, indexed by worker thread. This global view is essential for operations like connection draining during shutdown or reporting aggregate statistics across all tunnels.

Custom IOHandle: The Heart of Reverse Tunnel Creation

When our custom socket interface creates a "socket," it doesn't create a traditional socket at all—it creates a ReverseConnectionIOHandle, a custom I/O handle that kicks off the reverse tunnel workflow.

Traditional sockets in Envoy wrap a file descriptor and implement standard operations like read(), write(), close(), and listen(). The ReverseConnectionIOHandle implements the same interface, but with completely different semantics.

Initiating Reverse Tunnels

Figure 1: Reverse Tunnel Handshake

When the listener starts up and Envoy calls initializeFileEvent() on the IOHandle (which typically signals a socket should start accepting connections), our custom handle springs into action with a completely different workflow. Instead of setting up a listening socket, it kicks off the reverse tunnel establishment process.

The handle examines its configuration to determine which upstream clusters it needs to connect to. Remember that special rc:// address we parsed earlier? It contained all the parameters we need:


rc://on-prem-node:on-prem-cluster:on-prem-tenant@cloud_cluster:3

From this, the handle knows:

Identity metadata: It represents on-prem-node in on-prem-cluster for on-prem-tenant
Target cluster: It needs to connect to cloud_cluster
Connection count: It must maintain 3 TCP connections

The handle starts by resolving the cluster name through Envoy's cluster manager to get actual endpoint addresses. For each endpoint in the upstream cluster, it initiates the configured number of TCP connections. This is where the three-way handshake protocol comes into play.

For each connection, the handle:

Obtains a TCP connection to the upstream host through Envoy's standard connection pooling mechanism. This ensures we get all the benefits of Envoy's load balancing, health checking, and TLS termination.
Sends an HTTP handshake request over the connection containing the identity metadata:


GET /reverse_connections/request HTTP/1.1
  Host: upstream_envoy.example.com
  x-envoy-reverse-tunnel-node-id: on-prem-node
  x-envoy-reverse-tunnel-cluster-id: on-prem-cluster
  x-envoy-reverse-tunnel-tenant-id: on-prem-tenant

This handshake tells the upstream Envoy: "I am on-prem-node, part of cluster_prod, belonging to on-prem-tenant, and I'm offering you this connection to use for sending requests back to me."

Waits for the upstream response. The upstream Envoy's reverse tunnel filter validates the handshake and responds with:
- HTTP 200 OK if it accepts the tunnel
- HTTP 403 Forbidden if validation fails (wrong node ID or cluster ID, etc.)
- HTTP 404 Not Found if the path doesn't match expected handshake endpoint
Processes the result. If accepted, the connection is now a valid reverse tunnel. If rejected, the handle closes the connection and may retry with exponential backoff, depending on the error.

This entire handshake establishes trust and identity. The upstream Envoy now knows exactly who this tunnel connects to and can route data requests accordingly. If TLS is configured (which it should be in production), the handshake also benefits from mutual authentication through client certificates.

The Trigger Pipe Mechanism

Once a handshake succeeds, the connection is queued and a Unix pipe signals Envoy's event system to call accept() on the IOHandle. The handle then duplicates the connection's file descriptor, wraps it in a DownstreamReverseConnectionIOHandle, and returns it to the listener. This mechanism allows the reverse tunnel connection to pass through Envoy's listener filter chain exactly like any other accepted connection, maintaining transparency and reusing all existing filter logic.

Automatic Reconnection

The DownstreamReverseConnectionIOHandle that wraps each individual tunnel connection can override the close() method to detect connection failures and trigger remediation.

When a tunnel connection closes—whether due to network failure, upstream restart, or graceful shutdown—the DownstreamReverseConnectionIOHandle's close() method notifies the parent ReverseConnectionIOHandle of the closure.

The parent ReverseConnectionIOHandle runs a periodic maintenance timer that checks all configured clusters and ensures each has the required number of connections. When it discovers that a host is missing connections, it automatically initiates new handshakes to restore the target count. This creates a self-healing system where network failures are automatically remediated within seconds without manual intervention.

To be continued in Part 3: Reverse Tunnel Acceptance, where we'll explore how the upstream Envoy validates, accepts, and manages these established tunnels.

©2026 Nutanix, Inc. All rights reserved. Nutanix, the Nutanix logo and all Nutanix product and service names mentioned are registered trademarks or trademarks of Nutanix, Inc. in the United States and other countries. All other brand names mentioned are for identification purposes only and may be the trademarks of their respective holder(s). Code samples and snippets that appear in this content are unofficial, are unsupported, and are provided AS IS Nutanix makes no representations or warranties of any kind, express or implied, as to the operation or content of the code samples, snippets and/or methods. Nutanix expressly disclaims all other guarantees, warranties, conditions and representations of any kind, either express or implied, and whether arising under any statute, law, commercial use or otherwise, including implied warranties of merchantability, fitness for a particular purpose, title and non-infringement therein.

Designing Secure Cross-Cluster Communication for Nutanix Part 2: Reverse Tunnel Initiation