deep dive · the system stack

Where does the TCP actually run?

The system stack is often called "lighter" than a userspace stack like gVisor or lwIP. That deserves scrutiny — you still read packets, NAT them, and open a socket per flow. So where's the win? Follow one packet.

"You NAT and write back to the TUN to a localhost service, open a socket per connection, and hit userspace for every read and write. You're still parsing TCP in userspace — does this actually save anything?"

— the fair question this page answers
the path

One packet, two boundary crossings.

A TCP connection through the system stack hands off to the kernel and back. The userspace touches headers; the kernel runs the connection.

userspace · your process
1
tun.Read()
A raw IP packet leaves the TUN and enters your process. There's no connection yet — just bytes.
2
processIPv4TCP → TCPNat.Lookup
Rewrite headers: destination becomes sing-tun's local listener; source becomes a synthetic port that encodes the original 5-tuple. Recompute checksums. This is the only "TCP" userspace does — header NAT, not state.
3
tun.Write()
Write the rewritten packet straight back into the TUN.
↓ crosses into the kernel — the packet now looks like ordinary local traffic
kernel · the host TCP/IP stack
4
Host TCP/IP stack
The kernel runs the entire TCP state machine: handshake, sequence/ack, retransmission, reassembly, flow & congestion control (CUBIC/BBR), timers. None of this allocates in your process.
5
delivered to s.tcpListener
The kernel terminates the connection at sing-tun's local listener as a normal, fully-formed TCP stream.
↑ back to userspace via listener.Accept()
userspace · your process
6
acceptLoop → tcpNat.LookupBack()
Accept() returns a real net.Conn. Its source port is the synthetic key; LookupBack recovers the true original destination.
7
handler.NewConnectionEx(conn, src, dst)
Your handler receives a clean, kernel-terminated socket carrying the real destination. Dial out and relay — done.

The accept side, in the source

Condensed from stack_system.go — the listener, the NAT lookup, and the handoff to your handler:

stack_system.go (condensed)
func (s *System) acceptLoop(listener net.Listener) {
    for {
        conn, err := listener.Accept()        // kernel-terminated TCP
        if err != nil { return }

        connPort := M.SocksaddrFromNet(conn.RemoteAddr()).Port
        session := s.tcpNat.LookupBack(connPort)   // recover the original 5-tuple

        // hand the real destination + ready socket to your code:
        go s.handler.NewConnectionEx(s.ctx, conn,
            M.SocksaddrFromNetIP(session.Source), session.Destination, nil)
    }
}

// stack_system_nat.go — the NAT table is just two maps:
type TCPNat struct {
    addrMap map[netip.AddrPort]uint16   // 5-tuple → synthetic port
    portMap map[uint16]*TCPSession      // synthetic port → original
}
the ledger

Where the work happens.

The honest comparison. The system stack doesn't do less work — it moves the expensive work out of your process and into the kernel.

Concernsystemgvisor / lwIP
TUN read / writeuserspaceuserspace
IP/TCP header NATuserspace (cheap)
TCP state machine
handshake · retransmit · reassembly · cwnd · buffers · timers
kerneluserspace, per-conn
Sockets per connection~2 (accepted + outbound)~1 (outbound)
Packet ↔ kernel crossingsmore (write-back + re-ingest)fewer
Per-connection memorykernelyour process heap
i

So the critique is half right: you do still parse packets and pay TUN I/O in userspace. But the system stack never runs the TCP state machine in userspace — it only rewrites headers. gVisor/lwIP run the whole stack in-process, per connection. That's the asymmetry the "lighter" claim is really about.

the catch

Why it can't run everywhere.

The write-back trick depends on the kernel routing that re-injected packet to sing-tun's local listener. Two environments break that assumption.

iOS includeAllNetworks

Under Apple's full-tunnel mode, the extension's own sockets — including the local listener — get pulled into the tunnel, so the re-injected packet loops instead of being delivered. sing-tun refuses system and mixed in this mode (sing-tun#25) and forces gVisor. With includeAllNetworks off, the system stack runs on iOS exactly like macOS — it isn't an iOS limitation, it's an includeAllNetworks one.

Android kernels below 5.10

Older Android kernels don't reliably support the NAT/redirect path the system stack needs. sing-box-derived clients read the kernel version at startup and fall back to gVisor below 5.10.

The verdict. The system stack doesn't save resources by doing less — you still pay TUN I/O, header NAT, and an extra socket per flow. It wins by doing different work: the costly part of TCP runs in the mature kernel stack, not your process. You trade file descriptors and packet-boundary crossings for kernel-grade TCP and a smaller heap. A throughput win on a server; a footprint win in a memory-capped mobile extension — which, ironically, is often exactly where includeAllNetworks won't let you use it.