MITM standalone worker protocol
This document defines the worker-facing protocol that sing-box uses for external MITM workers. It is intentionally command-agnostic: sing-box only requires a connected Unix domain socket stream and never assumes a specific embedded runtime.
Lifecycle
- sing-box establishes one Unix domain socket stream to the worker master process.
- Stream
0is reserved for startup negotiation. - sing-box sends
HELLO. - The worker replies with
HELLO. - sing-box sends
CAPSwith the selected protocol version and negotiated capability set. - The worker replies with matching
CAPS. - Only after both
CAPSframes match may live traffic start. - Each intercepted request or HTTP/2 stream uses one logical
StreamIDon the same socket.
If negotiation fails, sing-box rejects the worker before serving live traffic.
Frame format
Every frame uses the same 9-byte header:
StreamID: unsigned big-endian 32-bit integer.Type: unsigned 8-bit frame type.Length: unsigned big-endian 32-bit payload length.
StreamID = 0 is reserved for session negotiation only.
Frame types
| Value | Name |
|---|---|
0x01 |
control |
0x02 |
request_headers |
0x03 |
request_body |
0x04 |
request_trailers |
0x05 |
response_headers |
0x06 |
response_body |
0x07 |
response_trailers |
request_body and response_body payloads are raw bytes. All other payloads are UTF-8 JSON objects with strict unknown-field rejection.
Control frames
control payloads start with one extra subtype byte, followed by a JSON object.
| Value | Subtype | Scope |
|---|---|---|
0x01 |
HELLO |
Session startup on stream 0 |
0x02 |
OPEN |
Per-stream open / worker decision |
0x03 |
END |
Half-close or full-close a logical stream |
0x04 |
ABORT |
Abort/reset a logical stream |
0x05 |
ERROR |
Structured protocol/runtime error |
0x06 |
WINDOW |
Optional flow-control credit update |
0x07 |
CAPS |
Session startup capability confirmation |
Unknown critical control subtypes are fatal protocol errors.
Version and capability negotiation
HELLO
min_version/max_version: inclusive supported range.required_capabilities: bits that must be present in the final negotiated set.supported_capabilities: full bitset the peer understands.
sing-box rejects:
- non-overlapping version ranges,
- required bits that are not a subset of supported bits,
- any unknown capability bit.
CAPS
Both sides must send the same selected version and negotiated capability set. Any mismatch is a fatal startup failure.
Capability bits
| Bit | Name | Meaning |
|---|---|---|
1 << 0 |
request_headers |
request-header surface is available |
1 << 1 |
request_body |
request body frames are allowed |
1 << 2 |
request_trailers |
request trailer frames are allowed |
1 << 3 |
response_headers |
response-header surface is available |
1 << 4 |
response_body |
response body frames are allowed |
1 << 5 |
response_trailers |
response trailer frames are allowed |
1 << 6 |
direct_response |
worker may short-circuit with its own response |
1 << 7 |
pass_through |
worker may explicitly decline interception |
1 << 8 |
window_updates |
WINDOW frames are allowed |
Workers must not send a frame that depends on a capability which was not negotiated.
Stream opening
sing-box opens each logical stream with OPEN on a non-zero StreamID:
{
"protocol": "http/1.1",
"module": "capture",
"request_body": true,
"request_trailers": true,
"response_body": true,
"response_trailers": true,
"is_connect": false,
"is_extended_connect": false
}
The worker must answer with its own OPEN decision:
Allowed values:
interceptdirect_responsepass_through
direct_response requires negotiated direct_response capability.
pass_through requires negotiated pass_through capability.
Redaction requirements
Workers should treat all decrypted HTTP content as sensitive.
- Normal sing-box observability does not emit decrypted body bytes.
- Normal sing-box observability does not emit secret-style headers such as
Authorization,Proxy-Authorization,Cookie,Set-Cookie, token headers, or API keys. - Workers should follow the same rule in their own logs unless an operator has explicitly built a separate secure audit path.
Request metadata that appears in structured MITM events is limited to redaction-safe fields such as module tag, protocol, decision, authority/path/query, safe headers, and request body presence or size.
Header payloads
sing-box → worker request headers
{
"method": "GET",
"scheme": "https",
"authority": "example.com",
"path": "/v1/data?ok=1",
"headers": {
"accept": ["application/json"]
}
}
worker → sing-box request mutation
{
"authority": "rewritten.example",
"path": "/internal/data?ok=1",
"headers": {
"x-policy": ["capture"]
}
}
The worker mutation frame does not include method. Any attempt to send method, outbound, tls, fsm, or other unknown fields is rejected.
Response headers
For direct_response, the first worker response_headers frame must include status_code.
Trailers
Allowed worker outputs
The worker may only do the following:
- mutate request headers, body, and trailers,
- mutate response headers, body, and trailers,
- send a direct response,
- abort/reset a logical stream,
- send an explicit pass-through decision,
- send
WINDOWupdates only whenwindow_updateswas negotiated.
Prohibited worker outputs
The worker must not attempt to control sing-box outside the explicit contract.
Rejected examples:
- request method mutation,
- outbound selection or outbound tag changes,
- TLS engine or certificate changes,
- generic FSM / scheduler / transport control,
- HTTP pseudo-header injection in normal header maps,
Hostmutation through the header map instead ofauthority,- unknown capability bits,
- unknown fields inside JSON payloads,
- malformed stream transitions such as
response_bodybeforeresponse_headers.
Stream closure and errors
END
Valid target values:
requestresponsestream
ABORT
ERROR
WINDOW
WINDOW is invalid unless window_updates was negotiated.
Failure semantics
- Negotiation failure is a session-level fatal error. sing-box closes the worker connection and does not accept traffic.
- Parse failure, unknown critical subtype, forbidden mutation, or impossible state transition is a deterministic protocol failure.
- For supported MITM flows, sing-box treats worker protocol failures as fail-closed behavior rather than silently bypassing policy.
pass_throughis explicit. Absence of a valid worker decision is not an implicit pass-through.
Example transcript: intercepted request with request rewrite
S→W stream=0 control/HELLO {"min_version":1,"max_version":1,...}
W→S stream=0 control/HELLO {"min_version":1,"max_version":1,...}
S→W stream=0 control/CAPS {"version":1,"capabilities":255}
W→S stream=0 control/CAPS {"version":1,"capabilities":255}
S→W stream=1 control/OPEN {"protocol":"http/1.1","module":"capture","request_body":true}
S→W stream=1 request_headers {"method":"GET","scheme":"https","authority":"example.com","path":"/v1/data"}
W→S stream=1 control/OPEN {"decision":"intercept"}
W→S stream=1 request_headers {"authority":"rewritten.example","path":"/internal/data"}
S→W stream=1 request_body <bytes>
S→W stream=1 control/END {"target":"request"}
W→S stream=1 response_headers {"status_code":200,"headers":{"x-worker":["1"]}}
W→S stream=1 response_body <bytes>
W→S stream=1 control/END {"target":"response"}
Example transcript: direct response
S→W stream=7 control/OPEN {"protocol":"h2","module":"capture"}
S→W stream=7 request_headers {"method":"GET","scheme":"https","authority":"api.example","path":"/healthz"}
W→S stream=7 control/OPEN {"decision":"direct_response"}
W→S stream=7 response_headers {"status_code":503,"headers":{"content-type":["text/plain"]}}
W→S stream=7 response_body <"worker unavailable">
W→S stream=7 control/END {"target":"response"}
Example transcript: explicit pass-through
S→W stream=11 control/OPEN {"protocol":"http/1.1","module":"capture"}
S→W stream=11 request_headers {"method":"CONNECT","authority":"db.example:443"}
W→S stream=11 control/OPEN {"decision":"pass_through"}
W→S stream=11 control/END {"target":"stream"}
Implementation notes for standalone workers
- Treat the UDS stream as an ordered byte stream; do not rely on message boundaries from the socket layer.
- Use big-endian parsing for both
StreamIDandLength. - Reserve stream
0for negotiation only. - Reject or surface unknown fields in your own parser too; do not silently discard contract changes.
- Do not reverse-engineer sing-box internals for unsupported control. If the contract does not expose it, the worker may not control it.