Routing Comparison

Overview

Plan v2 Track B measures whether in-process routing (monolith runtime) is faster than external routing (controller + remote tool/model endpoints) on the same hardware.

Date: 2026-02-17 to 2026-02-19 (UTC) Hardware: AWS g5.xlarge (NVIDIA A10G)

Headline

Metric	Internal	External	External/Internal
Mean latency	94.849 ms	97.927 ms	1.032x

Internal routing is faster. Ratio > 1 means internal wins.

Stage Timing

Stage	Value
Internal route mean	23.380 ms
Internal infer mean	68.286 ms
Internal TTFT mean	53.425 ms
External controller route mean	0.003 ms
External tool hop mean	2.206 ms
External model hop mean	94.859 ms

Per-Task Breakdown

Task	Internal	External
general_short	150.767 ms	152.274 ms
receipt_extract	80.732 ms	81.270 ms
search_grounded	46.945 ms	57.237 ms
summarize_short	100.950 ms	100.928 ms

Integrity

Errors: top-level 0, warmup 0, internal 0, external 0
Warmup ordering bias from earlier run was corrected in this final comparison.

Matrix Expansion (2026-02-19)

Track B was expanded to a 6-profile matrix (baseline + escalating timeout/failure stress):

Profile	Ext/Int Latency Ratio	Ext Error
p00 baseline	1.042x	0.0000
p01 fail mild	1.048x	0.0000
p02 timeout mild	1.142x	0.0000
p03 mixed moderate	1.164x	0.0417
p04 mixed aggressive	1.436x	0.0833
p05 mixed aggressive + retry2	1.416x	0.0833

Matrix interpretation:

Internal path remains faster in all profiles (ratio > 1).
External path degrades progressively under timeout/failure pressure.
Internal error rate stayed zero across matrix profiles.