Routing Comparison
Internal vs external routing benchmark results from Plan v2 Track B.
Overview
Plan v2 Track B measures whether in-process routing (monolith runtime) is faster than external routing (controller + remote tool/model endpoints) on the same hardware.
Date: 2026-02-17 to 2026-02-19 (UTC) Hardware: AWS g5.xlarge (NVIDIA A10G)
Headline
| Metric | Internal | External | External/Internal |
|---|---|---|---|
| Mean latency | 94.849 ms | 97.927 ms | 1.032x |
Internal routing is faster. Ratio > 1 means internal wins.
Stage Timing
| Stage | Value |
|---|---|
| Internal route mean | 23.380 ms |
| Internal infer mean | 68.286 ms |
| Internal TTFT mean | 53.425 ms |
| External controller route mean | 0.003 ms |
| External tool hop mean | 2.206 ms |
| External model hop mean | 94.859 ms |
Per-Task Breakdown
| Task | Internal | External |
|---|---|---|
| general_short | 150.767 ms | 152.274 ms |
| receipt_extract | 80.732 ms | 81.270 ms |
| search_grounded | 46.945 ms | 57.237 ms |
| summarize_short | 100.950 ms | 100.928 ms |
Integrity
- Errors: top-level
0, warmup0, internal0, external0 - Warmup ordering bias from earlier run was corrected in this final comparison.
Matrix Expansion (2026-02-19)
Track B was expanded to a 6-profile matrix (baseline + escalating timeout/failure stress):
| Profile | Ext/Int Latency Ratio | Int Error | Ext Error |
|---|---|---|---|
| p00 baseline | 1.042x | 0.0000 | 0.0000 |
| p01 fail mild | 1.048x | 0.0000 | 0.0000 |
| p02 timeout mild | 1.142x | 0.0000 | 0.0000 |
| p03 mixed moderate | 1.164x | 0.0000 | 0.0417 |
| p04 mixed aggressive | 1.436x | 0.0000 | 0.0833 |
| p05 mixed aggressive + retry2 | 1.416x | 0.0000 | 0.0833 |
Matrix interpretation:
- Internal path remains faster in all profiles (
ratio > 1). - External path degrades progressively under timeout/failure pressure.
- Internal error rate stayed zero across matrix profiles.