Forked Ethereum mainnet at block 19,000,000,
35 simulation steps (7.0 minutes
of mainnet block time). Seed 42.
Aggregated metrics by cohort. A cohort with disproportionately worse effective price typically indicates its members’ cliff overlapped with another large cohort, compounding pool impact.
| Cohort | Members | Swaps | WETH sold | USDC realized | Effective price |
|---|---|---|---|---|---|
| advisor | 8 | 96 | 4.0000 | 10,248.37 | 2,562.09 |
| seed | 20 | 300 | 20.0000 | 51,248.66 | 2,562.43 |
| team | 5 | 40 | 25.0000 | 64,069.81 | 2,562.79 |
Cumulative inventory liquidated, in WETH, by simulation step. The slope reflects the aggregate selling intensity across all recipients.
Per-step revenue summed across all recipients. Gaps indicate steps where no recipient sold (small rounding amounts below the dust threshold).
USDC received per WETH sold, per step. A declining line indicates accumulating slippage from sustained selling.
Per-step Uniswap V3 tick for the tracked pool, captured by the
MarketSnapshotHook at the start of every scheduler step.
A monotone trend signals sustained price pressure; abrupt jumps
correspond to large fills landing in a single step.
| Agent | WETH sold | USDC realized | Effective price | Swaps |
|---|---|---|---|---|
advisor-0
Hand-coded heuristic
|
0.5000 | 1,281.04 | 2,562.08 | 12 |
advisor-1
Hand-coded heuristic
|
0.5000 | 1,281.05 | 2,562.10 | 12 |
advisor-2
Hand-coded heuristic
|
0.5000 | 1,281.05 | 2,562.10 | 12 |
advisor-3
Hand-coded heuristic
|
0.5000 | 1,281.06 | 2,562.11 | 12 |
advisor-4
Hand-coded heuristic
|
0.5000 | 1,281.05 | 2,562.10 | 12 |
advisor-5
Hand-coded heuristic
|
0.5000 | 1,281.04 | 2,562.08 | 12 |
advisor-6
Hand-coded heuristic
|
0.5000 | 1,281.05 | 2,562.09 | 12 |
advisor-7
Hand-coded heuristic
|
0.5000 | 1,281.04 | 2,562.07 | 12 |
seed-0
Hand-coded heuristic
|
1.0000 | 2,562.39 | 2,562.39 | 15 |
seed-1
Hand-coded heuristic
|
1.0000 | 2,562.37 | 2,562.37 | 15 |
seed-10
Hand-coded heuristic
|
1.0000 | 2,562.47 | 2,562.47 | 15 |
seed-11
Hand-coded heuristic
|
1.0000 | 2,562.49 | 2,562.49 | 15 |
seed-12
Hand-coded heuristic
|
1.0000 | 2,562.37 | 2,562.37 | 15 |
seed-13
Hand-coded heuristic
|
1.0000 | 2,562.45 | 2,562.45 | 15 |
seed-14
Hand-coded heuristic
|
1.0000 | 2,562.50 | 2,562.50 | 15 |
seed-15
Hand-coded heuristic
|
1.0000 | 2,562.39 | 2,562.39 | 15 |
seed-16
Hand-coded heuristic
|
1.0000 | 2,562.40 | 2,562.40 | 15 |
seed-17
Hand-coded heuristic
|
1.0000 | 2,562.44 | 2,562.44 | 15 |
seed-18
Hand-coded heuristic
|
1.0000 | 2,562.44 | 2,562.44 | 15 |
seed-19
Hand-coded heuristic
|
1.0000 | 2,562.34 | 2,562.34 | 15 |
seed-2
Hand-coded heuristic
|
1.0000 | 2,562.46 | 2,562.46 | 15 |
seed-3
Hand-coded heuristic
|
1.0000 | 2,562.52 | 2,562.52 | 15 |
seed-4
Hand-coded heuristic
|
1.0000 | 2,562.45 | 2,562.45 | 15 |
seed-5
Hand-coded heuristic
|
1.0000 | 2,562.46 | 2,562.46 | 15 |
seed-6
Hand-coded heuristic
|
1.0000 | 2,562.37 | 2,562.37 | 15 |
seed-7
Hand-coded heuristic
|
1.0000 | 2,562.49 | 2,562.49 | 15 |
seed-8
Hand-coded heuristic
|
1.0000 | 2,562.42 | 2,562.42 | 15 |
seed-9
Hand-coded heuristic
|
1.0000 | 2,562.44 | 2,562.44 | 15 |
team-0
Hand-coded heuristic
|
5.0000 | 12,814.13 | 2,562.83 | 8 |
team-1
Hand-coded heuristic
|
5.0000 | 12,814.08 | 2,562.82 | 8 |
team-2
Hand-coded heuristic
|
5.0000 | 12,814.10 | 2,562.82 | 8 |
team-3
Hand-coded heuristic
|
5.0000 | 12,813.87 | 2,562.77 | 8 |
team-4
Hand-coded heuristic
|
5.0000 | 12,813.63 | 2,562.73 | 8 |
mayavi validate.The agents in this run use the hand-coded / scripted baselines (see the
Agent column above). Mayavi’s agents are RL-trainable on the same
forked-mainnet stack — mayavi train --env aave|vesting|liquidator produces
a PPO policy, and VestingRecipient(policy_path=…) loads one into a scenario.
Trained-policy-vs-baseline evaluation results (each on a real forked mainnet, $0 marginal cost):
docs/artifacts/aave_ppo_v2_local_2026-05-07.jsondocs/artifacts/vesting_ppo_v1_local_2026-05-13.jsondocs/artifacts/aave_liquidator_ppo_v1_local_2026-05-13.json