Active vision is learnable, not engineered — what TAVIS tells us about VLAs
TAVIS benchmarks whether robot policies benefit from controlling where they look. The answer: conditionally — and one behavior emerges without being trained for it at all.
TAVIS benchmarks whether robot policies benefit from controlling where they look. The answer: conditionally — and one behavior emerges without being trained for it at all.
Reiner Pope spent two hours deriving AI lab operations from public API prices and a roofline model. Batch size, FLOPs/byte ratios, pipeline bubbles, RL over-training — all of it from first principles. Here's the structured breakdown.
Event cameras fire per-pixel only when log-intensity changes. No frame buffer, no redundant readout, sub-millisecond latency. Three recent arXiv papers show where this matters — spacecraft pose estimation, visual place recognition, and the benchmarking gap blocking progress.
Ask three robotics executives what constrains their deployments and two will say data before they say models. The third will say models, then pause, and add: "actually, it's still data." Capgemini, Bessemer, and BCG all land on the same bottleneck — and it isn't models.