The trick that powers RoPE is that you can encode a token's position by rotating its query and key vectors. Because rotations compose, the rotation at position i and the rotation at position j cancel, on the dot product, into a single rotation by j − i. The score depends on the offset, not on i and j separately. Transformers get relative-position attention without storing a relative-position table.
A relative displacement can be represented as an angular displacement. That is the move.
In the hippocampus, place cells fire in a particular region of space, but they do not only encode position by firing rate. As the animal moves through a place field, the cell fires at progressively earlier phases of the theta rhythm. That is theta phase precession, O'Keefe and Recce 1993. Position becomes phase. Relative position becomes phase difference. The substrate is biophysical, the mechanism has nothing to do with matrix multiplication, but the geometric move is the same as RoPE's.
Multiple oscillations at different scales can interfere into a stable spatial pattern: a grid-like lattice. This is one of the classical models of entorhinal grid cells, and it is the third shape of the same idea: a small number of phases can carry a lot of spatial structure if you compose them right. RoPE's frequency ladder is the engineered version of the same principle.
Five presets traverse the design space. A LLaMA-style baseline. A long-range head with high base and many pairs. A short-context head with low base and few pairs, sharply localised. An extrapolation regime where the sequence outruns the longest wavelength. A neural-phase preset that pushes the phase slope to 360°, matching the O'Keefe-Recce range. The calibration table checks whether the toy attention concentration matches the canonical regime for each preset.
This is a sketch, not a transformer simulator. There is no softmax, no value projection, no multi-head averaging, no training. Content vectors are random rather than meaningful. The point is to make the position-rotation step legible: to show that RoPE is a clean engineering move that has a messy biological cousin, and that what they share is geometry, not implementation.