Pettini tensor networks · research · playgrounds

Introduction: three ideas, one playground

The Pettini tensor networks playground blends two well-established toolkits and one actively debated hypothesis into a single interactive model:

Facilitated diffusion (biophysics): proteins can locate target DNA sites faster than naive 3D diffusion by combining 3D diffusion with 1D motion (sliding/hopping) along DNA. Foundational theory and classic experiments formalized this in the early 1980s and it remains central to modern thinking, with many refinements (crowding, obstacles, multi-state search/recognition, chromatin geometry).
Tensor-network compression (computational physics/applied math): high-dimensional objects with local or limited-range correlations can often be compressed efficiently using tensor networks---notably matrix product states (MPS) and the closely related tensor-train (TT) format used in scientific computing. This is a mature mathematical technology, increasingly applied to classical probability distributions (e.g., chemical master equations).
Long-range electrodynamic recruitment (open research): under certain activated conditions, long-range resonant electrodynamic (ED) interactions between biomolecules could bias encounter rates beyond diffusion-only expectations. A 2022 study reported experimental evidence for long-range ED forces in vitro using two complementary readouts (FCS-based diffusion signatures and THz spectroscopy), and follow-on work (2025) extends the experimental program to other protein systems; a 2025 theoretical paper explicitly discusses DNA--protein ED recruitment. Even if these effects are real in vitro, their relevance to intracellular protein--DNA target search remains highly uncertain due to ionic screening, crowding, and competing mechanisms.

This document treats the playground as a teaching model: it is judged primarily by conceptual fidelity, honest signaling of uncertainty, and meaningful parameter-to-literature mapping---not by whether the current six-parameter formula can predict real search times.

Concepts and metrics in the code

The code implements a toy summary model rather than a mechanistic simulation. Six sliders (all 0--100) are converted to normalized fractions and combined into five metrics:

resonanceGain = r x a x g x (1 - 0.55 n). This encodes a gated resonance mechanism: recruitment only happens when frequency matching (r), activation (a), and coupling (g) are all present, then is damped by ionic noise (n).
baselineMobility = 0.55 d3 + 0.45 d1, a fixed linear blend of 3D diffusion and 1D sliding contributions.
targetBias = max(0, 0.15 + 0.95 resonanceGain), enforcing a nonspecific background floor.
compressibility = max(0.08, 0.82 - 0.45(r g) + 0.22 d1), which is meant to proxy how structured (and therefore compressible) the high-dimensional state might be under stronger long-range correlations vs local sliding correlations.
searchTime = max(0.12, 1.25 - 0.55 baselineMobility - 0.5 targetBias + 0.18 n), a normalized inverse-efficiency surrogate.

Two additional pedagogical layers are built on top:

A distribution over 80 DNA sites, comparing "with resonance" vs "without resonance," explicitly designed to show how a long-range attractor would concentrate probability near the target site.
A coupling sweep (coupling 0 to 100) to visualize tradeoffs among search time, target bias, and compressibility.

The design is defensible as a conceptual map, but the numerical coefficients are mostly author-chosen weights rather than derived constants; the companion should therefore avoid implying physical parameter identification unless additional mapping is added.

The model's narrative arc

The playground's presets ("classical search," "Pettini coupling," "cellular crowding") encode a teaching arc that mirrors real debates:

Start with the mainstream baseline. After decades of work, facilitated diffusion is a standard explanation for fast target location---though the exact mixture of sliding, hopping, intersegment transfer, and 3D excursions is protein- and context-dependent.
Introduce compression as a modern computational lens. Representing the full joint probability distribution over protein state + many DNA sites is exponentially large. MPS/TT factorization is the natural "physicist's compression" story: local tensors encode local variability; a bond dimension encodes correlation capacity.
Add a speculative long-range term. If a long-range, frequency-selective interaction existed and survived noise, it would act like an effective drift or bias field toward the target---conceptually similar to adding a position-dependent potential in a stochastic search problem. That is a legitimate "what would it look like?" question, even if the mechanism remains controversial.

Historical context and intellectual lineage

The playground's concepts come from three partially independent intellectual lineages that only recently began to "talk to each other" explicitly.

Facilitated diffusion and the "search problem"

Modern formulations trace to early observations that some DNA-binding proteins find targets at rates that appeared too fast for simple diffusion-limited expectations, sparking theoretical and experimental work on how nonspecific binding and 1D motion can accelerate target location. A key theoretical/experimental series is the 1981 work by Otto G. Berg, Peter H. von Hippel, and collaborators (Biochemistry) that formalized diffusion-driven translocation and multiple modes (sliding, hopping, intersegment transfer) as components of the search.

A widely cited synthesis is the 1989 minireview "Facilitated target location in biological systems," where von Hippel and Berg explicitly frame the conceptual puzzle and the idea that nonspecific binding can reduce effective dimensionality and speed up target finding.

By the early 2000s, work increasingly emphasized that facilitated diffusion is not "just sliding": it is a stochastic switching strategy, sensitive to salt, energy landscapes, DNA packaging, obstacles, and multi-state protein conformations. A canonical accessible review is by Steven E. Halford and John F. Marko, which helped codify the modern baseline and the experimental puzzles (e.g., when solution diffusion seems insufficient).

A complementary theoretical synthesis for physicists is the 2009 review by Leonid A. Mirny, which emphasizes mean search time calculations, constraints from sequence/recognition, crowding, and a key point the playground currently compresses into one slider: there is typically an optimum tradeoff between time spent sliding and time spent relocating in 3D.

In the 2000s--2010s, single-molecule experiments became decisive for giving this "search" picture real teeth. For example, p53 sliding was characterized in detail by Anahita Tafvizi and coworkers, illustrating how real proteins can combine domains and modes and how "sliding" itself can be multi-modal.

Tensor networks and compression of structured high-dimensional objects

Tensor networks emerged in condensed matter physics as a route to circumvent exponential state spaces by exploiting correlation structure. The density-matrix renormalization group (DMRG) formulated by Steven R. White became the flagship algorithm for 1D quantum systems.

The connection to matrix product states was clarified and formalized, including influential work by Frank Verstraete and J. Ignacio Cirac on why MPS approximations are effective in 1D under entanglement constraints (area-law-like behavior).

Modern reviews (including the 2021 RMP) place MPS/PEPS in a broader "entanglement as correlation language" framework that is directly relevant to the playground's compressibility story: as correlations become longer-range and stronger, one generally needs larger bond dimension to represent the object faithfully.

For learners, Roman Orus remains one of the most accessible on-ramps that explains the intuition without requiring full many-body machinery.

A critical bridge to classical biological modeling is the tensor-train (TT) format developed in numerical analysis. Ivan V. Oseledets introduced the TT decomposition as a structured low-rank factorization for high-dimensional tensors in scientific computing.

This line (often outside physics departments) strongly supports the playground's "biology version" claim: TT methods can compress high-dimensional probability mass functions and operators, including solutions of chemical master equations, enabling parameter inference tasks at scales where naive grids are impossible.

Similarly, CME time/state tensorization by Sergey Dolgov and Boris Khoromskij shows concrete ways to treat time as another tensor dimension---an idea closely parallel to the playground's conceptual generator equation in the outro.

Long-range electrodynamic recruitment and resonance selectivity

The third lineage is the most controversial. In 2022, a team including Marco Pettini reported experimental evidence for long-range resonant electrodynamic intermolecular forces between proteins in vitro, using two distinct measurement modalities (FCS-derived diffusion behavior and THz spectroscopy of collective modes), and argued that the forces required out-of-equilibrium activation and could act over distances up to ~1000 A.

The broader institutional context (and why this topic is now visibly "alive") is the EU-funded research program around long-range electrodynamic interactions, including Project LINkS (H2020 FET-Open) and its dissemination through EU catalogs and project pages.

In 2025, follow-on work reported long-range interactions in a light-harvesting protein context and claimed that thermal energy alone could activate attractive interactions up to hundreds of nanometers in that system.

Also in 2025, a theoretical paper on "electrodynamic forces driving DNA-protein interactions" explicitly frames how resonance and activation could yield selective recruitment in a DNA--protein enzyme system.

From the playground's standpoint, the key teaching point is not "this is true," but "if such a mechanism existed, here is the signature it would have in a search process": a distance-dependent attraction gated by activation and spectral overlap (as the outro writes). That is a legitimate scientific "forward model" question, but it should be visually and textually labeled as hypothesis-driven rather than consensus.

Empirical grounding and open questions

This section maps the playground's explicit assumptions to the current evidence base and highlights where the present confidence labels are fair, too strong, or missing key qualifiers.

Facilitated diffusion is well-supported, but its simplest form is not universal

The playground's "facilitated diffusion" assumption being established is reasonable in the sense that alternating 1D/3D motion is a mainstream framework, supported by decades of theory and many single-molecule experiments.

However, the code's linear mobility blend (0.55 d3 + 0.45 d1) compresses a richer literature where search efficiency typically depends on switching rates, nonspecific binding energy landscapes, obstacles, DNA packing, and---in vivo---cellular confinement and crowding.

A key conceptual correction is that facilitated diffusion usually has an optimal regime: too much nonspecific binding or too much time in the wrong mode can slow search.

Short-range recognition is established, but "binary recognition" is pedagogical, not empirical

The assumption that final recognition is driven by contact-scale chemistry is consistent with structural and biophysical understandings of base readout and shape readout emphasized by Remo Rohs and colleagues.

But the assumption labeled "binary recognition" being established is too strong: multi-step recognition, conformational changes, and proofreading-like tradeoffs are central themes in the recognition literature. The conformational proofreading framework of Yonatan Savir and Tsvi Tlusty is a canonical example of why recognition cannot always be reduced to an instantaneous yes/no event without risking incorrect intuitions about speed vs specificity.

Tensor-network compression is mathematically mature; its biological "always works" reading is not

The "tensor compression" assumption is easiest to justify if phrased conditionally: if correlations are sufficiently local/structured, low-rank tensor formats will efficiently represent the joint object.

In classical scientific computing, TT methods demonstrably compress high-dimensional probability distributions arising from biological reaction networks (CME), and even support Bayesian inference pipelines---this strongly supports the playground's claim that "biological state spaces are natural targets," at least in some regimes.

What remains contested (and correctly labeled as such in the assumptions) is whether a protein--DNA search distribution with long-range couplings, variable DNA geometry, and heterogeneous environments admits a low-bond-dimension MPS approximation at biologically relevant accuracy and scales. That is testable in simulation by measuring rank/bond-dimension growth with system size at fixed approximation error.

Long-range ED recruitment remains an open research question; cellular relevance is the hard part

The 2022 in-vitro evidence claim is real and peer-reviewed, and the authors explicitly argue for resonant selectivity and significant ranges.

The follow-on 2025 work strengthens the case that the experimental group is building a coherent program across systems.

But whether these effects persist in vivo is plausibly limited by ionic screening and competing forces. A basic physical anchor for "ionic noise" is the Debye screening length in physiological ionic strength solutions, commonly < 1 nm, meaning fixed-charge electrostatic effects are strongly screened at nanometer scales in bulk electrolyte.

The playground's "cellular crowding" preset uses high ionicNoise to demonstrate masking, which is directionally consistent with screening intuition---but the code currently treats noise as a simple linear penalty on resonance rather than modeling how salt and crowding also change nonspecific binding and sliding itself.

Calibration cases: the cited papers support regimes, not the numeric "expected" scalars

The calibration table assigns "expected" searchTime values (0--1 scale) for lac repressor, p53 scanning, diffusion-only, high salt screening, and a hypothetical resonance-assisted case. The cited sources strongly support the qualitative regimes (e.g., lac repressor as a facilitated diffusion exemplar; p53 sliding; salt effects on protein--DNA association), but do not uniquely justify these particular normalized "expected" numbers.

Moreover, when the current code is applied to those parameter sets, the predicted searchTime values differ substantially from the "expected" targets (e.g., lac repressor case in this toy formula is closer to ~0.86 than 0.55). This is not automatically a flaw---if "expected" is meant as a pedagogical anchor---but it should be labeled as illustrative rather than "supported by" the references unless a mapping to measured rates is added.

Guided experiments in the playground

These are concrete explorations a user can run that connect directly to the code's structure (and to the literature's conceptual questions). They are written as lab-style tasks so the playground produces explicit "aha" moments.

Explore the facilitated-diffusion baseline and the idea of intermittent search

Start from "classical search" and set resonanceMatch, activation, coupling to zero. You should see the narrative shift to "pure facilitated diffusion," and the distribution should become relatively broad (diffusion base + a local bump near the start from sliding). This aligns with the idea that alternating modes can accelerate search by effectively increasing target cross-section.

Test the "optimality" misconception: does more sliding always help?

Increase sliding1D while decreasing diffusion3D. In the current toy formula, mobility improves linearly with both, and sliding also increases "compressibility," so you may see monotonic improvements that reinforce a common misconception. The literature suggests the real picture is typically non-monotone: too much nonspecific binding/slow sliding can trap the protein and slow total search, motivating multi-state models. Use this mismatch as a prompt: the toy results show why adding a "two-state" or "frustration" parameter would deepen fidelity.

Turn on the "Pettini coupling" knobs and interpret the signature correctly

Switch to "Pettini coupling." Compare "with resonance" vs "without resonance" distributions: the key signature the playground wants to teach is probability mass concentrating near the target site when long-range recruitment exists. Treat this as a qualitative "what it would look like" indicator, not as proof it happens in cells.

Stress-test the masking story with ionic noise and screening intuition

In the "cellular crowding" preset, increase ionicNoise further (or start from "Pettini coupling" and crank ionicNoise up). The code applies a linear penalty to the resonanceGain and adds an explicit noise term to searchTime; you should watch resonance-driven target bias collapse toward baseline. Compare this to the basic screening argument: at physiological ionic strength, the Debye length is sub-nanometer, so any long-range effect would need a mechanism not trivially reduced to static electrostatics.

Use the "coupling sweep" to learn what "compressibility" is trying to say

Run a sweep and look for the intended story: stronger coupling lowers searchTime (benefit) but reduces compressibility (cost), illustrating that adding global correlations can make the state less locally factorable. Connect this to the tensor-network principle that stronger long-range correlations generally demand larger bond dimension.

Connect "compressibility" to real TT/MPS practice

After observing how "compressibility" moves with parameters, reflect on what would make it operational: in TT/MPS workflows, compressibility is not a free scalar; it is usually measured by (a) the required TT ranks/bond dimensions to achieve a given approximation error, or (b) entanglement/mutual information growth proxies. This motivates a natural "next version" feature: show bond dimension vs error for the current distribution.

Suggested improvements

Below are 16 actionable improvements, grouped by category.

Parameter corrections

Re-map diffusion3D to a physically interpretable diffusion coefficient (um^2/s), with presets anchored to in vivo ranges. Typical bacterial cytoplasm values are on the order of ~1--20 um^2/s for small proteins (context-dependent).
Re-map sliding1D to a 1D diffusion constant or sliding length per binding event, and make the range consistent with single-molecule studies. 1D sliding is often characterized via D1 or a mean sliding length; values vary by protein and conditions.
Replace the fixed 0.55/0.45 mobility blend with a regime-dependent tradeoff that can produce an optimum search time as a function of binding/switching. The qualitative optimum is a core result of facilitated diffusion theory.
Make ionicNoise map to ionic strength (I) or Debye length, and apply it to both resonance and nonspecific binding/sliding. At physiological ionic strength the Debye length is typically < 1 nm.
Re-label "activation" as "out-of-equilibrium pumping" and constrain it in the "cellular crowding" preset unless an explicit pumping mechanism is chosen. Possible mechanisms: light-driven, thermal, chemical gradients, ATP-driven; default to "none" in cellular conditions unless specified.

Missing features

Add first-passage-time (FPT) outputs (distribution, not just a steady-state-like proxy). Simulate toy trajectories under the current generator assumptions and display MFPT and variance; clarify that target finding is a first-passage problem.
Add a two-state (search vs recognition) mechanism to address the speed--stability paradox. Show that faster sliding can conflict with strong recognition without state switching.
Add obstacles/crowding on DNA (roadblocks, nucleosomes) and show their effect on sliding and effective search. Implement "roadblock density" and allow hopping/intersegment transfer to bypass them.
Add DNA geometry beyond a linear lattice: coiling/looping or "contact shortcuts." Implement a simple graph where DNA sites have occasional 3D contacts; show how contacts can dominate kinetics in some regimes.
Make compressibility operational: compute TT/MPS ranks for the displayed distribution and show approximation error vs bond dimension. For the current 80-site distribution, compute TT ranks under a chosen tolerance and display the bond dimension as the "compressibility."

Conceptual corrections

Downgrade "steady-state distribution" from "established" to "simplifying," and explicitly teach quasi-stationary vs first-passage viewpoints.
Downgrade "binary recognition" from "established" to "simplifying," and link to multi-step recognition/proofreading ideas.
Reframe calibration "expected" values as qualitative bands (e.g., near diffusion-limited / moderately accelerated / strongly accelerated) unless a unit mapping is added.
Clearly separate "electrostatics/screening" from "resonant electrodynamic recruitment" in the UI language. Debye screening limits static charge effects; the ED hypothesis is claimed to be activation- and resonance-dependent.

Interesting additions

Add a "model-to-literature" tab that turns each assumption into an explicit experiment sketch and measurement modality. For long-range ED: specify a minimal replication design (FCS diffusion signatures + THz modes) and list confounders (convection, electrokinetic artifacts).
Add a cross-disciplinary "search strategies" module linking facilitated diffusion to intermittent search theory. Provide a toggle between (a) "biophysics language" (3D/1D modes) and (b) "intermittent search language" (detecting vs relocating phases), showing they are the same mathematical archetype.

Annotated further reading

Berg, Winter & von Hippel (1981) --- Foundational theory of diffusion-driven translocation on nucleic acids; introduces multiple physical pathways (including reduced-dimensional search) in a rigorous kinetic framework.
Halford & Marko (2004) --- A highly accessible review of how site-specific DNA-binding proteins find targets; strong entry point for undergraduates moving into the research literature.
Mirny (2009) --- Physicist-friendly analysis of facilitated diffusion, emphasizing mean search time calculations, crowding/packing considerations, and the necessity of coupling search to recognition.
Tafvizi et al. (2011) --- Single-molecule characterization of p53 search on DNA; a concrete example of how domain structure and multiple modes complicate "just sliding" narratives.
Rohs et al. (2010) --- Review of protein--DNA recognition emphasizing base and shape readout; essential context for why recognition is rarely truly "binary."
White (1992) --- The DMRG paper; historically important for showing how structured truncations can tame exponential state spaces in 1D systems.
Cirac et al. (2021) --- Reviews MPS/PEPS theory and theorems; best single "anchor reference" for the tensor-network side of the playground's story.
Oseledets (2011) --- Introduces tensor-train decomposition in numerical analysis; crucial for understanding how "tensor networks for biology" often operate in practice (classical tensors, not wavefunctions).
Lechelon et al. (2022) and Perez-Martin et al. (2025) --- Core experimental references behind the "Pettini coupling" hypothesis; valuable for understanding what is claimed, what is measured (FCS + THz), and what remains unproven about intracellular relevance.