At HedgeWestSF in San Francisco, I presented alongside Oscar Stiffelman of Nand Capital, whose paper 'Investing is Compression' reframes portfolio management through information theory. I'd already built his framework into Sourcetable as a live tool. Here's what the paper actually says, how I implemented it, and where the real-world version has to go further.
Andrew Grosser
June 3, 2026 • 10 min read min read
Last night I was at The House of AI in San Francisco's SoMa neighborhood for the HedgeWestSF Quant, Data, and AI meetup — a room full of quants, fund managers, and technologists who take the mechanics of markets seriously. Three presentations: fractal trading from Ramki Ramakrishnan of Wavetimes, a compressed look at information theory and investing from Oscar Stiffelman of Nand Capital, and my own talk on creating trading strategies with AI. What made the evening unusual was that Oscar and I were, unknowingly, presenting two ends of the same idea — him from the theoretical side, me from a live implementation.
Oscar's paper, 'Investing is Compression' (arXiv:2604.10758, April 2026), is one of the more original takes on portfolio theory I've read. His central argument isn't about risk in the conventional sense. It's about information. Specifically: investing, at its core, is a compression problem. I'd already read the paper and built a version of it into Sourcetable. So I was watching his talk knowing exactly where his math leads — and where a practitioner has to make choices the paper deliberately leaves open.
The thesis starts with Kelly (1956) and a formulation most quants know: if you're in a multiplicative world — one where returns compound — the rational objective isn't maximizing expected return or minimizing volatility. It's maximizing expected log-wealth. Kelly showed this at Bell Labs in the context of information theory, and the lineage runs straight from Bernoulli (1738) through Shannon to Cover's universal portfolios.
Oscar's contribution is a three-term decomposition of the Kelly growth rate (his equation 60):
g(W) = log(R̄) − H(W*) − D_KL(W* ∥ W)
↑ ↑ ↑
money term entropy divergence
(payoff/rules) (irreducible (the drag — the only
uncertainty) thing W controls)The money term and entropy term don't depend on your portfolio allocation W. You can't change the payoff structure of the market, and you can't eliminate the irreducible uncertainty in the true distribution W*. The only lever you have is the divergence term — the KL divergence between your allocation W and the true unknown distribution W*. Investing, therefore, is a compression problem: describe the world (construct your portfolio) in as few wasted bits as possible, where 'wasted bits' means distance from the true distribution.
Two things follow immediately from this framing that clash with standard finance. First: risk is NOT volatility. The Sharpe ratio is an additive-world metric; in a compounding world it's irrelevant. The correct measure of risk is the probability of ruin — and log(0) is negative infinity, so you must have strictly zero risk of ruin. Second: asymmetry (bounded downside, unbounded upside) is the lever. Volatility is symmetric. The information-theoretic framework cares about the shape of the distribution, not its spread.
The mathematical machinery that makes this tractable is what Oscar calls Cover's 'sum of products' trick. Multi-period wealth is a product of per-period sums — and expanding that product turns it into a sum over exponentially many sequences, each representing which asset 'carries' each period. Group those sequences into type classes by frequency, and the class whose symbol frequencies match your allocation W dominates. Off-class mass decays exponentially in the KL divergence between the true distribution and W.
The punchline: once you've collapsed the problem this way, the non-binary multi-asset case has the same shape as a one-winner horse race. The odds and payoffs drop out. The only thing that matters is who wins each period. This is why Oscar's central heuristic focuses on P(dominates) rather than modeling the full joint return distribution.
Oscar's 'Winner Fraction' is the practical upshot. When you can't model the full joint return distribution — which is always, in practice — allocate each asset in proportion to its probability of dominating the candidate set. His equation 69 gives a clean bound on the growth shortfall relative to the true optimal portfolio:
g(W*) − g(W') ≤ H(W')
where W' = the Winner Fraction (your allocation)
and H(W') = the entropy of that distribution in bits
Lower entropy → more concentrated allocation → tighter growth bound.The elegance is real: you don't need to know the true distribution to bound your worst-case shortfall. You just need to know how concentrated your winner-probability distribution is. A winner fraction that puts 80% on one asset has low entropy and a tight bound. A uniform distribution over ten assets has high entropy and a loose bound — which is exactly the no-edge baseline that the paper says you should never pretend to have escaped.
Here's where Oscar's paper and my talk diverged — productively. The paper is rigorous about what it can and cannot claim. It's explicit that the individual D_KL terms are not computable because the true distribution W* is unknown. Only differences between strategies are measurable — you can compare two allocations, but you can't compute the absolute divergence from the truth. Oscar's framework is correct about its own limits.
A live tool has to make choices the paper deliberately leaves open. Specifically: where does the edge come from? The Winner Fraction requires P(asset dominates) — but the paper doesn't tell you how to estimate that probability. It just says you must 'disagree with the market while being closer to the true distribution.' How you get closer is the entire problem of active management, left as an exercise to the practitioner.
In Sourcetable's CompressionAnalyst, I built a tiered edge-source system that's explicit about what each input is:
Edge source priority in CompressionAnalyst:
The distinction between levels 1 and 4 matters enormously. Level 4 — historical dominance frequency — is the honest null hypothesis. It's what you get if you have no real edge. Level 1 is where the tool becomes useful: you've done some analysis (sentiment, wave patterns, earnings estimates, whatever), converted it to a probability of domination, and plugged it in. The Winner Fraction then allocates proportionally and tells you your entropy bound. You know exactly how much growth you're giving up for the uncertainty in your signal.
The paper treats the edge as given. The tool has to produce it. In Sourcetable, CompressionAnalyst connects to the rest of the analytical stack via the signal_source parameter — you can run SentimentAnalyst, WaveAnalyst, or TechnicalAnalyst per ticker, and their (signal, confidence) outputs are fused into P(dominates) via a softmax. Each analyst is a 'sensor.' CompressionAnalyst is the portfolio layer that takes those sensors and translates them into an allocation.
This matters because Oscar's talk included a point about Hurst exponents and fractal structure — specifically Ramki's presentation on fractal trading, which ran just before his. I noted in my own talk that Hurst ≈ 0.5 in the information-theoretic frame is essentially saying the series is incompressible: a random walk has maximum entropy and provides no edge. Hurst > 0.5 (persistent, trending) is a compressibility signal. Hurst < 0.5 (mean-reverting) is a different compressibility signal. WaveAnalyst's fractal output is a direct proxy for compressibility, which maps cleanly onto the CompressionAnalyst frame. The three talks at HedgeWestSF were, unintentionally, three views of the same underlying idea.
The paper focuses on the Winner Fraction as the allocation. The implementation adds a Kelly sizing layer on top: given the Winner Fraction as a warm start, maximize E[log(1 + w·r)] via projected gradient ascent over the return history. This is Kelly's own objective — the one that appears in equation 60 — and it's the only expectation in the entire pipeline. Everything upstream is counting and logarithms; the Kelly step is where probability enters.
The risk-of-ruin constraint is enforced structurally. With max_leverage = 1.0 (fully-invested, long-only), 1 + w·r can never reach zero because simple returns are bounded below at -100%. The risk of ruin is not computed; it's architecturally impossible. If you want leverage (max_leverage > 1), the tool computes risk of ruin as the fraction of historical periods where the levered book would have been wiped out, and reports it explicitly. Oscar's point about log(0) = -∞ isn't a theoretical nicety — it's an engineering constraint baked into the sizing layer.
Oscar presented the theory. I'd already shipped the practice. When he showed equation 69 — the entropy bound on growth shortfall — I had code running that computes exactly that number for any candidate set, any signal, any look-back. When he talked about the no-edge baseline and the requirement to 'disagree with the market while being closer to the truth,' I had a tool that labels its own output levels with exactly that warning.
The gap between a white paper and a live analyst is where most implementations get lost. A paper can say 'the edge is given.' A tool has to produce the edge, or be honest that it hasn't. A paper can leave the Kelly sizing as an implication of the math. A tool has to implement it, handle numerical edge cases, and enforce the risk-of-ruin floor as a hard constraint rather than a theoretical property. A paper can describe type classes analytically. A tool has to decide when to enumerate actual sequences and when to report the analytic approximation — and label both clearly.
That's what I mean when I talk about turning a white paper into a live analyst. It's not about automating the paper's conclusions. It's about making every decision the paper leaves open into an explicit, labeled, honest choice — so the person using the tool knows exactly what they're getting and what they're not.
CompressionAnalyst in Sourcetable:
The next HedgeWestSF meetup will be interesting. There are at least two presentations' worth of conversation sitting between Oscar's information-theoretic framework and the practical question of where the edge actually comes from — which is, ultimately, the only question in active management that matters.
The paper describes a framework you'd re-apply each time you have new data. The implementation has to do better than that. The entire point of a compounding world — which is what information-theoretic investing is about — is that your edge compounds too. A single run of CompressionAnalyst is interesting. A saved workflow that re-runs it automatically on fresh data, every day, and routes the allocation to live execution is a different category of tool entirely.
In Sourcetable, you build a workflow once: connect your data sources, configure your signal pipeline (SentimentAnalyst → CompressionAnalyst → Kelly sizing), set your tickers and parameters. Save it. Now it's a persistent, reusable analysis — not a one-off chat session you restart each morning. Change the candidate set from sector ETFs to individual stocks, or swap the sentiment signal for wave analysis, and the same workflow structure runs cleanly on the new dataset. The compression framework doesn't care what the assets are; it only cares about dominance probabilities and the entropy of your winner distribution.
The Robinhood integration takes this from persistent analysis to automated execution. Sourcetable connects directly to Robinhood via a patent-pending encrypted credential system — the credentials never touch our servers in plaintext, and execution is protected against man-in-the-middle attacks. When the workflow produces a Kelly-sized allocation, you can have it execute automatically. Or review it first and approve. The loop is fully configurable: how often it runs, what it does with the output, whether it requires a human in the decision.
This is what the gap looks like between Oscar's white paper and a live trading system. The paper proves that maximizing expected log-wealth is the right objective and that the Winner Fraction bounds your shortfall. A live system has to refresh market data daily, re-run the dominance counts, update Bayesian priors with new evidence, recompute Kelly weights, and route the result to execution — all reliably, all with an audit trail, all without requiring you to remember to run it. Saved workflows in Sourcetable handle the refresh loop. Robinhood handles the execution. The math is Oscar's.
The full automated trading loop in Sourcetable: