Thomas Meli

A Landscape You Can Hold in Your Head

The previous module defined the landscape as a four-component system: configuration space ( $X$ ), neighborhood ( $N$ ), value function ( $f$ ), dynamics ( $D$ ). You saw each component mapped onto protein engineering, coffee recipes, and neural networks. This module puts the framework to work: you will build a complete landscape from scratch, encode it in binary, score it by hand, define neighborhoods, and run search algorithms across it. Coffee carries the main example. Pick a system of your own to build alongside: a workout split, a recipe, a study schedule, anything you have tried to improve through repeated adjustment.

A home brewer spends months chasing a better cup, changing the roast, the grind size, the water temperature, whether to add milk, whether to grind beans fresh each morning. Some changes help. Some make it worse. For a while, the search stalls: every tweak fixes one thing while breaking another. A finer grind makes the flavor richer but introduces bitterness. Milk softens the bitterness but buries the subtle notes that made the dark roast worth drinking.

Three questions locate you on a landscape you cannot yet see:

What did you change last time you tried to improve it?
How did you know whether the change helped?
Have you ever been stuck where nothing you tried seemed to make it better?

"Stuck" is the lived experience of a local optimum: a configuration where every allowed one-step change makes things worse, even though a much better configuration may exist in territory you have not explored. What counts as an "allowed" change is a modeling choice this module will define precisely. For now, think of it as the kind of change you normally try: adjusting one variable at a time. The landscape framework exists to explain why you get stuck and what you can do about it.

These traps are real. They show up in coffee brewing, workout programming, drug design, neural network training. Researchers study them in full-scale systems all the time. The reason to start small: five binary variables produce 32 configurations, enough to enumerate every one, score them all, trace every path, and verify by direct inspection where the traps form and why. Real coffee involves continuous grind settings, precise temperatures, brew times, bean origins, water mineral content, pour rates. The space is too large to see whole, and every measurement carries noise. A toy makes the entire landscape visible, so you build the structural vocabulary on a system you can hold in your head, then carry it into systems where the space outgrows inspection.

Classify

A coffee experimenter tries five single-variable changes to their current recipe. Each one makes the coffee worse. They conclude their recipe is the best possible. What is wrong with that conclusion?

The experimenter found a local optimum: a configuration that scores higher than every configuration reachable by a single change. That does not make it globally optimal. A far better recipe may exist elsewhere in the configuration space, separated by intermediate recipes that score worse. "No one-step improvement exists from here" is a different claim from "no better configuration exists anywhere." The landscape framework makes this distinction precise: a local optimum is defined relative to a neighborhood (the set of configurations reachable by one allowed move), and the global optimum may sit beyond that neighborhood's reach. In a clean additive model where each position contributes independently, with no tied weights and no constraints, there is one strict global optimum, and local optimality does imply global optimality. When positions interact (a property called epistasis), multiple peaks become possible, and getting stuck at one says nothing about the height of the others.

Why a Toy, and Why Binary

A toy model is a deliberate simplification designed to make one mechanism visible by stripping away everything else. A wind tunnel model of a car has no seats, no stereo, no paint job. It needs to behave correctly with respect to drag. The parts it leaves out are the price of isolating the part that matters.

The toy for fitness landscapes uses binary strings: sequences of zeros and ones, where each position represents a single on/off decision. Why binary? Because it makes the combinatorial structure visible with minimal overhead. Each position has two states, written as 0 or 1. Binary strips away magnitude, continuity, and domain-specific meaning, preserving the structure this module needs: which combinations of the retained decisions exist and which configurations differ by one change. This is a modeling choice. Many real variables are continuous or ordered, and binarizing them destroys that information. The cost is worth paying here because it makes the entire space small enough to inspect.

Start with one decision: roast. A single bit encodes it: 0 for light, 1 for dark. One position, two configurations, done.

Add grind as a second position, and each recipe becomes a two-character string read left to right: 00 is light roast with coarse grind, 10 is dark roast with coarse grind, 11 is dark roast with fine grind. Two decisions, four combinations. The pattern scales to any number of positions.

The coffee example uses five:

Position	Decision	0	1
1	Roast	Light	Dark
2	Grind	Coarse	Fine
3	Water temperature	Below 92 C	Above 92 C
4	Milk	No milk	Milk added
5	Fresh beans	Pre-ground	Ground today

Reading a five-bit string works the same way: go left to right, one digit per row. 11101 means position 1 is 1 (dark roast), position 2 is 1 (fine grind), position 3 is 1 (hot water), position 4 is 0 (no milk), position 5 is 1 (freshly ground).

Five binary decisions. Each one throws away information. Grind size is really a spectrum from Turkish to French press, collapsed here to two bins. Water temperature has an infinite range, split at 92 C. Brew time, water mineral content, bean origin, pour rate: all excluded entirely.

What does the encoding lose? It cannot distinguish a 91 C brew from an 80 C brew; both register as "below 92 C." If the optimal temperature is 88 C, the model cannot represent it. That is the price of making the whole space small enough to inspect, and it is the first genuine modeling decision the toy asks you to make.

Each five-bit string carries the combinatorial structure of the retained choices. The domain meaning (coffee, workouts, study habits) can be swapped without changing the structure. That is the point.

Your turn: Pick 4-5 binary decisions for the system you identified in Section 1. Write them down in a table like the one above. For each decision, name what 0 means and what 1 means. Then name at least one variable you excluded and say what you lose by excluding it.

Classify

You encode your system as 5 binary positions. Which structural property does the encoding preserve from the original system?

Binary encoding preserves combinatorial structure: which combinations of the retained decisions exist and how they neighbor each other in the space. It destroys magnitudes (a "coarse" grind is just 0, carrying no indication of how coarse) and ordering (there is no "medium" between 0 and 1). Whether those decisions interact is a separate question that only becomes visible once you add a value function: the encoding defines the states and their adjacency, and the scoring rule reveals whether positions help independently or interfere. The trade for these losses is tractability: with binary, the space is finite, enumerable, and free of domain noise, which lets you observe how configuration space, value functions, neighborhoods, and dynamics interact without measurement error or continuous optimization clouding the structural picture.

Build the Configuration Space

The configuration space is the set of every possible combination of your binary decisions. Formally, $X = \{0, 1\}^n$ , where $n$ is the number of positions.

For the coffee model, $n = 5$ , so $|X| = 2^5 = 32$ configurations. Here are a few of them, translated back into coffee:

00000 = Light roast, coarse grind, cool water, no milk, pre-ground
10101 = Dark roast, coarse grind, hot water, no milk, ground today
11101 = Dark roast, fine grind, hot water, no milk, ground today
01010 = Light roast, fine grind, cool water, milk, pre-ground
11111 = Dark roast, fine grind, hot water, milk, ground today

Thirty-two recipes, all of them enumerable. That is the point of a toy: the space is small enough to see whole.

Watch what happens when the space grows. Add one more decision (stirring vs. not stirring) and the space doubles to 64. Add another (filtered vs. unfiltered water) and it reaches 128. Each binary decision doubles the total:

$|X| = 2^n$ .

At $n = 5$ , there are 32 configurations. At $n = 10$ , there are $2^{10} = 1{,}024$ . At $n = 20$ , there are $2^{20} = 1{,}048{,}576$ . Twenty binary decisions, each one just an on/off choice, produce over a million possible configurations. The difficulty of search is not that any single choice is complicated; the difficulty is that choices multiply.

This is why you cannot simply try everything. Exhaustive search works at $n = 5$ : an experimenter could brew all 32 coffees in a weekend. At $n = 20$ , that same experimenter would need nearly 2,900 years of daily brewing. Search algorithms exist because configuration spaces outgrow enumeration fast.

Your turn: You committed to 4-5 binary decisions in the previous section. Count your configurations: if you chose 5 decisions, you have 32 candidates. Write down 4-5 of them in binary and translate each one back into plain language. You have just defined your configuration space.

Paste this into Claude (artifact mode), Gemini Canvas, or ChatGPT canvas. It will generate an interactive app you can use in your browser.

Build a single-page interactive HTML/CSS/JavaScript app that visualizes binary configuration spaces. No frameworks, no build tools, just vanilla JS with inline styles.

DOMAIN CONTEXT: A "configuration" is a binary string of length n. Each bit represents an on/off decision. For the coffee example, 5 bits encode: (1) Roast: Light=0 / Dark=1, (2) Grind: Coarse=0 / Fine=1, (3) Water temp: Cool=0 / Hot=1, (4) Milk: No=0 / Yes=1, (5) Beans: Pre-ground=0 / Fresh=1. The string 11101 means dark roast, fine grind, hot water, no milk, freshly ground. With n bits there are 2^n total configurations.

THE APP SHOULD HAVE TWO PANELS SIDE BY SIDE:

LEFT PANEL - Configuration table:

A slider to set n from 1 to 8 (default 5).
Show ALL 2^n configurations in a scrollable table.
Each row: the binary string, plus a human-readable label. For n=5, use the coffee meanings above. For other n values, label positions generically (Decision 1: OFF/ON, etc.).
Highlight the row the user hovers over. Clicking a row selects it and highlights all its Hamming-1 neighbors (strings differing in exactly 1 bit) in a different color.
Show total count: "2^n = [value] configurations."

RIGHT PANEL - Exponential growth chart:

Plot 2^n for n = 1 to 25 on a log-scale y-axis (use canvas or SVG).
Draw a vertical marker at the current slider value of n.
Label key milestones on the chart: n=5 → 32, n=10 → 1,024, n=20 → 1,048,576.

Use a dark background (#1a1a2e or similar). Make it look clean.

Classify

A configuration space has 10 binary positions. You add one more position (a new on/off decision). How does the space change?

Each new binary position doubles the total. With 10 positions you have $2^{10} = 1{,}024$ configurations. Adding an 11th position means every existing configuration now has two versions (the new bit is 0 or 1), giving $2^{11} = 2{,}048$ . This multiplicative compounding is why exhaustive search becomes impossible long before the space feels large in any individual dimension. Twenty binary decisions produce over a million configurations; thirty produce over a billion.

The 32-Recipe Hypercube

This view shows configuration space: every dot is one coffee recipe, and every line is a one-decision change. The colors preview the score function you add next. Epistasis becomes visible when the same kind of line helps in one background and hurts in another.

Configuration space

The 32-Recipe Hypercube

The landscape is coming into view.

Score Every Configuration by Hand

A configuration space alone is just a set of possibilities. To create a landscape, you need a value function: a rule that assigns a score to every configuration. The value function gives every configuration a height. Which configurations count as peaks depends on both the value function and the neighborhood (defined in the next section): a peak is a configuration whose score is higher than every configuration reachable by one allowed move.

Before writing any scoring code, score a deliberately chosen sample by hand. The scores below are one brewer's personal ratings, 0 to 10, for how much they enjoy each cup. The sample includes matched pairs so you can see whether a change has the same effect in different backgrounds:

Config	Roast	Grind	Temp	Milk	Fresh	Score
10100	Dark	Coarse	Hot	No	Pre-ground	4
10101	Dark	Coarse	Hot	No	Fresh	7
11101	Dark	Fine	Hot	No	Fresh	8
01001	Light	Fine	Cool	No	Fresh	10
01101	Light	Fine	Hot	No	Fresh	6
01111	Light	Fine	Hot	Milk	Fresh	7
11111	Dark	Fine	Hot	Milk	Fresh	5
11001	Dark	Fine	Cool	No	Fresh	4
00000	Light	Coarse	Cool	No	Pre-ground	2
10111	Dark	Coarse	Hot	Milk	Fresh	6
11100	Dark	Fine	Hot	No	Pre-ground	5
00011	Light	Coarse	Cool	Milk	Fresh	3

Look at position 5 (fresh beans). To test whether fresh beans have an additive effect, compare matched pairs: two recipes that differ only in the fresh-beans position. Compare 10100 (score 4) with 10101 (score 7): fresh beans add 3 points in that background. Compare 11100 (score 5) with 11101 (score 8): fresh beans add 3 points in that second background. In these two matched pairs, fresh beans add the same amount each time. That suggests an approximately additive effect for those backgrounds. To claim a truly stable additive contribution, we would need the same contrast across more backgrounds.

Now look at positions 1 and 4 (roast and milk). To test for interaction, use a clean 2×2 contrast where only roast and milk change and everything else stays fixed. Pull four recipes from the table that share the same grind, temperature, and freshness (fine, hot, fresh) and differ only in roast and milk:

Roast	Milk	Config	Score
Light	No milk	`01101`	6
Dark	No milk	`11101`	8
Light	Milk	`01111`	7
Dark	Milk	`11111`	5

Without milk, switching from light to dark adds 2 points. With milk, switching from light to dark subtracts 2 points. The effect of roast changes sign depending on milk. That is epistasis: the contribution of one position depends on the state of another.

The table above revealed epistasis without any mathematical machinery. The interaction structure was already in the scoring; the toy model made it visible.

The epistasis might look like an artifact of subjective scoring. It is subjective, and that is the point: the interaction structure belongs to the scorer's preferences for this system. If those preferences were truly additive, no hand-scoring would produce interactions, because each position would contribute the same amount in every context. Additive means each improvement works alone; epistatic means improvements interfere.

The simplest value function treats each position as independent. The total score is the sum of each position's individual contribution, counted only when that position is ON.

Start with one position. Fresh beans contributed roughly 3 points in the hand-scoring above, so assign weight $w_5 = 3$ and write the value function as $f(x) = w_5 \cdot x_5$ . Here $x_5$ is the value of position 5: either 0 or 1. When fresh beans are ON ( $x_5 = 1$ ), the term contributes $3 \times 1 = 3$ . When OFF ( $x_5 = 0$ ), it contributes $3 \times 0 = 0$ . The multiplication by $x_5$ is a switch: the weight counts when the position is ON and vanishes when it is OFF.

Add a second position. Dark roast contributes roughly 2 points, so $w_1 = 2$ , and the function extends to two terms: $f(x) = w_1 x_1 + w_5 x_5 = 2 x_1 + 3 x_5$ . A configuration with both ON scores $2 + 3 = 5$ ; dark roast ON with pre-ground scores $2 + 0 = 2$ . Each term contributes independently.

The general form writes this pattern for all $n$ positions:

f_{\text{additive}}(x) = w_1 x_1 + w_2 x_2 + \cdots + w_n x_n

Because each position's contribution is fixed regardless of context, the optimal recipe is straightforward: turn ON every positive-weight position and OFF every negative-weight position. If every weight is nonzero and no constraints exclude any configuration, there is one strict global optimum. There are no traps because improving any single position always improves the total. If a weight is exactly zero, both settings tie for that position, and the single peak broadens into a plateau. The structural point holds either way: no interactions means no traps.

Now add a second layer: pairwise interactions. The total score is the sum of individual contributions plus a bonus or penalty for every pair of positions that are both ON.

See this with two positions first. Dark roast contributes $w_1 = 3$ and milk contributes $w_4 = 2$ individually. Their additive baseline is $f(x) = 3 x_1 + 2 x_4$ . To encode the interference observed in the hand-scoring, add one interaction term: $f(x) = 3 x_1 + 2 x_4 + (-4) \cdot x_1 x_4$ . The product $x_1 x_4$ equals 1 only when both positions are ON; otherwise it contributes nothing. When both are ON: $3 + 2 + (-4) = 1$ point total, less than either contributes alone. Each ingredient helps in isolation; together they partly cancel. The coefficient $w_{1,4} = -4$ is the interaction weight: it encodes the size and direction of the interference. The product $x_1 x_4$ is the gate that activates it.

The general form writes this structure for all positions and all pairs:

f_{\text{epistatic}}(x) = \sum_{i} w_i x_i + \lambda \sum_{(i,j)} w_{ij} \, x_i \, x_j

The first sum, $\sum_{i} w_i x_i$ , is the additive baseline. The second sum adds a $w_{ij} x_i x_j$ term for every pair $(i, j)$ . The parameter $\lambda$ scales how strongly interactions matter relative to the individual weights.

Interactions make the landscape rugged. Ruggedness measures local unpredictability: how much the payoff of a single change varies depending on the rest of the configuration. Any epistasis creates ruggedness, because once a position's contribution depends on context, the same change can help in one background and hurt in another. As $\lambda$ grows and interactions strengthen, the surprises get larger.

A rugged landscape can still have a single peak. Suppose interactions only change the magnitude of each position's effect: dark roast adds 3 points in one recipe and 1 point in another, always a gain. The landscape is bumpy, with uneven payoffs, yet every uphill path leads to the same summit.

Multiple peaks require reciprocal sign epistasis: two positions that each reverse the other's effect. The coffee scores demonstrate this. Roast reverses depending on milk: dark adds 2 points without milk, subtracts 2 with milk. Milk also reverses depending on roast: milk adds 1 point with light roast, subtracts 3 with dark roast. Each position negates the other. That mutual reversal creates a valley no single-step path can cross without losing score, splitting the landscape into separated peaks. Poelwijk, Tănase-Nicola, Kiviet, and Tans (2011) proved the general result: reciprocal sign epistasis between at least one pair of positions is necessary for multiple peaks to exist. A landscape with magnitude-only interactions can be arbitrarily rugged and still have one peak.

This quadratic model captures pairwise interactions. Higher-order interactions, where three or more positions affect each other simultaneously, exist in real systems; the N-K module later in this course introduces a framework for tuning that complexity.

Same configuration space, different value function, different landscape. Score the same 32 coffees by taste, and this brewer's global peak is light-roast, fine-grind, cool-water, no-milk, fresh: a delicate cold cup. The dark-roast, fine-grind, hot, no-milk, fresh recipe remains a strong local favorite, but not the best cup in the full space. Score the same configurations by affordability (higher score for cheaper cups), and the optimum shifts entirely: pre-ground light roast brewed with cool water wins. The value function is a choice about what to measure, and that choice reshapes the entire terrain.

The same principle applies within a single criterion. Two people scoring these 32 recipes by "taste" would assign different numbers, producing different peaks, different traps, and different interaction patterns over the same configuration space. Modeling both preferences would require two separate value functions: two landscapes, one per scorer. "Taste" is not a single function until you specify whose.

This structure appears in real biology. Weinreich, Delaney, DePristo, and Hartl (2006) studied five point mutations in beta-lactamase, an enzyme involved in antibiotic resistance. Because each mutation can be absent or present, the five mutations define $2^5 = 32$ genotypes. The researchers scored each genotype by its resistance to the antibiotic cefotaxime: five binary positions, 32 configurations, a measured fitness value for each. The same combinatorial structure as the coffee toy. Epistasis pervaded the landscape. Some mutations that improved resistance in one genetic background reduced it in another. Because five mutations can be acquired in $5! = 120$ possible orders, there are 120 direct mutational paths from the starting genotype to the five-mutation genotype. Of those 120 paths, 102 were inaccessible to strict step-by-step selection because they did not increase resistance at every step. Only 18 paths increased resistance monotonically. A five-bit landscape in nature, exhibiting the same structural traps this toy model makes visible.

Your turn: Score 10 of your own configurations on a 0-to-10 scale. As you score them, watch for interactions: does the effect of one position change depending on what another position is doing? If yes, your system has epistasis. If every position contributes roughly the same amount regardless of context, your system may be approximately additive.

Paste this into Claude (artifact mode), Gemini Canvas, or ChatGPT canvas. It will generate an interactive app you can use in your browser.

Build a single-page interactive HTML/CSS/JavaScript app that lets the user compare two value functions on the same configuration space. No frameworks, no build tools, just vanilla JS with inline styles.

DOMAIN CONTEXT: A configuration space is the set of all binary strings of length n. A "value function" scores every configuration. An ADDITIVE value function assigns a weight to each bit position; the score is the sum of weights for positions that are ON (1). Because each position contributes independently, there is exactly one global optimum: turn ON every positive-weight position. An EPISTATIC value function adds pairwise interaction terms: for every pair of positions (i,j), if both are ON, a bonus or penalty applies. Interactions can create multiple local peaks, where no single-bit flip improves the score, yet a much better configuration exists elsewhere.

THE APP:

CONTROLS at the top:
- Slider for n (number of bit positions), range 4 to 8, default 5.
- A "Regenerate weights" button that picks new random weights.
- A slider labeled "Epistasis strength" (0 to 3, default 1.5) that scales how strong the pairwise interaction terms are relative to the additive weights.
TWO SIDE-BY-SIDE LANDSCAPE VIEWS (additive on left, epistatic on right):
- Each view plots ALL 2^n configurations as dots on a chart: x-axis = Hamming weight (number of ON bits, 0 to n), y-axis = score.
- Color-code the dots: local peaks in gold, the global optimum in bright green, all others in a muted color. A local peak is a configuration that scores higher than every configuration differing from it in exactly 1 bit.
- Show a stats bar under each chart: total peaks, global optimum score, global optimum binary string.
INTERACTION: hovering over a dot shows the binary string, its score, and its human-readable label. For n=5, use coffee labels: (1) Roast: Light=0 / Dark=1, (2) Grind: Coarse=0 / Fine=1, (3) Water temp: Cool=0 / Hot=1, (4) Milk: No=0 / Yes=1, (5) Beans: Pre-ground=0 / Fresh=1. For other values of n, label generically.
KEY INSIGHT the app should make visible: the additive landscape always has exactly 1 peak; the epistatic landscape (at sufficient strength) typically has multiple peaks. The reader should be able to slide epistasis strength from 0 to 3 and watch peaks multiply.

Use a dark background. Make it look clean and educational. Use a seeded pseudorandom number generator so the same seed gives the same weights, and the "regenerate" button picks a new seed.

Predict

You score coffee configurations by hand. Position 1 (dark roast) adds 2 points when position 4 (milk) is OFF, but subtracts 2 points when position 4 is ON. What structural property have you observed?

When the effect of one position changes sign or magnitude depending on another position's state, the two positions interact. This is epistasis. It means you cannot optimize those positions independently: the best setting for roast depends on what milk is doing. In a landscape with epistasis, flipping a single bit can help in one configuration and hurt in another. That kind of sign reversal makes multiple peaks possible, though epistasis alone does not guarantee them. Reciprocal sign epistasis, where two positions each reverse the other's effect, is a necessary condition for multiple peaks. Compare this with position 5 (fresh beans), where the effect is roughly +3 across the tested backgrounds: that is an additive contribution, and positions like that can be optimized independently regardless of context.

The Valley of Bad Taste

This is the same 32-recipe space after the brewer's score function gives every recipe a height. The vertical axis is taste score. Hover the blue local peak to see why every one-decision change points downhill, even though the purple global peak is higher.

Hover over any node in the 3D landscape to reveal the recipe and illuminate its physical traversal pathways.

Rugged terrain

The Valley of Bad Taste

The landscape is coming into view.

Define What Counts as One Step

When the coffee experimenter tries to improve a recipe, how big is a typical change? Most experimenters change one thing at a time: switch from light to dark roast, or switch from coarse to fine grind. Rarely would someone change the roast AND the grind AND the temperature AND the milk AND the beans simultaneously.

That restraint reflects the cost structure of experimentation. Changing one variable makes attribution possible: the experimenter can see what that change did. Changing five at once makes attribution impossible: if the coffee improved, there is no way to tell which change mattered. This experimental logic is also a modeling decision: it defines the neighborhood of each configuration.

For binary strings, the simplest neighborhood is Hamming distance 1: two strings are neighbors if they differ at exactly one position. In notation, $N_1(x)$ is the set of all configurations $y$ whose Hamming distance from $x$ is exactly 1, meaning they differ at one position. For a string of length $n$ , this gives exactly $n$ neighbors: one for each position you could flip.

Here is the one-bit neighborhood of the brewer's dark-roast local favorite, 11101 (dark, fine, hot, no milk, fresh):

Neighbor	Change	What it means
`01101`	Flip position 1	Switch to light roast
`10101`	Flip position 2	Switch to coarse grind
`11001`	Flip position 3	Switch to cool water
`11111`	Flip position 4	Add milk
`11100`	Flip position 5	Switch to pre-ground

Five neighbors, each reached by changing exactly one decision.

What happens when the experimenter relaxes that restraint and changes two things at once? Clean attribution is sacrificed: if the coffee improves, there is no way to tell which change mattered. The benefit is access to configurations that no single change can reach from the current position.

A wider neighborhood allows flipping one or two positions simultaneously. See this at minimal scale first. Shrink to two decisions: roast and grind. Starting at 10 (dark roast, coarse grind), the one-bit neighbors are 00 (switch roast) and 11 (switch grind). The two-bit neighbor is 01 (switch both). The full neighborhood under one-or-two-bit moves is {00, 11, 01}: every other configuration in this small space.

The formal notation writes this rule for any configuration $x$ :

N_2(x) = \{y \in X : 1 \le d_H(x, y) \le 2\}

Read this as: $N_2(x)$ is the set of all configurations $y$ in the space $X$ whose Hamming distance $d_H(x, y)$ (the count of positions where they differ) is between 1 and 2. The constraint $1 \le d_H \le 2$ means you can flip one position or two, but you cannot stay in place. This gives $n + \binom{n}{2}$ neighbors. The notation $\binom{n}{2}$ , read " $n$ choose 2," counts the ways to pick which two positions to flip. For $n = 5$ : $5$ one-bit changes plus $\binom{5}{2} = 10$ two-bit changes, giving $15$ neighbors total. The two-bit neighbors include moves like "switch to light roast AND add milk" or "switch to coarse grind AND use pre-ground": changes that involve two simultaneous adjustments.

Configuration 11101 scores 8 in our brewer's ratings. Under Hamming-1, every single-bit neighbor scores lower: 01101 gets 6, 10101 gets 7, 11001 gets 4, 11111 gets 5, 11100 gets 5. Then 11101 is a local peak: better than everything one step away. The search is stuck; every single change makes the coffee worse.

Under Hamming-2, 10 additional neighbors become visible. One of them, 01001 (light roast, fine grind, cool water, no milk, fresh), scores 10. That recipe requires flipping two positions simultaneously. It was invisible under one-step moves. The same configuration, the same score, the same position in the space, went from "peak" to "not a peak" because the neighborhood widened.

A peak is a configuration better than all its neighbors. Change what counts as a neighbor, and you change what counts as a peak.

Is a peak a property of the coffee system, or a property of how the experimenter chose to search?

Your turn: Look at your own scored configurations. Pick your best one. List its Hamming-1 neighbors (flip each position one at a time and translate each into plain language). Is your best configuration better than all of them? If so, it is a local peak under one-step moves. Then consider: would it survive as a peak if you could change two things at once?

Paste this into Claude (artifact mode), Gemini Canvas, or ChatGPT canvas. It will generate an interactive app you can use in your browser.

Build a single-page interactive HTML/CSS/JavaScript app that demonstrates how changing the neighborhood definition changes which configurations count as peaks. No frameworks, no build tools, just vanilla JS with inline styles.

DOMAIN CONTEXT: Two binary strings are "Hamming-1 neighbors" if they differ in exactly 1 bit. They are "Hamming-2 neighbors" if they differ in 1 or 2 bits. A configuration is a "local peak" if its score is higher than ALL of its neighbors. Widening the neighborhood (from Hamming-1 to Hamming-2) gives each configuration more neighbors to beat, so some peaks under the narrow neighborhood stop being peaks under the wide one. This is the key insight: peaks are properties of the neighborhood definition, not just the scoring function.

SETUP (generate this automatically when the app loads):

Use n=6 binary positions (64 configurations). Generate random additive weights for each position (drawn from normal distribution) plus random pairwise interaction terms scaled by an epistasis_strength parameter (default 1.5). Use a seeded PRNG so results are reproducible, with a "New landscape" button to regenerate.

THE APP:

A toggle switch at the top: "Hamming-1 neighborhood" vs. "Hamming-2 neighborhood."
A dot plot of all 64 configurations: x-axis = Hamming weight (number of ON bits), y-axis = score. Color the dots by status:
- Gold: local peak under the CURRENT neighborhood setting.
- Red with a strikethrough or X marker: was a peak under Hamming-1 but is NOT a peak under Hamming-2 (only visible when Hamming-2 is selected, to show which peaks vanished).
- Muted: everything else.
Stats bar showing: "Peaks under Hamming-1: [X] | Peaks under Hamming-2: [Y] | Peaks eliminated: [X-Y]"
Click any dot to see a detail panel: the binary string, its score, all its neighbors under the current neighborhood, and whether any neighbor beats it. If it lost peak status when switching to Hamming-2, highlight the specific 2-bit neighbor that beats it.
An epistasis strength slider (0 to 3) that regenerates the landscape and lets the reader see how interaction strength affects peak counts under both neighborhoods.

Use a dark background. Make it visually clear which peaks survive and which vanish.

Predict

You have a rugged landscape with 12 local peaks under one-bit moves. You switch to a two-bit neighborhood (allowing two simultaneous bit flips). What happens to the number of local peaks?

Widening the neighborhood gives each configuration more neighbors it must beat to qualify as a local peak. Some configurations that were better than all one-bit neighbors will have a better two-bit neighbor they could not previously see. Those configurations stop being peaks. When the wider neighborhood includes all the old neighbors plus new ones (a nested neighborhood), the count of local peaks can only stay the same or decrease. In practice, it usually decreases substantially. The peak was real relative to the old neighborhood, and gone relative to the new one: a property of the search process, encoded in the tuple.

definition

Hamming Distance

The number of positions at which two binary strings differ. Hamming distance defines the most common neighborhood: strings at distance 1 are one-bit neighbors. Widening to distance 2 changes which peaks exist.

e.g., 11101 and 01101 differ at position 1 (distance 1). 11101 and 01001 differ at positions 1 and 3 (distance 2).

^{[1]Citation 1}

definition

Local Optimum

A configuration that scores higher than all its neighbors. Whether a configuration counts as a local optimum depends on the neighborhood: widen the allowed moves and some local optima disappear because they now have better neighbors.

e.g., A coffee recipe scores higher than every recipe differing by one ingredient change. Under single-change moves it is a local peak. Under two-change moves, a better recipe becomes visible and it stops being a peak.

^{[1]Citation 1}

Trapped in a Local Maximum

The center node (11101) is a strong dark-roast cup with a score of 8. Look at its 1-bit neighbors: changing any single ingredient drops the score to 4, 5, 6, or 7. Greedy search sees no improvement and stops. Expand to Hamming 2 to reveal the global peak at 01001, which requires changing two decisions at once.

The Trap (Center)Score 8

11101

Dark Roast, Fine Grind, Hot Water, No Milk, Fresh Ground

Hover over a node to inspect it.

Neighborhood trap

Trapped in a Local Maximum

The landscape is coming into view.

The Controlled Experiment

The four-component tuple is a laboratory instrument: $X$ for the configuration space, $N$ for the neighborhood, $f$ for the value function, $D$ for the dynamics. Its power comes from one operation: change exactly one component, hold the other three fixed, observe what happens.

Experiment 1: Change the value function. Keep $X$ (the 32 coffee configurations), keep $N$ (one-bit moves), keep $D$ (greedy hill-climbing). Switch $f$ from "taste score" to "affordability" (higher score for cheaper cups). Under the taste function, the global peak was light roast, fine grind, cool water, no milk, freshly ground: a delicate cold cup. Under affordability, the peak shifts to light roast, coarse grind, cool water, no milk, pre-ground: the cheapest possible configuration. Same space, same walker, same step size; the value function alone moved the peaks.

Experiment 2: Change the neighborhood. Keep $X$ , keep $f$ (taste), keep $D$ (greedy). Switch $N$ from one-bit to two-bit moves. Some peaks that trapped the greedy walker under one-bit moves disappear, because they now have better two-bit neighbors. A wider neighborhood can remove traps and often improves greedy search in rugged landscapes, though the outcome should be measured for each case. The neighborhood changed which configurations count as traps.

Experiment 3: Change the dynamics. Keep $X$ , keep $N$ (one-bit), keep $f$ (taste). Switch $D$ from greedy hill-climbing to simulated annealing. Greedy gets stuck at the nearest peak; annealing sometimes accepts a worse recipe temporarily, crosses a shallow valley, and reaches a higher peak on the other side. The terrain is identical; the walker's willingness to take temporary losses changed the destination.

Each experiment produces one clean conclusion because only one variable moved. This ability to run controlled comparisons, where only one piece of the tuple changes, is the central methodological skill the toy builds.

Predict

You fix X, N, and D (greedy hill-climbing) and switch f from an additive value function to an epistatic one. Under the additive function, greedy found the global optimum every time. What happens under the epistatic function?

Under the additive function, there was one peak, and greedy always climbed to it regardless of starting point. Under this epistatic function, the interaction pattern creates multiple peaks. Greedy still climbs to the nearest peak from each starting point, and now different starting points lead to different peaks, many of them inferior to the global optimum. The walker did not change; the terrain became rugged, and the same deterministic strategy that succeeded on additive terrain gets trapped on epistatic terrain.

Compare

You run greedy hill-climbing on the same epistatic landscape twice: once using Hamming-1 neighbors (one-bit moves) and once using Hamming-2 neighbors (two-bit moves). Which run is more likely to reach a higher peak?

The Hamming-2 walker can see more neighbors at each step, which means some configurations that trapped the Hamming-1 walker are no longer peaks. The wider neighborhood removes some traps, allowing greedy to climb past configurations that would have stopped it under one-bit moves. The wider neighborhood also changes the path greedy takes, so the outcome should be measured rather than assumed. The tradeoff: each step requires evaluating more neighbors ( $n + \binom{n}{2}$ instead of $n$ ), costing more computation per move. In real systems, this tradeoff between exploration range and evaluation cost is one of the fundamental design choices.

Path Dependence

In a rugged landscape, the same one-bit move can be uphill in one place and downhill in another. Hover over a node to compare its immediate pathways. Some starting points climb toward the blue local peak, while others can still reach the purple global peak.

Path dependence

Path Dependence

The landscape is coming into view.

Build Your Walkers

The landscape $(X, N, f)$ defines the terrain: which configurations exist, how they connect, and how they score. The dynamics $D$ is the search process that moves across this terrain. The coffee experimenter has options. Methodically try every single-variable change, always keeping the best? Sometimes try something unpromising, hoping it opens a path? Maintain several parallel experiments? Each search process has a character, and different characters, on the same terrain, reach different destinations.

Greedy hill-climbing. The experimenter evaluates every one-step change to the current recipe. Moves to whichever neighbor improves the score the most. Repeats until no single-step neighbor improves the score. It always stops at a local optimum under the chosen neighborhood, and that local optimum need not be global. No looking back. No accepting a worse cup. When the peak turns out to be mediocre, the search is stuck: every path to something better passes through something worse first.

Random mutation hill-climbing. The experimenter picks one random change (say, switch from fine to coarse grind). If the score improves, keep it. Otherwise, stay put and sample a different random change next step. It uses fewer evaluations per step than greedy (one neighbor instead of all $n$ ), but it may need many more steps. Because changes are sampled in random order, the search path can differ from greedy, and it sometimes lands on a different peak.

Simulated annealing. Why would the experimenter deliberately drink a bad cup of coffee? Because the current peak might be mediocre, and crossing a valley is the only route to higher ground. Early on, the experimenter accepts worse recipes: bitter, weak, or sour cups that would normally be rejected. Over time, the acceptance threshold tightens. The early tolerance lets the search cross shallow valleys: a recipe slightly worse than the current one might border a much better region. Greedy rejects every downhill step and stays trapped. Annealing, at high temperature, accepts the temporary loss and sometimes reaches a higher peak from the other side. It does not guarantee the global optimum. Deep valleys or distant peaks may remain unreachable within the cooling schedule. The advantage is structural: annealing can explore territory that greedy, by design, will never visit.

Population search. The experimenter maintains 10 recipes simultaneously. Each round, keep the 5 best, mutate each by flipping one random bit, taste all 10, repeat. The population explores multiple regions of the space at once, which helps when the landscape has several peaks of different heights scattered across the configuration space.

The landscape determines where the peaks are. The dynamics determines which peaks are reachable. A peak reachable by annealing (which can cross shallow valleys) may be unreachable by greedy hill-climbing (which cannot accept a single downhill step). The terrain stayed the same; the walker changed, and so did the destination.

Your turn: Think about how you actually search your own system. Are you a greedy optimizer, always taking the best available improvement? Do you sometimes try changes that seem worse in the short term, hoping they lead somewhere better? Do you maintain several parallel approaches? Your real search process has a dynamics, and naming it is the final piece of the tuple.

Paste this into Claude (artifact mode), Gemini Canvas, or ChatGPT canvas. It will generate an interactive app you can use in your browser.

Build a single-page interactive HTML/CSS/JavaScript app that lets the user watch four search algorithms compete on the same rugged landscape. No frameworks, no build tools, just vanilla JS with inline styles.

DOMAIN CONTEXT: A fitness landscape is a set of binary strings (configurations), each with a score. A search algorithm starts at a random configuration and tries to find the highest-scoring one. Different algorithms behave differently on rugged landscapes (landscapes with multiple peaks created by epistatic interactions). The four algorithms below represent fundamentally different search strategies.

LANDSCAPE SETUP (generate automatically on load):

n=7 bit positions (128 configurations).
Additive weights: one random weight per position, drawn from a normal distribution.
Pairwise interactions: one random interaction weight for every pair, scaled by epistasis_strength = 1.5.
Use a seeded PRNG. Include a "New landscape" button.
Precompute all 128 scores and identify all local peaks (Hamming-1 neighborhood) so the app can display the terrain.

THE FOUR ALGORITHMS:

GREEDY HILL-CLIMBING: Evaluate all Hamming-1 neighbors. Move to the best one if it improves the score. Stop when no neighbor is better.
RANDOM MUTATION: Pick one random Hamming-1 neighbor per step. Accept it only if it improves the score. Run for 200 steps.
SIMULATED ANNEALING: Pick a random Hamming-1 neighbor. Always accept improvements. Accept worse moves with probability exp(-delta/temperature), where delta = (current score - neighbor score). Start temperature at 5.0; multiply by 0.95 each step. Run for 200 steps.
POPULATION SEARCH: Start with 20 random configurations. Each generation: keep the top 10, create 10 children by flipping one random bit in each parent, combine and keep the best 20. Run 50 generations.

THE APP LAYOUT:

TOP: The landscape as a dot plot (x = Hamming weight, y = score). All 128 configurations as small dots. Local peaks in gold, global optimum in green, others muted.

MIDDLE: A "Run 100 trials" button. When clicked, run each algorithm 100 times from random starting points. Show a 4-column histogram of final scores (one histogram per algorithm, overlaid or side by side) that fills in as trials complete. Color each algorithm distinctly.

BOTTOM: A results table showing, for each algorithm: mean final score, best score found, number of distinct final configurations, and how often each one found the global optimum (hit rate %).

BONUS - SINGLE STEP MODE: A "Watch one run" button that animates a single run of a selected algorithm, highlighting the current configuration on the dot plot at each step, drawing the path it takes through the space. Let the user pick which algorithm to watch via radio buttons, and control the animation speed with a slider.

Use a dark background. Clean layout, clear labels, responsive. The key takeaway the reader should see: greedy gets trapped at nearby peaks; annealing and population search reach higher peaks more often.

Compare

You run greedy hill-climbing and simulated annealing 100 times each on the same rugged landscape. Both start from the same set of random initial configurations. Which statement is most likely true about their results?

Greedy hill-climbing takes the steepest uphill path from each starting point and stops at the first peak it reaches. Starting points near each other tend to converge on the same peak, so greedy typically finds fewer distinct endpoints. Simulated annealing can cross shallow valleys early in the search (when temperature is high), which lets it explore beyond the nearest peak and sometimes reach higher ones. Annealing does not guarantee finding the global optimum; deep valleys or distant peaks may still be unreachable within the cooling schedule. The distribution of final scores for annealing will typically have a higher mean and a longer right tail compared to greedy.

Search Rules on the Same Terrain

Trace how search rules navigate the same scored coffee landscape. From the default start, Greedy climbs to the local peak at 11101 and stops because every one-bit change is worse. Annealing can sometimes accept a temporary loss and reach the global peak at 01001.

Walker Log

Empirical Insights

Run the batch simulation to compare where each search rule tends to stop on this landscape.

Search dynamics

Search Rules on the Same Terrain

The landscape is coming into view.

The Visualization Is a Projection

Plot the 32 coffee configurations on a chart: x-axis = number of ON features (the Hamming weight), y-axis = the brewer's taste score.

Hamming weight 0 means every position is OFF: light roast, coarse grind, cool water, no milk, pre-ground. Hamming weight 5 means everything is ON: dark roast, fine grind, hot water, milk, freshly ground. In between, Hamming weight 3 contains $\binom{5}{3} = 10$ different recipes: some that scored high for our brewer (dark roast, fine grind, hot water) and some that scored low (light roast, coarse grind, milk). All 10 stack in the same column.

That chart looks like a landscape. It has peaks and valleys. It is also misleading.

Two peaks that appear separated by a valley in this projection might be connected through the actual 5-dimensional configuration space. A path that flips one bit ON and another OFF at each step maintains the same Hamming weight while climbing in score. The projection collapses that entire path into a single column, making the connecting ridge invisible.

The distortion grows with $n$ . For $n = 12$ , the space has 4,096 configurations compressed into 13 columns. The middle column alone, Hamming weight 6, contains $\binom{12}{6} = 924$ configurations. Those 924 points, each with its own score, stack on top of each other. The 2D picture destroys almost all structural information.

A landscape visualization is a projection, not the landscape itself. The actual structure lives in the configuration space $X$ , the neighborhood $N$ , and the value function $f$ . A two-dimensional chart cannot represent their connectivity faithfully. When you encounter a landscape visualization, ask: what dimensions were collapsed to make this picture, and what structure did the collapsing hide?

Classify

You plot your landscape as a 2D chart (Hamming weight vs. fitness) and see two peaks separated by a deep valley. Does this prove the valley is uncrossable in the actual landscape?

A 2D projection compresses the configuration space into a single axis (Hamming weight), which collapses many structurally different configurations into the same column. Two peaks at different Hamming weights might be connected by a path that flips one bit ON and another OFF at each step, maintaining the same Hamming weight while climbing in fitness. That path would be invisible in the projection because every step stays in the same column. The true connectivity of the landscape lives in the full-dimensional space defined by $X$ and $N$ ; the projection cannot represent it. A landscape visualization can be useful as a summary, yet it should never be mistaken for the landscape itself.

Break Your Own Model

You have a working toy landscape: a configuration space, a value function, a neighborhood, and search dynamics. You can change any one component and watch the consequences. That ability to run controlled experiments, where only one piece of the tuple changes at a time, is the central skill this module builds.

What did you leave out? The coffee model has 5 positions with 2 states each: 32 configurations total. Real coffee involves continuous grind settings (Turkish, espresso, drip, French press, and everything between), water temperatures measurable to the degree, brew times from seconds to hours, and dozens of excluded variables: bean origin, roast date, water mineral content, altitude, humidity, cup material. If the optimal cup depends on interactions between brew time and water mineral content, the model is blind to it.

Does complexity mean ruggedness? You might assume that a system with many variables produces a rugged landscape. The number of variables determines the size of the configuration space; ruggedness comes from interactions among those variables. A hundred independent variables produce an additive landscape with a single peak: large, but smooth. Five variables with strong epistasis can produce a landscape with many traps: small, but rugged. The distinction matters for strategy: additive landscapes reward persistent, methodical optimization; rugged landscapes reward exploration, restarts, and tolerance for temporary regression.

Where does the visualization mislead? The Hamming-weight plot collapsed 32 configurations into 6 columns. Actual landscape plots of real systems collapse millions of configurations into even fewer dimensions. The visualization is always a shadow of the structure.

What happens when the landscape changes? The taste function is fixed: the same recipe always gets the same score. In reality, preferences shift: strong, simple, and fast at 6 AM; nuanced, aromatic, and leisurely at 3 PM. When the value function shifts over time, the landscape becomes a seascape (Mustonen and Lässig, 2009), and peaks that existed when the search began may no longer exist when it arrives.

Which tuple component would you change first? More positions? A ternary alphabet (0, 1, 2) instead of binary? A value function trained on actual tasting data instead of hand-scored? Higher-order interactions beyond pairs? A time-varying scoring rule? Each of these extensions is the subject of a later module in this course. Mount Fuji explores what happens when the value function is purely additive. Rough Mount Fuji asks how much noise an additive signal can tolerate. N-K makes the density of interactions tunable. Seascapes introduce time-varying terrain. The toy you just built is the simplest possible instance. Everything that follows adds structure, complexity, and realism one component at a time.

The four-component tuple shows you structure by choosing what to ignore.

The structural trap that stalled the coffee recipe at an 8 is the same structural trap that stalls antibiotic drug design at a locally optimal compound, organizational restructuring at a locally efficient process, and neural network training at a locally minimal loss. The tools this module builds (controlled experiments on the tuple, identification of interaction structure, comparison of search dynamics) apply wherever adaptive agents search combinatorial spaces.

Reflect

Look at the toy landscape you built. Which of the four tuple components would you change first to make it more realistic for the real system you identified at the start of this module?

There is no wrong answer. The question connects the formal structure back to the system you identified in Section 1. If your system has continuous dials (temperature, duration, dosage), you need a richer configuration space. If you care about multiple outcomes simultaneously (taste AND cost, strength AND injury risk), you need a multi-objective value function or a composite scoring rule. If your real changes are constrained (you can only afford one experiment per week, or changing one variable forces another), the neighborhood should encode those constraints. If you rely on expert advice, past literature, or transfer from related problems to jump across the space, your dynamics goes beyond a local random walk. The toy is the starting point; recognizing which component to enrich first is the modeling skill this module aimed to build.

Toy Model

Make one search concrete

Pick a familiar search, like choosing a coffee recipe, a route home, a workout, or a paragraph revision. Make it concrete before naming the landscape: options, score, one-step changes, and the rule that chooses the next change.

Options
Write 3-5 actual states you could occupy: recipes, routes, schedules, drafts, or designs.
Score
Pick one outcome that makes a state better: taste, time, recovery, clarity, cost, or stability.
Next move
Name one reachable change, then say how it gets chosen: cheapest tweak, steepest gain, random trial, or expert guess.

References

1
Stadler, P. (2002).Combinatorial Landscapes

Toy Coffee Fitness Landscape: Build a Complete Search Model

Encode choices

Score interactions

Compare walkers

A Landscape You Can Hold in Your Head

Why a Toy, and Why Binary

Build the Configuration Space

The 32-Recipe Hypercube

The 32-Recipe Hypercube

Score Every Configuration by Hand

The Valley of Bad Taste

The Valley of Bad Taste

Define What Counts as One Step

Trapped in a Local Maximum

Trapped in a Local Maximum

The Controlled Experiment

Path Dependence

Path Dependence

Build Your Walkers

Search Rules on the Same Terrain

Empirical Insights

Search Rules on the Same Terrain

The Visualization Is a Projection

Break Your Own Model

Make one search concrete

References

The 32-Recipe Hypercube

The Valley of Bad Taste

Trapped in a Local Maximum

Path Dependence

Search Rules on the Same Terrain

Empirical Insights