← Blog

Where values enter science

Philosophy of science10 min
upstream-choiceresilience

The lake

A lake in southern Oregon. Two federal agencies look at the same water. The Fish and Wildlife Service sees an endangered species at risk — the shortnose sucker, the Lost River sucker, populations collapsing. The National Research Council sees insufficient evidence for regulatory intervention. Same lake. Same data. Different conclusions.

The disagreement is not about what the data show. Both agencies have access to the same monitoring records, the same population surveys, the same water quality measurements. The disagreement is about what counts as data — what resolution to use, which variables to foreground, what statistical threshold to apply, how to handle uncertainty when the stakes are extinction.

The Fish and Wildlife Service used a precautionary threshold: when a species is declining and the cause is plausible, act before the evidence is conclusive. The National Research Council applied a more conservative standard: the evidence must meet a higher bar before it justifies restricting water allocation to agriculture. Same lake. Same fish. Same numbers. Different rules for what the numbers mean.

These upstream choices — resolution, threshold, scope, what counts as the same measurement — are where values enter science. Not in the interpretation of the data. In the construction of what counts as data in the first place.


The parameters

Before a scientist collects a single data point, five choices have already been made.

What domain to examine — the scope. Where to stand within it — the point of orientation. How finely to look — the resolution. What dimensions to measure — the measurement. And what counts as the same result when repeated — the rules for sameness. Each is a decision. Each involves judgment. Each shapes what can appear as evidence before any evidence arrives.

The DSM makes this visible. "Major depressive disorder" is not a condition the diagnostic manual tries to capture. It is its parameters: five of nine symptoms, present for at least two weeks, causing functional impairment, with exclusion criteria ruling out other causes. Change the parameters — require six symptoms instead of five, extend the duration to four weeks, alter the exclusion criteria — and you do not get a better view of the same condition. You get a different condition with different boundaries and a different population inside them.

Every classification works this way. The instrument is not a window onto a pre-existing world. It is a set of commitments about what matters — commitments made before the first patient walks through the door.

The same is true of every scientific measurement. The resolution of a climate model determines what weather patterns can appear. The significance threshold of a drug trial determines what effects are real. The species concept in taxonomy determines where one organism ends and another begins. None of these are neutral. All of them are choices. And all of them are made before the data arrives.


The dispute

Two kinds of disagreement operate in science, and they are routinely confused.

In the first kind, two researchers look at the same data with the same method and reach different conclusions. This is a variable-disagreement — a disagreement about what the evidence says within a fixed framework. The scientific method is designed to resolve this. More data, better analysis, replication.

In the second kind, two researchers look at the same data with different methods — different thresholds, different resolution, different criteria for what counts as significant — and reach different conclusions. This is a rule-disagreement. More data cannot resolve it, because the disagreement is about how data becomes evidence. The rules that organize observations into conclusions are themselves the object of the dispute.

The Upper Klamath Lakes case is a rule-disagreement. The Fish and Wildlife Service and the National Research Council did not disagree about the fish counts. They disagreed about what level of evidence justifies action — about the rules that connect data to decision.

The value-free ideal — the principle that science should be insulated from non-epistemic values — handles variable-disagreements coherently. Keep your preferences out of the data analysis. But it has nothing to say about rule-disagreements, because it treats the rules as given. And the rules are where the consequential choices happen.

Consider money. Burning a bill destroys one token. Demonetization destroys the value of every token of that type. The difference between changing a variable and changing a rule is the difference between altering an outcome and altering the system that produces outcomes. In science, rule-disagreements are disagreements about the system itself — about which apparatus to use, which threshold to set, which logic to apply. No amount of data settles them, because they are upstream of data.


The test

If values enter science through the choice of parameters, and if those choices cannot be eliminated, then no classification is value-free. The difference that matters is whether its results survive independently of the specific choices that produced them.

Change the instrument. Change the statistical method. Change the framing assumptions. If the result comes back, it was tracking something real. If it collapses, it was an artifact of how you chose to look.

The periodic table survives this test. Change the instrument — move from chemical reactivity to X-ray spectroscopy to mass spectrometry to quantum mechanical prediction — and the elements come back. The classification is resilient. The parameters found something the world supports.

Certain psychiatric categories do not survive this test. Change the diagnostic framework — move from the DSM-III to the DSM-5, shift from categorical to dimensional models, apply the same criteria across cultures — and some categories fragment. The boundaries were being held in place by the apparatus, not by what the apparatus was pointed at.

This is not a purity test. Every classification involves choices about scope, resolution, measurement. The test is whether the result survives when those choices change. A resilient classification tracks structure that persists across different ways of looking. A fragile classification tracks the way of looking itself.

When a classification has been in place long enough — when its parameters have been embedded in training programs, funding structures, insurance codes, institutional habits — its boundaries start to look like nature. The categories feel discovered rather than drawn. Changing the apparatus is how you check. Some of what looked like the world turns out to have been the instrument.


The choice

The deepest version of this argument reaches the rules of reasoning themselves.

Classical logic says: if you find a contradiction, everything follows. A single inconsistency in your premises, and any conclusion is derivable. Other logics — paraconsistent, relevance logics — handle contradiction differently. They contain it locally. They restrict what follows from what. Each has defensible claims. Each produces different results from the same premises.

If reasoning were governed by immutable laws — if the rules of inference were discovered like the rules of physics — there would be one logic, the way there is one set of physical constants. The existence of multiple defensible logics is what you would expect in a domain governed by revisable rules rather than by the structure of reality itself.

Science operates under rules, not laws. The world gives no choices — gravity is gravity, boiling points are boiling points. But describing the world always involves choices about scope, resolution, measurement, inference, what counts as contradiction, when to revise. These choices run through every stage of the work.

The test remains the same at every level. Not purity — resilience. Which results survive when the choices change? Which classifications hold under different instruments, different methods, different logics? What comes back when you change the apparatus is what the apparatus was tracking all along. Everything else was the apparatus talking to itself.

A version of this argument was submitted to a special issue of American Philosophical Quarterly co-edited by Torsten Wilholt. Related work on model transfer will be presented at Leibniz University Hannover in April 2026.