Sean Ellis Test experiment

Tomáš Veselý - podpořen AI
1 day ago
5 min read

We're building a comprehensive knowledge library about product development as part of our mission. The library is for anyone looking to make better decisions — primarily decisions about product development. Whether you're an inventor, a product manager, or a Chief Product Officer, using the right research methods and experiments increases your chances of building the right things for the right audience. Today we'll introduce the Sean Ellis Test validation method.

When to Use This Experiment?

This method makes sense once a working product already exists with active users, and the goal is to measure how many of them consider it indispensable. It fits these circumstances:

A product exists along with a group of people who genuinely use it, not just people who tried it once.
A decision is needed on whether to invest in growth (marketing, sales) and scaling the product, or return to development first.
Users have experienced the product's core value, have used it at least twice, and did so within the last two weeks.
There's a risk of scaling prematurely, or growth is stalling and churn is rising, and it's necessary to find out whether weak product-market fit is the cause.

Basic Experiment Principles

The principle rests on a single question that measures how dependent users are on the product — asking about disappointment (a negative emotion). The experiment reveals how indispensable the product is to its customers, which is a key argument for assessing product-market fit. The procedure is as follows:

Define the qualified segment. Reach out only to users who have experienced the product's core, used it at least twice, and were active in the last two weeks. This excludes both freshly registered users and those who left long ago, since either group would distort the score. Choosing when to send the survey matters just as much:
1. There's no point sending it right after signup or launch — the user needs enough experience to understand the product's value.
2. It usually pays to let a cohort stay active for a few weeks (for example, reaching out to those who registered at least four weeks ago), and for products with a trial period, surveying users toward the end of the trial.
3. It's also worth running the survey after a major update or pivot, during stalled growth or high churn where it helps reveal whether weak product-market fit is the cause and before deciding on scaling or a funding round.
Choose the tool and channel. The survey is short and can be delivered by email with a link to a form (Typeform, Google Forms) or directly in the app. The key is that it takes one to two minutes to complete.
Ask the key question. The wording must be kept exact — small changes in phrasing affect the results: "How would you feel if you could no longer use [this product]?" with the options:
1. Very disappointed
2. Somewhat disappointed
3. Not disappointed (it isn’t really that useful)
4. N/A — I no longer use the [product].
5. You can add three open-ended follow-up questions: who would benefit most from the product, what its main benefit is, and what could be improved.
Collect enough responses. Around 40 to 50 qualified responses are enough for a meaningful result. More important than the count is the diversity of respondents and the fact that they genuinely use the product.
Calculate and interpret the score. There's only one metric to track: the share of "very disappointed" responses out of all valid responses (excluding N/A). The decision signal is clear — 40% or more means product-market fit has been reached and gives the green light to invest in growth, while a value below 40% means it's necessary to return to product development. It helps to segment the results (paying vs. non-paying, new vs. long-term), because fit can exist in one segment and be absent in another.
Identify the risks. The method doesn't validate true product-market fit, only perceived dependence, and it can produce false positives. Surveying only your most loyal fans inflates the score. Small samples are unstable — a few dozen responses give direction, not statistical certainty. Mixing different segments averages out the signal and hides where fit actually is. The 40% threshold is a guide, not a guarantee, and the numbers themselves should always be checked against actual user behavior.

Real-World Experiment Example

Link to research: How Superhuman Built an Engine to Find Product-Market Fit

In the summer of 2017, the email client Superhuman, led by founder Rahul Vohra, measured its product-market fit using this method. Users received an email with a link to a Typeform survey asking the key question: how they would feel if they could no longer use Superhuman. The result was 22% "very disappointed" — well below the 40% threshold.

Rather than guess, Vohra broke the responses into segments and focused only on those who would be very disappointed. From their answers about who would benefit most from the product, he built a profile of the so-called high-expectation customer — the most demanding customer in the target group. These turned out to be mainly founders, managers, and executives. The team then split its effort in two: strengthening what the fans loved about the product (above all, speed) and removing the objections of more lukewarm users.

To make the improvements measurable, Superhuman set up an OKR with a single key result — the share of very disappointed users — and continuously surveyed new users. Over three quarters the score rose from 22% to 58%, giving the team the confidence that it was time to scale.

What Can Be Tested With This Experiment?

The method's main strength is that it turns the abstract concept of product-market fit into one trackable number while also showing who needs the product and why. Specifically, you can test:

Product indispensability: whether people see the product as a must-have; confirmed by a share of 40% or more "very disappointed" responses.
Ideal customer profile: who the users are that would be very disappointed; their shared traits (role, industry, company size) reveal the target segment.
Core product value: which benefit users mention most often as their main reason for using it; a signal of what to strengthen and communicate further.
Readiness to scale: whether to accelerate growth or return to the product first; decided by crossing the 40% threshold.
Strength of individual segments: whether fit exists among paying users but is missing among non-paying ones (or vice versa); revealed by segmenting the score.
Barriers to enthusiasm: what holds "somewhat disappointed" users back from full dependence; their feedback shows which changes will move them higher.