The HADI Hypothesis Board: A Framework for Structured Marketing Experiments
Random A/B tests produce one-off results that die with the campaign. The HADI framework - Hypothesis, Action, Data, Insight - turns every experiment into documented knowledge. A hypothesis board makes that knowledge visible, searchable, and reusable. Here is how it works.
The HADI Hypothesis Board: A Framework for Structured Marketing Experiments
HADI is a four-stage experiment framework for marketing teams: Hypothesis to Action to Data to Insight. A HADI hypothesis board is the visual management system - typically a kanban - that tracks every experiment through those four stages, stores the outcome, and makes confirmed findings reusable across future campaigns. For performance agencies, it is the infrastructure that turns ad hoc testing into a compounding knowledge advantage.
Key takeaways
The structural failure of unstructured A/B testing is that results live in individuals' heads and Slack threads
The same experiment gets repeated from zero across different clients or campaigns months later because no institutional record exists. Structured hypothesis documentation converts one-off test results into reusable knowledge.
Organisations running structured documented experiments improve performance 30 to 45 percent faster than ad hoc testers
Research across marketing and product teams shows this performance gap is not about the quality of individual tests - it is about whether outcomes accumulate into a queryable knowledge base. The documentation discipline produces the compounding advantage.
A HADI board converts one-off test results into reusable institutional knowledge
The insight from a positive UGC test on one client becomes the starting hypothesis for the next client facing the same creative problem. Without a HADI board, this transfer never happens systematically.
The minimum viable HADI record requires four fields: hypothesis, action, data, and insight
An outcome without all four still dies with the project. The insight field - the transferable rule derived from the data - is the specific field most often skipped, which is why most test results fail to inform future decisions.
The most common failure mode is running a positive test and moving on without writing the insight
The knowledge never enters any system that can be queried when the same problem appears on a different account 3 months later. The missing insight field is the gap between a successful test and an institutional asset.
Even structured A/B tests fail when underpowered. 41.4% of marketing A/B tests claim significance with insufficient statistical power; of those, only 28.4% replicate at full traffic. The fix is calculating required sample size before launch, not after. See why your A/B tests produce noise, not signal.
Why unstructured A/B tests fail - and what they cost
The problem is not that agencies don't test. It's that most tests produce knowledge that immediately disappears.
A typical unstructured experiment: an account manager suspects that UGC video will outperform static creative on a Meta awareness campaign. They swap the creative, run it for two weeks, ROAS improves, they move on. Three months later, a different account manager on a different client runs the same creative swap for the same reason - starting from zero.
This is the structural failure of ad hoc testing. The experiment happened. The result was positive. But because there was no formal record - no documented hypothesis, no pre-defined success metric, no written Insight - the knowledge never entered any system that could be queried later.
Research across marketing and product teams consistently shows the same pattern: organisations running structured, documented experiments improve performance 30-45% faster than teams relying on judgment-based optimisation alone, because documented results compound while recalled impressions decay. According to research on lean product development, running even one undocumented experiment at a time increases the risk of repeating failed tests and reduces the organisation's ability to build on prior learnings.
For an agency running campaigns across 10-30 client accounts, the cost of undocumented testing is not linear - it scales with headcount. Every account manager who joins the team starts from zero. Every client who onboards gets experiments that were already run and failed on the previous client in the same vertical.
A HADI hypothesis board solves this by treating every experiment as a knowledge artifact, not just a campaign action.
HADI hypotheses test specific tactical learnings; OKRs commit to strategic outcomes. The two work together: HADI experiments produce the learnings that inform next-quarter OKR setting, and OKRs frame which experiments matter most. For the OKR structure with B2B SaaS and DTC examples, see the marketing OKRs template.
What the four HADI stages produce
Each stage of HADI has a defined output. The board only advances a card when that output exists.
Hypothesis is not a hunch. A valid HADI hypothesis names a specific causal claim, identifies the metric that will confirm or reject it, and sets the success threshold before the experiment runs. "Switching from 7-day click to 28-day click attribution will increase reported ROAS for this account from 2.1 to ≥2.6 without changing actual conversion volume" is a hypothesis. "Better targeting will improve results" is not.
The threshold commitment is what makes HADI intellectually honest. It prevents the most common testing failure: evaluating a result as positive because something improved, even if the specific metric you were testing did not hit the target.
Action is the documented experiment design - produced before any campaign change is made. It records the control state (current campaign configuration), the variant (the specific change), the minimum run duration, the minimum spend threshold, and the decision date. The decision date is the most important field: it is a fixed calendar date on which the experiment will be evaluated and closed, regardless of whether the result looks ready. Without a fixed date, experiments get evaluated when they look favourable - which is confirmation bias built into the process.
Data is the record of what actually happened: the primary metric result, total spend and impressions during the test window, any external confounds (platform algorithm changes, creative refreshes, budget edits), and the binary verdict - confirmed, rejected, or inconclusive. The verdict applies to the hypothesis as written, not to the general direction of results. A test that moved the target metric from 2.1 to 2.4 when the hypothesis required ≥2.6 is a rejected hypothesis, with the actual result noted. This precision matters when you read the record six months later.
Insight is the transferable learning - written for a future reader who was not present when the experiment ran. It answers four questions: what happened (one sentence, the metric result), why it probably happened (flagged as interpretation), which other client types or verticals this finding is likely to apply to, and what follow-on hypothesis this result suggests. The Insight entry is what makes HADI compound: it converts a single experiment result into reusable intelligence.
HADI hypotheses are tested through campaigns, which need briefs. A campaign brief documents the hypothesis being tested, the audience targeted, the channels deployed, the budget committed, and the success criteria - making the campaign itself an experiment with measurable outcomes. For the brief structure, see the marketing campaign brief template.
Make the call with the whole picture
Briefs are daily; the understanding compounds.
14 days free · no credit card
How a HADI hypothesis board works - columns, cards, and WIP limits
The hypothesis board is a four-column kanban that mirrors the four HADI stages.
Backlog holds queued hypotheses that have been written but not yet started. A healthy backlog for a single client account contains 8-15 hypotheses at any time - enough to maintain experiment velocity without creating decision paralysis. Hypotheses in the Backlog column must meet the format requirement (falsifiable claim, named metric, stated threshold) before they are created as cards. Vague ideas go into a separate ideation list, not the board.
In Progress holds experiments that have moved from Hypothesis to Action - the experiment design is documented and the campaign change has been made. Apply a WIP limit of 1 active experiment per campaign objective. Running multiple simultaneous tests on the same campaign makes attribution of results impossible. An account managing four distinct campaign objectives (awareness, traffic, conversion, retention) can carry up to four In Progress cards - one per objective - without contaminating data.
Completed holds closed experiments with their full HADI record attached: hypothesis, action design, data, and Insight. This column is the knowledge base. Every card in Completed should be readable by someone who was not on the account and produce a clear understanding of what was tested and what was learned.
Archived holds superseded hypotheses - ideas that were valid at one point but are no longer relevant (a platform changed its algorithm, a client pivoted their product, a creative format was deprecated). Archiving is not deleting: these cards sometimes become relevant again when context changes.
The anatomy of a HADI card
Every card on the board carries a fixed set of fields:
- Hypothesis (the falsifiable claim with metric and threshold)
- Trigger (the data signal or observation that generated this hypothesis)
- Priority (High / Medium / Low - based on potential impact and ease of testing)
- Campaign objective (determines WIP slot)
- Action design (control, variant, duration, spend threshold, decision date)
- Data record (metric result, spend, impressions, confounds, verdict)
- Insight (transferable learning, applicability tag, follow-on hypothesis)
- Status (Backlog / In Progress / Completed / Archived)
The trigger field is often skipped. It should not be. Every hypothesis should trace back to a specific observation: a metric movement, a competitor ad spotted, a client brief, a finding from a previous HADI cycle. Hypotheses without triggers tend to be creative instincts dressed as structured experiments - and when they fail, there is no way to understand why.
The structured experiment discipline most agencies skip
The ICP problem this section addresses: most agencies treat experiment tracking as a reporting task - something done after the fact to explain what happened. HADI inverts this. The documentation work happens before the experiment runs, which changes the quality of the question being asked.
The lean product development research on hypothesis-driven kanban boards identifies one discipline as the single largest predictor of experiment quality: pre-commitment to a decision rule. Teams that write down what a positive result looks like before seeing any data produce significantly more reliable findings than teams that evaluate results after the fact. Post-hoc evaluation introduces confirmation bias at every step - what counts as the test period, which metrics matter, whether external factors explain the result.
For performance agencies, the pre-commitment rule has a practical corollary: the account manager who ran the experiment should not be the only person who evaluates its result. The decision date creates a forcing function: on that date, the account manager presents the data record to at least one other person (a team lead or a peer), and the verdict is agreed collectively. This one structural change removes the most common source of agency testing error: experiments that "worked" because the account manager wanted them to work.
For agencies building the experiment-tracking discipline alongside client reporting, the marketing analytics for agencies framework covers the broader infrastructure - how experiment results feed into the weekly client briefing.
How confirmed HADI outcomes feed an AI agent's memory
This is the dimension of HADI that most frameworks miss - and the one that changes its value proposition from "organised testing" to "compounding intelligence."
When a HADI cycle closes with a Confirmed verdict, the Insight entry is not just a record in a spreadsheet. In Prooflytics, confirmed hypothesis Insights are promoted to the AI agent's memory - a structured knowledge base that the agent queries when generating daily briefings, anomaly explanations, and campaign recommendations.
Here is what that means in practice. An account manager confirms that extending the Meta attribution window from 7-day click to 28-day click reveals a 22% ROAS gap for this client's vertical. That finding enters the agent's memory with a transferability tag: "ecommerce, high-ticket, 14-28 day consideration cycle." The next time the agent generates a briefing for a client in the same category and detects a ROAS anomaly, it queries the memory, finds the attribution window finding, and surfaces it as a candidate explanation - ranked by confidence.
This creates a feedback loop that gets more accurate over time. Each confirmed hypothesis makes the agent's recommendations more specific and more verifiable. Each rejected hypothesis prevents the agent from repeating the same recommendation when it does not apply.
Failed hypotheses are equally valuable for agent memory. A rejected finding - "extending attribution window to 28-day click did not change ROAS for this client" - tells the agent to deprioritise that explanation for similar client profiles. Without documented failures, AI agents tend to cycle through the same recommendations regardless of whether they have been tested and found inapplicable. HADI gives the agent a verified record of what has and has not worked across the portfolio.
Prooflytics HADI hypothesis testing surfaces this feedback loop in the daily briefing as hypothesis cards linked to campaign anomalies - when a metric shifts, the briefing shows which open hypotheses are relevant and which confirmed findings from similar accounts apply.
The cross-client knowledge transfer that multiplies every experiment
A confirmed HADI finding on one client account has potential value on every comparable account in the agency portfolio. Realising that value requires a deliberate transfer mechanism - it does not happen automatically.
The practical format: a monthly 60-minute session where account managers present 1-2 completed HADI cycles from the past month. The group identifies transferable findings, tags them by vertical and campaign type, and adds them to the agency-wide hypothesis library. Any new hypothesis generated by those findings is added to the relevant client backlogs immediately.
This session converts isolated client work into institutional knowledge. An agency running HADI across 12 client accounts and conducting monthly cross-client transfer sessions effectively multiplies each experiment result by 12 - because a finding confirmed on one account is tested and verified on comparable accounts within the same month. An agency without this mechanism gets 12 unrelated data points that never compound.
For agencies producing weekly client deliverables, confirmed HADI Insights feed directly into the white-label weekly report - giving clients a narrative of what was tested and what was learned, not just what the metrics did.
For campaigns where testing a hypothesis requires knowing whether a channel's traffic is incremental at all, the incrementality testing guide covers the measurement methodology that provides the counterfactual baseline HADI needs for conversion-focused experiments.
Bottom line
- HADI turns each experiment into a knowledge artifact: hypothesis written before the test, binary verdict recorded after, transferable Insight documented for future readers.
- The hypothesis board tracks experiments through four columns - Backlog, In Progress, Completed, Archived - with a WIP limit of one active experiment per campaign objective.
- Pre-commitment to a decision rule (metric, threshold, decision date) is the single discipline that separates reliable experiment results from post-hoc rationalisation.
- Confirmed Insights feed the AI agent's memory, making every future recommendation more specific and verifiable. Rejected findings are equally valuable: they prevent the agent from repeating inapplicable advice.
- Monthly cross-client transfer sessions multiply each experiment result across the portfolio - a finding confirmed on one account becomes a tested hypothesis on every comparable account.
- For the full operational workflow - how to write hypotheses, run cross-client sessions, and integrate results into client reporting - see how to run HADI hypothesis testing for client campaigns.
You can read independent reviews of Prooflytics on G2 and compare it to alternatives in the marketing intelligence category.
Start a free trial at Prooflytics - the HADI Kanban is live from day one, with competitor-to-HADI one-click import and AI agent memory that compounds with every confirmed experiment.
Frequently asked questions
What is a HADI hypothesis board?+
A HADI hypothesis board is a kanban-style visual management system for tracking marketing experiments through the four stages of the HADI framework: Hypothesis, Action, Data, and Insight. Each experiment is a card that moves from Backlog through In Progress to Completed, carrying a full record of the hypothesis written before the test, the experiment design, the data result, and the transferable Insight. The Completed column becomes a searchable knowledge base of confirmed and rejected findings.
How is HADI different from a standard A/B testing process?+
A/B testing is a method for splitting traffic between variants. HADI is the management framework around any experiment - including A/B tests, bid strategy changes, audience swaps, and budget reallocations. The key difference is documentation discipline: HADI requires a falsifiable hypothesis with a named metric and success threshold to be written before the experiment runs, and a binary verdict (confirmed / rejected / inconclusive) to be recorded when it closes. Standard A/B testing has no equivalent pre-commitment requirement, which is why post-hoc rationalisation of results is common.
How many HADI experiments should run at once per client?+
One active experiment per campaign objective. Running multiple simultaneous tests on the same campaign prevents you from attributing a result to a specific change. An account with four distinct campaign objectives - awareness, traffic, conversion, retention - can carry four concurrent HADI cycles without contaminating data. Running two tests on the same objective simultaneously produces inconclusive results that waste the experiment budget.
How does a hypothesis board feed an AI agent's memory in Prooflytics?+
In Prooflytics, confirmed HADI Insights are promoted from the hypothesis board into the AI agent's memory - a structured knowledge base queried when the agent generates daily briefings and campaign recommendations. A confirmed finding tagged with a client vertical and campaign type becomes a candidate explanation the next time the agent detects a similar anomaly on a comparable account. Rejected findings prevent the agent from cycling through recommendations that have already been tested and found inapplicable. The result is an agent that gets more accurate over time as the hypothesis board grows.
What is the right WIP limit for a marketing hypothesis board?+
For most performance agency accounts: 1 active experiment per campaign objective, with a total In Progress cap of 3-4 cards per client account. Research on lean product development kanban boards recommends that teams prefer 3-4 concurrent items to accommodate the longer feedback cycles in marketing (compared to software development), while still avoiding the context-switching cost of running too many simultaneous tests. The In Progress cap should be set lower for small-budget accounts where spending is spread thin across multiple tests.
Make the call with the whole picture
Briefs are daily; the understanding compounds.
14 days free · no credit card
Continue reading

How to Run HADI Hypothesis Testing for Client Campaigns: A Performance Agency Guide
Most agencies run ad hoc tests with no structured outcome tracking. The HADI framework - Hypothesis, Action, Data, Insights - gives account managers an operational system for running experiments across multiple client accounts simultaneously, building a knowledge base that compounds over time.
Marketing Analytics Maturity: 5 Levels and How to Move Between Them
Most marketing teams sit at Level 1 or 2 of analytics maturity -- reacting to dashboards rather than driving decisions. Understanding where your team is and what the next level requires is the fastest way to improve how marketing intelligence gets used. Here is the full five-level framework with practical diagnostic tests.
What Is Incrementality Testing in Marketing? A Practical Guide for In-House Teams
Attribution tells you who got credit. Incrementality testing tells you what actually caused the conversion. Here's how in-house marketing teams measure true ad impact without a data science department.
Google Performance Max Asset Testing: How to Use the New Experiments Tool
Google launched asset experiments for Performance Max in June 2026, letting advertisers compare asset groups, measure individual asset impact, and declare winners based on a secondary KPI. Here is what the tool does, how to run your first experiment, and what to test first.