Skip to content

When to Start A/B Testing (VWO vs Optimizely vs Intelligems)

Here's the most expensive mistake I see in conversion optimization, and it has nothing to do with the tool you pick. It's running an A/B test,...

The Sellarix team · 9 Jun 2026 · 5 min read

Here's the most expensive mistake I see in conversion optimization, and it has nothing to do with the tool you pick. It's running an A/B test, watching the variant "win" by 8% after four days, declaring victory, shipping it, and then quietly wondering why revenue didn't budge. The test wasn't a win. It was noise wearing a costume. The uncomfortable truth is that most stores don't have enough traffic to test the things they're trying to test. And the second the math doesn't work, the prettiest A/B testing tool in the world becomes a random number generator with a dashboard.

Why I care about this so much

I got burned by this early. I ran a button-color test, called it after a few days because the numbers looked great, rolled it out, and saw exactly nothing. When I finally ran the sample-size math properly, I realized I'd have needed weeks of traffic to detect a real change of that size. I'd been celebrating a coin flip. So before I talk tools, I want to talk about the gate almost nobody checks first: do you even have the traffic to test?

The traffic math, in plain English

A/B testing is just statistics, and statistics needs sample size. The number you need depends on three things: your baseline conversion rate, the minimum lift you want to reliably detect (the MDE), and your confidence threshold (95% is standard). The rule of thumb people quote for a highly reliable test is roughly 30,000 visitors and 3,000 conversions per variant. That's the gold standard, not the floor. As a practical minimum you want at least 1,000 visitors per variant before any result means much. And the smaller the change you're trying to detect, the more traffic you need. Chasing a 2% lift takes dramatically more traffic than chasing a 20% one.

Chart: Approximate visitors per variant needed at different baseline conversion rates to detect a 10% relative lift at 95% confidence and 80% power.
Chart: illustrative sample sizes (author's calculation using a standard two-proportion power formula, 95% confidence, 80% power, 10% relative lift). Treat as estimates; run your own calculator on your real baseline. Sources in the appendix. Here's how I translate that into a decision. If you're doing under ~1,000 conversions a month total, classic on-page A/B testing is mostly a waste of time, and you should test bigger swings (whole page redesigns, radically different offers) or skip A/B testing for now and just ship informed bets. Once you're comfortably past a few thousand conversions a month, smaller, sharper tests start to pay off. The general guidance is to run any test for at least 1-2 full weeks (ideally 2-6) so you capture weekday/weekend behavior, and to lock your sample size before you start, not peek and stop when it looks good.

The three tools, and who they're really for

Once the math says go, the tool matters. These three get pitched at the same buyer but they are genuinely different animals.

Tool Best for Price (entry) Stats engine Who it fits
VWO All-round CRO: A/B, multivariate, heatmaps, surveys \~\$190/mo Growth, up to \~\$1,049/mo Pro Bayesian (SmartStats) Mid-market teams wanting one CRO suite
Optimizely Enterprise experimentation at scale, server-side No public pricing; \~\$36k/yr minimum, often \$50-75k/yr Frequentist sequential (Stats Engine) Large orgs with dev + data teams
Intelligems DTC price, shipping, bundle & margin testing on Shopify \$99/mo content; \$499/mo profit; \$999/mo Blue Profit-per-visitor focused testing Shopify DTC brands optimizing margin

How I'd actually choose

If you want a Swiss-army CRO suite and you're not enterprise, VWO is the comfortable pick. You get A/B, multivariate, heatmaps, and session recordings in one place, its SmartStats Bayesian engine is friendly to read, and pricing is sane for mid-market. Note that VWO tightened its free tier in late 2025, so don't plan around the old free plan. If you're a large organization running experiments across web and server-side, with a data team that wants statistical rigor and feature flagging, Optimizely is the heavyweight. But go in knowing what you're signing up for: there's no public pricing, the sales cycle can run 4-8 weeks, and you're looking at five figures a year minimum. For a 500k-visitor store running a handful of tests, people report Optimizely landing around $50-75k/year versus a few hundred a month for VWO. That gap only makes sense at real scale. And then there's the one I'd point most Shopify DTC founders to first. Intelligems built its whole business around price and margin testing, which is the thing the other two largely don't touch. It tests actual prices, shipping thresholds, free-shipping bars, and bundles, and it reports on profit per visitor rather than just conversion rate. Optimizely will tell you which button converts better. Intelligems will tell you whether selling at $48 instead of $45 made you more money after the conversion drop. For a margin-obsessed DTC brand, that's a completely different and often more valuable question. Its entry tier at $99/mo also makes it far more approachable than Optimizely's $36k floor.

The takeaway

Pick the tool second. Check the traffic first. If you can't clear roughly 1,000 conversions a month, test big swings or hold off, because small tests on thin traffic just manufacture false confidence. When you've got the volume, match the tool to the job: VWO for all-round CRO, Optimizely for enterprise-scale experimentation, and Intelligems when the question is really about price and margin, not button color. So before you open a single tool, here's what I'd ask yourself: run your numbers through a sample-size calculator. Do you actually have enough traffic to detect the win you're hoping for, or are you about to celebrate a coin flip?

Sources

Sustituye seis herramientas por una

Unete a la lista de espera para ser de los primeros, o solicita una demo.