There’s an insidious force that seeks to undermine your entire testing program. Very smart people cite it as a reason not to run a test at all, while the Google Optimize stats team has adopted novel methods to combat it. This dreaded menace is known as the newness effect.
To picture how it works, let’s say you sell shoes online. And let’s assume that the majority of your converting visitors are also returning visitors. The typical behavior is to visit the site, browse, make a tentative selection … maybe add to cart … and close the tab.
After this, the typical visitor shops around, looks at other shoes, consults with their spouse, maybe waits till payday. They come back a week later.
If you’ve launched a test during the week they spent mulling things over, and they’re bucketed into a variation, they’re now seeing a new experience. The impact that this “Hey wait a minute, things are different here 🤔” sensation has on their buying behavior … that’s the newness effect.
The newness effect works both ways
Does this update register as fresh, intriguing? These returning visitors may convert in higher numbers than the returning visitors in Control.
On the other hand, did you move stuff around in a way that’s disorienting? They may convert at a lower rate.
Either way, for the first week of your test, results are dominated by these returning visitors who’ve seen the before and after.
But starting on Day 8, your returning-ready-to-buy visitors will come from the cohort who first visited after launch. They’re seeing the same experience they saw a week ago. There’s no newness effect.
This means your results will normalize. Inflated conversion rates will drop, deflated rates will rise, and eventually settle. Wherever you land, that’s the conversion rate for visitors who see this experience consistently.
The newness effect is not always a thing
This only affects returning visitors who came to your site before launch, and are familiar enough with it to notice (even subconsciously) what’s changed.
If most of your purchases happen in a single session, or within a day or two of the first session, this is not something you need to worry about.
What looks like a newness effect could also just be noise. It’s to be expected that some actual losers will have higher conversion rates for the first week, and some actual winners will look like losers at first. (An A/A test, or an A/A/A/A/A test, can help you measure this.)
What to do about the newness effect
Figure out if it’s a thing. What proportion of your converting visitors are new versus returning? If you have lots of conversions from returning visitors, how much time typically passes between their initial visit and the conversion?
This period of time is approximately how long you can expect to observe a newness effect.
Do you see much higher (or lower) results after launch, with everything tapering off after the expected time?
If so, you may need to run tests longer. You may need to stop pausing “underperforming” variations early on. You may find this extends your test duration so much that it’s infeasible to run tests.
If you’re in that dire situation, there is still hope. You can use statistical methods to compensate for this effect (see the Google Optimize link above). And if you consistently see positive newness effects, there may be a case for testing personalization on the basis of new/returning visitors. (Test it, please don’t just do it 🙏)
Are you suffering under the newness effect? Want to chat about it? Hit reply and let’s commiserate.