Ben Labay on SaaS Experiments

February 2021

In this episode, I talk to Ben Labay, Managing Director of CX Experimentation at Speero.

He breaks down how to meet the challenge of doing user research when you’re selling a B2B SaaS product to a highly specialized market.

Listen on Apple Podcasts, Stitcher, Spotify, Google Podcasts, or right here👇

Listen on SoundCloud

SaaS Experiments · B2B User Research with Ben Labay

… or on YouTube

Quotes

“Using the data correctly is a big, big problem with, with user testing data.”
“It’s dangerously salient to have someone’s face and video and voice say something that can persuade you or your superior into whatever you want. It could support your confirmation bias really quickly.”
“They can struggle with their words, they can struggle with their face, they can struggle with their mouse”
“We want to be confident in our decision. We want to be fast in our decisions. Science is merely a tool to try to be fast and confident at the same time in our decisions.”

Transcript

Brian (00:01):
Hello, I’m Brian, and this is SaaS experiments. And today I’m here with Ben Lebay. Ben is the managing director of Speero, which is the CXL agency. And he’s worked there with brands that you know and love, like ADP and code Academy. This guy teaches the become great at voice of customer data course on the CXL Institute online. And you can find him on LinkedIn and YouTube, where he breaks down everything from mobile optimization, eye tracking to mental models that underlie successful optimization. Ben, super excited to have you here. Thanks.

Ben (00:36):
I’m excited to be here and chat about in this case, a kind of SaaS experiments on the B2B side, and I do a lot of work there and, and, and yeah. Loved to rap about it.

Brian (00:47):
Yeah. Yeah, let’s do it. And so I, I know you’ve got a, quite a wealth of experimentation experience, but I kind of dragged you on here today to talk a little bit about the stuff that comes before or that supports or informs the experimentation program. And I guess we broadly classify that as voice of customer research. It’s the thing you teach a course on. And so what motivated me to reach out to you, I guess, and the thing I’m hoping to get here is this, this knowledge that for a lot of B2B SaaS marketing departments, you’ve, you’ve got this customer base, you’ve got this audience that is specialized, right? You’re, you’re selling to somebody like an HR manager or somebody who is in the construction industry and these people’s experiences and backgrounds and motivations. They’re not your guy on the street motivations and backgrounds. So when we think about some of the methods that we might fall at, you know, surveys or user testing, stuff like this, you want to find, I guess, a relevant enough audience to carry out these research methods, but how, I guess, how do you do this? And so I broadly speaking, I guess I just want to check in like, first of all, does it make sense if I, as the marketing director of access company that sells to HR departments, should I do user testing? Should I be thinking about how to recruit and find and gather data from people in my target audience? Is that worth pursuing?

Ben (02:18):
Yeah. Yeah. yeah, absolutely. And, and that can mean a lot of different things. And, and generally the, where I come at it from is, is come from the data science and the research science side of things. And so if there’s questions about if a particular, you know, monetization flow or for particular marketing campaign landing, page messaging if those experiences are gonna work well, how do you measure the efficacy of it? Right. So it’s a, it’s a measurement question, right? And on the B2B side, there tends to be less volumes and more specialized. Like you said, like you alluded to volumes it’s other people that are coming have kind of specific languages in their head, they’ve got specific kind of requirements. And so if you’re wanting to use that data or measure the data in an inappropriate way, then you’ve got, there’s just some nuances on how to set up that measurement system.

Ben (03:13):
A lot more on the B2B side, there’s a lot more qualitative to the game generally. So there’s just less volumes, but, but, and I’ve worked in a lot of the ones that don’t have volumes do AB testing we’ll come in and we’ll do some audits. We’ll do some of the user testing and to get some some of the more, the qualitative more formative types of data points less than summative that, you know, things like the surveying or the AB testing would, would provide. But certainly on the user testing side of things if you’d have a digital experience users test at find friction, find, find some issues with it. Yeah,

Brian (03:53):
You, you said it like, this is my dilemma. I don’t have the volume to just run dozens of experiments on my site because the leads aren’t coming through at that much of a rate. But I do know that there’s much for me to learn about friction or confusion on my digital experience. And so how do I find people that I can trust to, to glean that information from my, maybe I come from a background where with B to C typically consumer goods, user testing is it’s pretty easy. You can just go to a platform like usertesting.com, usability hub, and they’ll just bring you testers and you will get good stuff from them.

Ben (04:30):
Yeah. maybe not so much in B2B. Yeah. let me cycle through some tactical experience on this usertesting.com. Userzoom both like pretty robust, really nice screeners pre-qualifications and, you know, decent size panels. So you can probably find who you’re looking for. Sometimes it’s not enough. And you got to go deeper. We’ve also used some, some user testing services, like Validately. Userlytics those are some pretty good ones as well. Validately in particular has got a weird pricing model, but really robust tech. Those are cool, but when you, when you, when they’re not enough and you’ll, you’ll soon realize that like, Oh, I’m struggling to get those 10 user testers, you know, through that, through that screener, you gotta go through some custom paneling work, and it’s really not as intimidating as you might think. Peanut labs sent these are two paneling services that you can pay like pretty good money for to do some user testing.

Ben (05:32):
Another way to do it is to sort of, if you’re willing to invest, if you do it periodically, you can, you can do it through Amazon MTurk, which has one of the largest US-based panels out there in Europe. The equivalent would be ClickWorker but in the U S side of things, it’s a sloppy panel. So you gotta be careful with it. There’s honey. It has hundreds of pre-qualifications like, are you a cigarette smoker? Do you work in finance? Do you, do you have a MySpace page? Do you ha it has hundreds of those pre qualifications that you can kind of target, and then you can have your, you can survey like a few hundred people to get a smaller subset that fit your, your firmographic and then use that list of worker IDs to be like, okay, there’s 10 of you that I want to get through this like 20 minute, you know, moderated or unmoderated session.

Ben (06:23):
Right. And you’re pointing that tool, that paneling tool over to something like that. These are user testing or Validately, you know, to that, to that, to the other tool that actually does the, the, the usually testing type of type of situation. Right. and and so that’s some like tactical mechanics and a bunch of name drops out there of some of the tools that we use and do that way, but I’ll say as well, like ways to go about it. And there’s different goals with those, those ways to go about it. Like traditional user testing is around observing user behavior. So BR you know, people that don’t know your digital experience, they’re going to trip on your trip wires, cause you got a gauntlet, no matter what, what you look like, you got a gauntlet of information or pages or whatever. If you get a set of common tasks of what people should be doing considering and aligning to common user intents of your audiences, you know, where are they tripping?

Ben (07:24):
Where are they struggling? You’re, you’re watching them struggle. You’re watching them can be confused. You’re watching them use words that indicate they’re confused. That’s what you’re looking for. And you’re not looking for perception or opinions or, Oh, this site looks nice or this sucks, but you can also use those panels for that as well. So for example the, you know, rural counties and standardized survey techniques like system usability score, or even getting like kind of third, you know, so like NPS type of user experience metrics and benchmarking metrics for your site, a real common one out there, the super Q survey that just allows you to like, let’s put a hundred people, not 10, but a hundred through your digital digital experience. Let’s, let’s do kind of a URL based like validation that they went through it and they give feedback. And then they take like a, a standardized survey, like 10 of these liker scale questions and with some open-ended stuff at the, at the backend, and then you, you, you have yourself a set up for benchmarking, right?

Ben (08:26):
Same set up it’s user testing, the need for a panel, the need for, to get like, I need a bunch of HR managers. I need a bunch of people that are looking for construction management software. You know, you’re still finding that panel, but you’re not looking for friction. You’re looking for perception, data points, right? So user testing is about friction observing behavior and this other one goal, this other research method using the same mechanic in a way, but you’re you scaled. And then you can ask them like, how do they perceive loyalty, credibility, appearance, usability. You can get perception data there. Which is pretty cool. So I rambled on there.

Brian (09:08):
No, no, no, this is, this is perfectly, this is where I want to be. Right. I, if I can only recruit this, this magical panel, then I can get friction data from user testing. And I can know that this is legitimate friction, this isn’t friction because the person has no knowledge of my industry and is just pointing out stuff. That’s confusing because they’re unqualified. Like I know it’s real friction. I can take action on that. I can get this perception data from a survey type of UX benchmarking approach, assuming that I can bring the right people to it. And so I think I heard you say that, of course, there’s a whole spectrum of services or options for getting this done. And at the high end, I can just pay someone who does this all day long to go do it for me and tell me what they saw. And assuming that’s not in my budget, then there are some higher end, I guess, software platforms that are pretty robust in terms of how much of this they do for you as far as the recruitment and the setup. I, can you talk a little bit about how you think about, is it just easier to invest in the high end software versus I want to maybe roll my own and I’m going to go work with MTurk and bring my own panel. When does it make sense to do one versus the other?

Ben (10:23):
Yeah, that’s a good question. I think there’s, there’s a lot of people that are using things like user testing.com UserZoom, which can get really pricey to do this sort of thing, especially if you’re wanting to get that perception data, cause you’re needing hundreds of people. Right. And, and that’s where, that’s where it can get a little bit tricky. I mean, if it’s a mainstream product, I mean, it’s still B2B SaaS, but it’s something like Miro. So like my organization needs an online cloud, but, or whiteboard, well, most people can imagine themselves in a business setting and you can ask questions about onboarding into a SaaS tool and you can get them to do that without needing an expensive B2B panel. Right. So you can use sort of a critical consumer panel for that, but when it’s really a, an esoteric or niche niche, niche, like, like a product or service or something like that, like supply chain management or, or developer tools or, or something like that, then, then then that language barrier might be just too much for that, for that set of common tasks to be, it might be too much friction in that language of the common tasks.

Ben (11:28):
Right. So, so there’s a barrier there. So you need somebody that’s not familiar with your brand and this, this is the rub. They’re not familiar with your brand or website, but they know the language and they know the, the ecosystem of, of the product and services to be able to like absorb that scenario and set of tasks and not be sort of hung up on those, let alone your digital experience. Yeah. And I’ll say too, like I didn’t mention one thing that our company CXL has like a lot of different parts of it, you know, there’s the agency, which that’s my realm. There’s the there’s CXL now, which is ju you know, CXL Institute and the training side, but there’s also a copy testing.com, which just got rebranded to winter.com, which is the reason it got rebranded was a pivot specifically for B2B paneling.

Ben (12:21):
And this is doing messaging message, testing, messaging, and positioning testing. So think of it as like user testing. But on single pages specific to messaging message, clarity USB type type stuff, like you know, does the copy resonate? Is it confusing or is it boring or things like that. So it’s getting kind of quantifiable data on that, but they’re trick by the way. And you might’ve sent me up for a question on this a little bit, but their, their truck is a, is an interesting one. I don’t know if PEPs is going to get me to first sharing this, but he’s going around and finding newsletters in different industries to build the panels. So like, you know, if you’re you know, procurement procurement officer in, in in I don’t know, industrial air compressor industry.

Brian (13:15):
Yeah, perfect. Right. That’s I find these people. Yeah.

Ben (13:19):
You find these people, well, you go to the, like you find some kind of industry newsletters and you, and you advertise on those newsletters for looking for a panel for that, you know, to create a panel specific to that industry. And you go industry by industry, you find a bunch of of those industry based newsletters, which are just growing like the weeds, of course. Because newsletters are popping up just like podcasts and you find your panels. And so every time you’d have a customer come in that says, Oh, this is great. I need a panel for this. Okay, well, let’s go recruit. Let’s get, you know, some hundreds of people that are, that are willing to, to speak to that. And then, and then that’s how it grows.

Brian (13:59):
So you, you can recruit, this is really opening for me. You can recruit on the platform if the platform supports it. For example, user testing.com, you can recruit on your own through some, a tool like MTurk, or you can actually kind of go guerrilla recruiting, I guess, and in the same places where you might consider just playing marketing.

Ben (14:21):
So this, that one is, is that griller recruiting is just put it. It was the way that w that winter is, is getting it’s it’s panel people, right. So that’s the way that it it’s finding the right people and making sure that it’s a super rock solid for whatever industry that it speaks to. And, and so, you know, you gotta be careful because again, what I said in the beginning, if you’re going to be doing the, the copy testing or the user testing, you don’t want people that know your trip that know your jargon super, super well, because they’re not gonna kind of have that a non-biased or they’re, they’re going to be biased, right? They’re going to have already tripped on your wires. They’re going to already kind of know where you’re coming from, and you need people that kinda like shake you out of, of what you assume when you’re doing these types of exercises and people that know your stuff just won’t help you do that.

Ben (15:16):
They’re just like you. So, so that’s why, like, you can’t like use hot. Sometimes we’ve done this before, when we really, really struggle. You use a tool, like, Hotjar, you identify new users to the site, you have a pop-up like, Hey, have you ever been here today? And then, no. And then, okay. You know, do you want a $10 Amazon gift card to spend 10 minutes with us? You know, doing this exercise. Yes. Okay. You know, and then, and then you get them like pop off the side, do the exercise, and then they go back on the site. So that’s a true guerrilla way.

Brian (15:50):
This is interesting. I maybe we’ll come back to this topic of we’ve now, given two examples of where you might actually accost someone for participation in a panel where you could have used that opportunity to focus on a conversion, right. On the site with the Hotjar dialogue, and then in a newsletter that’s relevant with advertising. So maybe we’ll come back to the question of trade-offs like, when is it more important to recruit for a panel than it is just to blast people with ads and offers?

Ben (16:21):
Well, I mean, how much is an MQL worth, right. I mean, there’s, there’s one of the funnest examples of you know, the strategy like this this came from gum Cobain w does the consultant, you know, consultancy and invest in all sorts of stuff now, but he was taught, he, he worked for drift for a long time in second segment, before that I believe. And he said that they would use drift to like, ask somebody like, Hey you know, w what’s your, you know, the question that the bot would get to eventually is like, where are you located? And then they would ask, they would use and shipmates to see if, if like coffee deliver delivery was available to them as like a, Hey, would you would you care for a cup of coffee or tea?

Ben (17:08):
And then they would actually like, like, sure, like how do you take it? And then they would deliver it, like, Hey, why you have 15 minutes while you’re waiting for your coffee and tea, do you care to get a demo? You know? And so they would actually ship them, you know, and it would cost them 20 bucks from end to end. But that’s, I mean, in terms of lead costs and acquisition costs, that’s, that’s nothing. So if you think about like, well, let’s do user testing and, you know, to like new users that are coming into my site and considering we can pay you like Amazon gift card, a cup of coffee to assess our situation, ask a bunch of questions. And therefore you’re way more informed about what we do. True. Much more. So you can start exclusive, not mutually exclusive it’s messy in the terms of like, okay, who’s, who’s got, what goal are you a user tester? Are you a marketer? Like what’s your metric, you know, that kind of thing. So so you gotta be careful to not, to be too biased and use that, use that data kind of with integrity, but but otherwise, yeah, you can sort overlap. You’re, you’re set up like that. I’ve seen good things like in those regards.

Brian (18:20):
Okay. Well, so now my kind of my core question to you is just gotten more and more complicated as we’ve gone on, because what I want to get at is just this validation of a potential test subject for a panel participant. Right? What, how do we think about deciding whether someone’s qualified and so totally fair, just to take it from the perspective of one of these approaches, one of these channels for acquiring panel members. Cause it sounds like they’re a little different, there’s the MTurk approach, there’s the targeted newsletter where you can make a lot more assumptions about anyone reading it, all these different areas to potentially find people there’s someone on your site where you hope they’re qualified and aware of the problem space. So in any way, you’d like to tackle it, I guess, what if I’ve never done this before? How should I think about making sure that I get the right people qualified enough people

Ben (19:13):
Going back to the way that you put a question a couple ago you know, do you go for the expensive paneling service or do you sort of go through more of one of these these half ways of doing it? Whether it’s MTurk or a newsletter, like even from your own site or something like that. I think that if you’ve never done it before sticker something, you know, you can buy like a round of credits, like 10 or so from UserZoom or something user panel do it through there, concentrate on not the panel variables, but more so your, your script, your screener variables you know, focus there nailed that, get the muscle memory of actually doing it, processing the data, processing it correctly. Making sure that you’re, you’re, you’re using the data correctly, which is a big, big problem with, with user testing data, by the way, like it’s, it’s in my mind, I think some of the most dangerous type of data and in that it gets misused quite a bit it’s, it’s, it’s dangerously salient to have someone’s face and video and voice is something that’s really powerful that can persuade you or your superior into whatever you want.

Ben (20:30):
You know, it could support your confirmation bias really quickly, where it’s, especially towards those perceptions and opinion types of things where you only should be using it for the hit, you know, friction and behaviors type of stuff. So, so that’s, you know, I would, if you’re new to doing it, concentrate on those types of variables, using a correctly, getting the muscle memory down with what these types of research activities. And then if you’re looking to do it periodically, if you’ve got a user research team that’s that you’re looking that you’re resourcing, it gets it, you know, these tools get quite expensive, so you can get, start getting creative at that point.

Brian (21:10):
Okay. I think I heard you say focus on the, your, your script and your screener question. Did I get that right? So can you tell us a little bit more about how not to mess those up?

Ben (21:20):
Yeah. There’s, there’s a lot of classical stuff on like no, no leading questions. You’re not, you’re not looking to ask them like you know, find the best thing you can do on this website or some kind of leading questions. I can’t think of a good one on the fly right now. You don’t want to lead them. You generally don’t, you know, what you’re looking to do is get them to walk through a scenario, like walk through their day, walk through a decision making process, walks through a common set of tasks, considering a really common user intent. And the user intent is set up with like the usually the scenario and the, and the screener. So the screener provides like the F and a scenario provide that frame of mind. This is what we want to kind of, it kind of, it’s kind of a layered cake game.

Ben (22:09):
We want to get them primed for the subject matter. We want to get them primed for, you know, a day in the life of this type of decision and that you’re about to make. And or this set of tasks that you’re about to come along to you’re this type of person you, Oh, you’ve just landed on a Google search, you know, page that, and you’re searching for this. And, and then you’ve got to jump in, what would you do? And things like that. So that’s the scenario going and getting into frame of mind type of stuff. You know, the common mistakes they’re getting the wrong people and that, that has more to do with getting the intent, right. But also the making sure that they don’t know you, they don’t know your website, they don’t know your brand. They’re, they’re relevant to you, you know, don’t, if you, if you sell, you know, clothing for teenagers, don’t talk to really old people, you know, th those comments, those kind of things, to make sure you get the right people.

Ben (23:05):
And then the, the big, next common mistake is make sure you’re, you’re focused purely on observing behavior around friction and confusion. And I saw a script just this week by John, another agency that we work alongside that does a lot of this work on a giant agency doing really, really good stuff. And they’re about to user test, you know, 20 or so people. And they’re gonna ask questions around opinions and, and perception and, you know, and also some segment types of data. Like what, what, what kind of intent would you have if you were to do this and to put the, put these 20 or so people into these buckets of like firmographics or demographics, which I started my alarm, my data alarm bells started going off and I’m like, like, you gotta be, you can, you can collect this data, but you shouldn’t because you’re going to misuse it.

Ben (24:00):
It’s gonna, you’re gonna have a hard time not looking at it and being influenced by it. But then you, they can make the case and they did well, this is just formative. It just kind of gives us a feel for things that’s when things get dangerous and get good awry. In my opinion, I like to have really surgical goals and, and only collect the data that, that is needed, not collect a ton. You know, when I do surveys, I don’t like to have like six or eight questions. I like to have four questions in a survey. You know, it’s only about motivation. Let’s get fear, uncertainties and doubts in a, another, you know, an exit intent poll. Like let’s, let’s, let’s only ask about what, what motivated you and not what almost held you back and all that kind of stuff. So not cross wire, the goals with user testing, it’s surgically to friction. Yeah.

Brian (24:52):
Well, I know it’s a really good point that the playback of a user testing session is it gets you emotionally. And so you really don’t want to be listening for the wrong things. Is your main objection with the gathering of opinions or whatever with a user testing session. Is it just mainly about the sample size? The fact that I’ve only got a handful?

Ben (25:15):
Actually I published this on LinkedIn last week around, but my main, main mental model for this is knowledge equals experience plus sensitivity. So we want to know, you know, what works on the site. So we’re going to look at a set of data from a user test. That’s our experience. But the question is, is how is that data from the experience sensitive? How sensitive is the data? Is it sensitive enough to draw the knowledge in our equation that all women, you know, think that the website is, is, is lively? Can we, can we really say that from user testing for women? No, because it doesn’t have sensitivity to come to conclusion of that type of knowledge. And, and so again, knowledge equals experience plus sensitivity. This is why we have statistics. It measures sensitivity of the experience. So we can be confident that we can draw conclusions and that the experience is transferable. And therefore we have knowledge.

Brian (26:19):
So let’s talk about the sensitivity of the right kind of data to pull from a user testing panel, which is mainly focused around behavior friction on the site. If I’m hearing you correctly, how do you evaluate say a point of friction that user testing uncovers?

Ben (26:35):
Yeah. So any, any time where people struggle and they can, it can struggle with their, with their words, they can struggle with their face. They can struggle with their, that can struggle with the journey of, they take up, take a left and, you know, they should be taken a ride, you know, that, that kind of thing. So there’s a task and, and you have, you should have in your mind a way to successfully and quickly perform that task. How do they struggle? Right? Like where, where do they struggle and that kind of thing. So that’s, that’s the heart of the, the goal of user testing is to find that struggle.

Brian (27:09):
And so we’re not, we’re not doing math on these results, we’re just observing and kind of making a list and okay, we saw X people take this weird route and we saw why people get confused at this point. And then from there, where would you take it?

Ben (27:26):
Well, before, before rope real quickly on the X and Y that you mentioned one of my, one of my pet peeves is that when people report out on user testing results, they use the words some, or most, or 80%, you know, they’re, they’re using those types of my alarm bells go off. Like, I’m like, don’t, you don’t say most, don’t say, because you’re dealing with five people. Right. You know, it’s three of five. So that’s, that’s what you, you know, the right three of five struggled to find testimonials on the site when they were looking for what other people were saying about the software. So, you know, there you go. So like, you know, two of three people skipped cart altogether, and didn’t see the, the didn’t have a possibility to see the upsells or whatever. What have you. So you’re observing, you’re observing their behavior and you’re sort of documenting like the success rate of that small amount of people that again, struggled and things like that.

Ben (28:27):
And then what you do with that data, you know, user testing is quite powerful to get a list of like JDI type stuff, just to it types of types of things where like, let’s, let’s, let’s see about like raising, you know, the awareness of this link in the header or this, you know, this type of messaging or that type of messaging or things like that, but really what we use it is to triangulate it with other types of data. So is that related to data that we saw from surveying around motivations? Is it related to data we saw from intercept polls that has to do with like fears on certain decent dowels, like triangulating the signal from multiple data points that gives us more confidence that it’s an issue worth addressing, and that we should like prioritize it to either address in a, just a fix or a redesign or or an AB test, you know, if their volumes there. Yeah.

Brian (29:25):
Okay. Perfect. All right. So we can uncover patterns and we’ll start to see trends and user testing figures into that, but we’re not trying to do stats and make assertions based on five, 10, even 20 sessions with this particular tool. Number

Ben (29:42):
One, once you start getting 20, 30, you can start to, to think about some stats. It gets, it gets loosey goosey there. But then definitely when you get to 50 and a hundred, you can start doing some, some, some cool stuff, some benchmarking type of work. And, but even those are dangerous within itself. Like, you know, if you, if you use your quote unquote user tests, you know, it’s not a moderate, you’re not watching a hundred videos, but you’re, you get a hundred people that go through your, go through a particular journey with a couple tasks and to answer some questions in a survey afterwards you know, that’s effectively what an NPS metric is, but you can do it with a panel as well to get some other types of info. Then you can have a data point that is that that is sort of accurate, but not precise in a lot of ways.

Ben (30:31):
And, and there’s those trade-offs there. And so you can use it longitudinally for your own site through time. And that’s what NPS, a lot of people track NPS, and it goes up and down, but you can also do it competitively. So you can, this is the fun part. You can, you can sort of send a hundred people through a competitor’s website and have them take some surveys right up, you know, and they’ve done it for years as well. And what’s, what’s clear what do you, what’s, you know, usability, clarity, credibility things like that. So you can, you can ask all those types of questions.

Brian (31:04):
Okay, perfect. Segue. Let’s go there. Let’s cause I think you’ve been, you’ve made it really clear as far as, you know, if I’m new to this and I want to do some user testing, buy some credits, get good at producing a script and a screener. Just go do some, it’ll be okay. But if you do want to scale, if you do want a panel of a hundred and you do want to do some benchmarking, or you want to send a hundred through your own site, a hundred through competitor site, now we need to potentially save a little money on recruitment. And so tell me a little bit about the kinds of questions or the kinds of screening questions or qualifications that you think about, how, how qualified do they need to be? How do I discover that?

Ben (31:48):
Yeah, there’s a, there’s a given, there’s give and take a little bit, considering a panel provider. Like you, if you add too many qualifications, you you’re going to restrict your pool of, of, of people. And so do you really care if they’re married or not? Do you really care if they, if they’re in like this many age brackets versus like, you know, I think that the demographics are pretty easy. It’s the firmographics that are tough. Like, do I need a senior level it manager? Or can I deal with just anybody that’s in information technology? You know, so there’s going to be up, you know, opportunity costs and, and getting more refined on your panel. And if you go and again, user testing and user zoom, they can get pretty good user MTurk has that sloppy user reported qualifications. And that, and that’s the, that’s the key there user report is you’ve got to do some due diligence and add some dummy questions and some screeners to, to make sure that people are who they say they are.

Ben (32:49):
But you can create effectively create and curate your own panel through the entire can. So it can be quite powerful if I was in house at, at a company, that’s what I would mess with it’s from agency. W we’ve had some clients where we’ve had really good success with them stark, and some words it’s failed miserable, and it sort of continues to, to fail. If I was in house, I would put like a lot of resources to make it good, but we do it so periodically that we keep, you know, it’s, it’s not, it’s not so good. So it’s kind of, it can be a bit hit or miss, but if you’re looking to scale you know, B2B, SaaS, and ideally you you’ve got some budget because you’re B2B SaaS, right. You know, just by nature, think about it throughout a year, what our research plan might be.

Ben (33:36):
Are you going to do like a classic, you know, 10 panel, user testing once a quarter, once every six months and maybe do a benchmarking type of exercise, like a little bit, you know, once a year or something like that. And then, and then you can cost that out. And if it goes down, you know, you can kind of, sort of play with some of those numbers based on how aggressive you want to be with that research methodology. It still might be advantageous to, to kind of go with one of these panel providers. Yeah.

Brian (34:09):
Can you talk a little bit about screener questions? And the, the impression I get is that you’re asking kind of gotcha questions in case somebody is pretending to have a qualification that they don’t have, that you can filter them out. Is that the idea behind this?

Ben (34:25):
Yeah. So what you’re doing is, you know, you potentially, you know, you, you do have some, some screener questions where they kind of get, you know, kicked out of the panel, but you’re, you’re asking questions, not kind of gotcha. But kinda some random ones to find your panel initially at all. So you, you might have an initial survey. We do this with mTOR quite a bit where the survey is just a random consumer survey and it might ask things about what’s your favorite car and what’s your favorite this, and he’s got a bunch of random questions and it’s got one in there that’s around what you care about, which is, you know, do you you know, are you in charge of HR services at your company or something like that? And I’m over simplifying it a little bit here by backing myself up. But, but generally you’re, you don’t want them to kind of go through and say, Oh, I’m going to get paid. If I answer just correctly. Right. You’re wanting to, to really identify the right people. And they don’t know what the right answer is. And so they won’t know what to, what, to, how they won’t know how to answer to get paid, to go to that next level. And so that’s that’s the principle behind it.

Brian (35:41):
Okay. That’s, that’s really helpful. I’m starting to see how this could take shape if I did decide, okay, I want to be doing this year over year. My audience is super niche. My budget is fairly constrained, but I think this is important. I want to invest the time and the energy to, to build out a bit of an interim process function within my team. So we’ve got this first round, very broad screener where you don’t even know what it’s about when you take it. How, how exacting would you be in, I guess the demographic or self-reported qualifications for that batch for just to take that screener quiz.

Ben (36:19):
So you can, you can look at the pre the, the pre-qualifications that are self-reported and say like, okay, let’s talk with people that, you know, they have to have like high income, let’s say they’re above a hundred, a hundred K income. Self-Reported they they’re, they’re in the business administration space. And they’re in charge of buying software. Right. And then you’re, you’re so you’ve, you’ve found it like there, and you’re like, Oh, just a general survey, 10 questions. I can send it to, you know, the inter panel’s going to have, you know, tens of thousands of people that sort of fit those. And then you just ask a set of questions to help refine that group, you know, male or female you know, where are you, what state are you located in? You know, do you deal with like HR payroll, X, Y, or Z, you know, you know, like those types of things.

Ben (37:09):
Okay. Do you, you know, do you work full-time or part-time, you know, you’re just asking them these set of questions. Okay. Thank you very much for your, for your, for your time. And then based on one in particular that you care about, like they deal with payroll specifically, or retirement services specifically, and then you’re like, okay, well, let’s pluck those out. And I’ve got a list of my my inventory IDs or panel IDs. Cause you can do the same thing with some other panels as well. You know, ClickWorker, other ones other parts of the world, I’ve got a list of these IDs. You know, it might be a hundred of them. Those are the ones then that I want to open up another job for. That’s a little bit longer, that’s actually my user testing, 20 minute tasks, and I’m going to pay them, you know, 20 bucks or 30 bucks or something like that to do it. Yeah.

Brian (37:56):
Okay. That’s, that’s super helpful. I don’t think I would’ve ever figured that out on my own. The idea of, I mean, it sounds like you’ve got your ideal customer persona in mind, you broaden out from that a bit, but you keep certain qualifications about like industry seniority. You try to make sure you’re going to get people who are mostly working in that, in that field. And then you just go completely all over the place with questions, including a few that actually matter that actually qualify them. And then from that big batch, you funnel it down to the handful that had the right answers, that, that do the right kind of work, meet the requirements. And, and there you go, that’s your UX benchmarking panel or your user testing.

Ben (38:40):
I’d spend the initial survey, you know, might be 10 questions that real quick the answer, and you spend like a dollar per survey, cause it takes them two minutes to fill out. And so you spent a thousand dollars to get a smaller list of a hundred, like really this, these are who you want to talk to and you know that who they are like you and they’re not lying to you necessarily. So now you open them up to pay him like 30, 40, 50, $60 for that, that longer session, whether it’s moderator on moderated because now, and then you’re going to do that probably periodically, like in six months, you’re going to use, you know, it’s a hundred, but you use 10 of them. So you got to burn like now you’ve got 90 to like hit up maybe, but some of those maybe stop working on MTurk. So you’ve got to maybe refresh your, your panel, but that’s the mechanics and the, the muscle memory that you can kind of set yourself up with.

Brian (39:33):
Yeah. That’s, that’s great. That’s perfect. And I guess that does seem to me like a reasonable step that somebody new to this could, could execute on. I think the way you’ve sketched it out, it’s clear enough what the steps are. It’s clear enough, what they look like. And it’s really helpful that you kind of threw out some numbers to some, some rough dollar estimates to help do a comparison of, okay, well, versus what does it look like to have someone do this for me, or use a platform that might, you know, remove some steps of this process?

Ben (40:02):
Yeah. Yeah. Like, you know, scent.com might, might charge $150 to find one and get them, do it through a panel or something like that. And usertesting.com user zone. I mean, you might end up paying 60, 70, $80, you know, roughly those are some of the scales of that we’re working with and it’s do insert, you can, if you nail your system, you can be confident who you’re talking to. And you can maybe pay, you know, in the 20 to $30 range. Right. And then you start to scale that and those numbers might mean something. But the big part of it is sort of the ownership of the data and the certainty, because you’ve, you’ve sort of curated it and you you’re, you know, who you’re talking to, you know, how to talk to them and think, yeah,

Brian (40:49):
Yeah. And I know I put you on the spot with the example, but just to revisit, it sounded like for initial qualifications, just who gets to take the first broad survey, it’s, it’s more about like income level industry reported, occupation, stuff like that. And with the, the questions that we’re targeting to fully qualified people, it’s more about particular experience like, or the area they work in or the types of tasks that they carry out at work.

Ben (41:22):
Yeah. You’re just, again, a member of the principle is that with MTurk in particular and a lot of these other panels are all self-reported. So you’re, you’re trying to set up a set of questions, allow you to get confident that they are, who they claim to be. And so you’re setting up that survey to dig a little bit deeper without giving them an understanding of what’s the best way to answer. Yeah. Okay. Right. And so, yeah, you might have like a little open-ended like, you know, describe, describe you know, how you use payroll at your work or something, you know, a little open-ended kind of blurb. And if, if they can’t really say anything, intelligent kick them but then you got to read all of that. So it’s, you know, there’s some ways to do it where, and that’s where a little bit the artists that you can set that up and, and yeah, yeah,

Brian (42:10):
Yeah. It sounds like it’s something that for the right person, for the person that you want in your panel will be pretty straightforward to answer, but for someone who’s faking will be immediately apparent that they’re not who they said they are, but that does sound kind of fun, I guess. Okay. this is great. You’ve given me so much here as far as kind of like where this fits into overall voice of customer and where that fits into overall experimentation optimization some good guidelines, some good processes here. Is there anything that I should’ve asked about that we didn’t get to?

Ben (42:46):
I slept in some of my soap box stuff around like proper data usage.

Brian (42:51):
And I didn’t call this out at the beginning, but you know, your, your background is in academia. You’re a researcher since way back. And I think that’s so valuable because there’s a lot of folks in marketing, never me, of course, but you know, a lot of us out here kind of making up this stuff as we go along to some extent some of these research methods. And so like you bring a rigor and perspective that is perhaps lacking in some quarters. So like, yeah. Can you say more about how to not get this wrong, I guess?

Ben (43:22):
Yeah. So that, that is my love of my soap box area. I come from you know, 10 years or so, and, you know, staff research scientists at university of Texas at Austin doing a lot of ecological modeling a lot of climate change modeling work you know, doing stats, publishing papers and, and, you know, research sciences is that game, a lot of fish work. And, and so from the data side of things, I see both sides. I see people like not being rigorous enough and I see data traps on people being too rigorous or applying share misapplying misapplied, precision. So the data that’s really holding people up, holding decisions up people, you know, I see both sides of it. Yeah. So I, I like to talk about that sort of the proper use of data. And we talked, you know, I talked about that mental model, that knowledge equals experience plus sensitivity that really sums up the why behind it. It’s that sensitivity word is like proper use of data. Like how sensitive is it to your goal? How sensitive are, you know, do you need to be, maybe you’re, you know, you’re, you’re, over-indexing on this and like, you didn’t don’t need, like, let’s make faster decisions. You know, you don’t need that much data. So

Brian (44:40):
Yeah. Well, and you said too, I think you talked about being laser-focused with each research method with what are we trying to learn from user testing versus what are we trying to get from this survey? And I, I think that’s a part of it.

Ben (44:56):
Yeah, yeah. That’s to do with the, you know, we don’t use your test to use your tests. We don’t experiment to experiment with all of this stuff to make decisions faster. And we forget that. And, and, and that’s what, you know, if you talk to a user tester, they were like, no, we have to use test. You know, that’s the nail, that’s the hammer and they’re in their hand. Right. That’s true. But we got to realize like what science is for, we don’t do science for science sake. We don’t do, we do it to search for knowledge search for, for truth. And on the business setting side, there’s a strategy, a business strategy, and it was like a big ship. And we got to a point that strategy in the right direction, the data helps make sure that we’re going in the right direction with this, with the strategy, it helps test the strategy itself.

Ben (45:37):
And so decision the faster we can make decisions, the faster we can get ahead of the competition. So not just the speed, the speed and confidence, right? That, that, you know, that spectrum there. So we want to be confident in our decision. We want to be fast in our decisions. Science is merely a tool to, to try to be fast and confident at the same time in our decisions. So like if you break it down like that, and you think about like how we’re using data, why are we use data where we’re using data, what data it is you can start to like, fit everything on a lattice of, of these framework of, of understanding and context for why we’re doing what we’re doing. And this is where I can have a lot of fun in what I do, because it gives me inspiration.

Brian (46:24):
Yeah. I mean, well, we have spoken about, for the most part about panel recruitment, which is a part of user testing and UX benchmarking, which is a part of voice of customer, which is a part of the broader optimization efforts. And so we, we could go on for another couple hours probably, and sort of hit all those points and, you know, what is the right degree of sensitivity to make a decision from each area of that? But maybe this is a segue, I guess if somebody wants more perspective on that. I know you’ve talked about this in a number of places. Is there anywhere in particular you’d point somebody?

Ben (47:01):
Well, my channel with regards to the professional stuff, my voice is on LinkedIn. Definitely follow me on LinkedIn. I have a lot of cool engagement there. And then I taught a couple courses at the CXL one voice of customer, one on statistics. And I’m about to teach one on experimentation program management prepping mind material for that. So you can find me there as well. Okay.

Brian (47:29):
Oh, well, we’ll add those links. Everybody go check out those links. Ben, thank you so much. This has been mind altering as I expected. I really appreciate it.

Ben (47:40):
Yeah. Thanks Brian.

Ben Labay on SaaS Experiments

Listen on SoundCloud

… or on YouTube

Links

Quotes

Transcript

Next time I write something, I'll let you know.