Synthetic data is as good as real – next comes synthetic strategy

June 14, 2024 12:19 pm By Mark Ritson

Synthetic data is as good as real – next comes synthetic strategy

AI customers can now deliver a 95% match to real survey results, which will ultimately feed a fully automated process of marketing strategy and execution.

I’ve known Jon Lombardo and Peter Weinberg for almost 10 years. For most of that period, they ran the now legendary B2B Marketing Institute at LinkedIn. Their office was atop the Empire State Building and every time I made it out to New York City I would always put in some LinkedIn time to hang with the boys and talk B2B.

The very first time I visited them, we worked late and then collapsed out of the elevator into Midtown in search of food and booze. It’s not the best place to find the best place and we eventually grew desperate somewhere near Grand Central Station. “There’s an oyster bar under the station,” Pete (or Jon) said with a hunch. So we stumbled into this incredible basement restaurant with good wine and amazing seafood, and the distinct feeling that Cary Grant would turn up at any moment in his suit from North by Northwest and order a martini. All kinds of trouble ensued.

That spot became our annual hangout. Once a year we’d convene there to talk shop and catch up. And it was there that Jon (or Pete) told me that they were leaving LinkedIn. “You are both fucking morons,” I said supportively through a mouthful of chardonnay and Wellfleet oysters.

And I meant it. Both had great jobs. In a great company. That paid them well. And allowed them to do pretty much what they wanted to do. And valued them for it. Pete’s (or perhaps Jon’s) wife was newly pregnant. The whole thing was bogus. “Morons,” I said again, draining my glass.

Then Jon, it was definitely Jon, got out his phone and showed me something. It was a simple attempt to generate customer data without customers. That evening, under Grand Central, I got my first whiff of synthetic data. Generated by creating consumers using AI and then asking these newly formed neo-consumers both qualitative and quantitative questions. I scanned the data and then asked the obvious question that the duo would subsequently be asked dozens of times: “How close is it to real data?”

The two looked at each other, then looked at me. Both huddled up over the table. I had a sudden feeling I was in All the President’s Men or something. In hushed tones Pete, or Jon, or possibly both in unison, said: “Really fucking close.”

“You’re both geniuses,” I said. “You need to leave LinkedIn immediately.”

Last night we were back under Grand Central and out of control once again. Pete and Jon have come a long way in a year. They have assembled a team. Built and refined their process. Generated seven figures of revenue from a who’s who of big US brands. And last night they exited ‘stealth mode’ (way less exciting than it sounds) and launched their new company, Evidenza.AI.

Synthetic customers

I’m a very minor shareholder in Evidenza so I need to declare my interest before I go on. And you also need to remember that I lack market orientation, given I love the boys. But I would have written this column regardless of my interest. Evidenza is the real deal. And it’s already generating extensive amounts of data for a whole bunch of clients. There is some B2C work being done for a couple of very notable, very surprised clients. But the real goldmine is in B2B, where clients typically spend hundreds of thousands to get a wobbly sample of C-suite decision makers or – more usually – fly blind without any quant data at all.

For these B2B clients, Evidenza is now able to deliver three things. First, data where often there was none. Second, data that would have taken many months to collect but now takes hours. Third, data that usually costs high six figures but is now available at a fraction of that cost. Of course, your question is the same as mine a year ago, sitting shitfaced under Grand Central: how close is it to the real thing? Pretty fucking close remains the answer. And it’s getting ever closer. The experience of consulting and audit giant EY is perhaps the best way to demonstrate it.

EY Americas’ very well respected CMO Toni Clayton-Hine is not an easy woman to please. She admits she was initially sceptical about Evidenza’s claim that its synthetic data could provide EY with a treasure trove of B2B data. So she sent it EY’s annual brand questionnaire. It is an annual survey conducted on only the most senior executives at the very largest companies which measures general leadership sentiment and their perspective on EY. Clayton-Hine provided the questions and asked Evidenza to provide its synthetic answers. Ones she could then compare to the newly collected human sample she had just received.

The era of synthetic data is upon us. It’s bad news for market research companies.

Evidenza went to work. It created synthetic customers to match the profile and size of the sample Clayton-Hine had worked with for her actual survey. Then the team asked the synthetic customers the qual and quant questions and sent back their findings. EY’s CMO was dumfounded. “It was astounding that the matches were so similar,” Clayton-Hine told AdWeek on Tuesday. “I mean, it was 95% correlation.”

For a fraction of the cost, in a few days rather than months, Evidenza had proved its worth. And the 95% correlation does not necessarily mean that synthetic data has 5% of ground left to cover. It’s more likely that the inherent sampling errors, subject distractions and signalling biases mean it is the human subjects who were off the pace. Actual executives get bored (quickly) with a survey of more than 20 questions, for example. Synthetic customers never falter.

The era of synthetic data is upon us. It’s bad news for market research companies because they are likely to experience a steam-train decade of oncoming paranoia, pressure and partial obsolescence. Naïve marketers might expect these firms to adopt and adapt to synthetic data but, of course, the skills and competences required to talk to humans and program computers are entirely different and probably prohibitive. The incumbent research firms will huff and puff for the next few years but there is relatively little they can do about it. Expect a phalanx of objections in the months and years ahead. But expect them also to pass without much impact. If we know anything from the dodgy programmatic revolution that took hold a decade ago, it’s that marketers will always prefer a technological black box over anything. Research included.

Focus groups on demand

It’s not just that synthetic data is cheaper and faster and potentially more accurate. It’s that it opens up the ability to do mind-bending things that simply are not possible with traditional organic subjects. I had a piss-my-pants moment in May when Evidenza sent me some synthetic data for my own Mini MBA product. They’d surveyed my target customers in 10 countries and for each produced a list of all the category entry points and how likely it was that Mini MBA could win in each one. That was amazing enough, but still possible with organic data if I had £300,000 and the patience of Solomon. The killer moment was when Jon (or Pete) asked if I’d like to chat with Helen about one of the category entry points.

“Who is Helen?” I asked, pointing at the attractive 40-something woman looking patiently at me from the screen. I genuinely did not know but had that creeping ‘holy shit’ feeling that is usually now absent in the lives of 50-somethings like me. Helen, it turned out, was one of the synthetic customers currently going through the category entry point of a job change. I was able to ask how it felt, what she was thinking of next and which training options she was considering to upskill for the job search ahead. It was mind-boggling and an illustration of the more fluid synthetic experience that enables marketers to go back to the data repeatedly, often in real time, to learn more.

Despite these inherent advantages, Evidenza is not aiming to be the world’s biggest synthetic data firm. Market research is a hard, low-margin, limited-impact sector – ask anyone who works in it. They are good men and women trapped working in a tough category. Once the synthetic data-generation process has been perfected, Evidenza will move to the more lucrative, difficult world of marketing strategy.

Once the system can generate its own data, it can use that data to find an extensive array of analyses that produce strategic decisions. Custom funnels. Perceptual maps. Positioning. Clustered segmentation. Distinctive brand assets. Category entry points. ESOV analysis. Budgeting recommendations. Econometric-style analysis with media mix recommendations. Contextual pricing analysis. Briefing. The whole artificial nine yards.

Testing endless options

And with all these different analytical capabilities, Evidenza can then do something even more special. Remember when Dr Strange explored 14 million different universes in The Avengers to find the one possible way to defeat Thanos? Evidenza will eventually be capable of something similar. Not defeating Thanos, running countless scenarios. With all the various spokes of the AI analysis in place, Evidenza can run thousands of permutations in which targeting, category entry points, objectives, positioning, budgets and media mix decisions are all iterated and re-iterated against each other. Imagine 50,000 different iterations with synthetic data providing the expected outcome in each case. At the end of the process, Evidenza should be able to provide a small set of strategic objectives and their expected financial value, and explain to marketers and the rest of the organisation why this is the optimum path to growth.

I do this work currently, in a very shit human way. I think I am pretty good at it. But I will be unable to touch the feet of such an approach. Partly because it is built from the width and depth and interactivity of empirical data I have never been exposed to. Partly because the complete suite of analytics tools gives Evidenza an incredible superiority. And mostly because when I build a strategy for a brand we might, on a good day, consider two or three possible options. Not 400,000.

The implications are enormous. The process ahead is unclear. But the era of AI marketing has begun.

And this is a key point. We keep comparing AI to humans who are expert at their jobs. All of us know that marketers, often through no fault of their own, aren’t particularly good at marketing. They lack the data. They miss the ability to access tools like MMM or pricing analysis to be able to see the world properly. Remember that the best data from Better Briefs suggests that more than 60% of big brand plans aren’t clear on who they are even targeting. This is not the story of AI versus the best of humanity. This will be the story of AI versus a half-assed, often hilariously bad process, which can be replaced by a superior system that works for a fraction of the cost and lead time of its inferior antecedent.

And, of course, once Evidenza has mastered synthetic data and marketing strategy, there is an obvious final part to its artificial triptych. Armed with a clear strategy and the data to assess its progress, Evidenza could connect into all the tactical options. The system will not just come up with the strategic objectives, it can also devise and then develop the digital media communications, portfolio changes, smart pricing and new product development to deliver these objectives. These tactical innovations can be run, tested and improved in real time, in a closed-loop system that Just. Gets. It. Done. The whole thing becomes a fluid, evolving system.

It’s all very ‘flying cars’ at this point. But Pete (perhaps Jon) has a telling phrase that he rolls out frequently. “This,” he tells clients, “is as bad as it will ever be. It will only get better from here.” Aside from the synthetic data, everything I’ve described above is impossible at the moment. But it all appears incredibly tenable in the near future. The analytic stuff like segmentation or category entry points is imminently doable if you know your marketing onions and have access to the data and programming power. Many of the AI tactical options are already being rudimentarily developed, with academic research confirming that things like product innovation and message creation can be done in a manner significantly superior to that produced by humans.

The missing element is time, and it’s the 2030s that will usher in this grand new era of automation. The implications are enormous. The process ahead is unclear. But the era of AI marketing has begun.