Curation Stinks

Don't @ Me

Sep 25, 2024

Curation Stinks

I have been given a gift, not-so-rarely afforded to those in my position, wherein Ad Tech has created another thing that I think is dumb, but all of you seem to be buying. That thing is “curation.”

In light of that, a disclaimer – I may at some point sell some of you some kind of product that does this. I will not be happy that I am selling it. I will take no joy in it. But damnit, if you want to pay for something that does this, do not consider this article a statement that I will never build it, because I might.

With no further ado, let us explore Curation in the depths of its silliness.

What Is Curation

In simple terms, Curation is a kind-of new form of “bundling.” Bundling means grouping the inventory of multiple publishers together using some kind of logic, and then going out and pitching agencies/advertisers directly on your shiny new “bundle.” The defining logic of the bundle could be anything – certain pieces of inventory, data, sitelists, whatever. You name it, someone is bundling it, and someone is pitching it.

Historically, bundling was done as a standalone platform business model. This is why people refer to curation as the new form of Ad Network – because this was literally the pitch of ad networks for a decade. Have a contextual data set? Great, let’s sign up some publishers, layer the contextual data set on top of it, and close some direct advertiser deals. How about cookie data about users? Awesome, let’s start TACODA, the 2005 Behavioral Ad Network. Unique (or not-so-unique) data has been and continues to be the lynchpin of countless ad network businesses.

But now it’s 2024, and appropriately, bundling in its modern form uses the favorite mutant offspring of the Ad Tech middle man – the Deal ID. Deal IDs are the foundational technology of PMPs (Private Marketplaces), which have been “Curating” for years. Deal IDs are logic-based labels that are dynamically applied to bid requests by an ad exchange. In direct opposition to bundling of old, where you had a separate technological entity, Deal IDs allow you to bundle within a pre-existing piece of tech that already has inventory integrated with it – the ad exchange. We’ve made a frictionless transition to this for reasons I will describe later.

Deal IDs are a strange creature. They were originally invented for publishers to do their own bundling, and I would argue that this was in a sense “okay,” while still not being optimal. Their initial use was meant to be a semi-direct sold methodology, something in between programmatic guaranteed and open auction. But in truth, the notion of a publisher needing to label something that was typically already readily available for targeting to DSPs always rubbed me the wrong way, often tacking on some pulled-out-of-a-hat pricing to sweeten the inefficiency deal. Don’t get me wrong, I’m sure there were some people who used them properly and had a good reason for doing so, but over time they became progressively more and more meaningless.

I think the reason that Deal IDs became so fraught is because they’re by design adding an inefficient component to programmatic auctions. This is because a Deal ID system is actually a targeting system – wherein the system user sets up a list of things they’d like to apply this Deal ID to, and then the Deal ID system applies it when appropriate. In the case of modern curation, the targeting system is used to integrate the dataset, whatever it may be, with the bid requests generated by the ad exchange. So the flow is as follows :

Publisher loads website. Prebid runs, sends requests to ad exchanges. Maybe publisher or a publisher vendor passes in some KVPs for targeting, maybe they don’t.
Requests land in the ad exchange. Ad exchange executes a lookup, where they have a BUNCH of Deal ID rules set up each with its own set of qualifying targeting (inventory, geo, data provider, blah blah)
Qualifying requests get the appropriate Deal ID appended to them.
Requests go out to DSPs, with a subset of requests having the relevant Deal ID(s) appended, to be labeled as “PMPs”

The problem I have with targeting in this context is that it’s way too close to the concept of a “campaign” in a DSP for me to believe it’s efficient. In fact, the universal Deal ID troubleshooting email consists of the following “HEY DO YOU HAVE TARGETING TURNED ON IN ADDITION TO THE DEAL ID BECAUSE ITS NOT DELIVERING AND THAT’S PROBABLY WHY.” A DSP campaign contains a list of rules of things for the campaign to target, meanwhile, the deal ID system is executing a…list of rules of things for the deal ids to target.

When things are being done twice in the ad tech supply chain it’s normally the sign of a problem. And not only are Deal IDs watered-down forms of DSP targeting, in their modern form, they’re often hurting more than they’re helping. This is because Deal IDs are arbitrary, middleware-introduced signals that could mean absolutely anything, including nothing at all. It could be “homepage ATF on News Sites” or it could be “users who like potatoes.” It could be a “whitelist” of literally every domain available in that exchange, packaged under a Deal ID because you’ve probably turned off your targeting for Deal IDs and this is a way for an ad exchange to slip mr. advertiser impressions that they don’t want. And this brings us to the krux (I spelled this like this in my first draft and I refuse to fix it) of the issue – we are using Deal IDs badly, and I’m not certain there’s a way to put the toothpaste back in the tube.

And it gets worse. Their original invention, while introducing targeting inefficiency, at the very least meant that Publishers or exchanges were using a shitty way to communicate optimization information to the buy side while keeping 100% of the fruits of their inefficient method. But modern “curation” has added a new component – the Deal ID middle man.

There are a few flavors of these middle men :

Inventory Curation – “we’re going to put websites and ad units into a Deal ID so that you don’t have to build your own whitelists of these things”
Data Based Curation – “we have contextual or behavioral data we’re using to govern when this Deal ID applies”

Where this gets spicy in the modern era is the sneaky-and-gross-but-totally-clever introduction of the third party curation provider. You see, dear reader, exchanges have been scared of becoming pure proxy servers for some time (or, disturbingly aware of the fact that that’s what they are). Given this unpleasant truth, they quite cleverly had the idea “what if we allow other companies to enrich the bid requests we’re sending out, without having to implement code anywhere else, and then they get a vig on the PMP.” In other words, exchanges had the realization they can become an integration point for data, and subtract a revenue share en-route that they then pay to those third parties, while also outsourcing the sales against that data.

This is great for exchanges because they make a vig on their PMPs and have someone else doing sales for them.

This is great for the “curators” because they make a vig on their PMPs, and they don’t have to provide any technology other than “when” to apply the PMP – they can be basically pure sales companies. Plus they can often activate without additional technology on publishers already integrated with the exchanges.

This sucks because the signals going to DSPs aren’t what they could be, and we’ve managed to introduce yet another middle man into the ad tech supply chain.

Per my many articles on trying to fix ad tech, there are certain places where certain activities should occur in the programmatic supply chain. Targeting and optimization are the purview of the DSP. This is a good thing. We shouldn’t all be trying to build targeting systems and optimization algorithms, let some people specialize and get super good at it. Ultimately, I don’t like Curation, or Deal Ids, because I think bundling is bad.

Bundling is bad because the application of a given bundle may contain many different components, some of which may or may not be transparent to the DSP (or, frankly, might be totally value-less and already available without the bundle). This results in signal reduction and signal muddying – where the primary differentiated signal the DSP has is the deal ID, when in reality, the deal ID itself might be composed of signals unavailable to the DSP.

By reducing and muddying signals sent to DSPs, we’re making campaigns perform worse. By making campaigns perform worse, we’re making ad tech worse. Now, the counterargument to this is that some signal is better than no signal – and while that may be true, the argument of “well this makes more money than the alternative so we’re going to let it happen” is the justification for most of the bad things in ad tech. And because so many people aren’t actually optimizing their performance properly, bundling becomes another way to make middle men rich and hoover more money out of working media.

So what’s the solution here? My solution is that any signal that would be used to append a Deal ID should be communicated in its raw form to the DSP, and in doing so, the DSP will more effectively optimize the campaign to the signals that lead to outcomes.

Curation Shouldn’t Exist, DSPs should just be better listeners

I’ve done a lot of complaining about exchanges and middle men in this article, but at the end of the day, I believe the party most responsible for the emergence of curation middle men is the DSP. This is because I believe the right way to do this would be for Publishers to append whatever first (or third) party data they had to their bid requests in its raw form, instead of using a grouped Deal ID proxy, and DSPs should consume that data for optimization.

This would involve some OpenRTB cleanup – you might have a dozen different fields from a dozen different data providers flowing into the DSP, but in the age of artificial intelligence, more data is a good thing and should improve the performance of biddable media. DSPs should always want as much data as possible to back-correlate to campaign outcomes, and if they don’t, then there should be new DSPs who fire up who want to listen to that data and use it to back test versions of their optimization algorithms. Allow me to illustrate this future with an example :

A curation provider provides “MFA-free” deal IDs.

Current Curation Implementation

Curation Provider crawls websites, has a list of websites it thinks are MFA as a result of its 10-point assessment system (i’m being generous here).

Bid request comes in, curation provider reads page URL, replies with a binary “Deal ID or No Deal ID”

All bid requests have normal information associated with them + non MFA requests get DealID:NOMFATKS appended to them

Agency buys Deal. Agency Logic consists of “well, NOMFATKS performs better than Open Exchange Buying, so I guess it’s worth the 15%” (once again, being generous here, it could be that it performs worse for their campaigns but they just really don’t want that MFA)

Future Perfect State Implementation

All bid requests have normal information +

fpd.InventoryData:

AdDensity:1234
AdCount:6
InView:Yes
AttentionScore:75

Basically, we pass each of the components that the Curator used to make the determination that it was MFA into the bid request, and we optimize to those. This means we do surgery instead of chopping with an axe. Theoretically, this could get messy if a publisher is working with a bunch of different data providers, but I would argue this is no different than working with 20 different userID providers, which people do today and nobody seems that upset about.

Now the DSP correlates campaign performance to the components that comprise MFA. Maybe some impressions that would have been purchased before for being “MFA”, now get purchased because of nuance in their scores, or some impressions from “non MFA” sites that would’ve been purchased, don’t get purchased because they have a component score that doesn’t correlate to outcomes.

The tricky part of this implementation is billing – Deal IDs are easy because middle men just get a vig (once again, the plague of all ad tech). This means that to do this the right way we need some prebid work.

Each additional Data Provider needs to be A/B tested on a subsegment of traffic, and performance needs to be measured. Test needs to happen for a long enough time for DSPs to adapt to the signal.
Data Providers can then bill on a usage fee, or on a % of lift, whatever they prefer.

I don’t just think this reform is important, I think it is foundational. Deal IDs are a core thing that holds back open web buying from performing as well as it could. My implementation would mean that we harness the power of open source to build optimization data that isn’t just “as good” as walled garden data, it’s better because of how many different data providers can compete on the lift they can create for advertisers. In this architecture, which is death of 3pc proof BTW, Open Web buying could enter a new era of increasing performance, and DSPs truly compete on their ability to find signal in noise to drive outcomes for their clients.

So yea, I think this would be awesome. Oh, and I’ll let you guys know on Linkedin when/if I have any curation products available. It’ll probably be soon.

Gareth Hates AdTech

Discussion about this post