This is another transaction ID post, because this shit is juicy. If you don’t know what transaction ID is, please read my previous post here : https://www.garethhatesadtech.com/p/the-great-transaction-id-debate
After you’ve read that and you’re suffering like the rest of us, it’s time for you to engage intellectually with what prebid has just done – available here https://github.com/prebid/Prebid.js/pull/13800 . If you’re not familiar with reading PRs, this PR makes transaction IDs bidder specific. This means the little ID associated with an auction, which DSPs wanted to use to understand all of the different ways they receive requests for a single auction from all of the exchanges, can no longer be used for that purpose. Ostensibly, this means it can now be used to de-duplicate requests within one exchange and not between them…but then again, not even really that, because exchanges are now solely responsible for grading their own homework and could theoretically fabricate them.
I want to talk through the technical details of why the old version of transaction ID (the one that was consistent across exchanges) could be bad for publishers, because I hear a lot of things bandied about as “reasons” that I’m not sure hold up to deep intellectual scrutiny. By the way, I would be happy to be wrong about any of this – just let me know in the comments with insults and ad homs if I’ve mischaracterized something, or if you’re a real lunatic – post the actual explanation of why I’m incorrect.
I will then conclude with my thoughts on the question behind the question here – because I think this is a turning point for our industry, and right now, we’ve taken a turn backwards. But let’s get on with it.
As a refresher – the old transaction ID passes an auction unique identifier downstream through all supply paths so that a buyer knows a given bid request is from a given auction for an ad unit.
Let’s hear some arguments
It allows for data leakage
The argument here is that transaction ID provides a join key that can be used to synchronize audiences across exchanges. Let’s chart this out
Exchange A
PMP 1 - Auto Intenders
Exchange User ID 1234
Exchange B
Open Market Ad Request
Exchange User ID 5678
In the current state, without cookie syncing one cannot discern that this is the same auction and therefore the same user. However, it’s worth noting, that while these ad exchange user IDs are distinct – each ad exchange needs to do a cookie sync with the DSP. This means that in reality, this request to the DSP looks like this :
Exchange A
PMP 1 - Auto Intenders
Exchange User ID 1234
DSP ID 0101
Exchange B
Open Market Ad Request
Exchange User ID 5678
DSP ID 0101
Therefore, in this scenario the DSP is fully capable of discerning that the open market ad request user is the same one as the auto intender PMP one – and could ostensibly “take” that data.
So why are people worried about the addition of a transaction ID? What data leakage are they referring to? From my understanding, here’s why :
Exchange A
PMP 1 - Auto Intenders
User ID 1234
DSP ID 0101
TransactionID 9999
Exchange B
Open Market Ad Request
User ID 5678
DSP ID 0101
TransactionID 9999
In this scenario, only the DSP has access to “DSP ID” – the join key – the synced cookie – but with the addition of the transaction ID, anybody with access to logs from both exchanges has a join key for the audience. Exchanges and DMPs have all sorts of deals for ID syncs and usage and hashing and permissioning and all kinds of madness – which means that the addition of this join key has the potential to create a data leakage issue via logs for anybody who has the logs. It has been voiced to me that this concern is about curators – and because curators have log access, ID vendors and publishers are worried about their audiences seeping out.
Is this a problem? Probably, for ad exchanges honoring their agreements. Is this transaction ID’s problem? I don’t think so. Exchanges don’t have to share these things with people. They ostensibly don’t share the DSP User IDs today, because if they did, they’d have the same exposure they’d have with transaction ID. So why do they have to share the transaction id?
Transaction IDs will expose flooring strategies
I’ve also heard concerns about price floors, but they normally just consist of the sentence “my price flooring strategy will be exposed,” occasionally with the word “sequential” or “tiered.” In light of this, I’d like to explore what data DSPs already have when it comes to price floors – and what additional data they’d be able to procure about price floors should they have a “unified” view from transaction id.
Right now, DSPs receive user IDs, pages, user agents, and a bajillion other things. This means that even without a transactionId join, their data set looks like this (assuming no ID bridging, which further complicates things as we’ll explore in a moment) :
Page : xyz.com/yay
userID : 1234
User agent : 1310923
Price floor : 1.00
Page : xyz.com/yay
userID : 1234
User agent : 1310923
Price floor : 2.00
Page : xyz.com/yay
userID : 1234
User agent : 1310923
Price floor : 0
In this scenario, the current state of things with no transactionid, the supposition is that the various price floors – even with all other components of the bid requests held constant – result in additional yield. I’m not sure I understand how or why this would happen, but it’s certainly possible I guess.
The argument is then that if DSPs see this
Page : xyz.com/yay
userID : 1234
User agent : 1310923
Price floor : 1.00
TransactionID : 5678
Page : xyz.com/yay
userID : 1234
User agent : 1310923
Price floor : 2.00
TransactionID : 5678
Page : xyz.com/yay
userID : 1234
User agent : 1310923
Price floor : 0
TransactionID : 5678
And can use that transaction ID, the price flooring strategy that’s creating additional yield breaks. I’m not sure if I buy it – but I guess maybe?
Price floors alone don’t seem that compelling to me – because if the DSP is receiving the same user, on the same page, with all of the same attributes, a bunch of times (and has historicals for this) – but with very inconsistent price floors, why wouldn’t they just wait and bid for their valuation? Even without transaction ID, they have a join key – the user id – and they can see all the floors passed at the placement level for a given user. The transaction ID doesn’t actually afford them that much different data, and I’m hard pressed to come up with a “bid low” strategy that works with transaction IDs but doesn’t work with User IDs.
Transaction IDs break ID bridging
The real beast comes in when you add in the concept of ID bridging. When I look at the bid requests above, the thing that really stands out to me is the UserID. It’s a trivial deduplication effort, and trivial to just ignore the price floors, when the UserID is the same across all of the bid requests. Basically, any argument that you could make for using Transaction ID you could effectively execute using UserID with the same logic.
Frankly, user IDs should look like this and this should work. ID resolution should happen at the publisher level. But it doesn’t always – it often happens at the middle man level – and I think that’s why middle men are able to so effectively harvest profits from this (and not necessarily pass those through to publishers BTW).
When you add ID bridging, and don’t have transaction ID, the bid requests to the DSP now look like this :
Page : xyz.com/yay
DSP userID : 8574
User agent : 1310923
Price floor : 1.00
Page : xyz.com/yay
DSP userID : 2634
User agent : 1310923
Price floor : 2.00
Page : xyz.com/yay
DSP userID : 0984
User agent : 1310923
Price floor : 0
If I’m an algorithm, or even a human, and I’m looking at this – this really looks like different people getting different ads. I have two reactions to this – 1. I’m not at all surprised that this is an effective yield optimization technique, 2. I think this is a form of fraud.
It makes more money because different users will have wildly different valuations, and now the variable price flooring strategy can really work.
The reason I think this is a form of fraud is this impression can only be one person. It can only be one DSP User ID, at least that’s how it’s supposed to be in the buyerUID field (which is what I’m referring to with user ID here). This means that someone in this scenario is lying – advertently or inadvertently. I think that person is almost never publishers, and almost always an intermediary.
But we need to ask ourselves as an industry – is this acceptable? At what point does this become willful misrepresentation?
So, DSPs are fed up with this. They want to have a transaction ID tied to this – so now these requests will look like this –
Page : xyz.com/yay
userID : 8574
User agent : 1310923
Price floor : 1.00
TransactionID : 5678
Page : xyz.com/yay
userID : 2634
User agent : 1310923
Price floor : 2.00
TransactionID : 5678
Page : xyz.com/yay
userID : 0984
User agent : 1310923
Price floor : 0
TransactionID : 5678
This join key makes it very, very easy for them to see “bid request A from partner A came in with this userID for the same bid requests from partner B with this other userID.” And then they can drop the hammer on partner B if they want to and also ensure they’re not bidding against themselves on what they think are different users but are actually the same.
Now, is the transaction ID actually necessary? Eh – Kind of. The DSP could also check its logs once it serves to the page and true up the IDs declared in the bid request with the IDs they found in the browser. I would argue this is just as effective for finding ID bridging, and just as effective for dinging supply paths. And if they unraveled the ID bridging by refusing to pay for every ad rendered to a misdeclared buyerUID, they can now use UserID (with other fields) as a transactionID.
So while the transactionID is a valuable datapoint in hunting for ID bridging malfeasance, I’m not certain it’s a necessary one. But I do also think that ID bridging is actually a far larger problem than has been discussed to date.
Downstream of ID Bridging you have Bid Jamming
If you look at every one of my examples, the assumption is that DSPs are going to have the same thing presented to them multiple times, often with radically different valuations each time because of a bunch of different fields within each bid request. There’s a reason this isn’t something mentioned explicitly by publishers in all of the transactionID discussions – and it’s because people tacitly know that this isn’t something present in a healthy market. Basically, the ID Bridging problem gives birth to a bid jamming problem, where bid duplication has the effect of making the algorithm think you have a lot of differentiated inventory when in reality you’re selling one thing. You get to test out a bunch of different values in a bunch of different fields to see what elicits the highest bid from the algorithm, and this crushes it for yield.
The elephant in the room is that there’s a universe where transaction IDs, when leveraged to their full extent, could theoretically help to kill all of these things at once. A DSP could theoretically bid once per transaction ID. That is, assuming they’re capable of maintaining state in real time for “have I served to this transactionID” across like 10 gazillion QPS, which like, maybe?
But if this were to actually happen there are a lot of people who will make less money – a supposition substantiated by the fact that today they indisputably make more money by engaging in bid jamming combined with ID bridging.
While people may cite other things, this is what scares people. Because it’s created so much lift for publisher revenue the notion of losing it is a frightening one.
Conclusion About Stated Arguments About TransactionId
My goal with this examination is not to dismiss the publishers’ concerns with transactionId. Well, actually, it kind of was. But in the course of doing so I hope I also drove home the point that we wouldn’t necessarily need transactionId to solve a lot of the problems that transactionId solves if there weren’t a bunch of things happening at once that back DSPs into a corner. For instance, I think if we successfully killed ID bridging, it would be a lot easier for DSPs to SPO and deal with bid jamming, because user IDs would be a lot closer to transaction IDs. But that’s not happening – and I’m not sure whose fault that is at this point.
This is where we get into the foundation of the “public” sell-side argument. If these problems can theoretically be solved without transactionId, why do we need it? Why not just do what gareth says and fix id bridging and then build your own strategies for bid jamming?
And I understand that, I really do. However, the notion that taking away something that makes auctions more efficient (transactionId) could be good for the long term health of independent programmatic is a hard pill to swallow for anyone who believes in the future of the industry. Furthermore, I’d be remiss if I didn’t lament the lack of public debate about this outside of my articles and a few slack channels that most of the industry doesn’t participate in.
What We Should Actually Be Debating
But this is all for the “public” argument. I think there’s a far more consequential “private” debate that we should be having – and that’s “is bid request duplication sustainable?”
I don’t think people spend a lot of time thinking about this. Smart people know if you do it more, you make more money. But just because smart people can tickle the ivories of programmatic to make more money doesn’t mean that this is a “sell side” desirable thing. What’s good for the goose isn’t necessarily good for the gander, and not all publishers were created equal. Just because you’re a publisher doesn’t mean bid jamming necessarily benefits you – in fact, if bid jamming becomes an optimization vector, and you’re not a master of it, you’re being harmed by it. Spend is being routed away from you if you aren’t doing it as aggressively as possible today. Transaction Id therefore benefits publishers who are not bid jamming beasts, and hurts publishers who are. I would expect the publisher debate about this to be more robust, which is kind of why I find it surprising that this is a “DSP vs publisher” debate and not a “bid jammer vs not bid jammer” debate. DSPs are not going to suddenly “spend less money” because transactionId allows them to deduplicate bid streams. That’s as bad for them as it is for publishers!
Even beyond that, and beyond the arguments that have been publicly discussed, the problem here is a fundamental independent ad tech one. It’s that we are running our industry like Jon Hamm in Mad Men. In writing this article I remembered that originally people managed sales conflict by obfuscation – by hiding what people were actually buying from the buyers. I think this mentality hangs over to today – where people think “information should command an additional arbitrary price,” and that somehow there’s more money to be made when you withhold data.
I have news for you. This strategy has failed. Programmatic is not gaining marketshare, it is not growing, and it is not trusted. Programmatic’s competition, the walled gardens, do not withhold information from themselves. From their clients? Maybe. But not from their algorithms.
We have enough challenges to overcome – it’s an own goal to keep trying to run our business like a 1950s advertising agency <> magazine publisher negotiation. There’s a simple heuristic to live by in programmatic – More Statistically Significant Data is Better. Full stop. I’m a believer that sharing data will result in everyone making more money – even if some buyers are able to get things cheaper.
Programmatic is a 30bn business. Facebook is over 100bn for display banners. There’s a lot of money that can come our way if we can make our channel actually work instead of infighting over the scraps of the walled gardens. And the buy side of the programmatic ecosystem has its role to play in this – if more data doesn’t lead to more advertising dollars and therefore more competition and money for publishers, then their home’s foundation will erode out from under them. I think they know that.
There are certainly people who benefit from obfuscation. We need to decide if the obfuscation is worth stifling the growth of our industry.