Facebook, data sharing, and ad targeting, oh my

General Commentary, Shenanigans Add comments

Apr 272018

I don’t know if you’ve heard, but Facebook’s been in the news recently, specifically around the exposure of millions of users’ worth of data to a firm called Cambridge Analytica. Cambridge Analytica allegedly used this to help various Republicans, including Donald Trump, during the 2016 election cycle. And to hear a lot of people talk about it, that’s the last sign before the start of the apocalypse, or something like that. To be honest it’s been hard to find a calm take on the whole thing, which has been part of the problem. People are shocked at how much data Facebook has on them. They can’t believe that Facebook lets people use this data to target ads to others. Or that companies may use this targeting for political advertising to try to swing an election. Or Facebook was “breached” (everyone else’s word, not the correct one) and this data leaked out. The truth is that while there were some problems with Facebook, and some bad actors at play, we’re focusing on the wrong things here, and it’s inspiring us to hysterics instead of reasoned analysis and reasonable responses.

I’m going to pause here and just make a note that I work for an email marketing company that emphasizes segmentation and targeted marketing. I’ve also written a Facebook application to create custom audiences on Facebook and keep them synced with their source mailing list. None of this requires data from Facebook users, so I don’t capture any sort of profile information, and all the opinions I’m writing here are my own, but it’s probably worth bearing in mind that my employment revolves around targeted marketing and I have done work to help marketers sync some of their “targeting” over to Facebook, so clearly I’m not as bothered by the concept of targeted advertising as some people may be.

Facebook’s role in all of this

Obviously, Facebook profiles, likes, etc. were the source of the data Cambridge Analytica used. That’s not the big issue here. The big issue here is that Cambridge Analytica was able to get the profile data of people who didn’t take the personality test they commissioned because at the time, Facebook was giving away the public profile information of a user’s friends along with the user’s basic profile information. There’s lots of cynical (even Stratechery got a glib comment in about Facebook didn’t end the policy in question until after they had a dominant ad position) and paranoid possible theories about this, but if you listen to interviews with Mark Zuckerberg (even more recent ones), and you very quickly realize that this is a company run by someone who really does believe that the more you share and the more of your information is public the better the world becomes. Lots of people probably have lots of different opinions on that attitude, but the fact remains this is a pretty consistent vision from someone who’s idealistic about sharing more about ourselves to connect to more people making the world a better place.

I’m not saying Facebook was right on that premise – it’s one thing to encourage people to overshare about themselves through the tyranny of the default but it’s something else entirely to empower them to give away other people’s data, even if it was just their public profile information. Facebook has long been overly generous with what it considered, and defaulted to, public. This is just the first time that it’s publicly bitten them in the butt. Let’s also not pretend to be surprised about Facebook’s rather generous definitions of “public” and “reasonable default” – Facebook’s been openly “public by default” for about as long as it’s existed.

Now, before you start saying that this data “breach” is proof that Facebook needs to be regulated to better protect the data it has on people, there’s a few things you need to understand. First, this wasn’t a “breach” of anything other than Facebook’s developer agreement, and that was done by Cambridge Analytica and Aleksandr Kogan (more on them in a bit). You also need to remember that Facebook stopped sharing the ability for API users to get your friend’s public data back in 2015 – effectively self-regulating away that problem long before the last election. Facebook’s may have started with a loose relationship with privacy, but they have improved over the years.

Aleksandr Kogan

It may seem odd that I’m only singling out 1 person by name in this whole affair, but I think he bears calling out. Aleksandr Kogan wrote the personality quiz that actually pulled the data from Facebook. It appears that Cambridge Analytica paid him to write the app and for all the data he pulled from the people’s profiles. Here’s the issue at hand with this – there’s no record that Kogan represented himself to Facebook as a Cambridge Analytica employee or contractor – which meant Cambridge Analytica was a third party he wasn’t allowed to share data with. He also didn’t accurately describe what the data in the app would be used for (allegedly he didn’t know that it would be used to target voters, but that begs the question of just what he thought Cambridge Analytica was going to do with the data – surely that question came up when they were setting up the financial relationship where they were paying him to provide that data).

According to Facebook, Kogan claimed that the data he was pulling was going to be used for academic research, not for commercial purposes. From what little I can find on Kogan, analyzing things such as your “likes” to build a personality profile seems to be his area of research, so I’m sure he used his copy of the data for precisely that purpose. Now the pressing question I have here is this – isn’t sharing personal information gathered for research purposes a violation of professional ethics, at best? What other sensitive data has this guy sold to the highest bidder? Personally, I think if we’re going to declare someone or some organization to be the “villain” of this whole escapade, it should be this guy.

Kogan claims that he’s being used as a scapegoat by Cambridge Analytica and Facebook, but I’m not buying this. For starters, when you develop a Facebook application you agree that you’ll abide by their terms and conditions. Kogan says that Cambridge Analytica told him what they were doing was legal, but at the end of the day, Kogan signed that he would follow Facebook’s rules about what he could and couldn’t do with the data he pulled from the platform. Facebook associates apps on it’s platforms to individuals, not to companies – it was Kogan’s responsibility to know what he could and couldn’t do and it was on him to make that clear to Cambridge Analytica, not the other way around.

Cambridge Analytica

This all leads us to Cambridge Analytica, which is certainly the most…”colorful” player in the whole saga. They’re the ones who paid for the “This is your digital life” app that accessed the publicly available Facebook data, allegedly with the intent of using that data for ad targeting for Republicans during the 2016 election cycle. A lot of the outrage seems to be around a couple of points with this – one being that companies were targeting their ads with high levels of specificity and the other being that they allegedly helped Donald Trump. As for the first point, that’s the whole point of Facebook (and Internet advertising in general) – companies spend billons of dollar per year on Facebook and Google because those companies give the ability to make sure their ads are appearing to the people most likely to respond to them. As for the second point, if you wouldn’t be equally ticked if they were used to help Hillary Clinton, then you don’t really have a coherent argument to offer to the discussion at hand.

There’s a general misconception about companies that offer this kind of targeted marketing, namely that they’re selling user data (thanks for nothing “you are the product” people). For the record, Facebook doesn’t sell user data – they sell advertising placement. For a calm take on this issue as it relates to Facebook ads, see this post (another disclosure – Ryan Cohn was a roommate of mine for a year while I was still in school). They make billons of dollars per year on said advertising placements because they give people the ability to be really specific about who they want to see those ads. What advertisers actually get from Facebook is a form that lets them specify who they do and don’t want to see their ads. They also get aggregated reports on their ads and how they performed, but those don’t have data on any particular user, just buckets of demographic info (e.g. age 30 – 35, 36 – 40, etc.).

Cambridge Analytica got data on Facebook users by Aleksandr Kogan using the permissions people who took his quiz to query Facebook’s API for the data. They got that data in violation of the API access terms that Kogan agreed to (and Cambridge Analytica allegedly talked him into violating), and were told that they were in violation of Facebook’s terms of service and that they were supposed to delete the data they got from Kogan (which Cambridge Analytica allegedly lied about doing). As shady and dirty as they’ve made themselves sound to be – in this particular instance the only real “crime” they committed was improperly acquiring Facebook profile data, apparently deliberately.

“But the Obama campaign did it too”

1 of the immediate responses to the outrage over a company targeting people with political ads is to point out that the Obama campaign did it too. That’s true, but with 1 important distinction – the Obama campaign solicited people to give their Facebook information to the Obama campaign website, whereas Cambridge Analytica asked people to give their Facebook information to an online personality quiz. People knew they were giving their profile information to a political organization in the former case, but not in the latter, and that lack of transparency is important. That said, outside of how the data was collected, the linked article is correct in that there wasn’t a difference in how it was used. It’s also worth noting that at the time, the Obama campaign was also pulling in data about people’s friends, just like Cambridge Analytica, and that this was perfectly allowable by Facebook’s API permissions (again, at the time).

The reality is that for all the moral indignation about how Facebook’s “tracking you,” or all the information Facebook has on people, the fact that people used Facebook to target ads to very specific groups of people not only isn’t news (it’s Facebook’s entire business model, a fact that’s been public knowledge for years). Also, it’s worth pointing out here that there’s absolutely nothing wrong with targeted advertising, regardless of which campaign used it, or who won with it. There are 2 issues at play in this whole affair – the bad actions of Aleksandr Kogan and Cambridge Analytica, and Facebook’s overly generously public defaults.

What’s already been done about all of this

So a few things have already happened as a result of this whole bru-haha, and it’s worth noting them. First, while I’ve mentioned more than once that grabbing public profile data on a Facebook user’s friends was allowed by the Facebook API at the time all of this was actually happening (which was years ago) – Facebook actually disabled that behavior 3 years ago this month (“this month” is April 2018 if you’re that curious).

This presents a problem – Facebook had to testify to Congress and show that it was “doing something about this,” but it’s not like they could fix the issue again. So instead they made several other API changes, mostly removing user details from various endpoints, deprecating several other endpoints, and speeding up their Instagram API deprecation schedule from “later this year” to “we just turned it off an hour ago and totally broke your stuff” (for a better list of exactly what changed, see their breaking changes page). I’m not particularly familiar with the endpoints in question, but as best I can tell a lot of the deprecations make searching, events, and pages less useful to integrate with (my limited experience with the pages API hasn’t left a good taste in my mouth, and that was before Facebook limited the ability to publish tabs to pages so severely that it’s now impossible to test your code without doing so on a public, widely-visible page (oh, Dear Facebook API developers – f*** you).

Basically, Facebook is now exposing less information about users (“Look! We’re taking privacy seriously and not sharing data about people – even though the app would have had to explicitly request this kind of permission, we would have shown the user what would have been shared, they would have had to agree to it, and it would only impact the user signing up for the app!”), but in doing so making the “platform” less valuable to integrate with (fun fact, it doesn’t appear the main API documentation is updated with the deprecated endpoints, hopefully it’ll be better documented in the v2.13 release where presumably the endpoints will be removed entirely). I don’t like devaluing the ability to integrate with Facebook, especially since the real problem wasn’t related to the data coming from these endpoints, and the permissions around them were already pretty reasonable, and accessing those endpoints required a review from Facebook. It’s doing something for the sake of doing something, which isn’t solving the original problem (because, again, Facebook solved that problem years ago), and that “something” is just going to frustrate developers, who likely weren’t behaving badly since Facebook improved their permissions and review process.

At this point, it seems like even Facebook is giving up the pretense that it’s a platform, as opposed to an app with a corresponding API. That’s fine for Facebook, but begs the question of whether or not using their API for anything other than an ads integration or user sign-in is worth the effort. Personally, I think if this is the direction Facebook is moving with their API access, the answer is going to end up being “no.”

This is going to be where we see how much value 3rd-party apps brought to Facebook. If Facebook has enough commercial and staying power that people are willing to only interact with Facebook features on the Facebook site (or in the official Facebook app) itself, Facebook will be fine. If it turns out that the various apps that hooked into Facebook was what made the site so useful to people, we may finally see a significant drop in Facebook usage (although that doesn’t mean another social application is going to take over the market, Facebook still doesn’t have any real competition on that front, and likely won’t for a while, even if these API changes do backfire).

The Facebook/Cambridge Analytica story is one that’s easy to overhype, and has been made out to sound a lot worse than it actually is. The reality is that while Facebook used to have overly-generous permissions, a data analytics firm paid someone to create an app (while lying about the purpose of said app) to collect the data Facebook was offering, and then use it to buy ads. Facebook’s oversharing issue got fixed in 2015, and the reality is while Cambridge Analytica was sleazy in how they got the data, there’s nothing objectionable in what they did with it once they had it. In the years since this, Facebook app reviews and permissions have improved greatly, but sadly, to save face and in a show of public contrition, Facebook broke every Instagram integration on the market, and heavily deprecated their own APIs. In the end, Facebook will be fine – it still owns the social application genre. But the end result of this panic isn’t going to be a better awareness of personal data sharing defaults, but a less useful API, some political grandstanding, and continued ignorance and outrage around Facebook.