Online Privacy and User Data

Thoughts on how big companies are using your data for their own profit and its consequences

Levels of Privacy

When discussing online privacy, it is important to note that there are different levels in which our personal data is available. One the one hand, there is the data we choose to make public. For example, by writing on this blog there are plenty of things that can be learned about me, by anyone. Posting publicly on Instagram, for example, gives anyone access to your photos, etc. This is relatively easy to limit, since most platforms give you a fine control on what can be seen by different groups of people. Limiting, or at least being aware of how much we make public is the first step.

On the other hand, even if you select to share something on Facebook only with your family, for example, Facebook will still have access to whatever you shared. And this is valid for any online system you use. If you send an e-mail, the server you use, be it GMail, Exchange, etc. will know who you write to, who replies, etc. Therefore, the second level of privacy is what happens with the data you generate on given platforms. In principle, the data you generated on Facebook can be sold to other companies and therefore it becomes a concern.

Advertisements and user data

Companies such as Facebook, Google, or Twitter survive only because of the data they accumulate about their users. You may have noticed that on Facebook and Instagram, about 25% of the content you are exposed to are advertisements. The value of an online ad is normally calculated by the number of clicks it receives. It would make sense, therefore that companies try to optimize the number of clicks, since it would mean more profit. Google has been doing this for a long time. If you have ever used AdWords you probably noticed how complex the pricing becomes when setting target keywords.

To optimize the click-through-rate of advertisements, one can rely on data from users. A common example is that if you search for something on google, such as vacuum cleaners there is a big chance that you would like to buy one. Therefore, showing advertisements of vacuum cleaners may seem relevant enough. This is somewhat trivial nowadays, and is a correlation everyone has noticed. On Facebook, if you are a member of a Vacuum Cleaners Owners Group, probably showing you a link to buy one such device is relevant.

However, this approach as explained above has a limit. Google knows what you searched for, Facebook knows what groups you belong to. However, what happens when you leave Google, or when you leave Facebook? The tentacles of these companies have a much broader reach. Several years ago, Google developed a tool called Analytics, which gives website owners the possibility of understanding their visitors. In exchange, the only thing you need to do is give Google all the information (also personally identifiable) you have on your visitors. And they've made it dead-simple, just add 3 lines of code to your pages.

In this way, Google has access not only to what you search on their platform, but also to what pages you visit after you left. How long you stay, how often you come back. What are your porn preferences, whether you cheat on your wife by visiting dating sites. And this was enabled by webmasters who have willingly given Google all that information for free. What Google decides to do with the information is another discussion. So, now you see one clear example of how a company can quickly track users across the internet, even before owning a mobile operating system or a browser.

Facebook, of course, didn't stay behind. And followed exactly those steps. They first introduced the like button. It was a way in which users of your website could share love with you. This was quickly uncovered as a tracking scheme by Facebook and was abandoned. However, they offered an alternative. Imagine you buy advertisements on Facebook, you would like to know which users follow the link, how many go from your website to Facebook, etc. That is why they developed the Facebook Pixel, a tool to track users even outside of their own platform. And webmasters followed the instructions on how to give user information for free to the letter.

In the case of Facebook, another key parameter they want to optimize is the amount of time you spend on their platform. The more time you spend, the more valuable you are. The more they know you, the more content you generate so others stay longer, etc. Therefore, platforms such as Facebook or Youtube use all the information they have on their users to maximize the amount of time spent on their websites and apps. The news you see on your stream, are not randomly chosen, but carefully selected by an algorithm, the suggested videos on Youtube are not selected based on quality, but on likelihood of you watching til the end.

Now you see, pretty quickly how two internet giants can monitor user behavior across the internet: with complicit webmasters who give their own user information back, no questions asked. A healthier internet is everyone's responsibility, and especially webmasters'. The fact that they all aligned behind the same practices just shows how low we have fallen as society.

What else can be done with data

If Facebook and Google would be collecting data only for advertisement purposes, one may argue that in the end it is not that bad. We get a good service in return, and more personalized advertisements, which may give us access to the next great vacuum cleaner for a better price. In 2015 (yes, that long ago), the first Facebook scandal saw the light on The Guardian. Facebook enabled Cambridge Analytica to use the profile of millions of US citizens in order to generate targeted political campaigns. It sounds dystopical, but in the end is not that far-fetched.

The more data you have on people, the more you can profile them. A very silly example would be to gather information about gender and political preferences. Imagine you find in your data that 85% of women and 65% of men would vote the red party (your party). If you want to increase your chances of winning, you should better focus on men, since there is a bigger room of improvement. Now, you would target men even if you have no information about them specifically, just the average behavior.

If you have access to more data, you can start finding much more detailed correlations. You can have access to geographic information, age, social connections. You could target a family group, school district, etc. You may even know to which topics people is more susceptible, such as corruption scandals, global warming, and expose them to content you believe will tip the balance in your favor (or against your competitors). The consequences of this are widely studied, since it is what gives raise to fake news, confirmation biases, etc.

And this behavior by Facebook has not changed after the 2015 scandal. There are many reports regarding how Facebook willingly gave away user information to companies to exploit it for political gain without any kind of overseeing. And if you believe this is bad enough as to grab your attention, it can get worse. Facebook was behind the incitation to genocide in Myanmar. You may argue that Facebook can't control everything that happens on their platform, which could be true, but their algorithms are fine-tuned for their own profit. And in this case it lead to spreading military propaganda against a minority.

And the fixation with optimizing the amount of time spent on a platform is not exclusive to Facebook. Youtube is another great example of how those algorithms can lead to amplifying white supremacist speeches. Since the platform only wants you to spend more time on it, it will show you whatever it deems appropriate for you to watch. However, it has been shown that radical videos are what the algorithm converges to. And this is a platform more popular among younger audiences (i.e. sub-20's).

What can we do

As individuals, who are also consumers of those platforms, we can do our best to stop using them in order to force them to change their direction. Our wallet is the best tool consumers have for shaping the companies with which they don't agree. Remember that the fact we are on them is what generates value. People is on Facebook because their friends are on Facebook. Youtube's value is because people upload videos to it. If more and more people stop using the platforms, eventually other's will follow, and alternatives will raise.

For some, quitting Facebook is not an option, because they find work through it, for example. However, if enough people stop using Facebook, other channels to find work will become more relevant. It is common, for example, for people to have Facebook pages, and the only possible contact is through them. Once they realize they are loosing business because of it, they will start, at least, using e-mail as a contact method.

As webmasters, however, the responsibility is at a different scale. We make choices that affect our visitors, not only ourselves. If you choose to send Google your users' data, or if you install the Facebook pixel on your pages, you have to be aware of the consequences, at different levels. Alternatives are available, it is only a matter of using them and showing appreciation.

If you are a course generator, expand your tool set. If you just replicate the same flow with the same tools, without exploring alternatives, you are just being part of the problem, and you are making it more and more acute.


Online privacy has many different angles, and it operates at many different levels. I have only discussed some examples mostly based on Google and Facebook, and how some things work online. But the discussion is much lengthier, and I will be covering it step by step. Something we haven't discusses is how we can find alternatives to some of the tools we currently use, which ones are easy to replace and which ones aren't. I haven't discussed about the involvement of the state, also known as massive surveillance, nor data aggregators. The series of articles will finish with examples on how can we strengthen our online privacy.

Aquiles Carattino

I am the creator of Python for the Lab

Post your comment

Required for comment verification