Watching Them Watching Us: How Websites are Leaking Sensitive Data to Third-Parties
What is a TellTale URL ?
URLs are the most commonly tracked piece of information. The innocent choice to structure a URL based on page content can make it easier to learn a users’ browsing history, address, health information or more sensitive details. They contain sensitive information or can lead to a page which contains sensitive information.
We call such URLs as TellTaleURLs.
Let’s take a look at some examples of such URLs.
Website: donate.mozilla.org (Fixed)
After you have finished the payment process on donate.mozilla.org, you are redirected to a “thank you” page. If you look carefully at the URL shown in the below screenshot, it contains some private information like email, country, amount, payment method.
Now because this page loads some resources from third-parties and the URL is not sanitised, the same information is also shared with those third-parties via referrer and as a value inside payload sent to the third-parties.
In this particular case, there were 7 third-parties with whom this information was shared.
Mozilla was prompt to fix these issues, more details can be found here: https://bugzilla.mozilla.org/show_bug.cgi?id=1516699
Website: trainline.eu, JustFly.com (Last checked: Aug’18)
Once you finish a purchase like train tickets / flight tickets, you receive an email which has a link to manage your booking. Most of the time, when you click on the link, you are shown the booking details — without having to enter any more details like booking code, username/password.
This means that the URL itself contains some token which is unique to the user and provides access to the users’ booking.
Website: foodora.de, grubhub.com (Last checked: Aug’18)
One of the pre-requisites to order food online is entering the address where you want the food to be delivered.
Some popular food delivery websites, convert the address to fine latitude-longitude values and add them to the URL.
The URL is also shared with third-parties, potentially leaking where the user lives.
To be clear, it’s not just these websites that suffer from such leaks. This problem exists everywhere — it’s a default situation, not a rarity. We’ve seen it with Lufthansa, Spotify, Flixbus, Emirates, and even with medical providers.
Risks of TellTale URLs:
- Websites are carelessly leaking sensitive information to plethora of third-parties.
- Most often without users’ consent.
- More dangerously: Most websites are not aware of these leaks while implementing third-party services.
Are these problems hard to fix?
As a Software Engineer who has worked for some of the largest eCommerce companies, I understand the need to use third party services for optimising and enhancing not only the Digital Product but also how users interact with the product.
It is not the usage of third party services that is of concern in this case but the implementation of these services. Owners should always have the control of their website and what the website shares with third party services.
It is this control that needs to be exercised to limit the leakage of User information.
It is not a mammoth task, it is just a matter of commitment to preserving the basic right to privacy.
- Private pages should have noindex meta tags.
- Limit the presence of third-party services on private pages.
- Referrer-Policy on pages with sensitive data.
- Implement CSP and SRI. Even with a huge footprint of third-party services CSP, SRI are not enabled on majority of the websites.
Introducing Local Sheriff:
Given that such information leakage is dangerous to both users and the organisations, then why is it a wide-spread problem?
One big reason that these issues exist is lack of awareness.
A good starting point for websites is to see what information is being leaked or detect presence of TellTaleURLs.
But in order to find out if the same is happening with the websites you maintain or visit, you need to learn some tools to inspect network traffic, understand first-party — third-party relationship and then make sure you have these tools open during the transaction process.
To help bridge this gap, we wanted to build a tool with the following guidelines:
- Easy to install.
- Monitors and stores all data being exchanged between websites and third-parties — Locally on the user machine.
- Helps identify the users which companies are tracking them on the internet.
- Interface to search information being leaked to third-parties.
Given the above guidelines, browser extension seemed like a reasonable choice. After you install Local-Sheriff, in the background:
- Using the WebRequest API, it monitors interaction between first-party and third-party.
- Classifies what URL is first-party and third-party.
- Ships with a copy of database from WhoTracksMe. To map which domain belongs to which company.
4. Provides an interface you can search for values that you think are private to you and see which websites leak it to which third-parties. Eg: name, email, address, date of birth, cookie etc.
Revisiting EXAMPLE #1
- The user has Local-Sheriff installed and donates to mozilla.org.
- Clicks on the icon to open search interface.
- Enters emailID used on the website donate.mozilla.org.
It can be seen that email address used at the time of donation was shared with ~7 third-party domains.
You can try it yourselves by installing it:
Source Code: https://github.com/cliqz-oss/local-sheriff
Thanks for reading and sharing ! 🙂
If you liked this story, feel free to 👏👏👏 a few times (Up to 50 times. Seriously).
Happy Hacking !
- Special thanks to Remi , Pallavi for reviewing this post 🙂
- Title “Watching them watching us “ comes from a joint talk between Local Sheriff and Trackula at FOSDEM 2019.
Watching Them Watching Us: How Websites are Leaking Sensitive Data to Third-Parties was originally published in Hacker Noon on Medium, where people are continuing the conversation by highlighting and responding to this story.