What is the Human Web and what data is collected through it?

The Human Web is an open-source technology built by our parent company, Cliqz, that uses the power of anonymous group data create innovative browser technologies to make the internet more private.  Users that participate in the Human Web contribute anonymous information related to trackers, websites, and search queries that are then analyzed and evaluated for relevance and safety.  This data is used to create anonymous group models that power the private quick-search, anti-tracking, anti-phishing technologies featured in Cliqz products and which will soon be featured in Ghostery.  

The Human Web is built using world-leading privacy-by-design practices that ensures that any data that is collected is done completely anonymously without any personally identifiable information.  To achieve this, the Human Web implements two core components:  its data collection framework and its proxy network.

The Human Web data collection framework requires that the data points contributed by users are evaluated only as a single, aggregated event, disentangling these signals from any personally-identifiable information such as timestamps or user IDS.  Furthermore, The Human Web filters out any sensitive or personal information from URLs that are deemed unsafe (e.g., twitter.com/username) that can be used to identify an individual person.  Thus, we are neither able to combine data from multiple entries or multiple visits to websites, nor to link this information with any personal information, like email addresses or user IDs, that can be used to identify an individual.

As a further safety precaution, this information is sent through the Human Web proxy network, a series of peer-to-peer proxies that remove information like the user IP addresses, making it virtually impossible to determine who or where the data comes from.  The proxy network itself is blind to the content of the data its sending, adding a further security measure to the process.  Consequently, all data we collect is virtually unidentifiable by anyone, including ourselves, so that even if our security were breached by a hacker our outside organization, there would be absolutely no way to tie this information to individuals.

The specific data contributed through the Human Web includes:

  • Non-Private URLs
  • Search queries along with Search Engine Results Pages
  • Suspicious URLs that are potential phishing websites
  • Information related to safe and unsafe trackers
  • Information related to the prevalence and performance of trackers  

Though the Human Web is more powerful as more Cliqz and Ghostery users join it, participation in it is completely optional.  If you do not want the Human Web to collect anonymous statistical data about your searches and website visits, you can adjust your settings in the Ghostery Menu.

If you’d like to dive into the weeds and learn more about the Human Web, you can check out the source code in our open-source Github repo.