Top-viewed Articles on a Wikipedia
This tool fetches the top-viewed articles for a given wiki from yesterday. First you have the actual data — i.e. accurate counts without any noise added. Then you have the data after differential privacy (DP) has been applied (specifically noise drawn from a Laplace distribution).
You can play around with the different hyperparameters to see how it affects the results. See this Facebook blogpost for a good worked example.
Language: which Wikipedia language to query.
Privacy Unit: which unit of privacy to use. Selecting "pageview" provides a guarantee that individual pageviews will be private, whereas "user" provides a guarantee that user sessions will be private. A user session can be capped at 1, 5, or 10 views/session, which encompasses a significant (80-99%) amount of traffic, depending on the cap and the size of the wiki.
Epsilon (ε): privacy parameter. Defaults to 0.1, but can also be 0.5, 1, or 2. The smaller you make it, the more privacy-preserving the differential privacy mechanism is and the greater data loss there is.
Delta (δ): the probability of information about the database accidentally being leaked. The smaller you make it, the less likely a leak is to happen. In Privacy on Beam, δ is also used to add noise to the threshold used to put a minimum bound on output values. Ideally, the value should be less than the inverse of a polynomial in the size of the database.
Sensitivity: the maximum amount that any individual can add to the result. With pageview-level privacy, this defaults to 1, as the maximum difference between two adjacent databases is 1 pageview. With user-level privacy, this can be set either 1, 5, or 10, to simulate varying thresholds for adjacent databases.