The Web's Sixth Sense:

A Study of Scripts Accessing Smartphone Sensors

Mobile browsers allow web pages you visit to access sensors on your smartphone. We performed a study to find out how this functionality is used in practice: which websites are using your sensors, what they are doing with the data, and what are the privacy implications. The results are published in a paper at ACM CCS'18. This companion website presents some of our high-level findings and data.

Paper (PDF) » Demo »

The study is a collaboration of Anupam Das1, Gunes Acar2, Nikita Borisov3 and Amogh Pradeep4
1 North Carolina State University
2 Princeton University
3 University of Illinois at Urbana-Champaign
4 Northeastern University

All the findings below are based on a web crawl of the Alexa top 100K websites carried out in May 2018.

From our crawl results we found that sensor APIs were accessed on 3695 of the 100K websites by scripts served from 603 distinct domains. Orientation and motion sensors are by far the most frequently accessed, on 2653 and 2036 sites respectively. Light and proximity sensors, which were only supported by Firefox[1], are accessed on fewer than 200 sites each.

In addition to listing the number of sites that feature scripts, we use the prominence metric proposed by Engelhardt and Narayanan (§5.2) to capture the popularity of the sites where scripts are present. It is calculated as the inverse sum of the rank of the sites: \( \sum 1/\mathit{rank}_i \). For ease of interpretation, we normalize this by dividing by the prominence of all sites, \( \sum_{i=1}^{100000} 1/i, \) and expresss the value as a percentage.

Sensor # of sites # of domains Norm. prominence
Motion 2653 384 1.82%
Orientation 2036 420 4.34%
Proximity 186 50 0.13%
Light 181 35 0.13%

We found several scripts that access and send sensor data to remote servers either in clear text or in base64 encoded form. To detect such exfiltrations, we analyzed HTTP request headers and POST request bodies obtained through OpenWPM-Mobile’s instrumentation.

Domain (PS+1) Sensors Encoding Num. of sites Top site
b2c.com Orientation, Motion, Proximity, Light base64 53 reuters.com
perimeterx.net Motion base64 45 zillow.com
wayfair.com Motion base64 7 wayfair.com
moatads.com Orientation raw 5 stuff.co.nz
queit.in Motion, Orientation raw 3 busbud.com

By clustering scripts based on features extracted from instrumentation data we were able to classify major use cases. We found that sensor data are commonly used for tracking and analytics, verifying ad impressions, and distinguishing real devices from bots.

ID Use case % of JS Num. of sites Norm. prominence
1 Reacting to orientation, tilt, shake 6.7% 533 3.16%
2 Scripts clustered as noisy 20.6% 1804 1.47%
3 Tracking, analytics, fingerprinting and audience recognition 36.8% 1198 1.07%
4 Differentiating bots from real devices 17.7% 413 0.38%
5 Checks what HTML5 features are offered 11.2% 114 0.15%
6 Automatically resize contents in page or iframe 3.2% 103 0.04%
7 Parallax engine that reacts to orientation sensors 3.4% 35 0.03%
8 Use sensor data to add entropy to random numbers 0.4% 4 0.00%

We also found that a large fraction of the scripts that access sensors also perform browser fingerprinting.

Sensor Canvas FP Canvas Font FP Audio FP WebRTC FP Battery FP Any FP Total scripts
Motion 56.7% 0.2% 19.8% 6.8% 5.6% 62.7% 501
Orientation 36.2% 3.4% 5.7% 6.2% 4.5% 41.7% 650
Proximity 2.1% 0.0% 47.9% 0.0% 49.0% 51.0% 96
Light 19.5% 1.2% 56.1% 15.9% 57.3% 76.8% 82

We measure the rate of blocking by three popular tracking protection lists: EasyList, EasyPrivacy, and Disconnect. The table below reports the fraction of script domains that use sensors which are blocked by each list. In general, we see that a significant portion of scripts that access sensors are missed by the popular blacklists, which is in line with the previous research on tracking protection lists. However, the lists do tend to capture the most prominent scripts; the numbers in parentheses are the percentage of scripts weighted by their prominence.

Sensor Disconnect blocked EasyList blocked EasyPrivacy blocked
Motion 1.8% (74.4%) 1.8% (74.6%) 2.9% (11.5%)
Orientation 3.6% (60.4%) 3.1% (44.6%) 3.1% (5.8%)
Proximity 6.0% (13.3%) 2.0% (13.6%) 4.0% (68.6%)
Light 2.9% (13.2%) 2.9% (13.6%) 8.6% (70.8%)
Any sensor 3.3% (58.5%) 2.7% (43.0%) 3.0% (10.8%)

Scripts from the following domains were found to access one or more sensors APIs.

The list of websites where one or more sensor were accessed.

The code for OpenWPM-Mobile, a mobile version of the OpenWPM web privacy measurement framework can be found on GitHub.

We have made all of the data supporting this research, including raw crawl data, extracted features, and clustering results, available via the Illinois Data Bank.

Data »

We would like to thank all the anonymous reviewers for their feedback. We would also like to thank Arvind Narayanan, Steven Englehardt and our shepherd Ben Stock for their valuable feedback. This material is based in part upon work supported by the National Science Foundation grants CNS 1739966 and CNS 1526353.

Reference: Anupam Das, Gunes Acar, Nikita Borisov, Amogh Pradeep. The Web's Sixth Sense: A Study of Scripts Accessing Smartphone Sensors. In Proceedings of the 25th ACM Conference on Computer and Communications Security (CCS), Toronto, Canada, October 15–19, 2018.

BibTeX:

@inproceedings{sensor-js-2018,
  author    = {Anupam Das and Gunes Acar and Nikita Borisov and Amogh Pradeep},
  title     = {The {Web's} Sixth Sense: A Study of Scripts Accessing Smartphone Sensors},
  booktitle = {Proceedings of the 25th ACM Conference on Computer and Communication Security (CCS)},
  year      = 2018,
  month     = oct,
  publisher = {ACM},
  doi       = {10.1145/3243734.3243860},
  url       = {https://doi.org/0.1145/3243734.3243860}
}