The Privacy Problematic with Contact Tracing Apps

The Privacy Problematic with Contact Tracing Apps

June 16, 2020

Check Point Software Technologies Security Researcher Oleg Ilushin and Group Manager Jonathan Shimonovich walk us through existing coronavirus contact tracing technologies and weigh the need to manage the pandemic spread against the implications for loss of personal privacy.

By Oleg Ilushin and Jonathan Shimonovich

The Coronavirus pandemic has taken a huge toll worldwide for both individuals and economies. As a precaution, many countries have implemented strict lockdown measures such as closing schools, restaurants and borders, while mask-wearing in public and social distancing have become a must.

The enforcement of these policies, along with the extensive testing of populations has helped to minimise infection rates. However, the pandemic’s effect on each nation’s economy has been brutal.

When an individual is found to be infected with the coronavirus, the race is on to find those who have come into contact with them, as these people could be carriers or even be infected. This has led to hundreds of coronavirus contact-tracing mobile applications being developed worldwide and backed by various governments and national health authorities, as well as guidelines by the EU and special protocols developed by the two major smartphone OS vendors Apple & Google. In some places, the usage of such applications have been made mandatory for people who want to gain access into public spaces.

While the technology and algorithms differ between applications, the promise of most coronavirus contact tracing apps is the same:

The ability to detect close contact between individuals (i.e. within several meters) over a period of time. The parameters differ from one application to another, but as a guideline, the time interval is about 15 minutes. Proximity, in the majority of applications, is measured using either Bluetooth or GPS technology. In the case of Bluetooth, each device broadcasts packets with some unique ID periodically, allowing other devices to monitor them. In the case of GPS, the exact location of the user is logged for at all times.

When a person tests positive for coronavirus, they can use the application to advertise either their locations or the Bluetooth identifiers from registered contacts. Applications notify users that have appeared to be in close proximity with an infected person. The information around contacts made by the users of the applications is eventually shared with the local health authority, and/or with other users. Of course, if such a system is to be effective in breaking infection chains, the application must have high adoption rates.

These observations, naturally, raise many questions around the privacy of individuals’ data that the app may access, and the potential abuse of such systems.

Some are concerned that contact-tracing apps are surveillance tools that invade individual privacy and disclose sensitive information. Therefore, any such app and tracing system must maintain a delicate balance between privacy and security, since poor implementation of security standards may put users’ data at risk.

This comes down to questions on what data is collected, how it is stored and how it is distributed. For example, is the data encrypted? Is there a proper authorisation or verification process to protect against abuse? Is user anonymity preserved given that personal identifiers such as phone number, name and IDs are being collected?

Another aspect is user consent – does the user submit their data voluntarily, or is the data being collected and uploaded without the user’s knowledge?

Let’s look a little closer at how different applications work to try and answer these questions.

The two most widely used techniques for detecting proximity between two devices are GPS and the Bluetooth Low Energy (BLE) protocol.

GPS location tracking

With this method, the apps obtain a user’s GPS position periodically, and save a log of the user’s locations and timestamps. This data may be later intersected with other users’ location logs.

This approach offers the flexibility to analyse the geography of the infection spread, and gives more options to governments and health authorities to localise infected regions and apply prevention policies accordingly. However, this also gives away very sensitive information, revealing users’ travels and locations over the previous few days or weeks.

Examples of mobile applications that utilise GPS logging are MIT’s SafePaths, Cyprus’ CovTracer (which is based on SafePaths), Israel’s Hamagen and India’s Aarogya Setu.

Bluetooth Low Energy (BLE)

Here, each device broadcasts pings over BLE. These pings are registered by other devices that are in Bluetooth range based on duration and signal strength. To work, both devices must be running the contact tracing app.

This technology is widely used in coronavirus tracing applications, as it offers more privacy – the only info usually transmitted over Bluetooth is a cryptographic identifier that changes frequently and does not expose user identity. In addition, BLE randomises MAC addresses sent in a packet over the air and changes it every few minutes, making it difficult to track devices.

When a person tests positive for Coronavirus, they can publish all the IDs collected in proximity to them. Each user can then check whether one of the IDs belongs to them and find out when, and for how long, they were in proximity with the infected person. Since the IDs are anonymised, only the end user can affiliate them to their device.

The downside of this approach is its inability to map the infection geographically. Despite this, BLE is by far the most popular method, and among the applications using BLE for contact tracing include UK’s “NHS COVID-19”, Singapore’s TraceTogether and Australia’s COVIDSafe.

Centralised approach

Regarding data distribution, applications can be classified, again, into two groups: those using a centralised approach and those using a decentralised approach.

Most of the currently deployed applications are built on the centralised approach including UK’s NHS COVID-19, Singapore’s TraceTogether, and Australia’s COVIDSafe.

With a centralised approach, the contact events log is uploaded from the device to a central server. Even if the user uploads the data to the server when they are diagnosed with coronavirus, the data is stored and processed only at the central server.

This gives the authorities more power to analyse contact data and get more insight on the spread of the virus, but it also enables them to access private information on the mass population such as the locations of individuals, or who met whom and when.

Decentralised approach

This is a more privacy-centric approach, meaning that the contact events log never leaves the device, and only minimal information is uploaded to the central server.

The application periodically downloads keys of positive diagnosed users, and matches them against contact logs stored on the device.

Such an approach is used in the DP^3T open protocol, as well as in the “Exposure Notification” specification designed jointly by Google and Apple. Holland’s PrivateTracer use the DP^3T decentralised model, while applications adopting Google|Apple approach are not yet available publicly.

User Anonymity

Another important point in preserving privacy is whether an application that is running on a device can be associated with the real user. In order to preserve user anonymity, no personal identifiers (phone number, name, IDs etc) should be associated with the application at any time. This is achieved by using cryptographic keys that change frequently and serve as user identifiers transmitted over the air (via Bluetooth or Internet connections).

Usually, an application receives a one-time random unique key during installation /registration, and that key is used to derive rotating cryptographic identifiers that are broadcasted over Bluetooth, and uploaded to servers. It is important to stress, however, that while preserving privacy is crucial, so is the reliability of the application.

To stress on this point, let’s consider the following common use case of contact tracing applications.

One of the features of contact tracing applications is that a user may submit a diagnosis report, and in many cases, there is a self-diagnosis questionnaire where the user fills in the symptoms they are experiencing as well as other information.

When a user submits such a report, some applications do not perform any verification, while others enforce some kind of validation by requiring a phone number to send a verification code via SMS.

While verification by SMS de-anonymises users, it protects against fake reports. On the other hand, without verification, the whole system can be undermined by multiple fake reports, causing fake alerts and nationwide panic.

Several standards and frameworks have been developed that implement contact tracing features, with privacy and security in mind.

PACT

PACT (Private Automated Contact Tracing) is a collaboration project led by MIT, this protocol is based on BLE technology and has a decentralised data approach:

  • Each device emits and receives random “chirps” over BLE
  • Random 256 byte ‘seeds’ are generated on the device each hour
  • The phone stores a 3-month log of the seeds
  • The chirp is derived from the current seed and timestamp every few seconds; and,
  • Infected users will need to upload seeds to a central database (without any contact data). The upload is authorised by health authorities issuing a permission number.

In order to check whether a person was in contact with an infected individual, they must download an exposure database, and match it against locally stored chirp logs.

DP-3T

DP3T (Decentralised Privacy-Preserving Proximity Tracing) is an open source framework that is also based on a decentralised approach, and uses BLE technology for registering contact events.

Each device locally generates frequently-changing ephemeral identifiers (EphIDs) and broadcasts them via BLE beacons.

Infected users upload only their own EphIDs from the previous 14 days. All other users periodically download an exposure database, which does not contain any contact information, but rather a list of EphIDs of infected users.

The upload of the data by the infected user must be authorised by health authorities to prevent the possibility of abuse / fake reports.

Bluetrace

The Singapore government has backed an open framework, on which TraceTogether, and also Australia’s COVIDSafe, are based, using BLE for contact tracing and adopting a centralised data approach.

When a user downloads and registers with the application for the first time, the back-end service generates a unique, randomised User ID and associates it with the user’s phone number. Phone numbers are used to contact users if they found to be exposed to an infected user. Contacts are registered by exchanging TempIDs over Bluetooth. TempID comprises UserID, a created time stamp and an expiration timestamp, all of which are encrypted with a symmetric key. In addition, an Initialisation Vector, and authentication tag are transmitted without encryption. Only the health authority holds the key and can encrypt/decrypt TempIDs. Also, it is important to note that TempIDs have a life span of 15 minutes.

The application should download batches of user’s forward-dated TempIDs over the Internet. When a user becomes infected, they are asked by the health authority to upload their contact history via the application. This uploading of one’s contact history is authorised by a code provided by the health authority.

Google|Apple Exposure Notification

This jointly developed framework by Google and Apple uses a decentralised approach, and is based on BLE for registration of contact events. Each device generates cryptographic keys, which are changed at intervals of 15 minutes and advertised over Bluetooth to nearby devices to facilitate contact discovery. Devices store the contact IDs locally.

When a person tests positive for the coronavirus, they can voluntarily upload their diagnosis data (daily keys from the previous 14 days) to a central server. Contact data is not shared.

Each device periodically downloads diagnosis data from a server, and matches them with the proximity IDs of all the devices that had come in contact in the last 14 days, and if a match is found, the user is notified and asked to be tested and/or isolated.

GPS usage is prohibited; applications must use special permission, which is only granted to certified governmental bodies.

The Exposure Notification framework has been released, and is available for IOS 13.5+ devices. There are already several applications released which are based on the Google|Apple Exposure Notification API. SwissCovid is one of them, and is currently being test piloted in Switzerland. Other examples are Italy’s Immuni application, and Latvia’s Apturi.

Possible App Security Issues
So much for the different approaches and frameworks which aim to preserve privacy. What about the apps’ security? Here, there are several points that need to be addressed by the developers of the applications.

Device traceability

When Bluetooth technology is used for contact tracing, devices broadcast packets frequently over the air. These packets contain unique/cryptographic IDs to facilitate registration of a contact by other devices.

It should not be possible for someone listening to these broadcasts to correlate IDs and devices. Built into the BLE technology, the MAC address of the sent packets are randomised periodically to protect against device tracking.

But, in order to comply with BLE specifications, the broadcasted cryptographic identifiers should be rotated at the same intervals as the BLE MAC addresses. Otherwise, it would be possible to trace devices beyond the time interval between rotations of the cryptographic IDs.

Local Storage

Naturally, applications store contact logs, encryption keys and other sensitive data on devices. Sensitive data should be encrypted and stored in the application sandbox and not on shared locations. Even within the sandbox, gaining root privileges or physical access to the device, could compromise the data, more so if such sensitive information as GPS locations are stored.

Fake reports

It is important that applications perform authentication when information is submitted to its servers, such as when a user posts their diagnosis and contact logs. Without proper authorisation in place, it could be possible to flood the servers with fake reports and undermine the reliability of the whole system.

Encrypted Communications

To avoid the possibility of Man-in-The-Middle attacks and the interception of the application’s traffic, all communications with the application backend server should be encrypted. Certificate pinning should also be considered to prevent malicious SSL proxies from intercepting connections.

When we look at the adoption rates of coronavirus contact tracing applications in different countries, India’s Aarogya Setu leads the way with more than 100 million downloads from Google Play Store. This is largely because public and private workers in India are required to use it.

Gerak Malaysia has more than a million downloads from Google Play Store, while Singapore’s TraceTogether and Australia’s COVIDSafe have over 500,000 downloads each respectively.

In Europe, UK’s NHS COVID-19 has yet to be deployed across the country, but is currently being piloted on the Isle of Wight. It currently has more than 50,000 downloads. Austria’s Stopp Corona has been downloaded more than 100,000 times, as has Norway’s Smittestopp.

Germany and France have yet to release an application, but there are plans to do so soon.

It looks like coronavirus contact tracing applications are here to stay. But in order for them to be successful, it is essential that people have full trust that their privacy is being preserved and their data is protected from misuse. Given the abundance of frameworks and protocols that have prioritised privacy and security, and the fact that many official applications have their sources published, it looks like things are going in the right direction. With the recent release of the Google|Apple Notification Framework, we expect more applications based on this framework to be released, as well as some existing applications shifting to this approach. However, it is still up to the developers of the applications to comply with standards by implementing them in a secure manner. We strongly recommend government agencies to rely on sound protocols such as those mentioned above and offer open source for their apps in order to increase user confidence and acceptance.

As multiple fake apps have already been detected during the pandemic, our recommendation for end users is to only install contact-tracing Coronavirus applications from official app stores, since they only allow authorised government agencies to publish such apps. In addition, we recommend users to download and install a mobile security solution to scan applications and protect the device against malware, as well as verify that the device has not been compromised. We will continue researching applications and frameworks, and publish any new or emerging issues we find.

(Ed. Author Jonathan Shimonovich says he has more than 12 years of R&D experience ranging from embedded to web services, big data and cybersecurity. Shimonovich’s professional experience is supported by the Israeli Defence Force’s elite military technological unit for six years, where he says he served as a captain. Shimonovich holds a B.Sc sums cum laude and M.Sc in Electrical Engineering, Technion, Israel Institute of Technology, Haifa.  Author Oleg Ilushin says he has more than 10 years’ experience in reverse engineering and vulnerability research of Network Protocols and mobile applications. His current research interest focuses on vulnerability assessment of Android and IOS platforms. Oleg holds a M.Sc in Applied Mathematics from Technion, Israel Institute of Technology.)

Facebooktwitterredditlinkedin

Leave a Reply

Your email address will not be published. Required fields are marked *