Views on Personal Data Tracking by Internet Service Providers From Georgia Tech Law and Ethics Professor Peter Swire

Bloomberg Law: Privacy & Data Security brings you single-source access to the expertise of Bloomberg Law’s privacy and data security editorial team, contributing practitioners,...

With the Federal Communications Commission moving to release regulations on how Internet service providers (ISPs) collect the personal information of customers, understanding the technology utilized by ISPs to track information has become an increasingly important policy issue.

Bloomberg BNA Privacy & Data Security News Senior Legal Editor George R. Lynch posed a series of questions to Peter Swire, Georgia Institute of Technology Nancy J. and Lawrence P. Huang Professor of Law and Ethics and senior counsel at Alston & Bird LLP, on a study he recently completed—“Online Data Privacy and ISPs”—about the ability of ISPs to access consumer data from users. Swire served as chief privacy counselor for the Clinton administration and on President Barack Obama’s Review Group on Intelligence and Communications Technology. He was co-chairman of the global Do Not Track process for the World Wide Web Consortium.

Bloomberg BNA:

Wouldyou explain briefly the report's conclusion that Internet service providers (ISPs) have neither comprehensive nor unique access to user data?

Peter Swire:

The Working Paper focuses on how technical developments have limited ISP access to data, which makes that access less than comprehensive. The Paper also discusses how other companies often have access to more information and a wider range of user information than ISPs, which makes the data ISPs can access less than unique as well.

The recent shift to prevalent encryption is the biggest reason that ISPs today have less than comprehensive visibility into user Internet activity. According to public data about Internet backbone activity, the encrypted version of the standard Internet protocol (HTTPS) was only 13 percent in April 2014, rising to 49 percent in February 2016, with an expected 70 percent by the end of 2016. Today, all of the top ten Internet sites encrypt by default or when a user logs in, as well as 42 of the top 50 sites. For HTTPS, ISPs are blocked (even if they try) from seeing the content that a user accesses, as well as the detailed URLs that give commercially important clues into user activity. ISPs can see the host names that a user visits, such as www.example.com, but no deeper. (They also can see other information not historically used for advertising purposes, such as length of session and number of bits transferred.)

At the same time that technological and marketplace developments are reducing the online visibility of ISPs, non-ISPs are increasingly gathering commercially valuable information about online user activity from multiple contexts, such as social networks, search engines, webmail and messaging, operating systems, mobile apps, interest-based advertising, browsers, Internet video and e-commerce. ISPs aren't market leaders in any of these major areas; rather, they're just starting to compete in some of them.

Similarly, non-ISPs currently dominate in both cross-device and cross-context tracking. In short, the most commercially valuable information about online users, which can be used for targeted advertising and other purposes, is coming from other contexts such as social networks and search.

Bloomberg BNA:

In addition to how the use of HTTPS prevents ISPs from seeing a significant amount of Web activity, are there other contexts where its use might impact personal data collection?

Swire:

Yes. HTTPS creates technical separation between different contexts. The Paper describes these technical separations in detail for ISPs, but search provides another useful example. Encrypted search means that the websites referred to by a search engine don't see the detailed search query, but rather just see that a user is arriving from a particular search engine. In the old days, the user's HTTP referrer information from the search engine would have told a visited website that the user searched for “Women's white sneakers.” If the user today performs an encrypted search, however, that site only sees that the user came from that particular search engine, so the details of the search are masked.

If the user today performs an encrypted search, that site only sees that the user came from that particular search engine, so the details of the search are masked.

There are also many security risks for data that's not encrypted, and in the Internet of things context in particular. As more devices are created with Internet connectivity, each provides potential new sources of data across networks. Those devices can be designed to encrypt the data they send while it's in transit, both protecting it from malicious activity and ensuring that the data is available only to the parties intended to receive it. Conversely, devices that don't encrypt traffic create security risks for users, including the interception of personal data without their knowledge or consent.

The use of proxy services can also impact how personal user data is collected. Google's “Data Saver” proxy service, for example, acts as an intermediary between the user's browser and website destination by performing DNS requests, retrieving Internet content on behalf of the user, and encapsulating all Internet content inside of an encrypted tunnel. Data Saver aggregates and compresses the various requests that complicate and enlarge the size of Internet content so that users' devices don't have to individually request each feature. In 2013, Facebook purchased Onavo, which offers a similar compression/proxy service for mobile devices. These sorts of changes, if broadly adopted, will similarly limit which parties can access user data outside of these services.

Bloomberg BNA:

Are there other actors across the Internet, such as social networks, operating systems and search engines, that actually collect a greater amount of user data and which among them is most significant?

Cross-context tracking lets a company that provides both the e-commerce context and the search context to know what kind of searches the user performed and which links they clicked before ultimately making a purchase.

Swire:

The Paper doesn't do any quantitative judgments about the most data, but we do note that a number of companies, including Apple Inc., Facebook Inc., Google Inc., Microsoft Corp. and Yahoo! Inc. are very active in multiple contexts and lead in more market contexts than many of the traditional ISPs. While any one of these services or platforms can provide volumes of data about users, often with insights into content, the real insights come from combining information from multiple services and platforms—what we call “cross-context tracking” linked to a particular user device or across devices.

Additionally, while the Working Paper doesn't discuss data brokers in depth, it is important to recognize that companies that don't collect data in any one context may still be able to append its databases with purchased data from data brokers. The Working Paper doesn't attempt to quantify how much data any individual company actually collects or uses, but rather seeks to explain what types of data are available in each context discussed, and how that data can be useful for generating advertising revenue.

Bloomberg BNA:

Would you explain a bit the term “cross-context tracking” and how it may or may not work in conjunction with cross-device tracking to gather personal information?

Swire:

Cross-context tracking can occur on the same device, such as a laptop someone uses for multiple things. But, cross-context tracking can also occur across devices, for instance if a user's social network activity on one device is linked to search or other activity on another device. Any one user might rely heavily on their favorite device, such as their laptop or smartphone, and cross-context then would show the powerful insights that come across use on that single device, but if they have other activity like using social network on their phone instead of laptop, that shows more data.

We highlight cross-context in the Working Paper because the emphasis to date has been on cross-device tracking, which becomes more and more important as users move to multiple devices. The Paper is new in highlighting how effective insights into users and possible targeting of ads can be even on the same device by combining data across different contexts. The two techniques can also work in conjunction. For example, a user may do some online shopping on their laptop at home, and also do online shopping on their mobile device while at different stores. Cross-device tracking would make it possible for one service to know it was the same user making purchases from each device. Cross-context tracking would instead also let a company that provides both the e-commerce context and the search context to know what kind of searches the user performed and which links they clicked before ultimately making a purchase.

Young attorneys in the privacy and cybersecurity space need to learn the technical side and not just the law. A comparison is the way that antitrust attorneys over the years had to learn to be economists as well.

Bloomberg BNA:

You've had an interesting career in privacy and security that has ranged from privacy advisor to President Clinton to public interest advocate to law firm counsel to professor. What advice do you have for young attorneys just starting out in the privacy space?

Swire:

Young attorneys in the privacy and cybersecurity space need to learn the technical side and not just the law. A comparison is the way that antitrust attorneys over the years had to learn to be economists as well. Privacy and cybersecurity are examples of fields where the most convincing arguments are often based on what is technically feasible or not. That doesn't mean you need to write code yourself, but it does mean you should try to learn the material well enough that you can converse intelligently with people who do write code. This type of technical fluency will make you better able to serve and advise your clients, and to better understand the factual nuances of evolving privacy issues.

One of the things I love about the field is how you have to keep re-educating yourself constantly. Ten years ago social networks were getting off the ground. Today it is big data, drones and the Internet of things. Moore's Law is driving computational capacity that also applies to personal information. So the privacy field is fascinating because you have to keep learning what technology is beginning to do today and will do at scale tomorrow.