Monday, February 6, 2012

Microsoft researchers say anonymized data isn't so anonymous

"Seemingly innocuous Web log data can identify the activity on individual machines and threaten online privacy

"Data routinely gathered in Web logs -- IP address, cookie ID, operating system, browser type, user-agent strings -- can threaten online privacy because they can be used to identify the activity of individual machines, Microsoft researchers say.

"At the same time, analysis of such data when anonymized can help detect malicious activity and so improve overall Internet security, they add.

"The researchers found that 62 percent of the time, HTTP user-agent information alone can accurately tag a host. Combine that same information with the IP address, and the accuracy jumps to 80.6 percent. If the user-agent information is combined with just the IP prefix the accuracy is still 79.3 percent, they say.

"The highest accuracy came when more than one user ID was linked to a single host, as would be the case in a family that shares a single computer. In such cases, multiple IDs would accurately represent that one host computer. The accuracy rate was 92.8 percent...

"They found that service providers can recognize 88 percent of devices that receive a cookie, clear the cookie, then return to the site, if they examine other identifying factors they gathered during the initial connection. Even if they use private browsing mode, which is designed to protect user identity, they can still be identified, the researchers say.

"Our analysis suggests that users who do not wish to be tracked should do much more than clear cookies," the researchers say, and note that in some circumstances clearing cookies can help identify a particular host. "Uncommon behaviors such as clearing cookies for each request may instead distinguish a host from others who do not do so."...

By Tim Greene | Infoworld

No comments:

Post a Comment