Encryption at the application layer is often promoted to protect privacy, i.e., to prevent someone in the network from observing users’ communications. In this work we explore how to build a profile for a target user by observing only the names of the services contacted during browsing, names that are still not encrypted and easily accessible from passive probes. Would it be possible to uniquely identify a target user from a large population that accesses the same network? Aiming at verifying if and how this is possible, we propose and compare three methodologies to compute similarities between users’ profiles. We use real data collected in networks, evaluate and discuss performance and the impact of quality of data being used. To this end, we propose a machine learning methodology to extract the services intentionally requested by users, which turn out to be important for the profiling purpose. Results show that the classification problem can be solved with good accuracy (up to 94%), provided some ingenuity is used to build the model.

Users’ Fingerprinting Techniques from TCP Traffic / Vassio, Luca; Giordano, Danilo; Trevisan, Martino; Mellia, Marco; Couto da Silva, Ana Paula. - ELETTRONICO. - (2017). (Intervento presentato al convegno ACM SIGCOMM Workshop on Big Data Analytics and Machine Learning for Data Communication Networks tenutosi a Los Angeles, California, USA nel August 21 - 25, 2017) [10.1145/3098593.3098602].

Users’ Fingerprinting Techniques from TCP Traffic

VASSIO, LUCA;GIORDANO, DANILO;TREVISAN, MARTINO;MELLIA, Marco;
2017

Abstract

Encryption at the application layer is often promoted to protect privacy, i.e., to prevent someone in the network from observing users’ communications. In this work we explore how to build a profile for a target user by observing only the names of the services contacted during browsing, names that are still not encrypted and easily accessible from passive probes. Would it be possible to uniquely identify a target user from a large population that accesses the same network? Aiming at verifying if and how this is possible, we propose and compare three methodologies to compute similarities between users’ profiles. We use real data collected in networks, evaluate and discuss performance and the impact of quality of data being used. To this end, we propose a machine learning methodology to extract the services intentionally requested by users, which turn out to be important for the profiling purpose. Results show that the classification problem can be solved with good accuracy (up to 94%), provided some ingenuity is used to build the model.
File in questo prodotto:
File Dimensione Formato  
user-fingerprint (1).pdf

accesso aperto

Descrizione: Camera Ready
Tipologia: 2. Post-print / Author's Accepted Manuscript
Licenza: PUBBLICO - Tutti i diritti riservati
Dimensione 728.79 kB
Formato Adobe PDF
728.79 kB Adobe PDF Visualizza/Apri
Pubblicazioni consigliate

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11583/2674705
 Attenzione

Attenzione! I dati visualizzati non sono stati sottoposti a validazione da parte dell'ateneo