Nowadays, large volumes of data and measurements are being continuously generated by computer and telecommunication networks, but such volumes make it difficult to extract meaningful knowledge from them. This paper presents SaFe-NeC, an innovative methodology for analyzing network traffic by exploiting data mining techniques, i.e. clustering and classification algorithms, focusing on self-learning capabilities of state-of-the-art scalable approaches. Self-learning algorithms, coupled with self-assessment indicators and domain-driven semantics enriching data mining results, are able to build a model of the data with minimal user intervention and highlight possibly meaningful interpretations to domain experts. Furthermore, a self-evolving model evaluation phase is included to continuously track the quality degradation of the model itself, whose rebuilding is triggered as soon as quality indicators fall below a threshold of tolerance. The proposed methodology can exploit the computational advantages of distributed computing frameworks, as the current implementation runs on Apache Spark. Preliminary experimental results on a real traffic dataset show the full potential of the proposed methodology to characterize network traffic data.

SaFe-NeC: A scalable and flexible system for network data characterization / Apiletti, Daniele; Baralis, ELENA MARIA; Cerquitelli, Tania; Garza, Paolo; Venturini, Luca. - ELETTRONICO. - (2016), pp. 812-816. (Intervento presentato al convegno 2016 IEEE/IFIP Network Operations and Management Symposium, NOMS 2016 tenutosi a Istanbul (Turkey) nel 2016) [10.1109/NOMS.2016.7502905].

SaFe-NeC: A scalable and flexible system for network data characterization

APILETTI, DANIELE;BARALIS, ELENA MARIA;CERQUITELLI, TANIA;GARZA, PAOLO;VENTURINI, LUCA
2016

Abstract

Nowadays, large volumes of data and measurements are being continuously generated by computer and telecommunication networks, but such volumes make it difficult to extract meaningful knowledge from them. This paper presents SaFe-NeC, an innovative methodology for analyzing network traffic by exploiting data mining techniques, i.e. clustering and classification algorithms, focusing on self-learning capabilities of state-of-the-art scalable approaches. Self-learning algorithms, coupled with self-assessment indicators and domain-driven semantics enriching data mining results, are able to build a model of the data with minimal user intervention and highlight possibly meaningful interpretations to domain experts. Furthermore, a self-evolving model evaluation phase is included to continuously track the quality degradation of the model itself, whose rebuilding is triggered as soon as quality indicators fall below a threshold of tolerance. The proposed methodology can exploit the computational advantages of distributed computing frameworks, as the current implementation runs on Apache Spark. Preliminary experimental results on a real traffic dataset show the full potential of the proposed methodology to characterize network traffic data.
2016
9781509002238
9781509002238
File in questo prodotto:
File Dimensione Formato  
safenec.pdf

accesso aperto

Descrizione: postprint draft
Tipologia: 2. Post-print / Author's Accepted Manuscript
Licenza: PUBBLICO - Tutti i diritti riservati
Dimensione 110.12 kB
Formato Adobe PDF
110.12 kB Adobe PDF Visualizza/Apri
Pubblicazioni consigliate

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11583/2650989
 Attenzione

Attenzione! I dati visualizzati non sono stati sottoposti a validazione da parte dell'ateneo