Frequent closed itemset mining is among the most complex exploratory techniques in data mining, and provides the ability to discover hidden correlations in transactional datasets. The explosion of Big Data is leading to new parallel and distributed approaches. Unfortunately, most of them are designed to cope with low-dimensional datasets, whereas no distributed high-dimensional frequent closed itemset mining algorithms exists. This work introduces PaMPa-HD, a parallel MapReduce-based frequent closed itemset mining algorithm for high-dimensional datasets, based on Carpenter. The experimental results, performed on both real and synthetic datasets, show the efficiency and scalability of PaMPa-HD.

PaMPa-HD: A Parallel MapReduce-Based Frequent Pattern Miner for High-Dimensional Data / Apiletti, Daniele; Baralis, ELENA MARIA; Cerquitelli, Tania; Garza, Paolo; Michiardi, Pietro; Pulvirenti, Fabio. - (2015), pp. 839-846. (Intervento presentato al convegno The 3rd International Workshop on High Dimensional Data Mining (HDM’15) tenutosi a Atlantic City, NJ, USA nel 14 November 2015) [10.1109/ICDMW.2015.18].

PaMPa-HD: A Parallel MapReduce-Based Frequent Pattern Miner for High-Dimensional Data

APILETTI, DANIELE;BARALIS, ELENA MARIA;CERQUITELLI, TANIA;GARZA, PAOLO;MICHIARDI, PIETRO;PULVIRENTI, FABIO
2015

Abstract

Frequent closed itemset mining is among the most complex exploratory techniques in data mining, and provides the ability to discover hidden correlations in transactional datasets. The explosion of Big Data is leading to new parallel and distributed approaches. Unfortunately, most of them are designed to cope with low-dimensional datasets, whereas no distributed high-dimensional frequent closed itemset mining algorithms exists. This work introduces PaMPa-HD, a parallel MapReduce-based frequent closed itemset mining algorithm for high-dimensional datasets, based on Carpenter. The experimental results, performed on both real and synthetic datasets, show the efficiency and scalability of PaMPa-HD.
2015
978-1-4673-8493-3
File in questo prodotto:
File Dimensione Formato  
Pampa_Draft.pdf

accesso aperto

Descrizione: Draft
Tipologia: 2. Post-print / Author's Accepted Manuscript
Licenza: PUBBLICO - Tutti i diritti riservati
Dimensione 520.75 kB
Formato Adobe PDF
520.75 kB Adobe PDF Visualizza/Apri
Pubblicazioni consigliate

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11583/2639879
 Attenzione

Attenzione! I dati visualizzati non sono stati sottoposti a validazione da parte dell'ateneo