In this chapter we present the analysis of the Wikipedia collection by means of the ELiDa framework with the aim of enriching linked data. ELiDa is based on association rule mining, an exploratory technique to discover relevant correlations hidden in the analyzed data. To compactly store the large volume of extracted knowledge and efficiently retrieve it for further analysis, a persistent structure has been exploited. The domain expert is in charge of selecting the relevant knowledge by setting filtering parameters, assessing the quality of the extracted knowledge, and enriching the knowledge with the semantic expressiveness which cannot be automatically inferred. We consider, as representative document collections, seven datasets extracted from the Wikipedia collection. Each dataset has been analyzed from two point of views (i.e., transactions by documents, transactions by sentences) to highlight relevant knowledge at different levels of abstraction.

Semi-automatic knowledge extraction to enrich open linked data / Baralis, ELENA MARIA; Bruno, Giulia; Cerquitelli, Tania; Chiusano, SILVIA ANNA; Fiori, Alessandro; Grand, Alberto - In: Cases on Open-Linked Data and Semantic Web Applications / Patricia Ordoñez de Pablos, Miltiadis D. Lytras, Robert Tennyson, Jose Emilio Labra Gayo. - STAMPA. - [s.l] : IGI Global, 2013. - ISBN 9781466628274. - pp. 156-180 [10.4018/978-1-4666-2827-4.ch008]

Semi-automatic knowledge extraction to enrich open linked data

BARALIS, ELENA MARIA;BRUNO, GIULIA;CERQUITELLI, TANIA;CHIUSANO, SILVIA ANNA;FIORI, ALESSANDRO;GRAND, ALBERTO
2013

Abstract

In this chapter we present the analysis of the Wikipedia collection by means of the ELiDa framework with the aim of enriching linked data. ELiDa is based on association rule mining, an exploratory technique to discover relevant correlations hidden in the analyzed data. To compactly store the large volume of extracted knowledge and efficiently retrieve it for further analysis, a persistent structure has been exploited. The domain expert is in charge of selecting the relevant knowledge by setting filtering parameters, assessing the quality of the extracted knowledge, and enriching the knowledge with the semantic expressiveness which cannot be automatically inferred. We consider, as representative document collections, seven datasets extracted from the Wikipedia collection. Each dataset has been analyzed from two point of views (i.e., transactions by documents, transactions by sentences) to highlight relevant knowledge at different levels of abstraction.
2013
9781466628274
Cases on Open-Linked Data and Semantic Web Applications
File in questo prodotto:
Non ci sono file associati a questo prodotto.
Pubblicazioni consigliate

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11583/2502974
 Attenzione

Attenzione! I dati visualizzati non sono stati sottoposti a validazione da parte dell'ateneo