A systematic literature review of open data quality in practice

Il contenuto (Full text) non è disponibile all'interno di questo archivio. Spedisci una richiesta all'autore per una copia del documento
Tipo di pubblicazione: Articolo in atti di convegno
Tipologia MIUR: Contributo in Atti di Convegno (Proceeding) > Contributo in atti di convegno
Titolo: A systematic literature review of open data quality in practice
Autori: Rashid, Mohammad; Torchiano, Marco
Autori di ateneo:
Tipo di referee: Comitato scientifico
Titolo del convegno: Open Data Research Symposium
Luogo dell'evento: Madrid (Spagna)
Data dell'evento: October 5
Abstract: Context: The main objective of open data initiatives is to make information freely available through easily accessible mechanisms and facilitate exploitation. In practice openness should be accompanied with a certain level of trustwor- thiness or guarantees about the quality of data. Traditional data quality is a thoroughly researched field with several benchmarks and frameworks to grasp its dimensions. However, quality assessment in open data is a complicated process as it consists of stakeholders, evaluation of datasets as well as the publishing platform. Objective: In this work, we aim to identify and synthesize various features of open data quality approaches in practice. We applied thematic synthesis to identify the most relevant research problems and quality assessment methodologies. Method: We undertook a systematic literature review to summarize the state of the art on open data quality. The review process starts by developing the review protocol in which all steps, research questions, inclusion and exclusion criteria and analysis procedures are included. The search strategy retrieved 9323 publications from four scientific digital libraries. The selected papers were published between 2005 and 2015. Finally, through a discussion between the authors, 63 paper were included in the final set of selected papers. Results: Open data quality, in general, is a broad concept, and it could apply to multiple areas. There are many quality issues concerning open data hindering their actual usage for real-world applications. The main ones are unstruc- tured metadata, heterogeneity of data formats, lack of accuracy, incompleteness and lack of validation techniques. Furthermore, we collected the existing quality methodologies from selected papers and synthesized under a unifying classification schema. Also, a list of quality dimensions and metrics from selected paper is reported. Conclusion: In this research, we provided an overview of the methods related to open data quality, using the instru- ment of systematic literature reviews. Open data quality methodologies vary depending on the application domain. Moreover, the majority of studies focus on satisfying specific quality criteria. With metrics based on generalized data attributes a platform can be created to evaluate all possible open dataset. Also, the lack of methodology validation remains a major problem. Studies should focus on validation techniques
Status: In stampa
Lingua della pubblicazione: Inglese
Parole chiave: systematicreview, opendata, dataquality
Dipartimenti (originale): DAUIN - Dipartimento di Automatica Informatica
Dipartimenti: DAUIN - Dipartimento di Automatica e Informatica
URL correlate:
    Area disciplinare: Area 09 - Ingegneria industriale e dell'informazione > SISTEMI DI ELABORAZIONE DELLE INFORMAZIONI
    Data di deposito: 14 Set 2016 17:13
    Data ultima modifica (IRIS): 14 Set 2016 15:13:50
    Data inserimento (PORTO): 16 Set 2016 04:52
    Permalink: http://porto.polito.it/id/eprint/2648966
    Link resolver URL: Link resolver link

    Azioni (richiesto il login)

    Visualizza il documento (riservato amministratori) Visualizza il documento (riservato amministratori)