List of Publications Related to the S-PIC4CHU Project

Each publication is categorized based on its associated Work Project (WP).

WP1 Models and Architectures for DPPs and STPs

1.1. Semantic architecture and quality measures

Criscuolo, C., Salnitri, M., & Martinenghi, D. (2025). FAIR-CARE: A comparative evaluation of unfairness mitigation approaches. Information and Software Technology, 107898. - Link

Quintarelli, E., Schreiber, F. A., Stefanidis, K., Tanca, L., & Oliboni, B. (2025). A Conceptual Model for Context Awareness in Ethical Data Management. arXiv preprint arXiv:2511.21942. - Link

Pano, A., Lanti, D., & Calvanese, D. (2025, October). Virtual Knowledge Graphs over Earth Observation Data. In International Semantic Web Conference (pp. 149-166). Cham: Springer Nature Switzerland. - Link

1.2. Building blocks for data preparation

Chapman, A., Lauro, L., Missier, P., & Torlone, R. (2024). Supporting better insights of data science pipelines with fine-grained provenance. ACM Transactions on Database Systems, 49(2), 1-42. - Link

Lazzaro, P. L., Lazzaro, M., Missier, P., & Torlone, R. (2025). PROLIT: Supporting the Transparency of Data Preparation Pipelines through Narratives over Data Provenance. In EDBT (pp. 1138-1141). - Link

Karkee, S., Botoeva, E., Coombes, S., Jordanous, A., Kafali, Ö., & Lanti, D. (2025). Accessing Semi-structured Data with RML and LLMs. - Link

1.3. System architecture design

WP2 Improving the Quality of Data

2.1. Managing data inconsistency and incompleteness

2.2. Managing biased data

Tahmasebi, A. (2024). FairDataFlow: a modern solution to ensure data quality and fairness. - Link

Criscuolo, C., Martinenghi, D., & Piccirillo, G. (2025). A tutorial on intersectionality in fair rankings. arXiv preprint arXiv:2502.05333. - Link

Ciaccia, P., & Martinenghi, D. (2025). Optimization strategies for parallel computation of skylines. Distributed and Parallel Databases, 43(1), 10. - Link

Criscuolo, C., Martinenghi, D., & Huang, J. (2025). Reconciling Statistical and Causal Metrics of Fairness in Machine Learning on Data-Driven Systems. - Link

Ciaccia, P., & Martinenghi, D. (2025). Relevant, yet hard to find: Directional queries to the rescue. In Proceedings of the 33nd Symposium on Advanced Database Systems. - Link

Balzotti, L., Firmani, D., Mathew, J. G., Torlone, R., & Amer-Yahia, S. (2025, July). R-Fairness: Assessing Fairness of Ranking in Subjective Data. In Proceedings of the 63rd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers) (pp. 32187-32199). - Link

2.3. Data reduction

WP3 Semantic Enrichment, Provenance, and Explanation

3.1. Semantic enrichment

Xiao, G., Ren, L., Qi, G., Xue, H., Di Panfilo, M., & Lanti, D. (2025, August). LLM4VKG: Leveraging large language models for virtual knowledge graph construction. In Proceedings of the 34th International Joint Conference on Artificial Intelligence (IJCAI). - Link

Baura, D., & Calvanese, D. (2025). Real-world Assessment of Policy-Protected OBDA. - Link

3.2. Management of provenance data

Gregori, L., Lazzaro, P. L., Lazzaro, M., Missier, P., & Torlone, R. (2025). An LLM-guided platform for multi-granular collection and management of data provenance: L. Gregori et al. Journal of Big Data, 12(1), 187. - Link

3.3. Provenance and explanation in the context of OBDA

WP4 Coordination, Experimentation and Impact Enactment

4.1. Progress Report

Alfano, G., Bartolini, I., Calvanese, D., Ciaccia, P., Greco, S., Lanti, D., ... & Trubitsyna, I. (2025). S-PIC4CHU: Semantics-based Provenance, Integrity, and Curation for Consistent, High-quality, and Unbiased Data Science. - Link

Alfano, G., Bartolini, I., Calvanese, D., Ciaccia, P., Greco, S., Lanti, D., ... & Trubitsyna, I. (2025). S-PIC4CHU: Semantics-Enriched Techniques for Data Preparation in Data Science. In Proceedings of the 4th Italian Conference on Big Data and Data Science (ITADATA 2025), Turin, Italy, September 9-11, 2025 (pp. 1-8). - Link