[EN][Research Paper] Modern Big Data Platform

Feb 22, 2025 · 1 min read

A double-blind peer-reviewed research paper published on Springer Nature’s journal.

Full-text: Springer Nature Discover Applied Sciences - Modern Data Platform with DataOps, Kubernetes, and Cloud-Native Ecosystem

  • This research creates a blueprint using DataOps, Kubernetes, and Cloud-Native ecosystem to build a resilient Big Data platform following the Data Lakehouse architecture, the base for Machine Learning and Artificial Intelligence.
  • Using an iterative approach, we architectured and implemented the core of the platform, which is composable and cloud-agnostic, avoiding vendor lock-in, which could run on any Cloud provider or on-premises.
  • The initial benchmarking showed that the platform could efficiently handle millions of records, benefiting from Apache Iceberg features.

The paper is based on my master’s dissertation in Data Engineering at Edinburgh Napier University, Scotland, UK (2023).

More details: My Master’s Dissertation - Modern Data Platform with DataOps, Kubernetes, and Cloud-Native Ecosystem.

It was a great experience at Edinburgh Napier University, one of the top 10 UK modern universities and the top Scottish modern university for research power and impact.