PADL: a Language for the Operationalization of Distributed Analytical Pipelines over Edge/Fog Computing Environments

Abstract

In this paper we introduce PADL, a language for modeling and deploying data-based analytical pipelines. The novelty of this language relies on its independence from both the infrastructure and the technologies used on it. Specifically, this descriptive language aims at embracing all the particularities and constraints of high-demanding deployment models, such as critical restrictions regarding latency, privacy and performance, by providing fully-compliant schemas for implementing data analytical workloads. The adoption of PADL provides means for the operationalization of these pipelines in a reproducible and resilient fashion. In addition, PADL is able to fully utilize the benefits of Edge and Fog computing layers. The feasibility of the language has been validated with an analytical pipeline deployed over an Edge computing environment to solve an Industry 4.0 use case. The promising results obtained therefrom pave the way towards the widespread adoption of our proposed language when deploying data analytical pipelines over real application scenarios.