A Visual Programming Environment for Describing Complex Big Data Functions

Category

Conference Article

Published

21 November 2023

Abstract

The rapid advancement of technology and the increasing reliance on computer systems have elevated the significance of programming skills in various domains. Traditional programming techniques, primarily text-based languages, have been the backbone of software development for decades. However, these techniques often require a steep learning curve, limiting their accessibility to individuals without technical backgrounds. To address this issue, Visual Programming Environments (VPEs) have emerged as a powerful alternative, offering intuitive interfaces, consisting of graphical components such as buttons, icons or moving elements. The simple visual layout of these building blocks makes programming more accessible to non-technical users, and enables them to understand, design, and explain complex big data pipelines. While VPE's offer significant benefits, they are not without their limitations. Firstly, these environments may impose constraints on the flexibility and expressiveness of programming constructs, limiting advanced programming techniques that require fine-grained control over complex algorithms. Additionally, the graphical nature of VPE's may result in less efficient coding practices, as users might prioritize visual aesthetics over performance optimization. It is also of great importance that these services can be containerized, in order to be easily deployed and scaled, when talking about big data scenario's requirements. In this study, a VPE named Pipeline Modeler is implemented, capable of translating visual graph representations into operations, which can be then executed in big data scenarios, such as analytics pipelines and flow management on finance datasets. With the use of node blocks, one can create a graph to declare a complex mathematical flow or a pipeline of linked operations that need to be executed on a given set of data. After proper evaluation and experimentation, it is identified that this service is ideal for multiple big data and microservices scenarios, considering pipeline management cases, and that is why the Pipeline Modeler can be a great solution in the pursuit of inclusive and efficient software development and data management.