Everanalyzer: A self-adjustable big data management platform exploiting the Hadoop ecosystem

Published

3 February 2023

Contributing Authors

Panagiotis Karamolegkos, Argyro Mavrogiorgou, Athanasios Kiourtis, Dimosthenis Kyriazis

Abstract

Big Data is a phenomenon that affects today’s world, with new data being generated every second. Today’s enterprises face major challenges from the increasingly diverse data, as well as from indexing, searching, and analyzing such enormous amounts of data. In this context, several frameworks and libraries for processing and analyzing Big Data exist. Among those frameworks Hadoop MapReduce, Mahout, Spark, and MLlib appear to be the most popular, although it is unclear which of them best suits and performs in various data processing and analysis scenarios. This paper proposes EverAnalyzer, a self-adjustable Big Data management platform built to fill this gap by exploiting all of these frameworks. The platform is able to collect data both in a streaming and in a batch manner, utilizing the metadata obtained from its users’ processing and analytical processes applied to the collected data. Based on this metadata, the platform recommends the optimum framework for the data processing/analytical activities that the users aim to execute. To verify the platform’s efficiency, numerous experiments were carried out using 30 diverse datasets related to various diseases. The results revealed that EverAnalyzer correctly suggested the optimum framework in 80% of the cases, indicating that the platform made the best selections in the majority of the experiments.

Get Access

Categories

conference articles

journal articles

book chapters

latest publication

Can Large Language Models beat wall street? Evaluating GPT-4’s impact on financial decision-making with MarketSenseAI

Everanalyzer: A self-adjustable big data management platform exploiting the Hadoop ecosystem

Category

Published

Contributing Authors

Abstract

Subscribe to our newsletter!