Self-Adaptable Infrastructure Management for Analyzing the Efficiency of Big Data Stores
Abstract
Currently a continuously increasing amount of data is generated and processed in a daily basis towards improving decision-making and facilitating the gaining of insights. In this context, current era is characterized as the “Era of Big Data” with data characteristics including high volume, velocity, variety, or veracity, creating multiple chances and challenges. Several Information and Communication Technology (ICT) firms, enterprises and research projects are working upon the overall Big Data challenges with an increasing amount of effort being given to identify the means of effectively and efficiently collecting, storing, retrieving, analyzing and reusing Big Data in order to improve their services, increase their competitive advantage and support competent decisions. Such approaches deal with several sectors including the domains of healthcare, agricultural, environmental, transportation, governance, or insurance. Towards this goal, in order to identify the most efficient and less-time consuming database for using and reusing the stored data, in this paper we contribute into the selection of the most appropriate database for efficiently storing and retrieving Big Data. More specifically, considering the challenges and the nature of Big Data, as well as the main categories of databases that currently exist, three (3) NoSQL document-based databases are being described and compared under different working environments and conditions, namely the ArangoDB, the MongoDB and the CouchDB. These working environments depend on the Diastema platform that provides the ability of the adaptive allocation and management of infrastructures based on the …