Big Data

Data Analysis
Data
Analysis
Data Analysis
Data Processing
Data Analysis
Big
Data
Cloud Computing
Cloud Computing
Cloud Computing
Internet of Things
Cloud Computing
Information Systems

Implementation of a Hadoop-based system for Big Data Management

The objective of the thesis will be the study and development of a Hadoop-based web application following the microservices architecture, implementing the complete data path of Big Data, from data collection to data analysis, in the fields of either eHealth or environment.

Required Skills: Java/ Python, Web technologies


Comparative study of the performance of MapReduce and Spark ecosystems for analyzing Big Data

The objective of the thesis will be the study and comparison of the performance of the MapReduce and Spark ecosystems, focusing especially on their tools for applying data analysis upon Big Data of diverse types (structured, semi-structured, unstructured), visualizing the captured results.

Required Skills: Java/ Python


Comparative study of the performance and analysis of RDF Storage Models and Query Optimization Techniques

The objective of the thesis will be to analyse and compare the various tools and methodologies that exist around from RDF Storage Models and Query Optimization Techniques (GraphDB, neo4j, owlready2).

Required Skills: Python, SparkQL, Gpaph Theory


Automatic application deployment form and hardware accelerator selection

This thesis is about the establishment of a model enabling the automatic selection of the deployment form (docker, serverless, vm) and the most suitable hardware accelerator (GPU, FPGA) of an application in cloud/edge environment.

Required Skills: Python, Java, Docker, Serverless, Machine learning


Comparative Study of algorithms and techniques establised for “small data” situations

Target of this thesis will be the study and presentation of modern techniques and algorithms developed specifically for situations where the data available are limited in numbers. These techniques include algorithms to enhance and proliferate data such as data enrichment and synthetic data creation. In adiition algorithms such as Few Shot Learning will be addressed that are developed to specifically address the lack of big volumes of data.

Required Skills: Python, Machine Learning, Neural Networks


Automated Machine Learning for Time Series data

This study will explore how Automated Machine Learning, the domain of automated algorithm selection and hyperparameter tuning, can be applied on time series data. Due to the nature of time series data, data the come in certain time intervals such as stock predictions, certain challenges will arise such as when do you need to change the model or its parameters that are used for the prediction implementig concepts that take into account certain aspects such as drifts.

Required Skills: Python, Machine Learning, Optimization