Study, collection and Implementation of Data Cleaning Techniques in the Sector of Electronic Health

The object of the work is the study and recording of techniques for data cleaning in the field of Electronic Health, with the aim of implementing an application based on which a cleaning mechanism will be provided, responsible for the complete cleaning of the provided data.

According to the relevant study, a complete library with multiple data cleaning methods will be implemented, where the user will dynamically select the set of data he wants to clean, and will set specific criteria among a list of data cleaning criteria that will be provided. Depending on the criteria that will be set, the corresponding cleaning methods will be selected automatically. In addition, the specific application will provide the visualization of the statistics of the relevant results, regarding the percentages of the data that were cleaned, deleted, predicted, etc., according to the respective cleaning rules that were set each time.

Technical prerequisites

  • Python