An Optimized KDD Process for Collecting and Processing Healthcare Data
Abstract
Nowadays organizations are surrounded with enormous amounts of data, losing all the important information that resides in it. Knowledge Discovery in Databases (KDD) can aid organizations to transform this data into valuable information, by extracting complex patterns and relationships from it. To achieve that, various KDD techniques and tools have been proposed, resulting into impressive outcomes in various domains, especially in healthcare. Due to the huge amount of data available within the healthcare systems, data mining is extremely important for the healthcare sector. However, what is of major importance as well, is the way through which the data is collected, preprocessed and integrated with each other, considering its heterogeneous and diverse nature and format. To address all these challenges, this paper proposes a generalized KDD approach, which in essence constitutes a supplement of all the existing approaches that study and analyse the data mining part of the KDD process. This approach primarily concentrates on the phases of the selection, the preprocessing, as well as the transformation of the collected healthcare data, which are considered to be of great importance for its successful mining, analysis, and interpretation. The prototype of the proposed approach provides an example of the developed mechanism, explaining in deep detail its phases, verifying its possible wide applicability and adoption in various healthcare scenarios