Work package 11: Patterns for Data mining and Predictive Analytics

Brief description and aims of work

The objective of this work package is to translate data mining techniques into practical solutions for the analysis and prediction of patient data, thus creating a benefit for personalized medicine. It’s focus is thus to a less extend on the development of novel data mining techniques, but more on supporting the transition of state-of-the-art data mining solutions to clinical practice. In addition, an evaluation of the actual predictive performance of selected techniques will be carried out.

To facilitate a seamless transition of methods from data mining research to medical research, a pattern-based approach will be followed. A data mining pattern can be viewed as template for concrete data mining workflows. It acts as a formalized common language between data mining experts and medical researchers/ bioinformaticians. Under the name data mining pattern we understand a re-usable workflow template plus a description of the requirements and steps that are necessary to apply the generic data mining solution to a specific problem. It thus bridges the gap between a data mining algorithm, and a tool / workflow that applies the data mining algorithm to answer a specific research question on a given data set. This work package will develop the basic support for data mining patterns in the context of the p-medicine system, and will build re-usable solutions for current challenges in the predictive analysis of patient data.

Data mining patterns will be defined for the cases of the analysis of very large data sets, the analysis of privacy-sensitive data, data mining to support collaborative systems, and predictive literature mining. This data mining scenarios were selected because the share the common properties that

  • they have the potential to make a huge impact in the analysis of clinical data, when applied right
  • there exist proven solutions for a variety of specific application cases, and
  • it is currently hard to widely apply these solutions in practice,

because the translation of a solution to a new scenario or data set involves much manual work. This work package will demonstrate that using data mining patterns the re-usability of solutions in these scenarios will be very much improved.

Work package leader

Dennis Wegener Email

Fraunhofer-Gesellschaft zur Förderung der Angewandten Forschung e. V.
Institute for Intelligent Analysis and Information Systems (IAIS)
Schloss Birlinghoven
53757 Sankt Augustin/Germany