Projects per year
Organisation profile
Introduction / mission
The chair studies data mining (DM) techniques and knowledge discovery approaches that are at the core of data science. The group is known for its contributions to the areas of predictive analytics, automation of machine learning and networked science, subgroup discovery and exceptional model mining, and similarity computations on complex data. Its research is inspired by theoretical computer science, systems development and real-world applications of (big) data-driven discovery in healthcare, banking, energy, retail, telecom, and education among others.
Organisational profile
We develop generic approaches and specialized techniques that cover a wide range of descriptive, predictive and prescriptive analytics and work effectively with text, image, transactional, graph and time-series data in a responsible manner. E.g. we use Deep Learning methods to develop models for high dimensional heterogeneous, unstructured and evolving data and apply this models to areas such as medical imaging, genomics, anomaly detection and sentiment analysis. We further work on methods for analyzing and explaining the model’s decisions and performance and facilitate effective DM with domain expert in the loop.
Success stories
We have created OpenML: an online collaborative platform for studying machine learning techniques. OpenML is used by almost 2,000 researchers, students, and practitioners world-wide, and contains around 20,000 datasets, 3,000 machine learning workflows, and 1,7 million shared experiments. It has won the Dutch Data Prize, as well as backing from Microsoft Research. It is crucial for the development of automated machine learning that is adopted by companies such as Philips.
Further information at OpenML.org
- NWO RATE-Analytics (with Tilburg University, Rabobank and Achmea) "Next generation predictive analytics for data-driven banking and insurance".
- ImpulseKYC-Analytics (with Rabobank) "Know your customer predictive analytics" project aims at developing approaches for effective DM on heterogeneous and evolving data sources with expert-in-the-loop.
- STW CAPA (with Adversitement and StudyPortals)"Context-aware predictive analytics" advanced the current state of the art in Web analytics.
- NWO Veni "Detection methods for similarity structures in time-dependent data"develops foundations for advanced time series and trajectories clustering.
- H2020 SODA (ICT-2016-1; Big Data PPP) "Scalable Oblivious Data Analytics" facilitates secure DM; together with Crypto group we develop practical approaches for DM with multi-party computation.
Fingerprint
Collaborations and top research areas from the last five years
Profiles
-
Adam Arafan
- Mathematics and Computer Science, Data Mining - Doctoral Candidate
Person: Prom. : doctoral candidate (PhD)
-
Elahe Arani
- Mathematics and Computer Science, Data Mining - University Researcher
Person: OWP : University Teacher / Researcher
-
Guido Budziak, MSc
- Mathematics and Computer Science, Data Mining - University Researcher
Person: OWP : University Teacher / Researcher
Projects
- 2 Finished
-
-
Interoperability of Heterogeneous IoT Platforms
Mocanu, D. C. & Exarchakos, G.
1/01/16 → 31/12/18
Project: Research direct
-
Algorithmic Unfairness through the Lens of EU Non-Discrimination Law: Or Why the Law is not a Decision Tree
Weerts, H., Xenidis, R., Tarissan, F., Olsen, H. P. & Pechenizkiy, M., 12 Jun 2023, Proceedings of the 6th ACM Conference on Fairness, Accountability, and Transparency, FAccT 2023. Association for Computing Machinery, Inc, p. 805-816 12 p.Research output: Chapter in Book/Report/Conference proceeding › Conference contribution › Academic › peer-review
Open Access -
An AI-empowered infrastructure for risk prevention during medical examination
Shah, S. I. H., Naeem, M., Paragliola, G., Coronato, A. & Pechenizkiy, M., 1 Sept 2023, In: Expert Systems with Applications. 225, 10 p., 120048.Research output: Contribution to journal › Article › Academic › peer-review
-
Analyzing the Posterior Collapse in Hierarchical Variational Autoencoders
Kuzina, A. & Tomczak, J. M., 20 Feb 2023, In: CoRR. 2023, 18 p., 2302.09976.Research output: Contribution to journal › Article › Academic
Open AccessFile12 Downloads (Pure)
Datasets
-
Histopathology data of bone marrow biopsies (HistBMP or HistMNIST)
Tomczak, J. (Contributor), Zenodo, 18 Mar 2018
Dataset
-
Microscope images of human cancer cell lines (U2OS and HL-60)
Lavitt, F. (Creator), Rijlaarsdam, D. J. (Creator), van der Linden, D. (Creator), Weglarz-Tomczak, E. (Contributor) & Tomczak, J. (Creator), Zenodo, 8 Jan 2021
Dataset
-
Random forest models for gene expression experiments in Transformational Machine Learning
Soldatova, L. N. (Creator), King, R. D. (Creator), Davis, A. M. (Creator), Dash, T. (Creator), Vanschoren, J. (Creator), Olier, I. (Creator) & Orhobor, O. I. (Creator), SciLifeLab, 10 Jan 2022
DOI: 10.17044/scilifelab.16837084
Dataset
Prizes
-
Best Demo Paper Award of IEEE ICDE 2023
Halstead, Ben (Recipient), Koh, Yun Sing (Recipient), Riddle, Patricia (Recipient), Pechenizkiy, Mykola (Recipient) & Bifet, Albert (Recipient), 2023
Prize: Other › Career, activity or publication related prizes (lifetime, best paper, poster etc.) › Scientific
File -
Best Paper Award ICPM 2021
Menkovski, V. (Recipient), Sommers, Dominique (Recipient) & Fahland, Dirk (Recipient), 4 Nov 2021
Prize: Other › Career, activity or publication related prizes (lifetime, best paper, poster etc.) › Scientific
-
Best Paper Award of ALA 2022
Sokar, Ghada (Recipient), Mocanu, Elena (Recipient), Mocanu, Decebal C. (Recipient), Pechenizkiy, Mykola (Recipient) & Stone, Peter (Recipient), 2022
Prize: Other › Career, activity or publication related prizes (lifetime, best paper, poster etc.) › Scientific
-
2020 European Conference on Machine Learning and Principles and Practice of Knowledge Discovery in Databases (ECML PKDD 2020)
Shiwei Liu (Organiser)
13 Sept 2020 → 18 Sept 2020Activity: Participating in or organising an event types › Conference › Scientific
-
Machine Learning, better, together.
Joaquin Vanschoren (Speaker)
8 Dec 2018Activity: Talk or presentation types › Invited talk › Scientific
-
Tutorial on Automatic Machine Learning
Frank Hutter (Speaker) & Joaquin Vanschoren (Speaker)
3 Dec 2018Activity: Talk or presentation types › Keynote talk › Scientific
Press/Media
-
Reports from University of Technology Sydney Add New Data to Findings in Technology (Reinforcement Learning With Multiple Relational Attention for Solving Vehicle Routing Problems)
1/09/23
1 item of Media coverage
Press/Media: Expert Comment
-
New Gels Research Study Findings Recently Were Reported by a Researcher at Huaqiao University (Drying Process of HPMC-Based Hard Capsules: Visual Experiment and Mathematical Modeling)
16/06/23
1 item of Media coverage
Press/Media: Expert Comment
-
Huaqiao University Researcher Describes Findings in Plasticizers (Enhancing Pullulan Soft Capsules with a Mixture of Glycerol and Sorbitol Plasticizers: A Multi-Dimensional Study)
30/05/23
1 item of Media coverage
Press/Media: Expert Comment
Student theses
-
3D Face Reconstruction Using Deep Learning
Author: Jawahar, P., 20 Jan 2020Supervisor: Medeiros de Carvalho, R. (Supervisor 1), Gallucci, A. (Supervisor 2) & Vanschoren, J. (Supervisor 2)
Student thesis: Master
File -
Activity Recognition Using Deep Learning in Videos under Clinical Setting
Author: Srinivasan, V., 28 Jan 2020Supervisor: Duivesteijn, W. (Supervisor 1), Papapetrou, O. (Supervisor 2), Zhang, L. (External person) (External coach) & Vasu, J. D. (External coach)
Student thesis: Master
File -
A Data Cleaning Assistant
Author: Niederle, J. M., 2020Supervisor: Vanschoren, J. (Supervisor 1)
Student thesis: Bachelor
File