Projects per year
Organization profile
Introduction / mission
The chair studies data mining (DM) techniques and knowledge discovery approaches that are at the core of data science. The group is known for its contributions to the areas of predictive analytics, automation of machine learning and networked science, subgroup discovery and exceptional model mining, and similarity computations on complex data. Its research is inspired by theoretical computer science, systems development and real-world applications of (big) data-driven discovery in healthcare, banking, energy, retail, telecom, and education among others.
Organisational profile
We develop generic approaches and specialized techniques that cover a wide range of descriptive, predictive and prescriptive analytics and work effectively with text, image, transactional, graph and time-series data in a responsible manner. E.g. we use Deep Learning methods to develop models for high dimensional heterogeneous, unstructured and evolving data and apply this models to areas such as medical imaging, genomics, anomaly detection and sentiment analysis. We further work on methods for analyzing and explaining the model’s decisions and performance and facilitate effective DM with domain expert in the loop.
Success stories
We have created OpenML: an online collaborative platform for studying machine learning techniques. OpenML is used by almost 2,000 researchers, students, and practitioners world-wide, and contains around 20,000 datasets, 3,000 machine learning workflows, and 1,7 million shared experiments. It has won the Dutch Data Prize, as well as backing from Microsoft Research. It is crucial for the development of automated machine learning that is adopted by companies such as Philips.
Further information at OpenML.org
- NWO RATE-Analytics (with Tilburg University, Rabobank and Achmea) "Next generation predictive analytics for data-driven banking and insurance".
- ImpulseKYC-Analytics (with Rabobank) "Know your customer predictive analytics" project aims at developing approaches for effective DM on heterogeneous and evolving data sources with expert-in-the-loop.
- STW CAPA (with Adversitement and StudyPortals)"Context-aware predictive analytics" advanced the current state of the art in Web analytics.
- NWO Veni "Detection methods for similarity structures in time-dependent data"develops foundations for advanced time series and trajectories clustering.
- H2020 SODA (ICT-2016-1; Big Data PPP) "Scalable Oblivious Data Analytics" facilitates secure DM; together with Crypto group we develop practical approaches for DM with multi-party computation.
Fingerprint
Network
Profiles
-
Elahe Arani
- Mathematics and Computer Science, Data Mining - University Researcher
Person: OWP : University Teacher / Researcher
-
Guido Budziak, MSc
- Mathematics and Computer Science, Data Mining - University Researcher
Person: OWP : University Teacher / Researcher
-
Israel Campero Jurado, MSc
- Mathematics and Computer Science, Data Mining - Doctoral Candidate
Person: Prom. : doctoral candidate (PhD)
-
Smart One W&I TKI KPN Flagship
Pechenizkiy, M. & d'Hondt, T.
1/08/18 → 31/07/22
Project: Research direct
-
Interoperability of Heterogeneous IoT Platforms
Exarchakos, G., Mocanu, D. C. & Exarchakos, G.
1/01/16 → 31/12/18
Project: Research direct
-
Automated Reinforcement Learning: An Overview
Refaei Afshar, R., Zhang, Y., Vanschoren, J. & Kaymak, U., 13 Jan 2022, In: arXiv. 2022, 47 p., 2201.05000.Research output: Contribution to journal › Article › Academic
Open AccessFile -
Deep Ensembling with No Overhead for either Training or Testing: The All-Round Blessings of Dynamic Sparsity
Liu, S., Chen, T., Atashgahi, Z., Chen, X., Sokar, G., Mocanu, E., Pechenizkiy, M., Wang, Z. & Mocanu, D. C., 20 Jan 2022, (Accepted/In press) International Conference on Learning Representations, ICLR 2022.Research output: Chapter in Book/Report/Conference proceeding › Conference contribution › Academic › peer-review
File -
Don't Be So Dense: Sparse-to-Sparse GAN Training Without Sacrificing Performance
Liu, S., Tian, Y., Chen, T. & Shen, L., 1 Mar 2022, (Submitted) In: International Journal of Computer Vision. XX, XResearch output: Contribution to journal › Article › Academic › peer-review
Datasets
-
Random forest models for gene expression experiments in Transformational Machine Learning
Soldatova, L. N. (Creator), King, R. D. (Creator), Davis, A. M. (Creator), Dash, T. (Creator), Vanschoren, J. (Creator), Olier, I. (Creator) & Orhobor, O. I. (Creator), SciLifeLab, 10 Jan 2022
DOI: 10.17044/scilifelab.16837084
Dataset
Prizes
-
Best Paper Award ICPM 2021
Menkovski, V. (Recipient), Sommers, Dominique (Recipient) & Fahland, Dirk (Recipient), 4 Nov 2021
Prize: Other › Career, activity or publication related prizes (lifetime, best paper, poster etc.) › Scientific
-
Best PhD Thesis
de Campos, Cassio (Recipient), 2006
Prize: Other › Career, activity or publication related prizes (lifetime, best paper, poster etc.) › Scientific
-
IEEE CASE2021 Best Student Paper Award Finalists
Zhu, Aiyu (Recipient), Pauwels, Pieter (Recipient), de Vries, Bauke (Recipient) & Fang, Meng (Recipient), 2021
Prize: Other › Career, activity or publication related prizes (lifetime, best paper, poster etc.) › Scientific
File
-
2020 European Conference on Machine Learning and Principles and Practice of Knowledge Discovery in Databases (ECML PKDD 2020)
Shiwei Liu (Organiser)
13 Sep 2020 → 18 Sep 2020Activity: Participating in or organising an event types › Conference › Scientific
-
Machine Learning, better, together.
Joaquin Vanschoren (Speaker)
8 Dec 2018Activity: Talk or presentation types › Invited talk › Scientific
-
Tutorial on Automatic Machine Learning
Frank Hutter (Speaker) & Joaquin Vanschoren (Speaker)
3 Dec 2018Activity: Talk or presentation types › Keynote talk › Scientific
Press / Media
-
-Leiden University : How to make AI systems learn better
23/03/22
2 items of Media coverage
Press/Media: Expert Comment
-
Toloka to present new dataset at prestigious Data-Centric AI workshop launched by Andrew Ng
18/11/21
1 item of Media coverage
Press/Media: Expert Comment
-
Andrew Ng Announces The Launch Of NeurIPS Data-Centric AI Workshop
12/09/21
1 item of Media coverage
Press/Media: Expert Comment
Student theses
-
3D Face Reconstruction Using Deep Learning
Author: Jawahar, P., 20 Jan 2020Supervisor: Medeiros de Carvalho, R. (Supervisor 1), Gallucci, A. (Supervisor 2) & Vanschoren, J. (Supervisor 2)
Student thesis: Master
File -
Activity Recognition Using Deep Learning in Videos under Clinical Setting
Author: Srinivasan, V., 28 Jan 2020Supervisor: Duivesteijn, W. (Supervisor 1), Papapetrou, O. (Supervisor 2), Zhang, L. (External person) (External coach) & Vasu, J. D. (External coach)
Student thesis: Master
File -
A Data Cleaning Assistant
Author: Quadt, T. J., 2020Supervisor: Vanschoren, J. (Supervisor 1)
Student thesis: Bachelor
File