Skip to main navigation Skip to search Skip to main content

URL study guide

https://tue.osiris-student.nl/onderwijscatalogus/extern/cursus?cursuscode=2PDDSDMM&collegejaar=2025&taal=en

Description

Data Mining is the subfield within Data Science that is most strongly connected to the data modeling techniques emerging from the classical field of Statistics. The purpose of performing data mining techniques are amongst others data segmentation, data classification, predictive and rule based modeling. For most standard data mining techniques and algorithms R and Python are the most suitable software platforms. The basic techniques within Data Mining are:
  1. Linear Regression, Logistic Regression, Clustering (hierarchical and non-hierarchical) and k-Nearest Neighbors
  2. Decision Trees and Random Forests as segmentation-with-a-purpose strategies PDEng
  3. Neural Networks for predictive classification and feature extraction, Support Vector Machines, Rule Induction Systems
  4. Deep Learning, Convolutional Algorithms, Bayesian Networks, Statistical Learning, Genetic Algorithms
From the many references, we mention the one by Alex Berson et al, and the references therein: An Overview of Data Mining Techniques, excerpted from the book Building Data Mining Applications for CRM by Alex Berson, Stephen Smith, and Kurt Thearling, 2005 http://weber.itn.liu.se/~jimjo94/courses/TNM048/documents/DM-Techniques.pdf
 
Relevant open source courses
Mining Massive Data Sets / Stanford Coursera & Digital & Book $58
Introduction to Information Retrieval / Stanford Digital & Book $56
OSDSM Specialization Web Scraping & Crawling
Machine Learning Ng Stanford / Coursera
A Course in Machine Learning UMD / Digital Book
The Elements of Statistical Learning / Stanford Digital & Book $80 & Study Group
Neural Networks Andrej Karpathy / Python Walkthrough
Neural Networks U Toronto / Coursera
 
Software Packages
scikit-learn - Tools for Data Mining & Analysis
Data Science in IPython Notebooks (Linear Regression, Logistic Regression, Random Forests, K-Means Clustering)

Method of Assessment

Assignment
Course period1/09/1831/08/26
Course formatCourse