URL study guide
https://tue.osiris-student.nl/onderwijscatalogus/extern/cursus?cursuscode=JBG060&collegejaar=2025&taal=enDescription
Non Bachelor Data Science wanting to register for this course should reach out to program management via [email protected] for formal approvalThe objective of the Data Challenge courses is to teach students how to perform large-scale data-driven analyses themselves, combining technical skills acquired earlier with insights gained in methodological courses.
The focus of Data Challenge 3 is to take students through the entire life-cycle of a data analysis for public stakeholders, starting in a typical situation where neither the analysis question is sufficiently clear, nor the dataset is sufficiently well understood. The data challenge has a societal, a methodological, and a technical aspect. The societal aspect is to answer analysis questions driven by public interests and to communicate findings to public stakeholders (governance or society). Methodologically, students face the problem of having to answer an analysis question that is (initially) only partially defined while the data available is large and complex, and on one hand contains much more data than necessary to answer the question, and on the other hand may lack necessary data of sufficient quality to answer the original question. Technically, the data has a temporal dimension that requires to include more dimensions in the analysis, conduct an analysis from multiple angles (with reduced dimensions), and to develop more dedicated visualizations to communicate the findings to stakeholders.
A fundamental element of the project is that students learn how to resolve the uncertainty about the aims, objectives, and feasibility of the analysis in multiple iterations. Students ultimately learn how to rescope and redefine a project to work on the given insufficient data and with limited time to provide a meaningful answer to the stakeholder.
In the course, students will work in groups using SCRUM and follow the complete CRISP-DM lifecycle and use data exploration and visualization to gain an understanding of the data and acquire domain knowledge to derive clearly formulated research questions suitable for the stakeholder, develop and conduct an analysis that can handle various dimensions of the data and the research question, and validate their findings both technically as well as through visualizations adequate for stakeholders.
Objectives
Non Bachelor Data Science wanting to register for this course should reach out to program management via [email protected] for formal approvalAfter taking this course students should be able to independently:
• recognize the phases of data analytics research and divide the research process in these phases
• familiarize themselves with techniques to understand a complex dataset through querying, visual analytics, and pre-processing from multiple angles
• rescope a given domain-specific analysis question into a well-defined research requestion based on stakeholder interests, properties of the datasets, independent research on relevant domain knowledge, and limited time available
• iteratively develop a repeatable method for answering a chosen research question while making justifiable assumptions
• identify and familiarize themselves with qualitative and quantitative data analysis and visualization techniques to answer a chosen research question
• implement a repeatable data analysis according to the chosen method
• validate the results on their own and through external feedback and qualitatively analyze shortcomings of their method and how these can be overcome
• document their research in a presentation/report/poster/prototype suitable for expert and non-expert audiences
• reflect critically on the choices they make in a data analysis, and what those choices mean for stakeholders and citizens, as well as being able to reflect on data analysis in the public interest