Due to recent technological developments it became possible to generate and store increasingly larger datasets. Not the amount of data, however, but the ability to interpret and analyze the data, and to base future policies and decisions on the outcome of the analysis determines the value of data. The amounts of data collected nowadays not only offer unprecedented opportunities to improve decision procedures for companies and governments, but also hold great challenges. Many pre-existing data analysis tools did not scale up to the current data sizes. From this need, the research filed of data mining emerged. In this chapter we position data mining with respect to other data analysis techniques and introduce the most important classes of techniques developed in the area: pattern mining, classification, and clustering and outlier detection. Also related, supporting techniques such as pre-processing and database coupling are discussed.
|Title of host publication||Discrimination and Privacy in the Information Society: Effects of Data Mining and Profiling Large Databases|
|Editors||B.H.M. Custers, T.G.K. Calders, B.W. Schermer, T.Z. Zarsky|
|Place of Publication||Berlin|
|Publication status||Published - 2013|
|Name||Studies in Applied Philosophy, Epistemology and Rational Ethics|