Introduction to Data Mining

Chapter 2. Data Preparation and Preprocessing

2.1

  Data Types and Forms

2.2

  Data Preparation Methods
2.2.1   Data Normalization
2.2.2   Dealing with Temporal data
2.2.3   Outlier Removal techniques

2.3

  Data Preprocessing Methods
2.3.1   Data Cleaning
2.3.1.1   Missing Data
2.3.1.2   Noisy Data
2.3.1.3   Inconsistent Data
2.3.2   Data Reduction
2.3.2.1   Dimension Reduction
2.3.2.1.1   Feature Selection
2.3.2.1.2   Feature Transformation
2.3.2.2   Instance Selection
2.3.2.3   Value Discretization
2.3.2.3.1   Binning
2.3.2.3.2   Entropy-based
2.3.2.3.3   ChiMerge and Chi2

2.4

  Chapter Quiz

2.5

  Chapter Review Exercises

Go to Chapter Slides