Table of Contents
show
Feature Selection
What is Feature Selection
Characterize problem with smallest set of features,
data:image/s3,"s3://crabby-images/e6b24/e6b2468d389e0352173f19f1bace8d178668203b" alt=""
Feature Selection Methods
data:image/s3,"s3://crabby-images/0f88e/0f88eb1a4ac74bc52158f8a35b3f51e8672c8732" alt=""
Adding Features
New Features derived from existing features
data:image/s3,"s3://crabby-images/cfe2d/cfe2dce92eba9682cd5335732c9f3d411cc8430f" alt=""
Removing Features
- Features that are very correlated
- Features with a lot of missing values
- Irrelevant features : ID, row number, etc.
Combining Features
data:image/s3,"s3://crabby-images/ed68d/ed68d55ce178ba843db9c2afb1764efce646758f" alt=""
Recoding Features
Examples
- Discretization: re-format continuous feature as discrete
- Customer’s age à {teenager, young, adult, senior}
Breaking Up Feature
data:image/s3,"s3://crabby-images/c3ac8/c3ac8c07504f84b69e9371ea3d657bb4d03b2ef8" alt=""
Feature Selection Summary
- Goal: Select Smallest set of features that best captures data for application
- Domain Knowledge is important
- Also known as “Feature Engineering”
Feature Transformation
Feature transformation involves mapping a set of values for the feature to a new set of values to make the representation of the data more suitable or easier to process for the downstream analysis
data:image/s3,"s3://crabby-images/24b9f/24b9fce2b2dec11cba7e7e83ad1f0d819a164735" alt=""
1) Scaling
- Changing the range of values for a feature to another specified range
- Done to avoid allowing features with large values to dominate the analysis results
data:image/s3,"s3://crabby-images/1db11/1db11d7de0ea66c627a5000f575238a73e70e065" alt=""
Scaling to a range
To perform scaling is to map all values of a feature to a specific range such as between 0 an 1
data:image/s3,"s3://crabby-images/125df/125df2b4e316ac22d42e028818e5af75f2e0c5b6" alt=""
Zero – Normalization / Standardization
Transform the features such that the results have zero mean and unit standard deviation
data:image/s3,"s3://crabby-images/38c9d/38c9df22f51f488e8262230c7bdc0ee85a7f0472" alt=""
2) Filtering
A low pass filter removes components above a certain frequency allowing the rest to pass through unaltered
Remove grainy appearance in images
data:image/s3,"s3://crabby-images/e322c/e322c76fcdd2bc8567f06b25f85c6bd76ba53915" alt=""
Filter noise from audio signal
data:image/s3,"s3://crabby-images/07ccf/07ccfa583603d630668d174aa6ab757137759642" alt=""
3) Aggregation
Combines values for a feature in order to summarize the data or to reduce variability
data:image/s3,"s3://crabby-images/bc0a5/bc0a5884b49af7ae8ba28b551a243d6bcf8fc34e" alt=""
Feature Transformation Summary
- What: Map feature values to new set of values
- Why: Have data in format suitable for analysis
- Caveat: Take care not to filter out important characteristics of data
Views: 2