Building a Model
- Creating a model is an interactive process:
- Gather the data needed
- Selection of features - collect past observations (houses for sale) and targets (sale process)
- Choice of algorithm - template for relationship between the input and the output
- Selection of values for hyperparameters (tuning the dials of the model to make sure it functions)
- Selection of Loss (cost) function
- Train the model using past data collected
- Evaluate the performance
Feature Selection
- Features - characteristics of the input data
- Important to define feature - interaction of factors that:
- Might influence the solution (years the house is build, city, neighborhood)
- The data that we might be able to collect
- Methods of selection:
- Speak w experts - domain expertise
- Collect data - Visualization
- Past data collection =- statistical correlation
- Collect as much as you can, but narrow down the data - Modeling
Algorithm Selection
- Dependent on the task that needs t be solved
- Best approach - employ different algorithms and train all
- Criteria for algorithm selection:
- Performance Accuracy - expected performance of the model
- Interpretability - how easy/hard it is to interpret and predict the result of the model computation
- Computational Efficiency - computational horsepower measure in relation to the result generation
Model Complexity
Depends on:- Number of Features
- Algorithm - linear regression vs neuronetworks
- Hyperparameter Values
Bias-Variance Trade off
- Bias - modeling a complex problem using a simple model; the model is incapable of fully capturing the depth and underlying patterns in the data.
- Variance - sensitivity of the module to small fluctuations in the data; ex: interpreting noise as actual patterns
- Simpler model = higher bias; lower variance
- Complex model = lower bias; higher variance
- Total Error = Bias`2 + Variance + Inherent Error(noise)`2
- Need to be careful Underfitting / Overfitting the model
No comments:
Post a Comment