Thursday, November 2, 2023

Building a Machine Leaning Model

 Building a Model

  • Creating a model is an interactive process:
    • Gather the data needed
    • Selection of features - collect past observations (houses for sale) and targets (sale process)
    • Choice of algorithm - template for relationship between the input and the output
    • Selection of values for hyperparameters (tuning the dials of the model to make sure it functions)
    • Selection of Loss (cost) function
    • Train the model using past data collected
    • Evaluate the performance

Feature Selection

  • Features - characteristics of the input data
  • Important to define feature - interaction of factors that:
    • Might influence the solution (years the house is build, city, neighborhood)
    • The data that we might be able to collect
  • Methods of selection:
    • Speak w experts - domain expertise
    • Collect data - Visualization
    • Past data collection =- statistical correlation
    • Collect as much as you can, but narrow down the data - Modeling

Algorithm Selection

  • Dependent on the task that needs t be solved
  • Best approach - employ different algorithms and train all
  • Criteria for algorithm selection:
    • Performance Accuracy - expected performance of the model
    • Interpretability - how easy/hard it is to interpret and predict the result of the model computation
    • Computational Efficiency - computational horsepower measure in relation to the result generation

Model Complexity

Depends on:
  • Number of Features
  • Algorithm - linear regression vs neuronetworks
  • Hyperparameter Values

Bias-Variance Trade off

  • Bias - modeling a complex problem using a simple model; the model is incapable of fully capturing the depth and underlying patterns in the data.
  • Variance - sensitivity of the module to small fluctuations in the data; ex: interpreting noise as actual patterns
Typically
  • Simpler model = higher bias; lower variance
  • Complex model = lower bias; higher variance
  • Total Error = Bias`2 + Variance + Inherent Error(noise)`2
  • Need to be careful Underfitting / Overfitting the model 

No comments:

Post a Comment