Explaining 8 AI/ML Terms for Those New to the Field
Classification: The process of categorizing data points into different groups based on training data. For example, sorting emails into 'spam' and 'not spam' categories.
Regression: A method for making predictions of numerical outcomes using input data. It establishes a connection between input and output, like predicting house prices based on features.
Underfitting: When a model is insufficiently trained due to limited data. This leads to poor accuracy as it fails to capture the main trends, often seen in simple assumptions on complex data.
Overfitting: Occurs when a model is trained excessively on noise and irrelevant data. The model becomes too tailored to training data, resulting in poor performance on new data.
Loss function and Cost function: Loss function measures the error between actual and predicted values for one data point. Cost function quantifies errors across the entire dataset. They assess model accuracy and are sometimes used interchangeably.
Neural Network: A type of ML model inspired by the human brain's structure. It uses interconnected layers to perform complex tasks, mimicking the flow of information in the brain.
Parameters and Hyperparameters: Parameters are learned from training data and define a model's skill. Hyperparameters control the learning process and tuning, influencing a model's performance. Hyperparameters are set externally.
Test Data: Data used to evaluate a final trained model's real-world application. Unlike training and validation data, test data lacks labels, reflecting the model's real-world capability.