Topic: machine learning (ml fundamentals)

Q1: What is AUC (Area under the ROC Curve)?

A: A number between 0.0 and 1.0 representing a binary classification model's ability to separate positive classes from negative classes.
B: A metric used to evaluate the performance of a classification model by calculating the area under the ROC curve.
C: A measure of the model's ability to correctly classify both positive and negative instances across different classification thresholds.
D: A statistical measure that quantifies the overall performance of a binary classification model across all possible classification thresholds, considering both true positive and false positive rates.

Q2: What is overfitting?

A: A state where the model's performance on the training data significantly improves while its performance on unseen data degrades.
B: A situation where the model learns the training data too well, capturing noise and irrelevant details, leading to poor generalization.
C: Creating a model that matches the training data so closely that the model fails to make correct predictions on new data.
D: A phenomenon where the model's performance on the training data plateaus, indicating that further training will not lead to significant improvements.

Q3: What is an Activation Function?

A: A function that takes an input value and produces an output value within a specific range, typically between 0 and 1.
B: A function that introduces non-linearity into the output of a neuron in a neural network.
C: A mathematical function that maps any input value to a value between 0 and 1, often used to represent probabilities.
D: A function that scales the input values to a specific range, typically between -1 and 1, to improve model performance.

Q4: What is inference?

A: The process of making predictions by applying a trained model to unlabeled examples.
B: The process of determining the ideal parameters of a model by feeding it labeled data.
C: The process of evaluating a trained model's performance on a separate dataset to assess its generalization ability.
D: In machine learning, the process of making predictions by applying a trained model to unlabeled examples.

Q5: What is L1 Regularization?

A: A type of regularization that penalizes weights in proportion to the sum of the absolute values of the weights.
B: A type of regularization that penalizes weights in proportion to the sum of the squares of the weights.
C: A type of regularization that randomly drops out neurons during training to prevent overfitting.
D: A type of regularization that stops training early when the model's performance on a validation set starts to decrease.

Q6: What is a classification model?

A: A model that predicts a continuous output variable, such as price or temperature.
B: A model that predicts the probability of an event occurring.
C: A model whose prediction is a class or category.
D: A model that groups similar data points together based on their characteristics.

Q7: What is a confusion matrix?

A: A table that summarizes the performance of a classification model by showing the counts of true positives, true negatives, false positives, and false negatives.
B: An NxN table that summarizes how many correct and incorrect predictions a classification model made.
C: A visualization tool that plots the true positive rate against the false positive rate for different classification thresholds.
D: A graph that shows the relationship between precision and recall for different classification thresholds.

Q8: What is feature engineering?

A: The process of transforming raw data into a format suitable for training a machine learning model.
B: The process of selecting the most relevant features from a dataset to improve model performance.
C: The process of creating new features from existing ones to enhance model accuracy.
D: Determining which features might be useful in training a model and converting raw data from the dataset into efficient versions of those features.

Q9: What is generalization?

Q10: What is loss?

A: A measure of how well a model's predictions match the actual values in the training data.
B: A function that calculates the difference between predicted and actual values.
C: A measure of how far a model's predictions are from its label during training.
D: A metric that evaluates the overall performance of a model on a test dataset.