Topic: machine learning (ml fundamentals)

Q1: What is AUC (Area under the ROC Curve)?

  • A: A number between 0.0 and 1.0 representing a binary classification model's ability to separate positive classes from negative classes.
  • B: A metric used to evaluate the performance of a classification model by calculating the area under the ROC curve.
  • C: A measure of the model's ability to correctly classify both positive and negative instances across different classification thresholds.
  • D: A statistical measure that quantifies the overall performance of a binary classification model across all possible classification thresholds, considering both true positive and false positive rates.

Q2: What is overfitting?

  • A: A state where the model's performance on the training data significantly improves while its performance on unseen data degrades.
  • B: A situation where the model learns the training data too well, capturing noise and irrelevant details, leading to poor generalization.
  • C: Creating a model that matches the training data so closely that the model fails to make correct predictions on new data.
  • D: A phenomenon where the model's performance on the training data plateaus, indicating that further training will not lead to significant improvements.

Q3: What is an Activation Function?

  • A: A function that takes an input value and produces an output value within a specific range, typically between 0 and 1.
  • B: A function that introduces non-linearity into the output of a neuron in a neural network.
  • C: A mathematical function that maps any input value to a value between 0 and 1, often used to represent probabilities.
  • D: A function that scales the input values to a specific range, typically between -1 and 1, to improve model performance.

Q4: What is inference?

  • A: The process of making predictions by applying a trained model to unlabeled examples.
  • B: The process of determining the ideal parameters of a model by feeding it labeled data.
  • C: The process of evaluating a trained model's performance on a separate dataset to assess its generalization ability.
  • D: In machine learning, the process of making predictions by applying a trained model to unlabeled examples.

Q5: What is L1 Regularization?

  • A: A type of regularization that penalizes weights in proportion to the sum of the absolute values of the weights.
  • B: A type of regularization that penalizes weights in proportion to the sum of the squares of the weights.
  • C: A type of regularization that randomly drops out neurons during training to prevent overfitting.
  • D: A type of regularization that stops training early when the model's performance on a validation set starts to decrease.

Q6: What is a classification model?

  • A: A model that predicts a continuous output variable, such as price or temperature.
  • B: A model that predicts the probability of an event occurring.
  • C: A model whose prediction is a class or category.
  • D: A model that groups similar data points together based on their characteristics.

Q7: What is a confusion matrix?

  • A: A table that summarizes the performance of a classification model by showing the counts of true positives, true negatives, false positives, and false negatives.
  • B: An NxN table that summarizes how many correct and incorrect predictions a classification model made.
  • C: A visualization tool that plots the true positive rate against the false positive rate for different classification thresholds.
  • D: A graph that shows the relationship between precision and recall for different classification thresholds.

Q8: What is feature engineering?

  • A: The process of transforming raw data into a format suitable for training a machine learning model.
  • B: The process of selecting the most relevant features from a dataset to improve model performance.
  • C: The process of creating new features from existing ones to enhance model accuracy.
  • D: Determining which features might be useful in training a model and converting raw data from the dataset into efficient versions of those features.

Q9: What is generalization?

  • A: A model's ability to perform well on unseen data.
  • B: A model's ability to perfectly fit the training data.
  • C: A model's tendency to overfit the training data.
  • D: A model's inability to learn from the training data.

Q10: What is loss?

  • A: A measure of how well a model's predictions match the actual values in the training data.
  • B: A function that calculates the difference between predicted and actual values.
  • C: A measure of how far a model's predictions are from its label during training.
  • D: A metric that evaluates the overall performance of a model on a test dataset.