Machine Learning Interview Question Part 3

What Are the Different Types of Machine Learning?

There are three types of machine learning:

Supervised Learning
In supervised machine learning, a model makes predictions or decisions based on past or labeled data. Labeled data refers to sets of data that are given tags or labels, and thus made more meaningful.

Supervised Learning

Unsupervised Learning
In unsupervised learning, we don’t have labeled data. A model can identify patterns, anomalies, and relationships in the input data.

Unsupervised Learning

Reinforcement Learning
Using reinforcement learning, the model can learn based on the rewards it received for its previous action.

Reinforcement Learning

Consider an environment where an agent is working. The agent is given a target to achieve. Every time the agent takes some action toward the target, it is given positive feedback. And, if the action taken is going away from the goal, the agent is given negative feedback.

What is Overfitting, and How Can You Avoid It? 

The Overfitting is a situation that occurs when a model learns the training set too well, taking up random fluctuations in the training data as concepts. These impact the model’s ability to generalize and don’t apply to new data.

When a model is given the training data, it shows 100 percent accuracy—technically a slight loss. But, when we use the test data, there may be an error and low efficiency. This condition is known as overfitting.

There are multiple ways of avoiding overfitting, such as:

Regularization. It involves a cost term for the features involved with the objective function
Making a simple model. With lesser variables and parameters, the variance can be reduced
Cross-validation methods like k-folds can also be used
If some model parameters are likely to cause overfitting, techniques for regularization like LASSO can be used that penalize these parameters

What is ‘training Set’ and ‘test Set’ in a Machine Learning Model? How Much Data Will You Allocate for Your Training, Validation, and Test Sets?

There is a three-step process followed to create a model:

  1. Train the model
  2. Test the model 
  3. Deploy the model
Training SetTest Set
The training set is examples given to the model to analyze and learn
70% of the total data is typically taken as the training dataset
This is labeled data used to train the model
The test set is used to test the accuracy of the hypothesis generated by the model
Remaining 30% is taken as testing dataset
We test without labeled data and then verify results with labels

Consider a case where you have labeled data for 1,000 records. One way to train the model is to expose all 1,000 records during the training process. Then you take a small set of the same data to test the model, which would give good results in this case.

But, this is not an accurate way of testing. So, we set aside a portion of that data called the ‘test set’ before starting the training process. The remaining data is called the ‘training set’ that we use for training the model. The training set passes through the model multiple times until the accuracy is high, and errors are minimized.

Now, we pass the test data to check if the model can accurately predict the values and determine if training is effective. If you get errors, you either need to change your model or retrain it with more data.

Regarding the question of how to split the data into a training set and test set, there is no fixed rule, and the ratio can vary based on individual preferences. 

Machine Learning Interview Question

How Do You Handle Missing or Corrupted Data in a Dataset?

One of the easiest ways to handle missing or corrupted data is to drop those rows or columns or replace them entirely with some other value.

There are two useful methods in Pandas:

  • IsNull() and dropna() will help to find the columns/rows with missing data and drop them
  • Fillna() will replace the wrong values with a placeholder value

How Can You Choose a Classifier Based on a Training Set Data Size?

When the training set is small, a model that has a right bias and low variance seems to work better because they are less likely to overfit.

For example, Naive Bayes works best when the training set is large. Models with low bias and high variance tend to perform better as they work fine with complex relationships.

Explain the Confusion Matrix with Respect to Machine Learning Algorithms.

A confusion matrix (or error matrix) is a specific table that is used to measure the performance of an algorithm. It is mostly used in supervised learning; in unsupervised learning, it’s called the matching matrix.

The confusion matrix has two parameters:

  • Actual
  • Predicted 

It also has identical sets of features in both of these dimensions.

Consider a confusion matrix (binary matrix) shown below:


For actual values:

Total Yes = 12+1 = 13

Total No = 3+9 = 12 

Similarly, for predicted values:

Total Yes = 12+3 = 15

Total No = 1+9 = 10 

For a model to be accurate, the values across the diagonals should be high. The total sum of all the values in the matrix equals the total observations in the test data set. 

For the above matrix, total observations = 12+3+1+9 = 25

Now, accuracy = sum of the values across the diagonal/total dataset

= (12+9) / 25

= 21 / 25

= 84%

Advance Machine Learning Interview Question

What Is a False Positive and False Negative and How Are They Significant?

False positives are those cases that wrongly get classified as True but are False. 

False negatives are those cases that wrongly get classified as False but are True.

In the term ‘False Positive,’ the word ‘Positive’ refers to the ‘Yes’ row of the predicted value in the confusion matrix. The complete term indicates that the system has predicted it as a positive, but the actual value is negative. 

So, looking at the confusion matrix, we get:

False-positive = 3

True positive = 12

Similarly, in the term ‘False Negative,’ the word ‘Negative’ refers to the ‘No’ row of the predicted value in the confusion matrix. And the complete term indicates that the system has predicted it as negative, but the actual value is positive.

So, looking at the confusion matrix, we get:

False Negative = 1

True Negative = 9

What Are the Three Stages of Building a Model in Machine Learning?

The three stages of building a machine learning model are:

Model Building
Choose a suitable algorithm for the model and train it according to the requirement
Model Testing
Check the accuracy of the model through the test data
Applying the Model
Make the required changes after testing and use the final model for real-time projects
Here, it’s important to remember that once in a while, the model needs to be checked to make sure it’s working correctly. It should be modified to make sure that it is up-to-date.

What is Deep Learning?

The Deep learning is a subset of machine learning that involves systems that think and learn like humans using artificial neural networks. The term ‘deep’ comes from the fact that you can have several layers of neural networks.

One of the primary differences between machine learning and deep learning is that feature engineering is done manually in machine learning. In the case of deep learning, the model consisting of neural networks will automatically determine which features to use (and which not to use).

Machine Learning Interview Question

What Are the Differences Between Machine Learning and Deep Learning?

Machine Learning Deep Learning
Enables machines to take decisions on their own, based on past data
It needs only a small amount of data for training
Works well on the low-end system, so you don’t need large machines 
Most features need to be identified in advance and manually coded
The problem is divided into two parts and solved individually and then combined
Enables machines to take decisions with the help of artificial neural networks
It needs a large amount of training data Needs high-end machines because it requires a lot of computing power 
The machine learns the features from the data it is provided
The problem is solved in an end-to-end manner

What Are the Applications of Supervised Machine Learning in Modern Businesses?

Applications of supervised machine learning include:

  • Email Spam Detection Here we train the model using historical data that consists of emails categorized as spam or not spam. This labeled information is fed as input to the model.
  • Healthcare Diagnosis By providing images regarding a disease, a model can be trained to detect if a person is suffering from the disease or not.
  • Sentiment Analysis This refers to the process of using algorithms to mine documents and determine whether they’re positive, neutral, or negative in sentiment. 
  • Fraud Detection By training the model to identify suspicious patterns, we can detect instances of possible fraud.

What is Semi-supervised Machine Learning?

Supervised learning uses data that is completely labeled, whereas unsupervised learning uses no training data.

In the case of semi-supervised learning, the training data contains a small amount of labeled data and a large amount of unlabeled data.

Advance Machine Learning Interview Question

What Are Unsupervised Machine Learning Techniques? 

There are two techniques used in unsupervised learning: clustering and association.


Clustering problems involve data to be divided into subsets. These subsets, also called clusters, contain data that are similar to each other. Different clusters reveal different details about the objects, unlike classification or regression.


In an association problem, we identify patterns of associations between different variables or items.

For example, an e-commerce website can suggest other items for you to buy, based on the prior purchases that you have made, spending habits, items in your Wishlist, other customers’ purchase habits, and so on.

What is the Difference Between Supervised and Unsupervised Machine Learning?

  • Supervised learning – This model learns from the labeled data and makes a future prediction as output 
  • Unsupervised learning – This model uses unlabeled input data and allows the algorithm to act on that information without guidance.

What is the Difference Between Inductive Machine Learning and Deductive Machine Learning? 

Inductive LearningDeductive Learning
It observes instances based on defined principles to draw a conclusion
Example: Explaining to a child to keep away from the fire by showing a video where fire causes damage
It concludes experiences
Example: Allow the child to play with fire. If he or she gets burned, they will learn that it is dangerous and will refrain from making the same mistake again

Machine Learning Interview Question

Compare K-means and KNN Algorithms.

K-Means is unsupervised
K-Means is a clustering algorithm
The points in each cluster are similar to each other, and each cluster is different from its neighboring clusters
KNN is supervised in nature
KNN is a classification algorithm
It classifies an unlabeled observation based on its K (can be any number) surrounding neighbors

What Is ‘naive’ in the Naive Bayes Classifier?

The classifier is called ‘naive’ because it makes assumptions that may or may not turn out to be correct. 

The algorithm assumes that the presence of one feature of a class is not related to the presence of any other feature (absolute independence of features), given the class variable.

For instance, a fruit may be considered to be a cherry if it is red in color and round in shape, regardless of other features. This assumption may or may not be right (as an apple also matches the description).

Explain How a System Can Play a Game of Chess Using Reinforcement Learning.

Reinforcement learning has an environment and an agent. The agent performs some actions to achieve a specific goal. Every time the agent performs a task that is taking it towards the goal, it is rewarded. And, every time it takes a step that goes against that goal or in the reverse direction, it is penalized. 

Earlier, chess programs had to determine the best moves after much research on numerous factors. Building a machine designed to play such games would require many rules to be specified. 

With reinforced learning, we don’t have to deal with this problem as the learning agent learns by playing the game. It will make a move (decision), check if it’s the right move (feedback), and keep the outcomes in memory for the next step it takes (learning). There is a reward for every correct decision the system takes and punishment for the wrong one. 

Advance Machine Learning Interview Question

How Will You Know Which Machine Learning Algorithm to Choose for Your Classification Problem?

While there is no fixed rule to choose an algorithm for a classification problem, you can follow these guidelines:

  • If accuracy is a concern, test different algorithms and cross-validate them
  • If the training dataset is small, use models that have low variance and high bias
  • If the training dataset is large, use models that have high variance and little bias

How is Amazon Able to Recommend Other Things to Buy? How Does the Recommendation Engine Work?

Once a user buys something from Amazon, Amazon stores that purchase data for future reference and finds products that are most likely also to be bought, it is possible because of the Association algorithm, which can identify patterns in a given dataset. 

When Will You Use Classification over Regression?

Classification is used when your target is categorical, while regression is used when your target variable is continuous. Both classification and regression belong to the category of supervised machine learning algorithms.

Examples of classification problems include:

Predicting yes or no
Estimating gender
Breed of an animal
Type of color
Examples of regression problems include:

Estimating sales and price of a product
Predicting the score of a team
Predicting the amount of rainfall

Machine Learning Interview Question

How Do You Design an Email Spam Filter?

Building a spam filter involves the following process:

The email spam filter will be fed with thousands of emails
Each of these emails already has a label: ‘spam’ or ‘not spam.’
The supervised machine learning algorithm will then determine which type of emails are being marked as spam based on spam words like the lottery, free offer, no money, full refund, etc.
The next time an email is about to hit your inbox, the spam filter will use statistical analysis and algorithms like Decision Trees and SVM to determine how likely the email is spam
If the likelihood is high, it will label it as spam, and the email won’t hit your inbox
Based on the accuracy of each model, we will use the algorithm with the highest accuracy after testing all the models

What is a Random Forest?

A ‘random forest’ is a supervised machine learning algorithm that is generally used for classification problems. It operates by constructing multiple decision trees during the training phase. The random forest chooses the decision of the majority of the trees as the final decision.

Considering a Long List of Machine Learning Algorithms, given a Data Set, How Do You Decide Which One to Use?

There is no master algorithm for all situations. Choosing an algorithm depends on the following questions:

How much data do you have, and is it continuous or categorical?
Is the problem related to classification, association, clustering, or regression?
Predefined variables (labeled), unlabeled, or mix?
What is the goal?
Based on the above questions, the following algorithms can be used:

Advance Machine Learning Interview Question

What is Bias and Variance in a Machine Learning Model?

Bias in a machine learning model occurs when the predicted values are further from the actual values. Low bias indicates a model where the prediction values are very close to the actual ones.

Underfitting: High bias can cause an algorithm to miss the relevant relations between features and target outputs.

Variance refers to the amount the target model will change when trained with different training data. For a good model, the variance should be minimized.

Overfitting: High variance can cause an algorithm to model the random noise in the training data rather than the intended outputs.

What is the Trade-off Between Bias and Variance?

The bias-variance decomposition essentially decomposes the learning error from any algorithm by adding the bias, variance, and a bit of irreducible error due to noise in the underlying dataset.

Necessarily, if you make the model more complex and add more variables, you’ll lose bias but gain variance. To get the optimally-reduced amount of error, you’ll have to trade off bias and variance. Neither high bias nor high variance is desired.

High bias and low variance algorithms train models that are consistent, but inaccurate on average.

High variance and low bias algorithms train models that are accurate but inconsistent.

Define Precision and Recall.


Precision is the ratio of several events you can correctly recall to the total number of events you recall (mix of correct and wrong recalls).

Precision = (True Positive) / (True Positive + False Positive)


A recall is the ratio of the number of events you can recall the number of total events.

Recall = (True Positive) / (True Positive + False Negative)

Machine Learning Interview Question

What is a Decision Tree Classification?

A decision tree builds classification (or regression) models as a tree structure, with datasets broken up into ever-smaller subsets while developing the decision tree, literally in a tree-like way with branches and nodes. Decision trees can handle both categorical and numerical data.

What is Pruning in Decision Trees, and How Is It Done?

Pruning is a technique in machine learning that reduces the size of decision trees. It reduces the complexity of the final classifier, and hence improves predictive accuracy by the reduction of overfitting.

Pruning can occur in:

Top-down fashion. It will traverse nodes and trim subtrees starting at the root
Bottom-up fashion. It will begin at the leaf nodes
There is a popular pruning algorithm called reduced error pruning, in which:

Starting at the leaves, each node is replaced with its most popular class
If the prediction accuracy is not affected, the change is kept
There is an advantage of simplicity and speed

Briefly Explain Logistic Regression.

Logistic regression is a classification algorithm used to predict a binary outcome for a given set of independent variables.

The output of logistic regression is either a 0 or 1 with a threshold value of generally 0.5. Any value above 0.5 is considered as 1, and any point below 0.5 is considered as 0.

Advance Machine Learning Interview Question

Explain the K Nearest Neighbor Algorithm.

K nearest neighbor algorithm is a classification algorithm that works in a way that a new data point is assigned to a neighboring group to which it is most similar. 

In K nearest neighbors, K can be an integer greater than 1. So, for every new data point, we want to classify, we compute to which neighboring group it is closest. 

Let us classify an object using the following example. Consider there are three clusters:

  • Football
  • Basketball 
  • Tennis ball

Let the new data point to be classified is a black ball. We use KNN to classify it. Assume K = 5 (initially). 

Next, we find the K (five) nearest data points, as shown.

Observe that all five selected points do not belong to the same cluster. There are three tennis balls and one each of basketball and football. 

When multiple classes are involved, we prefer the majority. Here the majority is with the tennis ball, so the new data point is assigned to this cluster.

What is a Recommendation System?

Anyone who has used Spotify or shopped at Amazon will recognize a recommendation system: It’s an information filtering system that predicts what a user might want to hear or see based on choice patterns provided by the user.

What is Kernel SVM?

Kernel SVM is the abbreviated version of the kernel support vector machine. Kernel methods are a class of algorithms for pattern analysis, and the most common one is the kernel SVM.

Machine Learning Interview Question

What Are Some Methods of Reducing Dimensionality?

You can reduce dimensionality by combining features with feature engineering, removing collinear features, or using algorithmic dimensionality reduction.

How can you avoid overfitting

By using a lot of data overfitting can be avoided, overfitting happens relatively as you have a small dataset, and you try to learn from it. But if you have a small database and you are forced to come with a model based on that. In such situation, you can use a technique known as cross validation. In this method the dataset splits into two section, testing and training datasets, the testing dataset will only test the model while, in training dataset, the datapoints will come up with the model.

In this technique, a model is usually given a dataset of a known data on which training (training data set) is run and a dataset of unknown data against which the model is tested. The idea of cross validation is to define a dataset to “test” the model in the training phase.

What is inductive machine learning?

The inductive machine learning involves the process of learning by examples, where a system, from a set of observed instances tries to induce a general rule.

Advance Machine Learning Interview Question

What are the five popular algorithms of Machine Learning?

  • Decision Trees
  • Neural Networks (back propagation)
  • Probabilistic networks
  • Nearest Neighbor
  • Support vector machines

What are the different Algorithm techniques in Machine Learning?

The different types of techniques in Machine Learning are

  • Supervised Learning
  • Unsupervised Learning
  • Semi-supervised Learning
  • Reinforcement Learning
  • Transduction
  • Learning to Learn

What are the three stages to build the hypotheses or model in machine learning?

  • Model building
  • Model testing
  • Applying the model

Machine Learning Interview Question

What is the standard approach to supervised learning?

The standard approach to supervised learning is to split the set of example into the training set and the test.

What is ‘Training set’ and ‘Test set’?

In various areas of information science like machine learning, a set of data is used to discover the potentially predictive relationship known as ‘Training Set’. Training set is an examples given to the learner, while Test set is used to test the accuracy of the hypotheses generated by the learner, and it is the set of example held back from the learner. Training set are distinct from Test set.

List down various approaches for machine learning?

The different approaches in Machine Learning are

  • Concept Vs Classification Learning
  • Symbolic Vs Statistical Learning
  • Inductive Vs Analytical Learning

Advance Machine Learning Interview Question

What is not Machine Learning?

  • Artificial Intelligence
  • Rule based inference

Explain what is the function of ‘Unsupervised Learning’?

  • Find clusters of the data
  • Find low-dimensional representations of the data
  • Find interesting directions in data
  • Interesting coordinates and correlations
  • Find novel observations/ database cleaning

Explain what is the function of ‘Supervised Learning’?

  • Classifications
  • Speech recognition
  • Regression
  • Predict time series
  • Annotate strings

Machine Learning Interview Question

What is algorithm independent machine learning?

Machine learning in where mathematical foundations is independent of any particular classifier or learning algorithm is referred as algorithm independent machine learning

What is the difference between artificial learning and machine learning?

Designing and developing algorithms according to the behaviors based on empirical data are known as Machine Learning. While artificial intelligence in addition to machine learning, it also covers other aspects like knowledge representation, natural language processing, planning, robotics etc.

What is classifier in machine learning?

A classifier in a Machine Learning is a system that inputs a vector of discrete or continuous feature values and outputs a single discrete value, the class.

Advance Machine Learning Interview Question

What are the advantages of Naive Bayes?

In Naïve Bayes classifier will converge quicker than discriminative models like logistic regression, so you need less training data. The main advantage is that it can’t learn interactions between features.

In what areas Pattern Recognition is used?

Pattern Recognition can be used in

Computer Vision
Speech Recognition
Data Mining
Informal Retrieval

Machine Learning Part 1Machine Learning Part 2
Back to top