April 13, 2021

20 Cool Machine Learning and Data Science Concepts (Simple Definitions)

Hello everyone, as you know, I’m Kindson The Genius. I would like to share with you these 20 cool Machine Learning and Data Science Concept as well as a brief explanation of each.

I assume you are Learning Machine Learning and I would like to encourage you to continue learning and don’t give up, even if it appears a bit tough initially. Just keep moving. At the end your efforts will pay off.

I made this article because it is easier to read and understand a concept if you already have an overview of what it is all about.

Feel free to reach me via my website www.kindsonthegenius.com

1. Supervised LearningThis is a class of Machine Learning problem where the a dataset is provided(training dataset) and each observation has a corresponding class or label or target vector

2. Unsupervised Learning: This is a class of Machine Learning problem where a dataset is provided but no classes are provided.

3. Reinforcement Learning: This is a more recent class of Machine Learning problem developed in 1998 and is concerned with finding the best action to take in a given situation to maximize the expected outcome. One application  area is reinforcement learning is in Game Playing where the objective at each point is to take an action that increases the changes of wining.

4. Deep Learning: This is a class of machine learning algorithms which may be either supervised or unsupervised, that cascades multiple layers for computational units to extract features from input data. Deep Learning is a subset of Machine Learning. Also check Deep Neural Networks

5. Ensemble Learning: This is a learning process that tries to combine a number of learning algorithms to improve the performance of a model

6. Classification: Classification is a supervised learning task that attempts to classify series of observation into two or more classes. In classification, a dataset is of observation is provided along with the classes. The goal is to create a model using the given dataset that would be used to predict the class of a new observation given without the class. When there are just two classes (say 1 and 0), then it is called binary classification.

7. RegressionThis is a class of supervised learning task similar to classification, but in this case, the task is the find the function that relates the feature set to the classes. Example: given set of X = X1, X2, . . . , Xp for p features and Y = Y1, Y2, . . . , Yn, the goal is to find f(.) such that f(X) = Y

8. Clustering (Cluster Analysis): Clustering is a class of unsupervised learning tasks that is concerned with finding subsets or clusters within a dataset. We try to group elements in such a way that elements within the same cluster have certain similarity while being different from elements outside the cluster.  Two main clustering methods are:

9. Dimensionality Reduction: This is a statistical process of reducing the number or variables being considered to a lower number of variables that can be used to explain the given dataset.  Types of dimensionality reduction include:

  • feature extraction
  • feature selection.

10. Principal Components Analysis (PCA): PCA is  a statistical technique used to map data from a higher dimensional space to a lower-dimensional space such that the lower dimensional data represents that maximum variance in the original data. In PCA, the principal components are obtain by a process of eigen-decomposition.

11. Singular Value Decomposition(SVD): This is a more recent and powerful method of dimensionality reduction where the original dataset is factorized or decomposed into ‘singular values’.

Given a matrix X with is an m x n matrix, then we can decompose X into three matrises:

X = UΣV*


  • X is the original matrix
  • U is m x m unitary matrix
  • Σ is m x n diagonal matrix
  • V* is n x n unitary matrix

12. Support Vector Machine(SVM): Support Vector Machine is a tool used for classification. SVM performs classification by determining a line or a plane (hyperplane) that separates the data sets into tow classes such that margin between the classes is maximum.

13. Hyperplane: A hyperplane is defined as a plane that is one dimension less than the ambient plane. This simply means that if we have set of points in a 2d plot, then the hyperplane would be a 1d line. Similarly the hyperplane for a 3d space is a 2d plane. And so one

14. Neural Network: This is also called Artificial Neural Network (ANN). This is a network made up of interconnection of nodes called neurons  and try to mimic the functioning of the neural network (nervous system) of living things. Node in neural network is not scattered randomly but arranged  in different layer as shown in the Figure

  • the first layer is the input layer
  • the last layer is the output layer
  • in between, there are 1 or more layers called hidden layers

15. NeuronThis is the basic computing unit of a Neural Network. The neuron represents the edges of the neural network and is used to store pieces of information.

16. Perceptron: A perceptron is the simplest neural network you can think of. It is a neural network made up of a single node.  A perceptron has input (or set of input), a neuron, and an output(or set of inputs)

17. Deep Neural Network: An neural network is referred to as a deep neural network if it is made up of more than on hidden layer.

18. Recurrent Neural Network(RNN): This is a class of neural networks where the edges connecting the node forms a directed graph with traceable sequence. This property makes RNN suitable for modelling temporal behavior.

19. Constitutional Neural Network (CNN): CNN is a class of neural networks that is a bit more complex and are applied in image recognition and analysis.

20. Activation Function: Activation Function are the function that produces the output of a neuron in a neural network. The inputs, the weights and the bias are passed into an activation function. The activation function fires if a threshold is reached.

Some common activation function includes:

  • Step Function
  • Linear Function
  • Sigmoid Function
  • Hyperbolic Tan Function(Tanh)
  • Rectified Linear Unit(ReLU)
0 0 vote
Article Rating
Notify of
Newest Most Voted
Inline Feedbacks
View all comments
Sam Sandqvist
Sam Sandqvist
10 months ago

Number 19 should be *convolutional* neural network.