Page 1 of 1

Supervised vs. Unsupervised Machine Learning

Posted: Thu Jan 30, 2025 3:51 am
by Reddi1
In the field of machine learning, a basic distinction is made between three different categories:

In supervised learning, conclusions and predictions about future events are made based on structured, historical input and result data (e.g. user data and conversions). The output value always functions as a dependent "label" in correlation to the independent input variables.
It is estimated that around 70% of all machine learning applications can currently be found in this well-researched area. Typical supervised ML applications in marketing include data-driven revenue forecasting or the division of customer groups into predefined segments and classes (classification).

Two typical supervised machine learning problems: regression and classification

This is contrasted with unsupervised learning . These algorithms and models are used to recognize patterns in large amounts of data without predefined dependencies or correlations. In online marketing, for example, this is useful for accurately dividing website visitors into different, as yet undefined groups in order to then address them with an advertising message tailored to them (clustering).

Unsupervised algorithms reliably recognize patterns in large amounts of data

In reinforcement learning , the machine's learning process lebanon phone number data takes place through interaction with a generally predefined environment. Positive behavior is rewarded and the system is thus encouraged to act in the same way again in the future. Many of these applications and developments can currently be found in the areas of gaming and robotics.
The challenge in supervised learning is primarily in data preprocessing and in asking the right questions in advance that the system should learn. The model can only solve the tasks and problems that we teach it and only understand the correlations that we predefine. Accordingly, human bias plays a greater role here than in unsupervised learning. A problem that teams at Google, among others, are currently addressing.

Even before the learning process, it can be ensured that the model requires less computing power and thus delivers solid results more quickly. Examples of adjustments here include removing unnecessary columns (such as the currency column in keyword reports) and label encoding strings (text modules such as keywords or locations), i.e. converting them into numerical values ​​that can be more easily interpreted by the system.

By visualizing the data before the actual training, for example using histograms or scatter plots, we can get an overview in advance of whether we have enough data in the set from which the machine can learn. In this way, we can steer the quality of the informative value of our model in the right direction at an early stage and avoid phenomena such as overfitting and underfitting .