What Benefits does Normalisation offer?
Hi, I understand that when normalising my data it puts values into a specific range.
I know that this can help for machine learning purposes but I'm unclear on how?
Would someone mind clearing this up for me?
Thanks again
-Madcap
I know that this can help for machine learning purposes but I'm unclear on how?
Would someone mind clearing this up for me?
Thanks again
-Madcap
Tagged:
0
Best Answers
-
varunm1 Moderator, MemberPosts:1,207UnicornHello@Madcap
Normalizing puts values into a specific range, True. Actually, it keeps all the predictor's values in the same range for example 0 to 1. ML and statistic models consider that data is distributed normally. The main use of normalization is when we have predictors (Attributes) whose scales (Ranges) vary a lot. For example, If we have an attribute that has values between 0 to 10 and another attribute that has values between 1000 and 10000 in the dataset this causes the algorithm to think that the attribute with higher values (1000 to 10000) is a supporting predictor. This might not be true in reality. For this reason, we consider normalization so that all the attributes are normally distributed during training and based on the statistical significance they will be given priority. This will support stable convergence of an algorithmRegards,
Varun
https://www.varunmandalapu.com/
Be Safe. Follow precautions and Maintain Social Distancing
7 -
IngoRM Administrator, Moderator, Employee, RapidMiner Certified Analyst, RapidMiner Certified Expert, Community Manager, RMResearcher, Member, University ProfessorPosts:1,751RM FounderJust to add to the great explanation of@varunm1: Normalization is especially important for all distance-based learners like k-NN. Without normalization, attributes with a very large range would simply overwhelm other attributes with smaller ranges. Not because they are actually more important as predictors, simply because they have a bigger range. For other learning schemes, e.g. Decision Tree, this does not matter and in fact I would recommend against normalization (in most cases), since it changes the range of your input data and reduces understandability of the model to somebody who is familiar with the application domain.Hope this helps,
Ingo
7
Answers
Very helpful
Thanks again
-Madcap
Lindon Ventures
Data Science Consulting from Certified RapidMiner Experts