What are outliers in data? | Data Science Basics | Understanding Fundamentals

Sarthak Niwate
1 min readDec 9, 2020

“The process that of identifying the anomalous and extreme high or low observations in the dataset.”

The mistakes like computational error, wrong data entry, sampling error, value error cause to create some outliers in the data.

Say for an example, showing age of a person as 250 or 300 years could affect on the model/program/prediction. There are various types of algorithms defined to treat and eliminate the effect of outliers on the dataset.
This can cause the loss of very important information. In some scenarios, fraudulent credit card transactions, a person will pay a loan or not like activities can be noted with the help of outliers.

We can detect outliers by sorting, using z-scores, by plotting boxplot for distribution as like using interquartile range.

In next some blogs, we will know about the outlier treatment techniques...
Stay Tuned!

--

--