What is Exploratory Data Analysis? Steps & Examples
One of the most important things you can do when approaching a data science project is really understand the dataset you’re working with as a first step. Without a proper data exploration process in place, it becomes much more challenging to identify critical issues or successfully carry out a deeper analysis of the dataset.
Table Of Contents
What Is Exploratory Data Analysis?
Exploratory Data Analysis (EDA) in Data Science is a step in the analysis process that uses several techniques to visualize, analyze, and find patterns in the data. John Turkey, who developed the EDA method, likened it to detective work because you have to dig for clues and evidence before making any assumptions about the outcome.
A complete and solid Exploratory Data Analysis can help identify issues in your data like missing or wrong values, typos, and anomalies (outliers). In addition, you will learn about the distribution of the data, the relationship between variables, and find variables that may not affect the desired outcome.
In this article, we’ll explore the principle techniques of Exploratory Data Analysis, tools, and graphs that help to understand the data better so you can ultimately answer business questions and find insights that may surprise your stakeholders.
Types of Variables in the Exploratory Data Analysis
When you start to explore the dataset, the first thing you have to evaluate is the attributes of the data you’re working on. Understanding the type of each variable will help you in the process of choosing the proper technique for the attribute analysis.