Introduction to visualization
Our eyes are drawn to colors and patterns.
We can quickly identify red from blue, square from the circle.
Our culture is visual, including everything from art and advertisements to TV and movies.
Data visualization is another form of visual art that grabs our interest and keeps our eyes on the message.
When we see a chart, we quickly see trends and outliers.
If we can see something, we internalize it quickly.
It's storytelling with a purpose.
Data visualization is a graphical representation of information and data.
Use of visual elements like charts, graphs, and maps, data visualization tools provide an easy way to understand trends, outliers, and patterns in data.
Numerical data may be encoded using dots, lines, or bars, to visually communicate a quantitative message.
Effective visualization helps users analyze and reason about data and evidence.
Data visualization is both an art and a science.
To know more let's dive deeper in visualization.
Quantitative message
Quantitative data is data expressing a certain quantity, amount or range.
Usually, there are measurement units associated with the data, e.g. meters, in the case of the height of a person.
Stephen Few described eight types of quantitative messages:
1. Time-series: Capturing single variable over a period of time (eg: unemployment rate over a period of time)
2. Ranking: Ranking categorical subdivisions in ascending or descending order.
3. Part-to-whole: Measuring Categorical subdivisions as a ratio of wholei.e by 100%).
4. Deviation: Comparison of categorical subdivisions against a reference (eg: comparison of actual expenses vs budget expenses).
5. Frequency distribution: Shows no of observations of a particular variable.
6. Correlation: Comparison between Observations represented by two variables(x.y) to determine if they tend to move in the same or opposite direction.
7. Nominal comparison: Comparing categorical subdivisions in no particular order, such as the sales volume by product code.
8. Geographic or Geo-spatial: Comparison of a variable across a map (eg: unemployment rate by state).
Wow, quantitative messages seem to be a vast area...
No doubt all data scientists are geniuses.
Let's see how much we have learned yet-
Which phase is data visualization helpful in?
Select the correct answer
A. acquiring data
B. cleaning data
C. reporting
D. analysis
Answer : C.
Comparison of your actual vs budget expenses for several departments of a business for a given time period would fall under which category of a quantitative message?
Select the correct answer
A. time series
B. geographic
C. nominal
D. deviation
Answer : D.
Examples of diagrams used for data visualization
Data visualization involves specific terminology, some of which is derived fromfrom statistics.
Categorical: Text labels describing the nature of the data, such as "Name" or "Age". This term also covers
qualitative (non-numerical) data.
Quantitative: Numerical measures,.such as "25" to represent the age in years.
A table contains quantitative data organized into rows and columns with categorical labels.
A graph is primarily used to show relationships among data and portrays values encoded as visual objects (e.g. lines, bars, or points).
Bar Chart
Comparison of values, such as sales performance for several persons or businesses in a single time period. For a single variable measured over time (trend) a line chart is preferable
Histogram
Determining trequency of annual stock market percentage returns within particular ranges (bins) such as 0-10%, 11-20%, etc.
The height of the bar represents the number of observations (years) with a return % in the range represented by the bin.
Scatter plot
Determining the relationship (e.g.correlation) between unemployment (x) and inflation (y) for multiple time periods.
How does visualization work?
The visualization software pulls in data from the sources and applies a graphic type to the data.
Data visualization software allows the user to select the best way of presenting the data but, increasingly,
the software is automating this step.
Some tools automatically interpret the shape of the data and detect correlations between certain variables.
It then places these discoveries into the chart type that the software determines is optimal.
Typically, data visualization software has a dashboard component that allows users to pull multiple visualizations of analyses into a single interface, generally a web portal.
Let's revise
You have to compare the growth of sales of your company to last year.
What graphical representation would you choose in this case:
Select the correct answer
A. histogram
B. scatter plot
C. bar graph
D. none of the above
Answer : C.
Let's revise Again
Which of the following statements are not true.
Select the correct answer
A. A table contains quantitative data organized into rows and columns with categorical labels.
B. A graph is primarily used to Show relationships among data and portrays values encoded as
visual objects.
C. Text labels describing the nature of the data, such as "Name" or "Age"-quantitative data.
D. Data visualization comes under-reporting.
Answer : C.