Introduction to Data Science
World & Data
Data Science Basics
It's certainly something that seems to be on everyone's minds lately.
But what is it?
We’ll find out together…
Have you ever wondered how,
Amazon recommends items for you to buy?
Netflix recommends movies to you?
Spotify recommends music to you?
P.S. your every click on internet is contributing in data sets present all over.
World & Data
Data Science Basics
It's certainly something that seems to be on everyone's minds lately.
But what is it?
We’ll find out together…
Have you ever wondered how,
Amazon recommends items for you to buy?
Netflix recommends movies to you?
Spotify recommends music to you?
P.S. your every click on internet is contributing in data sets present all over.
Data is plain facts.
When data is processed, organized so as to make it useful, it is called Information.
Earlier, data was limited and structured.
Today, the entire world is on the Internet!
And we have a variety of structured, unstructured and semi structured data.
What do we mean when we say structured and unstructured data?
Suppose you are the Principal of the institute.
You have a lot of data about all the students in your institute.
But your data is not sorted in any manner.
This can be considered as unstructured data.
You call your helper and assign him a task to create a single file per student.
That file should contain only that particular student’s data.
Now that can be considered as structured data.
Structured Data can be stored in traditional database systems.
What are traditional database systems?
Database in which data is stored in tables i.e in the form of rows and columns is called as relational database systems or traditional database systems.
Unstructured Data cannot be stored in traditional database systems.
Fun Fact - Most of the Internet of Things (IoT) data is unstructured data.
Coming back to Data, Data is being produced constantly, every minute, every second with a very high speed!
Data Science is a blend of machine learning algorithms, statistics, business intelligence and programming.
It is helpful in discovering hidden patterns from the raw data.
Well done..
Hope you’ve got familiar with the word and world of DATA.
Which of the following falls under the category of unstructured data?
Select one or more answers
A. Survey responses
B. Employee data of an organisation
C. Log files from websites
D. Atmospheric data
Answer : A , C, D
Problems solved by data science
Data science can solve problems you’d expect it to.For example-
Netflix uses filtering algorithms to recommend you movies.
Many social media websites are powered by data science.
Construction of your Facebook's news feed,
Or a suggestion of new people to follow on twitter,
Data science is widespread.
Websites like zomato, uber, Swiggy, etc keep a vast amount of user data.
They do so to customize the user experience.
Data Science can also solve problems you’d never expect.
Urban planning has been made easier with data science.
With better understanding planners can factor in the required social facilities and amenities to support residents at a more localized level.
Number crunching( i.e the work involved in bringing a numerical perspective to a situation.) plays a huge role in astronomy these days.
In fact data crunching plays a huge role in astronomy these days.
What is data crunching?
- Data crunching is an overall term to cover the analysis of data so that it becomes useful in making decisions.
NBA teams are using data science to improve their training.
They have employed cameras that record movement of a player.
Later this data helps in new player’s training and to get a competitive advantage.
Data Science: discovery of data insight
This aspect of data science is all about uncovering the findings from data.
It enables companies to make smarter decisions.
Let's look at the example of Netflix-
Netflix data mines movie viewing patterns,
understands what drives the user interest, and
uses that to make decisions on which Netflix original series to produce next.
Can you think of any other such example?
Congratulations
You are one step towards being a Data Scientist.
How do data scientists mine out insights?
It starts with data exploration.
When given a challenging question, data scientists become detectives.
They investigate leads and try to understand patterns or characteristics within the data.
Ps - don't be like the man in the image. ;)
Data science : development of data product
A data product is a technical asset that:
utilizes data as input, and
processes that data to return algorithmically-generated results.
Gmail spam filter is a data product –
an algorithm behind the scenes processes incoming emails and determines if a message is junk or not.
This is different from the data insights section above.
In contrast, a data product is a technical functionality that encapsulates an algorithm and is designed to integrate directly into core applications.
Data scientists play a central role in developing data product.
Data scientists serve as technical developers, building assets that can be leveraged at a wide scale.
Hopefully, you are clear with the concept of data insight and data product.
A data product is a technical asset that:
utilizes data as input, and
processes that data to return algorithmically-generated results.
Gmail spam filter is a data product –
an algorithm behind the scenes processes incoming emails and determines if a message is junk or not.
This is different from the data insights section above.
In contrast, a data product is a technical functionality that encapsulates an algorithm and is designed to integrate directly into core applications.
Data scientists play a central role in developing data product.
Data scientists serve as technical developers, building assets that can be leveraged at a wide scale.
Hopefully, you are clear with the concept of data insight and data product.
Which of the following does not fall under the category of data insight?
Select the right answer
A. what your customers are saying on relevant forums and websites
B. online reviews to be able to analyze different brands.
C. Customer surveys such as NPS, where you ask what your customers think about your products.
D. Computer vision used for self-driving cars.
Answer : D
Data science is not simply a trendy new way to think about tech problems.
It is also a tool that can be used to solve problems in many fields.