尊敬的 微信汇率:1円 ≈ 0.046089 元 支付宝汇率:1円 ≈ 0.04618元 [退出登录]
SlideShare a Scribd company logo
DATA SCIENCE: TOOLS,
TECHNIQUES and APPLICATIONS
Dr. Meenakshi Srivastava
Dr. Ranjana Rajnish
Assistant Professor
Amity University
msrivastava@lko.amity.edu
What and Why ???
• WHAT is Data Science?
• WHY Data Science is Important?
• WHY Data Scientist are High in Demands?
• WHY Data Science : In Academia ?
Application of Data Science :
Some Examples
I. HEALTHCARE
• Survival analysis
– Analyze survival statistics for different patient
attributes (age, blood type, gender, etc) and
treatments.
• Medication (dosage) effectiveness
– Analyze effects of admitting different types and
dosage of medication for a disease.
• Re-admission risk
– Predict risk of re-admittance based on patient
attributes, medical history, diagnose & treatment.
II. MARKETING
• Predicting Lifetime Value (LTV)
What for: if you can predict the
characteristics of high LTV customers, this
supports customer segmentation, identifies
up sell opportunities and supports other
marketing initiatives.
• Demand Forecasting
III. LOGISTICS
• How many of what thing Customer needs and
where will they need them?
(Enables lean inventory and prevents out of
stock situations.)
MOST IMPORTANT QUESTION
HOW DATA SCIENCE DO ALL THIS
What is Data Science ?
Data Science, is Broad Umbrella Term
whereby the Scientific Methods, Math,
Statistics etc are applied to Data sets in
order to extract KNOWLEDGE and
INSIGHT.
DATA SCIENCE : A MESH UP OF
DISCIPLINES
• Another View
THE DATA SCIENCE UNICORN
• In medieval times, a Unicorn was a
rare and mythical creature with
great powers.
• In today’s world, a similar mythical
creature is a Data Science Unicorn,
who knows equally well the
Technology, Data Science, and
Business.
• Such professional is a most
valuable resource of any data
science team.
• Many data professionals are
experts in the first two areas –
technology and data science, but
lack business/domain skills.
You All Are OUR FUTURE UNICORN
How To Become A Data Science
UNICORN ?
Data Science UNICORN: Do Whatever Is
Necessary To Extract Value from the Data
• Statistics: Take a sample (data), answer questions about the process that
produced this sample Is it a normal distribution? Estimate it’s mean.
• Machine Learning: Take a sample(data), build a model to answer
questions about future samples.
– Given a sample of named faces, design a model for naming a new unseen
face.
• Data Mining: mine huge data store for interesting patterns or
relationships.
– Given DB of transactions, apply tools and algorithms to find frequent product
bundles
Machine Learning
Machine
Learning refers to a
computer’s ability to
learn from a dataset and
adapt accordingly
without having been
explicitly programmed
to do so.
Examples : Regression,
Decision Tree, Neural
Network etc.
Data Mining
• To most of people data mining
goes something like this: Tons of
data is collected, then quant
wizards work their arcane magic,
and then they know all of this
amazing stuff.
• BUT WHAT THEY DO ?
• They can tell us that "one of
these things is not like the other“,
or it can show us categories and
then sort things into pre-
determined categories/ class.
HOW TO DO ALL THIS ??
COMPUTATIONAL TOOLS
• With the help of existing computational tools
you all can very easily analyze your data.
• No Programming Skills Required.
• No in depth knowledge of Statistics, Machine
Learning, Data Mining etc is required.
Common Computational Tool
• Rapid Miner (Open Source and Free):
This is very popular since it is a readymade, open
source, no-coding required software, which gives
advanced analytics. Written in Java, it incorporates
multifaceted data mining functions such as data
preprocessing, visualization, predictive analysis, and
can be easily integrated with WEKA and R-tool to
directly give models from scripts written in the
former two.
• WEKA (Open Source & Free):
This is a JAVA based customization tool, which is
free to use. It includes visualization and
predictive analysis and modeling techniques,
clustering, association, regression and
classification.
• R-Programming Tool (Open Source and Free) :
This is written in C and FORTRAN, and allows the
data miners to write scripts just like a programming
language/platform. Hence, it is used to make
statistical and analytical software for data mining. It
supports graphical analysis, both linear and
nonlinear modeling, classification, clustering and
time-based data analysis.
• Python based Orange and NTLK:
Python is very popular due to ease of use and its
powerful features. Orange is an open source
tool that is written in Python with useful data
analytics, text analysis, and machine-learning
features embedded in a visual programming
interface. NTLK, also composed in Python, is a
powerful language processing data mining tool,
which consists of data mining, machine learning,
and data scraping features that can easily be
built up for customized needs.
• Rattle (Open source and FREE)
A rattle is a GUI tool that uses R
Stats programming language. Rattle exposes the
statistical power of R by providing considerable
data mining functionality. Although Rattle has
an extensive and well-developed UI. Also, it has
an inbuilt log code tab that generates duplicate
code for any activity happening at GUI.
• DataMelt (Availability: Open source and Free)
DataMelt, also known as DMelt is a computation
and visualization environment. Also, provides an
interactive framework to do data analysis and
visualization. It is designed mainly for engineers,
scientists & students.
How Computational Tools Work
• Have methods developed using Statistics,
Machine Learning and Data Mining are used.
• These pre-developed methods can be easily
applied on your data set.
• They provide you in build support for data
visualization.
What ALL I CAN DO WITH MY DATA ?
• Regression:
In statistics, regression is a classic technique to
identify the scalar relationship between two
or more variables by fitting the state line on
the variable values.
Cont…
• Classification:
This is a machine-learning technique used for
labeling the set of observations provided for
training examples. With this, we can classify the
observations into one or more labels. The
likelihood of sales, online fraud detection, and
cancer classification (for medical science) are
common applications of classification problems.
Google Mail uses this technique to classify e-
mails as spam or not.
• Clustering:
This technique is all about organizing similar items
into groups from the given collection of items.
User segmentation and image compression are
the most common applications of clustering.
Market segmentation, social network analysis,
organizing the computer clustering, and
astronomical data analysis are applications of
clustering.
• Google News
Uses these techniques to group similar news items
into the same category.
Cont…
• Recommendation:
The recommendation algorithms are used in
recommender systems where these systems are
the most immediately recognizable machine
learning techniques in use today. Web content
recommendations may include similar websites,
blogs, videos, or related content. Also,
recommendation of online items can be helpful
for cross-selling and up-selling.
• Association Rules:
This data mining technique helps to find the
association between two or more Items. It
discovers a hidden pattern in the data set.
• Outlier Detection:
This type of data mining technique refers to
observation of data items in the dataset which
do not match an expected pattern or expected
behavior. This technique can be used in a variety
of domains, such as intrusion, detection, fraud
or fault detection, etc. Outer detection is also
called Outlier Analysis or Outlier mining.
• Prediction:
Prediction has used a combination of the other
data mining techniques like trends, sequential
patterns, clustering, classification, etc. It
analyzes past events or instances in a right
sequence for predicting a future event.
ADVANTAGES
Use Computational Tools to predict the
behavior of your compound.
Use Computational Tools to analyze the same
data with a different vision.
Cos Cutting.
Time Saving
Very Clean perfect vision for your Research
QUESTIONS ?
THANKS

More Related Content

Similar to Data Science.pptx NEW COURICUUMN IN DATA

Which institute is best for data science?
Which institute is best for data science?Which institute is best for data science?
Which institute is best for data science?
DIGITALSAI1
 
Best Selenium certification course
Best Selenium certification courseBest Selenium certification course
Best Selenium certification course
KumarNaik21
 
Data science training in hyd ppt (1)
Data science training in hyd ppt (1)Data science training in hyd ppt (1)
Data science training in hyd ppt (1)
SayyedYusufali
 
Data science training institute in hyderabad
Data science training institute in hyderabadData science training institute in hyderabad
Data science training institute in hyderabad
VamsiNihal
 
Data science training in Hyderabad
Data science  training in HyderabadData science  training in Hyderabad
Data science training in Hyderabad
saitejavella
 
Data science training Hyderabad
Data science training HyderabadData science training Hyderabad
Data science training Hyderabad
Nithinsunil1
 
Data science online training in hyderabad
Data science online training in hyderabadData science online training in hyderabad
Data science online training in hyderabad
VamsiNihal
 
Data science training in hyd ppt (1)
Data science training in hyd ppt (1)Data science training in hyd ppt (1)
Data science training in hyd ppt (1)
SayyedYusufali
 
data science training and placement
data science training and placementdata science training and placement
data science training and placement
SaiprasadVella
 
online data science training
online data science trainingonline data science training
online data science training
DIGITALSAI1
 
Data science online training in hyderabad
Data science online training in hyderabadData science online training in hyderabad
Data science online training in hyderabad
VamsiNihal
 
data science online training in hyderabad
data science online training in hyderabaddata science online training in hyderabad
data science online training in hyderabad
VamsiNihal
 
Best data science training in Hyderabad
Best data science training in HyderabadBest data science training in Hyderabad
Best data science training in Hyderabad
KumarNaik21
 
Data science training Hyderabad
Data science training HyderabadData science training Hyderabad
Data science training Hyderabad
Nithinsunil1
 
Data science training in hyd ppt converted (1)
Data science training in hyd ppt converted (1)Data science training in hyd ppt converted (1)
Data science training in hyd ppt converted (1)
SayyedYusufali
 
Data science training in hyd pdf converted (1)
Data science training in hyd pdf converted (1)Data science training in hyd pdf converted (1)
Data science training in hyd pdf converted (1)
SayyedYusufali
 
Data science training in hydpdf converted (1)
Data science training in hydpdf  converted (1)Data science training in hydpdf  converted (1)
Data science training in hydpdf converted (1)
SayyedYusufali
 
Data Science & AI Road Map by Python & Computer science tutor in Malaysia
Data Science  & AI Road Map by Python & Computer science tutor in MalaysiaData Science  & AI Road Map by Python & Computer science tutor in Malaysia
Data Science & AI Road Map by Python & Computer science tutor in Malaysia
Ahmed Elmalla
 
semana1.pptx
semana1.pptxsemana1.pptx
semana1.pptx
AidaVivancoLuna1
 
Best Selenium certification course
Best Selenium certification courseBest Selenium certification course
Best Selenium certification course
KumarNaik21
 

Similar to Data Science.pptx NEW COURICUUMN IN DATA (20)

Which institute is best for data science?
Which institute is best for data science?Which institute is best for data science?
Which institute is best for data science?
 
Best Selenium certification course
Best Selenium certification courseBest Selenium certification course
Best Selenium certification course
 
Data science training in hyd ppt (1)
Data science training in hyd ppt (1)Data science training in hyd ppt (1)
Data science training in hyd ppt (1)
 
Data science training institute in hyderabad
Data science training institute in hyderabadData science training institute in hyderabad
Data science training institute in hyderabad
 
Data science training in Hyderabad
Data science  training in HyderabadData science  training in Hyderabad
Data science training in Hyderabad
 
Data science training Hyderabad
Data science training HyderabadData science training Hyderabad
Data science training Hyderabad
 
Data science online training in hyderabad
Data science online training in hyderabadData science online training in hyderabad
Data science online training in hyderabad
 
Data science training in hyd ppt (1)
Data science training in hyd ppt (1)Data science training in hyd ppt (1)
Data science training in hyd ppt (1)
 
data science training and placement
data science training and placementdata science training and placement
data science training and placement
 
online data science training
online data science trainingonline data science training
online data science training
 
Data science online training in hyderabad
Data science online training in hyderabadData science online training in hyderabad
Data science online training in hyderabad
 
data science online training in hyderabad
data science online training in hyderabaddata science online training in hyderabad
data science online training in hyderabad
 
Best data science training in Hyderabad
Best data science training in HyderabadBest data science training in Hyderabad
Best data science training in Hyderabad
 
Data science training Hyderabad
Data science training HyderabadData science training Hyderabad
Data science training Hyderabad
 
Data science training in hyd ppt converted (1)
Data science training in hyd ppt converted (1)Data science training in hyd ppt converted (1)
Data science training in hyd ppt converted (1)
 
Data science training in hyd pdf converted (1)
Data science training in hyd pdf converted (1)Data science training in hyd pdf converted (1)
Data science training in hyd pdf converted (1)
 
Data science training in hydpdf converted (1)
Data science training in hydpdf  converted (1)Data science training in hydpdf  converted (1)
Data science training in hydpdf converted (1)
 
Data Science & AI Road Map by Python & Computer science tutor in Malaysia
Data Science  & AI Road Map by Python & Computer science tutor in MalaysiaData Science  & AI Road Map by Python & Computer science tutor in Malaysia
Data Science & AI Road Map by Python & Computer science tutor in Malaysia
 
semana1.pptx
semana1.pptxsemana1.pptx
semana1.pptx
 
Best Selenium certification course
Best Selenium certification courseBest Selenium certification course
Best Selenium certification course
 

More from javed75

Unit-1.pptx final unit new mtech unit thre
Unit-1.pptx final unit new mtech unit threUnit-1.pptx final unit new mtech unit thre
Unit-1.pptx final unit new mtech unit thre
javed75
 
javed_prethesis2608 on predcition of heart disease
javed_prethesis2608 on predcition of heart diseasejaved_prethesis2608 on predcition of heart disease
javed_prethesis2608 on predcition of heart disease
javed75
 
presentationfinal-090714235255-phpapp01 (1) (2).pptx
presentationfinal-090714235255-phpapp01 (1) (2).pptxpresentationfinal-090714235255-phpapp01 (1) (2).pptx
presentationfinal-090714235255-phpapp01 (1) (2).pptx
javed75
 
algocomplexity cost effective tradeoff in
algocomplexity cost effective tradeoff inalgocomplexity cost effective tradeoff in
algocomplexity cost effective tradeoff in
javed75
 
Section 7.5 version 2 AM new ppt for every
Section 7.5 version 2 AM new ppt for everySection 7.5 version 2 AM new ppt for every
Section 7.5 version 2 AM new ppt for every
javed75
 
Cyber_Security_Awareness_Presentation (1).pptx
Cyber_Security_Awareness_Presentation (1).pptxCyber_Security_Awareness_Presentation (1).pptx
Cyber_Security_Awareness_Presentation (1).pptx
javed75
 
anand ethics ppt for phd scholar integral
anand ethics ppt for phd scholar integralanand ethics ppt for phd scholar integral
anand ethics ppt for phd scholar integral
javed75
 
1 Basic E-Commerce Concepts for it 2ndt year
1 Basic E-Commerce Concepts for it 2ndt year1 Basic E-Commerce Concepts for it 2ndt year
1 Basic E-Commerce Concepts for it 2ndt year
javed75
 
UNIT-IV WT web technology for 1st year cs
UNIT-IV WT web technology for 1st year csUNIT-IV WT web technology for 1st year cs
UNIT-IV WT web technology for 1st year cs
javed75
 
training about android installation and usa
training about android installation and usatraining about android installation and usa
training about android installation and usa
javed75
 
Phd2023-2024cIntegralUniversitynida.pptx
Phd2023-2024cIntegralUniversitynida.pptxPhd2023-2024cIntegralUniversitynida.pptx
Phd2023-2024cIntegralUniversitynida.pptx
javed75
 

More from javed75 (11)

Unit-1.pptx final unit new mtech unit thre
Unit-1.pptx final unit new mtech unit threUnit-1.pptx final unit new mtech unit thre
Unit-1.pptx final unit new mtech unit thre
 
javed_prethesis2608 on predcition of heart disease
javed_prethesis2608 on predcition of heart diseasejaved_prethesis2608 on predcition of heart disease
javed_prethesis2608 on predcition of heart disease
 
presentationfinal-090714235255-phpapp01 (1) (2).pptx
presentationfinal-090714235255-phpapp01 (1) (2).pptxpresentationfinal-090714235255-phpapp01 (1) (2).pptx
presentationfinal-090714235255-phpapp01 (1) (2).pptx
 
algocomplexity cost effective tradeoff in
algocomplexity cost effective tradeoff inalgocomplexity cost effective tradeoff in
algocomplexity cost effective tradeoff in
 
Section 7.5 version 2 AM new ppt for every
Section 7.5 version 2 AM new ppt for everySection 7.5 version 2 AM new ppt for every
Section 7.5 version 2 AM new ppt for every
 
Cyber_Security_Awareness_Presentation (1).pptx
Cyber_Security_Awareness_Presentation (1).pptxCyber_Security_Awareness_Presentation (1).pptx
Cyber_Security_Awareness_Presentation (1).pptx
 
anand ethics ppt for phd scholar integral
anand ethics ppt for phd scholar integralanand ethics ppt for phd scholar integral
anand ethics ppt for phd scholar integral
 
1 Basic E-Commerce Concepts for it 2ndt year
1 Basic E-Commerce Concepts for it 2ndt year1 Basic E-Commerce Concepts for it 2ndt year
1 Basic E-Commerce Concepts for it 2ndt year
 
UNIT-IV WT web technology for 1st year cs
UNIT-IV WT web technology for 1st year csUNIT-IV WT web technology for 1st year cs
UNIT-IV WT web technology for 1st year cs
 
training about android installation and usa
training about android installation and usatraining about android installation and usa
training about android installation and usa
 
Phd2023-2024cIntegralUniversitynida.pptx
Phd2023-2024cIntegralUniversitynida.pptxPhd2023-2024cIntegralUniversitynida.pptx
Phd2023-2024cIntegralUniversitynida.pptx
 

Recently uploaded

The Rise of the Digital Telecommunication Marketplace.pptx
The Rise of the Digital Telecommunication Marketplace.pptxThe Rise of the Digital Telecommunication Marketplace.pptx
The Rise of the Digital Telecommunication Marketplace.pptx
PriyaKumari928991
 
220711130082 Srabanti Bag Internet Resources For Natural Science
220711130082 Srabanti Bag Internet Resources For Natural Science220711130082 Srabanti Bag Internet Resources For Natural Science
220711130082 Srabanti Bag Internet Resources For Natural Science
Kalna College
 
Brand Guideline of Bashundhara A4 Paper - 2024
Brand Guideline of Bashundhara A4 Paper - 2024Brand Guideline of Bashundhara A4 Paper - 2024
Brand Guideline of Bashundhara A4 Paper - 2024
khabri85
 
Decolonizing Universal Design for Learning
Decolonizing Universal Design for LearningDecolonizing Universal Design for Learning
Decolonizing Universal Design for Learning
Frederic Fovet
 
Hospital pharmacy and it's organization (1).pdf
Hospital pharmacy and it's organization (1).pdfHospital pharmacy and it's organization (1).pdf
Hospital pharmacy and it's organization (1).pdf
ShwetaGawande8
 
220711130083 SUBHASHREE RAKSHIT Internet resources for social science
220711130083 SUBHASHREE RAKSHIT  Internet resources for social science220711130083 SUBHASHREE RAKSHIT  Internet resources for social science
220711130083 SUBHASHREE RAKSHIT Internet resources for social science
Kalna College
 
IoT (Internet of Things) introduction Notes.pdf
IoT (Internet of Things) introduction Notes.pdfIoT (Internet of Things) introduction Notes.pdf
IoT (Internet of Things) introduction Notes.pdf
roshanranjit222
 
Science-9-Lesson-1-The Bohr Model-NLC.pptx pptx
Science-9-Lesson-1-The Bohr Model-NLC.pptx pptxScience-9-Lesson-1-The Bohr Model-NLC.pptx pptx
Science-9-Lesson-1-The Bohr Model-NLC.pptx pptx
Catherine Dela Cruz
 
The basics of sentences session 8pptx.pptx
The basics of sentences session 8pptx.pptxThe basics of sentences session 8pptx.pptx
The basics of sentences session 8pptx.pptx
heathfieldcps1
 
Creativity for Innovation and Speechmaking
Creativity for Innovation and SpeechmakingCreativity for Innovation and Speechmaking
Creativity for Innovation and Speechmaking
MattVassar1
 
Library news letter Kitengesa Uganda June 2024
Library news letter Kitengesa Uganda June 2024Library news letter Kitengesa Uganda June 2024
Library news letter Kitengesa Uganda June 2024
Friends of African Village Libraries
 
managing Behaviour in early childhood education.pptx
managing Behaviour in early childhood education.pptxmanaging Behaviour in early childhood education.pptx
managing Behaviour in early childhood education.pptx
nabaegha
 
Slides Peluncuran Amalan Pemakanan Sihat.pptx
Slides Peluncuran Amalan Pemakanan Sihat.pptxSlides Peluncuran Amalan Pemakanan Sihat.pptx
Slides Peluncuran Amalan Pemakanan Sihat.pptx
shabeluno
 
(T.L.E.) Agriculture: "Ornamental Plants"
(T.L.E.) Agriculture: "Ornamental Plants"(T.L.E.) Agriculture: "Ornamental Plants"
(T.L.E.) Agriculture: "Ornamental Plants"
MJDuyan
 
How to Create a Stage or a Pipeline in Odoo 17 CRM
How to Create a Stage or a Pipeline in Odoo 17 CRMHow to Create a Stage or a Pipeline in Odoo 17 CRM
How to Create a Stage or a Pipeline in Odoo 17 CRM
Celine George
 
Music Business Model Presentation Full Sail University
Music Business Model Presentation Full Sail UniversityMusic Business Model Presentation Full Sail University
Music Business Model Presentation Full Sail University
camakaiclarkmusic
 
Diversity Quiz Finals by Quiz Club, IIT Kanpur
Diversity Quiz Finals by Quiz Club, IIT KanpurDiversity Quiz Finals by Quiz Club, IIT Kanpur
Diversity Quiz Finals by Quiz Club, IIT Kanpur
Quiz Club IIT Kanpur
 
How to Create User Notification in Odoo 17
How to Create User Notification in Odoo 17How to Create User Notification in Odoo 17
How to Create User Notification in Odoo 17
Celine George
 
Non-Verbal Communication for Tech Professionals
Non-Verbal Communication for Tech ProfessionalsNon-Verbal Communication for Tech Professionals
Non-Verbal Communication for Tech Professionals
MattVassar1
 
bryophytes.pptx bsc botany honours second semester
bryophytes.pptx bsc botany honours  second semesterbryophytes.pptx bsc botany honours  second semester
bryophytes.pptx bsc botany honours second semester
Sarojini38
 

Recently uploaded (20)

The Rise of the Digital Telecommunication Marketplace.pptx
The Rise of the Digital Telecommunication Marketplace.pptxThe Rise of the Digital Telecommunication Marketplace.pptx
The Rise of the Digital Telecommunication Marketplace.pptx
 
220711130082 Srabanti Bag Internet Resources For Natural Science
220711130082 Srabanti Bag Internet Resources For Natural Science220711130082 Srabanti Bag Internet Resources For Natural Science
220711130082 Srabanti Bag Internet Resources For Natural Science
 
Brand Guideline of Bashundhara A4 Paper - 2024
Brand Guideline of Bashundhara A4 Paper - 2024Brand Guideline of Bashundhara A4 Paper - 2024
Brand Guideline of Bashundhara A4 Paper - 2024
 
Decolonizing Universal Design for Learning
Decolonizing Universal Design for LearningDecolonizing Universal Design for Learning
Decolonizing Universal Design for Learning
 
Hospital pharmacy and it's organization (1).pdf
Hospital pharmacy and it's organization (1).pdfHospital pharmacy and it's organization (1).pdf
Hospital pharmacy and it's organization (1).pdf
 
220711130083 SUBHASHREE RAKSHIT Internet resources for social science
220711130083 SUBHASHREE RAKSHIT  Internet resources for social science220711130083 SUBHASHREE RAKSHIT  Internet resources for social science
220711130083 SUBHASHREE RAKSHIT Internet resources for social science
 
IoT (Internet of Things) introduction Notes.pdf
IoT (Internet of Things) introduction Notes.pdfIoT (Internet of Things) introduction Notes.pdf
IoT (Internet of Things) introduction Notes.pdf
 
Science-9-Lesson-1-The Bohr Model-NLC.pptx pptx
Science-9-Lesson-1-The Bohr Model-NLC.pptx pptxScience-9-Lesson-1-The Bohr Model-NLC.pptx pptx
Science-9-Lesson-1-The Bohr Model-NLC.pptx pptx
 
The basics of sentences session 8pptx.pptx
The basics of sentences session 8pptx.pptxThe basics of sentences session 8pptx.pptx
The basics of sentences session 8pptx.pptx
 
Creativity for Innovation and Speechmaking
Creativity for Innovation and SpeechmakingCreativity for Innovation and Speechmaking
Creativity for Innovation and Speechmaking
 
Library news letter Kitengesa Uganda June 2024
Library news letter Kitengesa Uganda June 2024Library news letter Kitengesa Uganda June 2024
Library news letter Kitengesa Uganda June 2024
 
managing Behaviour in early childhood education.pptx
managing Behaviour in early childhood education.pptxmanaging Behaviour in early childhood education.pptx
managing Behaviour in early childhood education.pptx
 
Slides Peluncuran Amalan Pemakanan Sihat.pptx
Slides Peluncuran Amalan Pemakanan Sihat.pptxSlides Peluncuran Amalan Pemakanan Sihat.pptx
Slides Peluncuran Amalan Pemakanan Sihat.pptx
 
(T.L.E.) Agriculture: "Ornamental Plants"
(T.L.E.) Agriculture: "Ornamental Plants"(T.L.E.) Agriculture: "Ornamental Plants"
(T.L.E.) Agriculture: "Ornamental Plants"
 
How to Create a Stage or a Pipeline in Odoo 17 CRM
How to Create a Stage or a Pipeline in Odoo 17 CRMHow to Create a Stage or a Pipeline in Odoo 17 CRM
How to Create a Stage or a Pipeline in Odoo 17 CRM
 
Music Business Model Presentation Full Sail University
Music Business Model Presentation Full Sail UniversityMusic Business Model Presentation Full Sail University
Music Business Model Presentation Full Sail University
 
Diversity Quiz Finals by Quiz Club, IIT Kanpur
Diversity Quiz Finals by Quiz Club, IIT KanpurDiversity Quiz Finals by Quiz Club, IIT Kanpur
Diversity Quiz Finals by Quiz Club, IIT Kanpur
 
How to Create User Notification in Odoo 17
How to Create User Notification in Odoo 17How to Create User Notification in Odoo 17
How to Create User Notification in Odoo 17
 
Non-Verbal Communication for Tech Professionals
Non-Verbal Communication for Tech ProfessionalsNon-Verbal Communication for Tech Professionals
Non-Verbal Communication for Tech Professionals
 
bryophytes.pptx bsc botany honours second semester
bryophytes.pptx bsc botany honours  second semesterbryophytes.pptx bsc botany honours  second semester
bryophytes.pptx bsc botany honours second semester
 

Data Science.pptx NEW COURICUUMN IN DATA

  • 1. DATA SCIENCE: TOOLS, TECHNIQUES and APPLICATIONS Dr. Meenakshi Srivastava Dr. Ranjana Rajnish Assistant Professor Amity University msrivastava@lko.amity.edu
  • 2. What and Why ??? • WHAT is Data Science? • WHY Data Science is Important? • WHY Data Scientist are High in Demands? • WHY Data Science : In Academia ?
  • 3. Application of Data Science : Some Examples I. HEALTHCARE • Survival analysis – Analyze survival statistics for different patient attributes (age, blood type, gender, etc) and treatments. • Medication (dosage) effectiveness – Analyze effects of admitting different types and dosage of medication for a disease. • Re-admission risk – Predict risk of re-admittance based on patient attributes, medical history, diagnose & treatment.
  • 4. II. MARKETING • Predicting Lifetime Value (LTV) What for: if you can predict the characteristics of high LTV customers, this supports customer segmentation, identifies up sell opportunities and supports other marketing initiatives. • Demand Forecasting
  • 5. III. LOGISTICS • How many of what thing Customer needs and where will they need them? (Enables lean inventory and prevents out of stock situations.)
  • 6. MOST IMPORTANT QUESTION HOW DATA SCIENCE DO ALL THIS
  • 7. What is Data Science ? Data Science, is Broad Umbrella Term whereby the Scientific Methods, Math, Statistics etc are applied to Data sets in order to extract KNOWLEDGE and INSIGHT.
  • 8. DATA SCIENCE : A MESH UP OF DISCIPLINES • Another View
  • 9. THE DATA SCIENCE UNICORN • In medieval times, a Unicorn was a rare and mythical creature with great powers. • In today’s world, a similar mythical creature is a Data Science Unicorn, who knows equally well the Technology, Data Science, and Business. • Such professional is a most valuable resource of any data science team. • Many data professionals are experts in the first two areas – technology and data science, but lack business/domain skills.
  • 10. You All Are OUR FUTURE UNICORN
  • 11. How To Become A Data Science UNICORN ? Data Science UNICORN: Do Whatever Is Necessary To Extract Value from the Data • Statistics: Take a sample (data), answer questions about the process that produced this sample Is it a normal distribution? Estimate it’s mean. • Machine Learning: Take a sample(data), build a model to answer questions about future samples. – Given a sample of named faces, design a model for naming a new unseen face. • Data Mining: mine huge data store for interesting patterns or relationships. – Given DB of transactions, apply tools and algorithms to find frequent product bundles
  • 12. Machine Learning Machine Learning refers to a computer’s ability to learn from a dataset and adapt accordingly without having been explicitly programmed to do so. Examples : Regression, Decision Tree, Neural Network etc.
  • 13. Data Mining • To most of people data mining goes something like this: Tons of data is collected, then quant wizards work their arcane magic, and then they know all of this amazing stuff. • BUT WHAT THEY DO ? • They can tell us that "one of these things is not like the other“, or it can show us categories and then sort things into pre- determined categories/ class.
  • 14. HOW TO DO ALL THIS ??
  • 15. COMPUTATIONAL TOOLS • With the help of existing computational tools you all can very easily analyze your data. • No Programming Skills Required. • No in depth knowledge of Statistics, Machine Learning, Data Mining etc is required.
  • 16. Common Computational Tool • Rapid Miner (Open Source and Free): This is very popular since it is a readymade, open source, no-coding required software, which gives advanced analytics. Written in Java, it incorporates multifaceted data mining functions such as data preprocessing, visualization, predictive analysis, and can be easily integrated with WEKA and R-tool to directly give models from scripts written in the former two.
  • 17. • WEKA (Open Source & Free): This is a JAVA based customization tool, which is free to use. It includes visualization and predictive analysis and modeling techniques, clustering, association, regression and classification.
  • 18. • R-Programming Tool (Open Source and Free) : This is written in C and FORTRAN, and allows the data miners to write scripts just like a programming language/platform. Hence, it is used to make statistical and analytical software for data mining. It supports graphical analysis, both linear and nonlinear modeling, classification, clustering and time-based data analysis.
  • 19. • Python based Orange and NTLK: Python is very popular due to ease of use and its powerful features. Orange is an open source tool that is written in Python with useful data analytics, text analysis, and machine-learning features embedded in a visual programming interface. NTLK, also composed in Python, is a powerful language processing data mining tool, which consists of data mining, machine learning, and data scraping features that can easily be built up for customized needs.
  • 20. • Rattle (Open source and FREE) A rattle is a GUI tool that uses R Stats programming language. Rattle exposes the statistical power of R by providing considerable data mining functionality. Although Rattle has an extensive and well-developed UI. Also, it has an inbuilt log code tab that generates duplicate code for any activity happening at GUI.
  • 21. • DataMelt (Availability: Open source and Free) DataMelt, also known as DMelt is a computation and visualization environment. Also, provides an interactive framework to do data analysis and visualization. It is designed mainly for engineers, scientists & students.
  • 22. How Computational Tools Work • Have methods developed using Statistics, Machine Learning and Data Mining are used. • These pre-developed methods can be easily applied on your data set. • They provide you in build support for data visualization.
  • 23.
  • 24. What ALL I CAN DO WITH MY DATA ? • Regression: In statistics, regression is a classic technique to identify the scalar relationship between two or more variables by fitting the state line on the variable values.
  • 25. Cont… • Classification: This is a machine-learning technique used for labeling the set of observations provided for training examples. With this, we can classify the observations into one or more labels. The likelihood of sales, online fraud detection, and cancer classification (for medical science) are common applications of classification problems. Google Mail uses this technique to classify e- mails as spam or not.
  • 26. • Clustering: This technique is all about organizing similar items into groups from the given collection of items. User segmentation and image compression are the most common applications of clustering. Market segmentation, social network analysis, organizing the computer clustering, and astronomical data analysis are applications of clustering. • Google News Uses these techniques to group similar news items into the same category.
  • 27. Cont… • Recommendation: The recommendation algorithms are used in recommender systems where these systems are the most immediately recognizable machine learning techniques in use today. Web content recommendations may include similar websites, blogs, videos, or related content. Also, recommendation of online items can be helpful for cross-selling and up-selling.
  • 28. • Association Rules: This data mining technique helps to find the association between two or more Items. It discovers a hidden pattern in the data set.
  • 29. • Outlier Detection: This type of data mining technique refers to observation of data items in the dataset which do not match an expected pattern or expected behavior. This technique can be used in a variety of domains, such as intrusion, detection, fraud or fault detection, etc. Outer detection is also called Outlier Analysis or Outlier mining.
  • 30. • Prediction: Prediction has used a combination of the other data mining techniques like trends, sequential patterns, clustering, classification, etc. It analyzes past events or instances in a right sequence for predicting a future event.
  • 31. ADVANTAGES Use Computational Tools to predict the behavior of your compound. Use Computational Tools to analyze the same data with a different vision. Cos Cutting. Time Saving Very Clean perfect vision for your Research
  翻译: