尊敬的 微信汇率:1円 ≈ 0.046166 元 支付宝汇率:1円 ≈ 0.046257元 [退出登录]
SlideShare a Scribd company logo
Henry R. Kang (1/2010)
General Chemistry
Lecture 5
Statistical Data
Analysis
Henry R. Kang (7/2008)
Outlines
• Fundamental Statistics
• Accuracy and Precision
• Data Rejection
Henry R. Kang (1/2010)
Accuracy & Precision
• Accuracy
 Accuracy is a measure of the closeness of a
measured quantity to the true value.
• Precision
 How close two or more measurements of the
quantity agree with one another.
 Precision is a measure of the agreement of
replicate measurements.
Henry R. Kang (7/2008)
Fundamental
Statistics
Henry R. Kang (7/2008)
Errors
• All Measurements Contain Errors.
• Types of Errors
 Systematic errors
 One-sided errors (either positive or negative)
• Usually from a single source
• Resulting data are consistently high or low
 Results may be precise but inaccurate
• Examples: Balance is incorrectly zeroed. Use incorrect constant for
calculations.
 Random errors
 Randomly occurred
 Positive and negative deviations occur with equal frequency and size.
• A bell shape curve (Gaussian or normal distribution)
 The source of the error is usually not known
Henry R. Kang (7/2008)
Gaussian Distribution
• Gaussian distribution gives the distribution of data points with respect to the
true value. It gives a bell-shaped curve as shown in the figure.
 The closer to the true value, the higher the probability.
0
0.05
0.1
0.15
0.2
0.25
0.3
0.35
0.4
-3 -2 -1 0 1 2 3
Standard Deviation
Probability
Henry R. Kang (7/2008)
Measuring Accuracy
• Percent Error
 If the true value is known
• Part Per Thousand (PPT)
• Part Per Million (PPM)
• Unfortunately, the true value is often not known.
% error =
| true value – experimental value |
| True value |
× 100
PPT =
| true value – experimental value |
| true value – experimental value |
| True value |
| True value |
× 1000
× 106
PPM =
Henry R. Kang (7/2008)
Measuring Precision
• Mean (or Average)
• Deviation and Absolute Deviation
• Absolute Average Deviation
• Relative Deviation
• Relative Average Deviation (RAD)
• Standard Deviation
• Relative Standard Deviation
Henry R. Kang (7/2008)
Mean (Average)
• For multiple measurements of a given quantity,
we have numerical values x1, x2, x3, - - - -, xn, where
n is the number of measurements.
• Sum is defined as
Sum = x1 + x2 + x3 + - - - + xn = ∑ xi
• Mean xavg is defined as
∑ xiSum
n n=xavg =
Henry R. Kang (7/2008)
Deviation & Absolute Deviations
• Deviation is the difference (or variation) of a single measurement,
xi, away from the mean value, xavg.
 d1 = x1 – xavg
 d2 = x2 – xavg
 d3 = x3 – xavg
 -- - -- -- - -- --
 -- - -- -- - -- --
 dn = xn – xavg
• Absolute deviation is always positive.
 d1 = | x1 – xavg|
 d2 = | x2 – xavg|
 d3 = |x3 – xavg|
 -- - -- -- - -- --
 -- - -- -- - -- --
Henry R. Kang (7/2008)
Absolute Average Deviation
• Absolute average deviation, davg, is the arithmetic
mean of individual absolute deviations, di.
d1 = | x1 – xavg|
d2 = | x2 – xavg|
d3 = | x3 – xavg|
--------- ---
--------- ---
dn = | xn – xavg| ∑ di
n=davg
Henry R. Kang (7/2008)
Relative Deviation
• Relative deviation, Di, is the ratio of
individual absolute deviations, di, to the
mean value, xavg.
D1 = d1 / xavg = | x1 – xavg| / xavg
D2 = d2 / xavg = | x2 – xavg| / xavg
D3 = d3 / xavg = | x3 – xavg| / xavg
------------
Di = di / xavg = | xi – xavg| / xavg
------------
Henry R. Kang (7/2008)
Relative Average Deviation
• Relative average deviation (RAD) is the
absolute average deviation relative to
the mean xavg
A precision of 3 ppt or less is considered
very good.
RAD (ppt) = × 1000
davg
xavg
Henry R. Kang (7/2008)
Standard Deviation
• Standard deviation (σ) is useful in estimating data points
distribution in the form of the Gaussian distribution (a
bell-shaped curve).
 (xavg ± σ) incorporates 68.3% of the data points.
 (xavg ± 3σ) incorporates 99.7% of the data points.
 The smaller the σ, the less spread of data points.
 d1 = x1 – xavg
d2 = x2 – xavg
d3 = x3 – xavg
------------
dn = xn – xavg
∑ di
2
n – 1
=σ
√ =
√
d1
2
+ d2
2
+ d3
2
+ - - - - + dn
2
n – 1
Henry R. Kang (7/2008)
Relative Standard Deviation
• Relative standard deviation (σr) is the standard
deviation relative to the mean value.
 d1 = x1 – xavg
d2 = x2 – xavg
d3 = x3 – xavg
--------- ---
dn = xn – xavg
where n is the number of measurements
∑ (di /xavg)2
n – 1
=σr
√ =
√ D1
2
+D2
2
+D3
2
+ - - - - +Dn
2
n – 1
or σr (ppt) = (σ / xavg ) × 1000
Henry R. Kang (7/2008)
Gaussian Distribution
• Gaussian distribution gives the
distribution of data points with
respect to the true value. It gives a
bell-shaped curve as shown in the
figure.
0
0.05
0.1
0.15
0.2
0.25
0.3
0.35
0.4
-3 -2 -1 0 1 2 3
Standard Deviation
Probability
• The Gaussian equation is
P(x) = [(2π)1/2
σ]–1
exp[-(x – X)2
/(2σ2
)]
where σ is the standard deviation and X is the true value.
 The closer to the true value, the higher the probability.
 The area under the curve (or the integration of the Gaussian function)
 (xture ± σ) incorporates 68.3% of the data points.
 (xture ± 3σ) incorporates 99.7% of the data points.
 (xture ± 3.8901σ) incorporates 99.99% of the data points.
 (xture ± 4.4172σ) incorporates 99.999% of the data points.
 (xture ± 6σ) incorporates nearly 100% of the data points.
Henry R. Kang (7/2008)
Standard Deviation & Data Distribution
• The smaller the σ, the less spread of data points.
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
-4 -3 -2 -1 0 1 2 3 4
Standard Deviation
Probability
σ = 0.5
σ = 1.0
σ = 2.0
Henry R. Kang (7/2008)
Approximation of Standard Deviation
• The computational cost for standard deviation is pretty
high; therefore, there exists a good approximation to
compute standard deviation with much less
computational cost.
• š = Ř/√N
 Ř is the range of data points from the lowest value to the
highest value
Ř = xmax – xmin
 N is the number of data points.
• For a small number of measurements the approximation
is accurate enough to replace the formal standard
deviation.
Henry R. Kang (7/2008)
Accuracy
and
Precision
Henry R. Kang (1/2010)
Accuracy & Precision of Measurements
• Accuracy is a measure of the closeness of a measured quantity to
the true value.
• Precision is a measure of the agreement of replicate
measurements.
• Measurements can be precise but not accurate or accurate but not
precise or neither. The best result is, of course, accurate and
precise.
Accurate &
precise
Precise but
not accurate
not accurate
& not precise
accurate but
not precise
Henry R. Kang (1/2010)
Example 1 of Accuracy and Precision
• Measured %S values in H2SO4 are 28.72%, 28.40%, and 28.57%,
where the true value is 32.69%. Determine the accuracy and
precision.
• Answer:
 Mean = (28.72% + 28.40% + 28.57%) / 3 = 28.60%
 Estimated precision by using the approximation: š = Ř / √N
 š = (28.72 – 28.40)% / 31/2
= 0.32% / 1.732 = 0.18 %
 Relative standard deviation: sr = š / xM
 sr = 0.18% / 28.60% = 0.0063
 Accuracy = |X − xM| = | 32.69% − 28.60% | = 4.09%
 Relative accuracy = Accuracy / True value
= 4.09% / 32.69% = 0.125
• These result indicate that the data are precise but inaccurate.
Henry R. Kang (1/2010)
Example 2 of Accuracy and Precision
• Measured %S values in H2SO4 are 28.89%, 32.56%, and 36.64%,
where the true value is 32.69%. Determine the accuracy and
precision.
• Answer:
 Mean = (28.89% + 32.56% + 36.64%) / 3 = 32.70%
 Estimated precision by using the approximation: š = Ř / √N
 š = (36.64 – 28.89)% / 31/2
= 7.75% / 1.732 = 4.47 %
 Relative standard deviation: sr = š / xM
 sr = 4.47% / 32.70% = 0.137
 Accuracy = |X − xM| = | 32.69% − 32.70% | = 0.01%
 Relative accuracy = Accuracy / True value
= 0.01% / 32.69% = 0.0003
• These result indicate that the data are imprecise but accurate.
Henry R. Kang (1/2010)
Example 3 of Accuracy and Precision
• Measured %S values in H2SO4 are 25.62%, 33.56%, and 27.93%,
where the true value is 32.69%. Determine the accuracy and
precision.
• Answer:
 Mean = (25.62% + 33.56% + 27.93%) / 3 = 29.04%
 Estimated precision by using the approximation: š = Ř / √N
 š = (33.56 – 25.62)% / 31/2
= 7.94% / 1.732 = 4.58 %
 Relative standard deviation: sr = š / xM
 sr = 4.58% / 29.04% = 0.158
 Accuracy = |X − xM| = | 32.69% − 29.04% | = 3.65%
 Relative accuracy = Accuracy / True value
= 3.65% / 32.69% = 0.112
• These result indicate that the data are imprecise and inaccurate.
Henry R. Kang (7/2008)
Data Rejection
Henry R. Kang (7/2008)
Data Rejection
• Replicate measurements of a given quantity are usually
scattered.
 Some values are closer than others.
• Which values to keep (or which values to discard)
 If a single result differs greatly from the others that is caused
by a particular error of the experimenter, then this result
should be discarded.
 If a result is significantly “off”, but there is no error in the
experiment, then the result, in general, should be kept.
• If in doubt, use the rejection coefficient Q test.
• Do not discard any result just to get “good precision”.
Henry R. Kang (7/2008)
Q Test
• Q test is used to test the extreme values (the highest and lowest
values)
• Procedure
 Calculate the range
 Range = xmax – xmin
 Calculate the difference between the extreme value with its nearest
neighbor
 dhi = xmax – xnbor,hi; dlo = | xmin – xnbor,lo |
 Calculate the ratio (Q value) between the difference and the range
 Qhi = dhi / Range ;Qlo = dlo / Range
• Compare the resulting Q value with the rejection table at 90%
confidence level (or other selected confidence level)
 If the calculated Q value is greater than the Q value given in the table, then
reject the value.
Henry R. Kang (7/2008)
Rejection Q Tables
Number
of Data
Q90 Q96 Q99
3 0.94 0.98 0.99
4 0.76 0.85 0.93
5 0.64 0.73 0.82
6 0.56 0.64 0.74
7 0.51 0.59 0.68
8 0.47 0.54 0.63
9 0.44 0.51 0.60
10 0.41 0.48 0.57
Henry R. Kang (7/2008)
Q Test - Example
• Data: 35.00, 35.05, 35.10, 35.80
• Calculate the range
 Range = xmax – xmin= 35.80 – 35.00 = 0.80
• Calculate the difference between the extreme value with its
nearest neighbor.
 dhi = xmax – xnbor,hi = 35.80 – 35.10 = 0.70
 dlo = xmin – xnbor,lo = | 35.00 – 35.05 | = 0.05
• Calculate Q values between the difference and the range.
 Qhi = dhi / Range = 0.70 / 0.80 = 0.88
 Qlo = dlo / Range = 0.05 / 0.80 = 0.063
• Compare the resulting Q value with the rejection table at 90%
confidence level.
 For 4 samples, the Q value in the table is 0.76
 Qhi > 0.76; therefore, the highest value 35.80 can be dropped
 Once the value is dropped, it is no longer in the data set and should not
be used for the calculations of mean and various deviations.
#Data Q90
3 0.94
4 0.76
5 0.64
6 0.56
7 0.51
8 0.47
9 0.44
10 0.41

More Related Content

What's hot

Solution
SolutionSolution
Solution
Usman Shah
 
Non aqueous titration
Non aqueous titrationNon aqueous titration
Non aqueous titration
Shashank shekher mishra
 
Chemical kinetics
Chemical kineticsChemical kinetics
Chem 2 - Chemical Kinetics III - Determining the Rate Law with the Method of ...
Chem 2 - Chemical Kinetics III - Determining the Rate Law with the Method of ...Chem 2 - Chemical Kinetics III - Determining the Rate Law with the Method of ...
Chem 2 - Chemical Kinetics III - Determining the Rate Law with the Method of ...
Lumen Learning
 
GC-S010-Nomenclature
GC-S010-NomenclatureGC-S010-Nomenclature
GC-S010-Nomenclature
henry kang
 
Chapter17
Chapter17Chapter17
Redox titration for mpharm ist year
Redox titration for mpharm ist year Redox titration for mpharm ist year
Redox titration for mpharm ist year
prakash64742
 
Predicting products cheat sheet
Predicting products cheat sheetPredicting products cheat sheet
Predicting products cheat sheet
Timothy Welsh
 
IR Spectroscopy
IR SpectroscopyIR Spectroscopy
IR Spectroscopy
Ämzâd Hûssåiñ
 
Column selectivity HPLC Vanquish
Column selectivity HPLC VanquishColumn selectivity HPLC Vanquish
Column selectivity HPLC Vanquish
Oskari Aro
 
Hoofdstuk 3 - Conductometrie
Hoofdstuk 3 - ConductometrieHoofdstuk 3 - Conductometrie
Hoofdstuk 3 - Conductometrie
Tom Mortier
 
Potassium permanganate titrations
Potassium permanganate titrationsPotassium permanganate titrations
Potassium permanganate titrations
Hardeep Kaur
 
Acid base titration
Acid base titrationAcid base titration
Acid base titration
Debbra Marcel
 
Ionic equilibrium
Ionic equilibriumIonic equilibrium
Ionic equilibrium
Shivani Singh
 
Ionic equilibria | chemical equilibria |Types of electrolyte |Degree of disso...
Ionic equilibria | chemical equilibria |Types of electrolyte |Degree of disso...Ionic equilibria | chemical equilibria |Types of electrolyte |Degree of disso...
Ionic equilibria | chemical equilibria |Types of electrolyte |Degree of disso...
shubhada walawalkar
 
Acid base equilibria
Acid base equilibriaAcid base equilibria
Acid base equilibria
Suresh Selvaraj
 
general chemistry ch1
general chemistry ch1general chemistry ch1
general chemistry ch1
Hülya Saraç
 
Distillation
DistillationDistillation
Distillation
bhagwadgeeta
 
Derivation of spreading coefficient
Derivation of spreading coefficientDerivation of spreading coefficient
Derivation of spreading coefficient
Sonu Sharma
 

What's hot (19)

Solution
SolutionSolution
Solution
 
Non aqueous titration
Non aqueous titrationNon aqueous titration
Non aqueous titration
 
Chemical kinetics
Chemical kineticsChemical kinetics
Chemical kinetics
 
Chem 2 - Chemical Kinetics III - Determining the Rate Law with the Method of ...
Chem 2 - Chemical Kinetics III - Determining the Rate Law with the Method of ...Chem 2 - Chemical Kinetics III - Determining the Rate Law with the Method of ...
Chem 2 - Chemical Kinetics III - Determining the Rate Law with the Method of ...
 
GC-S010-Nomenclature
GC-S010-NomenclatureGC-S010-Nomenclature
GC-S010-Nomenclature
 
Chapter17
Chapter17Chapter17
Chapter17
 
Redox titration for mpharm ist year
Redox titration for mpharm ist year Redox titration for mpharm ist year
Redox titration for mpharm ist year
 
Predicting products cheat sheet
Predicting products cheat sheetPredicting products cheat sheet
Predicting products cheat sheet
 
IR Spectroscopy
IR SpectroscopyIR Spectroscopy
IR Spectroscopy
 
Column selectivity HPLC Vanquish
Column selectivity HPLC VanquishColumn selectivity HPLC Vanquish
Column selectivity HPLC Vanquish
 
Hoofdstuk 3 - Conductometrie
Hoofdstuk 3 - ConductometrieHoofdstuk 3 - Conductometrie
Hoofdstuk 3 - Conductometrie
 
Potassium permanganate titrations
Potassium permanganate titrationsPotassium permanganate titrations
Potassium permanganate titrations
 
Acid base titration
Acid base titrationAcid base titration
Acid base titration
 
Ionic equilibrium
Ionic equilibriumIonic equilibrium
Ionic equilibrium
 
Ionic equilibria | chemical equilibria |Types of electrolyte |Degree of disso...
Ionic equilibria | chemical equilibria |Types of electrolyte |Degree of disso...Ionic equilibria | chemical equilibria |Types of electrolyte |Degree of disso...
Ionic equilibria | chemical equilibria |Types of electrolyte |Degree of disso...
 
Acid base equilibria
Acid base equilibriaAcid base equilibria
Acid base equilibria
 
general chemistry ch1
general chemistry ch1general chemistry ch1
general chemistry ch1
 
Distillation
DistillationDistillation
Distillation
 
Derivation of spreading coefficient
Derivation of spreading coefficientDerivation of spreading coefficient
Derivation of spreading coefficient
 

Viewers also liked

GC-S006-Graphing
GC-S006-GraphingGC-S006-Graphing
GC-S006-Graphing
henry kang
 
Determination of the accuracy of linear and volumetric measurements on CBCT i...
Determination of the accuracy of linear and volumetric measurements on CBCT i...Determination of the accuracy of linear and volumetric measurements on CBCT i...
Determination of the accuracy of linear and volumetric measurements on CBCT i...
enasanter
 
Accuracy & Precision
Accuracy & PrecisionAccuracy & Precision
Accuracy & Precision
TekZeno
 
Accuracy and Precision
Accuracy and PrecisionAccuracy and Precision
Accuracy and Precision
Simple ABbieC
 
I010315762
I010315762I010315762
I010315762
IOSR Journals
 
Power point estrada
Power point estradaPower point estrada
Power point estrada
alexhiithazz
 
dissertation_final_sarpakunnas
dissertation_final_sarpakunnasdissertation_final_sarpakunnas
dissertation_final_sarpakunnas
Tuomas Sarpakunnas
 
I0814852
I0814852I0814852
I0814852
IOSR Journals
 
TireAngel Telematics 2014-12
TireAngel Telematics 2014-12TireAngel Telematics 2014-12
TireAngel Telematics 2014-12
Xuelin Zhou
 
B017250715
B017250715B017250715
B017250715
IOSR Journals
 
Ouranos hemeljesuschristus
Ouranos hemeljesuschristusOuranos hemeljesuschristus
Ouranos hemeljesuschristus
Lord Jesus Christ
 
Conoscere l'editoria 2
Conoscere l'editoria 2Conoscere l'editoria 2
Conoscere l'editoria 2
Cosi Repossi
 
Jeshua february2016
Jeshua february2016Jeshua february2016
Jeshua february2016
Lord Jesus Christ
 
H017235155
H017235155H017235155
H017235155
IOSR Journals
 
Derivation and Application of Six-Point Linear Multistep Numerical Method for...
Derivation and Application of Six-Point Linear Multistep Numerical Method for...Derivation and Application of Six-Point Linear Multistep Numerical Method for...
Derivation and Application of Six-Point Linear Multistep Numerical Method for...
IOSR Journals
 
Using Kentico EMS to optimize the B2B sales process
Using Kentico EMS to optimize the B2B sales processUsing Kentico EMS to optimize the B2B sales process
Using Kentico EMS to optimize the B2B sales process
James Williamson
 
Madcom analyzes the need for broadband in eastern pa
Madcom analyzes the need for broadband in eastern paMadcom analyzes the need for broadband in eastern pa
Madcom analyzes the need for broadband in eastern pa
Rich Frank
 
D018132226
D018132226D018132226
D018132226
IOSR Journals
 
LordJeshuainheritancemay2016
LordJeshuainheritancemay2016LordJeshuainheritancemay2016
LordJeshuainheritancemay2016
Lord Jesus Christ
 

Viewers also liked (20)

GC-S006-Graphing
GC-S006-GraphingGC-S006-Graphing
GC-S006-Graphing
 
Determination of the accuracy of linear and volumetric measurements on CBCT i...
Determination of the accuracy of linear and volumetric measurements on CBCT i...Determination of the accuracy of linear and volumetric measurements on CBCT i...
Determination of the accuracy of linear and volumetric measurements on CBCT i...
 
Accuracy & Precision
Accuracy & PrecisionAccuracy & Precision
Accuracy & Precision
 
Accuracy and Precision
Accuracy and PrecisionAccuracy and Precision
Accuracy and Precision
 
I010315762
I010315762I010315762
I010315762
 
Power point estrada
Power point estradaPower point estrada
Power point estrada
 
dissertation_final_sarpakunnas
dissertation_final_sarpakunnasdissertation_final_sarpakunnas
dissertation_final_sarpakunnas
 
I0814852
I0814852I0814852
I0814852
 
TireAngel Telematics 2014-12
TireAngel Telematics 2014-12TireAngel Telematics 2014-12
TireAngel Telematics 2014-12
 
startup_inside_FINAL
startup_inside_FINALstartup_inside_FINAL
startup_inside_FINAL
 
B017250715
B017250715B017250715
B017250715
 
Ouranos hemeljesuschristus
Ouranos hemeljesuschristusOuranos hemeljesuschristus
Ouranos hemeljesuschristus
 
Conoscere l'editoria 2
Conoscere l'editoria 2Conoscere l'editoria 2
Conoscere l'editoria 2
 
Jeshua february2016
Jeshua february2016Jeshua february2016
Jeshua february2016
 
H017235155
H017235155H017235155
H017235155
 
Derivation and Application of Six-Point Linear Multistep Numerical Method for...
Derivation and Application of Six-Point Linear Multistep Numerical Method for...Derivation and Application of Six-Point Linear Multistep Numerical Method for...
Derivation and Application of Six-Point Linear Multistep Numerical Method for...
 
Using Kentico EMS to optimize the B2B sales process
Using Kentico EMS to optimize the B2B sales processUsing Kentico EMS to optimize the B2B sales process
Using Kentico EMS to optimize the B2B sales process
 
Madcom analyzes the need for broadband in eastern pa
Madcom analyzes the need for broadband in eastern paMadcom analyzes the need for broadband in eastern pa
Madcom analyzes the need for broadband in eastern pa
 
D018132226
D018132226D018132226
D018132226
 
LordJeshuainheritancemay2016
LordJeshuainheritancemay2016LordJeshuainheritancemay2016
LordJeshuainheritancemay2016
 

Similar to GC-S005-DataAnalysis

Statistics chm 235
Statistics chm 235Statistics chm 235
Statistics chm 235
Alex Robianes Hernandez
 
Statistics
StatisticsStatistics
Statistics
megamsma
 
lecture-2.ppt
lecture-2.pptlecture-2.ppt
lecture-2.ppt
Noorelhuda2
 
Estimating a Population Standard Deviation or Variance
Estimating a Population Standard Deviation or VarianceEstimating a Population Standard Deviation or Variance
Estimating a Population Standard Deviation or Variance
Long Beach City College
 
Estimating a Population Standard Deviation or Variance
Estimating a Population Standard Deviation or Variance Estimating a Population Standard Deviation or Variance
Estimating a Population Standard Deviation or Variance
Long Beach City College
 
presentation_statistics_1448025870_153985.ppt
presentation_statistics_1448025870_153985.pptpresentation_statistics_1448025870_153985.ppt
presentation_statistics_1448025870_153985.ppt
AKSAKS12
 
Ch3_Statistical Analysis and Random Error Estimation.pdf
Ch3_Statistical Analysis and Random Error Estimation.pdfCh3_Statistical Analysis and Random Error Estimation.pdf
Ch3_Statistical Analysis and Random Error Estimation.pdf
Vamshi962726
 
Dr.Dinesh-BIOSTAT-Tests-of-significance-1-min.pdf
Dr.Dinesh-BIOSTAT-Tests-of-significance-1-min.pdfDr.Dinesh-BIOSTAT-Tests-of-significance-1-min.pdf
Dr.Dinesh-BIOSTAT-Tests-of-significance-1-min.pdf
HassanMohyUdDin2
 
Measures of Variation
Measures of Variation Measures of Variation
Measures of Variation
Long Beach City College
 
Measures of Dispersion
Measures of DispersionMeasures of Dispersion
Measures of Dispersion
KainatIqbal7
 
Variance & standard deviation
Variance & standard deviationVariance & standard deviation
Variance & standard deviation
Faisal Hussain
 
Standard deviation and standard error
Standard deviation and standard errorStandard deviation and standard error
Standard deviation and standard error
Shahla Yasmin
 
Unit-I Measures of Dispersion- Biostatistics - Ravinandan A P.pdf
Unit-I Measures of Dispersion- Biostatistics - Ravinandan A P.pdfUnit-I Measures of Dispersion- Biostatistics - Ravinandan A P.pdf
Unit-I Measures of Dispersion- Biostatistics - Ravinandan A P.pdf
Ravinandan A P
 
9618821.pdf
9618821.pdf9618821.pdf
9618821.pdf
UMAIRASHFAQ20
 
9618821.ppt
9618821.ppt9618821.ppt
9618821.ppt
UMAIRASHFAQ20
 
Statistics 3, 4
Statistics 3, 4Statistics 3, 4
Statistics 3, 4
Diana Diana
 
Quantitative Analysis for Emperical Research
Quantitative Analysis for Emperical ResearchQuantitative Analysis for Emperical Research
Quantitative Analysis for Emperical Research
Amit Kamble
 
Measure of dispersion
Measure of dispersionMeasure of dispersion
Measure of dispersion
Waqar Abbasi
 
Measures of Dispersion .pptx
Measures of Dispersion .pptxMeasures of Dispersion .pptx
Measures of Dispersion .pptx
Vishal543707
 
Cairo 02 Stat Inference
Cairo 02 Stat InferenceCairo 02 Stat Inference
Cairo 02 Stat Inference
ahmad bassiouny
 

Similar to GC-S005-DataAnalysis (20)

Statistics chm 235
Statistics chm 235Statistics chm 235
Statistics chm 235
 
Statistics
StatisticsStatistics
Statistics
 
lecture-2.ppt
lecture-2.pptlecture-2.ppt
lecture-2.ppt
 
Estimating a Population Standard Deviation or Variance
Estimating a Population Standard Deviation or VarianceEstimating a Population Standard Deviation or Variance
Estimating a Population Standard Deviation or Variance
 
Estimating a Population Standard Deviation or Variance
Estimating a Population Standard Deviation or Variance Estimating a Population Standard Deviation or Variance
Estimating a Population Standard Deviation or Variance
 
presentation_statistics_1448025870_153985.ppt
presentation_statistics_1448025870_153985.pptpresentation_statistics_1448025870_153985.ppt
presentation_statistics_1448025870_153985.ppt
 
Ch3_Statistical Analysis and Random Error Estimation.pdf
Ch3_Statistical Analysis and Random Error Estimation.pdfCh3_Statistical Analysis and Random Error Estimation.pdf
Ch3_Statistical Analysis and Random Error Estimation.pdf
 
Dr.Dinesh-BIOSTAT-Tests-of-significance-1-min.pdf
Dr.Dinesh-BIOSTAT-Tests-of-significance-1-min.pdfDr.Dinesh-BIOSTAT-Tests-of-significance-1-min.pdf
Dr.Dinesh-BIOSTAT-Tests-of-significance-1-min.pdf
 
Measures of Variation
Measures of Variation Measures of Variation
Measures of Variation
 
Measures of Dispersion
Measures of DispersionMeasures of Dispersion
Measures of Dispersion
 
Variance & standard deviation
Variance & standard deviationVariance & standard deviation
Variance & standard deviation
 
Standard deviation and standard error
Standard deviation and standard errorStandard deviation and standard error
Standard deviation and standard error
 
Unit-I Measures of Dispersion- Biostatistics - Ravinandan A P.pdf
Unit-I Measures of Dispersion- Biostatistics - Ravinandan A P.pdfUnit-I Measures of Dispersion- Biostatistics - Ravinandan A P.pdf
Unit-I Measures of Dispersion- Biostatistics - Ravinandan A P.pdf
 
9618821.pdf
9618821.pdf9618821.pdf
9618821.pdf
 
9618821.ppt
9618821.ppt9618821.ppt
9618821.ppt
 
Statistics 3, 4
Statistics 3, 4Statistics 3, 4
Statistics 3, 4
 
Quantitative Analysis for Emperical Research
Quantitative Analysis for Emperical ResearchQuantitative Analysis for Emperical Research
Quantitative Analysis for Emperical Research
 
Measure of dispersion
Measure of dispersionMeasure of dispersion
Measure of dispersion
 
Measures of Dispersion .pptx
Measures of Dispersion .pptxMeasures of Dispersion .pptx
Measures of Dispersion .pptx
 
Cairo 02 Stat Inference
Cairo 02 Stat InferenceCairo 02 Stat Inference
Cairo 02 Stat Inference
 

More from henry kang

GC-S009-Substances
GC-S009-SubstancesGC-S009-Substances
GC-S009-Substances
henry kang
 
GC-S008-Mass&Mole
GC-S008-Mass&MoleGC-S008-Mass&Mole
GC-S008-Mass&Mole
henry kang
 
GC-S007-Atom
GC-S007-AtomGC-S007-Atom
GC-S007-Atom
henry kang
 
GC-S004-ScientificNotation
GC-S004-ScientificNotationGC-S004-ScientificNotation
GC-S004-ScientificNotation
henry kang
 
GC-S003-Measurement
GC-S003-MeasurementGC-S003-Measurement
GC-S003-Measurement
henry kang
 
GC-S002-Matter
GC-S002-MatterGC-S002-Matter
GC-S002-Matter
henry kang
 
RC3-deScreen_s
RC3-deScreen_sRC3-deScreen_s
RC3-deScreen_s
henry kang
 
RC2-filterDesign_s
RC2-filterDesign_sRC2-filterDesign_s
RC2-filterDesign_s
henry kang
 
GenChem000-WhatIsChemistry
GenChem000-WhatIsChemistryGenChem000-WhatIsChemistry
GenChem000-WhatIsChemistry
henry kang
 
GenChem001-ScientificMethod
GenChem001-ScientificMethodGenChem001-ScientificMethod
GenChem001-ScientificMethod
henry kang
 

More from henry kang (10)

GC-S009-Substances
GC-S009-SubstancesGC-S009-Substances
GC-S009-Substances
 
GC-S008-Mass&Mole
GC-S008-Mass&MoleGC-S008-Mass&Mole
GC-S008-Mass&Mole
 
GC-S007-Atom
GC-S007-AtomGC-S007-Atom
GC-S007-Atom
 
GC-S004-ScientificNotation
GC-S004-ScientificNotationGC-S004-ScientificNotation
GC-S004-ScientificNotation
 
GC-S003-Measurement
GC-S003-MeasurementGC-S003-Measurement
GC-S003-Measurement
 
GC-S002-Matter
GC-S002-MatterGC-S002-Matter
GC-S002-Matter
 
RC3-deScreen_s
RC3-deScreen_sRC3-deScreen_s
RC3-deScreen_s
 
RC2-filterDesign_s
RC2-filterDesign_sRC2-filterDesign_s
RC2-filterDesign_s
 
GenChem000-WhatIsChemistry
GenChem000-WhatIsChemistryGenChem000-WhatIsChemistry
GenChem000-WhatIsChemistry
 
GenChem001-ScientificMethod
GenChem001-ScientificMethodGenChem001-ScientificMethod
GenChem001-ScientificMethod
 

GC-S005-DataAnalysis

  • 1. Henry R. Kang (1/2010) General Chemistry Lecture 5 Statistical Data Analysis
  • 2. Henry R. Kang (7/2008) Outlines • Fundamental Statistics • Accuracy and Precision • Data Rejection
  • 3. Henry R. Kang (1/2010) Accuracy & Precision • Accuracy  Accuracy is a measure of the closeness of a measured quantity to the true value. • Precision  How close two or more measurements of the quantity agree with one another.  Precision is a measure of the agreement of replicate measurements.
  • 4. Henry R. Kang (7/2008) Fundamental Statistics
  • 5. Henry R. Kang (7/2008) Errors • All Measurements Contain Errors. • Types of Errors  Systematic errors  One-sided errors (either positive or negative) • Usually from a single source • Resulting data are consistently high or low  Results may be precise but inaccurate • Examples: Balance is incorrectly zeroed. Use incorrect constant for calculations.  Random errors  Randomly occurred  Positive and negative deviations occur with equal frequency and size. • A bell shape curve (Gaussian or normal distribution)  The source of the error is usually not known
  • 6. Henry R. Kang (7/2008) Gaussian Distribution • Gaussian distribution gives the distribution of data points with respect to the true value. It gives a bell-shaped curve as shown in the figure.  The closer to the true value, the higher the probability. 0 0.05 0.1 0.15 0.2 0.25 0.3 0.35 0.4 -3 -2 -1 0 1 2 3 Standard Deviation Probability
  • 7. Henry R. Kang (7/2008) Measuring Accuracy • Percent Error  If the true value is known • Part Per Thousand (PPT) • Part Per Million (PPM) • Unfortunately, the true value is often not known. % error = | true value – experimental value | | True value | × 100 PPT = | true value – experimental value | | true value – experimental value | | True value | | True value | × 1000 × 106 PPM =
  • 8. Henry R. Kang (7/2008) Measuring Precision • Mean (or Average) • Deviation and Absolute Deviation • Absolute Average Deviation • Relative Deviation • Relative Average Deviation (RAD) • Standard Deviation • Relative Standard Deviation
  • 9. Henry R. Kang (7/2008) Mean (Average) • For multiple measurements of a given quantity, we have numerical values x1, x2, x3, - - - -, xn, where n is the number of measurements. • Sum is defined as Sum = x1 + x2 + x3 + - - - + xn = ∑ xi • Mean xavg is defined as ∑ xiSum n n=xavg =
  • 10. Henry R. Kang (7/2008) Deviation & Absolute Deviations • Deviation is the difference (or variation) of a single measurement, xi, away from the mean value, xavg.  d1 = x1 – xavg  d2 = x2 – xavg  d3 = x3 – xavg  -- - -- -- - -- --  -- - -- -- - -- --  dn = xn – xavg • Absolute deviation is always positive.  d1 = | x1 – xavg|  d2 = | x2 – xavg|  d3 = |x3 – xavg|  -- - -- -- - -- --  -- - -- -- - -- --
  • 11. Henry R. Kang (7/2008) Absolute Average Deviation • Absolute average deviation, davg, is the arithmetic mean of individual absolute deviations, di. d1 = | x1 – xavg| d2 = | x2 – xavg| d3 = | x3 – xavg| --------- --- --------- --- dn = | xn – xavg| ∑ di n=davg
  • 12. Henry R. Kang (7/2008) Relative Deviation • Relative deviation, Di, is the ratio of individual absolute deviations, di, to the mean value, xavg. D1 = d1 / xavg = | x1 – xavg| / xavg D2 = d2 / xavg = | x2 – xavg| / xavg D3 = d3 / xavg = | x3 – xavg| / xavg ------------ Di = di / xavg = | xi – xavg| / xavg ------------
  • 13. Henry R. Kang (7/2008) Relative Average Deviation • Relative average deviation (RAD) is the absolute average deviation relative to the mean xavg A precision of 3 ppt or less is considered very good. RAD (ppt) = × 1000 davg xavg
  • 14. Henry R. Kang (7/2008) Standard Deviation • Standard deviation (σ) is useful in estimating data points distribution in the form of the Gaussian distribution (a bell-shaped curve).  (xavg ± σ) incorporates 68.3% of the data points.  (xavg ± 3σ) incorporates 99.7% of the data points.  The smaller the σ, the less spread of data points.  d1 = x1 – xavg d2 = x2 – xavg d3 = x3 – xavg ------------ dn = xn – xavg ∑ di 2 n – 1 =σ √ = √ d1 2 + d2 2 + d3 2 + - - - - + dn 2 n – 1
  • 15. Henry R. Kang (7/2008) Relative Standard Deviation • Relative standard deviation (σr) is the standard deviation relative to the mean value.  d1 = x1 – xavg d2 = x2 – xavg d3 = x3 – xavg --------- --- dn = xn – xavg where n is the number of measurements ∑ (di /xavg)2 n – 1 =σr √ = √ D1 2 +D2 2 +D3 2 + - - - - +Dn 2 n – 1 or σr (ppt) = (σ / xavg ) × 1000
  • 16. Henry R. Kang (7/2008) Gaussian Distribution • Gaussian distribution gives the distribution of data points with respect to the true value. It gives a bell-shaped curve as shown in the figure. 0 0.05 0.1 0.15 0.2 0.25 0.3 0.35 0.4 -3 -2 -1 0 1 2 3 Standard Deviation Probability • The Gaussian equation is P(x) = [(2π)1/2 σ]–1 exp[-(x – X)2 /(2σ2 )] where σ is the standard deviation and X is the true value.  The closer to the true value, the higher the probability.  The area under the curve (or the integration of the Gaussian function)  (xture ± σ) incorporates 68.3% of the data points.  (xture ± 3σ) incorporates 99.7% of the data points.  (xture ± 3.8901σ) incorporates 99.99% of the data points.  (xture ± 4.4172σ) incorporates 99.999% of the data points.  (xture ± 6σ) incorporates nearly 100% of the data points.
  • 17. Henry R. Kang (7/2008) Standard Deviation & Data Distribution • The smaller the σ, the less spread of data points. 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 -4 -3 -2 -1 0 1 2 3 4 Standard Deviation Probability σ = 0.5 σ = 1.0 σ = 2.0
  • 18. Henry R. Kang (7/2008) Approximation of Standard Deviation • The computational cost for standard deviation is pretty high; therefore, there exists a good approximation to compute standard deviation with much less computational cost. • š = Ř/√N  Ř is the range of data points from the lowest value to the highest value Ř = xmax – xmin  N is the number of data points. • For a small number of measurements the approximation is accurate enough to replace the formal standard deviation.
  • 19. Henry R. Kang (7/2008) Accuracy and Precision
  • 20. Henry R. Kang (1/2010) Accuracy & Precision of Measurements • Accuracy is a measure of the closeness of a measured quantity to the true value. • Precision is a measure of the agreement of replicate measurements. • Measurements can be precise but not accurate or accurate but not precise or neither. The best result is, of course, accurate and precise. Accurate & precise Precise but not accurate not accurate & not precise accurate but not precise
  • 21. Henry R. Kang (1/2010) Example 1 of Accuracy and Precision • Measured %S values in H2SO4 are 28.72%, 28.40%, and 28.57%, where the true value is 32.69%. Determine the accuracy and precision. • Answer:  Mean = (28.72% + 28.40% + 28.57%) / 3 = 28.60%  Estimated precision by using the approximation: š = Ř / √N  š = (28.72 – 28.40)% / 31/2 = 0.32% / 1.732 = 0.18 %  Relative standard deviation: sr = š / xM  sr = 0.18% / 28.60% = 0.0063  Accuracy = |X − xM| = | 32.69% − 28.60% | = 4.09%  Relative accuracy = Accuracy / True value = 4.09% / 32.69% = 0.125 • These result indicate that the data are precise but inaccurate.
  • 22. Henry R. Kang (1/2010) Example 2 of Accuracy and Precision • Measured %S values in H2SO4 are 28.89%, 32.56%, and 36.64%, where the true value is 32.69%. Determine the accuracy and precision. • Answer:  Mean = (28.89% + 32.56% + 36.64%) / 3 = 32.70%  Estimated precision by using the approximation: š = Ř / √N  š = (36.64 – 28.89)% / 31/2 = 7.75% / 1.732 = 4.47 %  Relative standard deviation: sr = š / xM  sr = 4.47% / 32.70% = 0.137  Accuracy = |X − xM| = | 32.69% − 32.70% | = 0.01%  Relative accuracy = Accuracy / True value = 0.01% / 32.69% = 0.0003 • These result indicate that the data are imprecise but accurate.
  • 23. Henry R. Kang (1/2010) Example 3 of Accuracy and Precision • Measured %S values in H2SO4 are 25.62%, 33.56%, and 27.93%, where the true value is 32.69%. Determine the accuracy and precision. • Answer:  Mean = (25.62% + 33.56% + 27.93%) / 3 = 29.04%  Estimated precision by using the approximation: š = Ř / √N  š = (33.56 – 25.62)% / 31/2 = 7.94% / 1.732 = 4.58 %  Relative standard deviation: sr = š / xM  sr = 4.58% / 29.04% = 0.158  Accuracy = |X − xM| = | 32.69% − 29.04% | = 3.65%  Relative accuracy = Accuracy / True value = 3.65% / 32.69% = 0.112 • These result indicate that the data are imprecise and inaccurate.
  • 24. Henry R. Kang (7/2008) Data Rejection
  • 25. Henry R. Kang (7/2008) Data Rejection • Replicate measurements of a given quantity are usually scattered.  Some values are closer than others. • Which values to keep (or which values to discard)  If a single result differs greatly from the others that is caused by a particular error of the experimenter, then this result should be discarded.  If a result is significantly “off”, but there is no error in the experiment, then the result, in general, should be kept. • If in doubt, use the rejection coefficient Q test. • Do not discard any result just to get “good precision”.
  • 26. Henry R. Kang (7/2008) Q Test • Q test is used to test the extreme values (the highest and lowest values) • Procedure  Calculate the range  Range = xmax – xmin  Calculate the difference between the extreme value with its nearest neighbor  dhi = xmax – xnbor,hi; dlo = | xmin – xnbor,lo |  Calculate the ratio (Q value) between the difference and the range  Qhi = dhi / Range ;Qlo = dlo / Range • Compare the resulting Q value with the rejection table at 90% confidence level (or other selected confidence level)  If the calculated Q value is greater than the Q value given in the table, then reject the value.
  • 27. Henry R. Kang (7/2008) Rejection Q Tables Number of Data Q90 Q96 Q99 3 0.94 0.98 0.99 4 0.76 0.85 0.93 5 0.64 0.73 0.82 6 0.56 0.64 0.74 7 0.51 0.59 0.68 8 0.47 0.54 0.63 9 0.44 0.51 0.60 10 0.41 0.48 0.57
  • 28. Henry R. Kang (7/2008) Q Test - Example • Data: 35.00, 35.05, 35.10, 35.80 • Calculate the range  Range = xmax – xmin= 35.80 – 35.00 = 0.80 • Calculate the difference between the extreme value with its nearest neighbor.  dhi = xmax – xnbor,hi = 35.80 – 35.10 = 0.70  dlo = xmin – xnbor,lo = | 35.00 – 35.05 | = 0.05 • Calculate Q values between the difference and the range.  Qhi = dhi / Range = 0.70 / 0.80 = 0.88  Qlo = dlo / Range = 0.05 / 0.80 = 0.063 • Compare the resulting Q value with the rejection table at 90% confidence level.  For 4 samples, the Q value in the table is 0.76  Qhi > 0.76; therefore, the highest value 35.80 can be dropped  Once the value is dropped, it is no longer in the data set and should not be used for the calculations of mean and various deviations. #Data Q90 3 0.94 4 0.76 5 0.64 6 0.56 7 0.51 8 0.47 9 0.44 10 0.41
  翻译: