The document discusses measures of dispersion, which describe how spread out or varied data values are from the average. Dispersion is important because averages alone do not reveal the full nature of how concentrated or scattered the data is. Some key measures of dispersion include range, interquartile range, mean deviation, variance, and standard deviation. These help provide a more complete picture of the data distribution compared to averages alone. Measures of dispersion are useful for tasks like quality control, data analysis, and comparing variability in different data sets.
The document discusses measures of dispersion used to describe how spread out or varied data values are around a central measure like the mean. It defines dispersion as the degree of scatteredness of data values and explains that measures of dispersion coupled with measures of central tendency provide a more complete picture of a data distribution than central measures alone. The document then covers different types of dispersion measures, both absolute measures expressed in data units and relative measures for comparing distributions with different units. Specific measures discussed include range, interquartile range, mean deviation, variance, and standard deviation.
The document discusses measures of dispersion, which describe how varied or spread out a data set is around the average value. It defines several measures of dispersion, including range, interquartile range, mean deviation, and standard deviation. The standard deviation is described as the most important measure, as it takes into account all values in the data set and is not overly influenced by outliers. The document provides a detailed example of calculating the standard deviation, which involves finding the differences from the mean, squaring those values, summing them, and taking the square root.
This document discusses measures of central tendency and dispersion in statistics. It defines central tendency as a single value that describes the center of a data distribution. Common measures include the mean, median, and mode. The mean is the average value calculated by adding all values and dividing by the total number. The median is the middle value when data is ordered from lowest to highest. The mode is the most frequent value. Dispersion measures the spread of data and includes the range, mean deviation, standard deviation, and variance. Standard deviation summarizes how far data points are from the mean. Variance is the square of the standard deviation. The document provides examples of calculating these measures and their characteristics and uses.
Lecture. Introduction to Statistics (Measures of Dispersion).pptxNabeelAli89
ย
1) The document discusses various measures of dispersion used to quantify how spread out or varied a set of data values are from the average.
2) There are two types of dispersion - absolute dispersion measures how varied data values are in the original units, while relative dispersion compares variability between datasets with different units.
3) Common measures of absolute dispersion include range, variance, and standard deviation. Range is the difference between highest and lowest values, while variance and standard deviation take into account how far all values are from the mean.
This document provides an overview of measures of variation that can be used to analyze and describe datasets. It discusses absolute measures like range, interquartile range, variance and standard deviation, which provide a measure of variability within a single dataset. It also discusses the relative measure of coefficient of variation, which can be used to compare variability across different datasets even if they use different units of measurement. The learning objectives are to be able to calculate these dispersion measures, understand their strengths and limitations, and properly interpret what they convey about a dataset's spread or variability.
The document discusses various concepts related to variability and measures of dispersion in statistics:
- Variability refers to the spread or deviation of scores from the mean in a data set. Measures of variability quantify how concentrated or dispersed the data is.
- Common measures of variability include range, quartile deviation, mean deviation, variance, standard deviation, and coefficient of variation. Range simply measures the highest and lowest scores while other measures account for dispersion across all scores.
- The standard deviation is the most widely used measure of variability as it expresses dispersion in the same units as the original data. It quantifies how far scores deviate from the mean on average.
- Understanding variability is important for determining if averages
This document provides an overview of biostatistics. It defines biostatistics as the branch of statistics dealing with biological and medical data, especially relating to humans. Some key points covered include:
- Descriptive statistics are used to describe data through methods like graphs and quantitative measures. Inferential statistics are used to characterize populations based on sample results.
- Biostatistics applies statistical techniques to collect, analyze, and interpret data from biological studies and health/medical research. It is used for tasks like evaluating vaccine effectiveness and informing public health priorities.
- Common analyses in biostatistics include measures of central tendency like the mean, median, and mode to summarize data, and measures of dispersion to quantify variation. Frequency distributions are
The document discusses measures of variability in statistics including range, interquartile range, standard deviation, and variance. It provides examples of calculating each measure using sample data sets. The range is the difference between the highest and lowest values, while the interquartile range is the difference between the third and first quartiles. The standard deviation represents the average amount of dispersion from the mean, and variance is the average of the squared deviations from the mean. Both standard deviation and variance increase with greater variability in the data set.
The document discusses measures of dispersion used to describe how spread out or varied data values are around a central measure like the mean. It defines dispersion as the degree of scatteredness of data values and explains that measures of dispersion coupled with measures of central tendency provide a more complete picture of a data distribution than central measures alone. The document then covers different types of dispersion measures, both absolute measures expressed in data units and relative measures for comparing distributions with different units. Specific measures discussed include range, interquartile range, mean deviation, variance, and standard deviation.
The document discusses measures of dispersion, which describe how varied or spread out a data set is around the average value. It defines several measures of dispersion, including range, interquartile range, mean deviation, and standard deviation. The standard deviation is described as the most important measure, as it takes into account all values in the data set and is not overly influenced by outliers. The document provides a detailed example of calculating the standard deviation, which involves finding the differences from the mean, squaring those values, summing them, and taking the square root.
This document discusses measures of central tendency and dispersion in statistics. It defines central tendency as a single value that describes the center of a data distribution. Common measures include the mean, median, and mode. The mean is the average value calculated by adding all values and dividing by the total number. The median is the middle value when data is ordered from lowest to highest. The mode is the most frequent value. Dispersion measures the spread of data and includes the range, mean deviation, standard deviation, and variance. Standard deviation summarizes how far data points are from the mean. Variance is the square of the standard deviation. The document provides examples of calculating these measures and their characteristics and uses.
Lecture. Introduction to Statistics (Measures of Dispersion).pptxNabeelAli89
ย
1) The document discusses various measures of dispersion used to quantify how spread out or varied a set of data values are from the average.
2) There are two types of dispersion - absolute dispersion measures how varied data values are in the original units, while relative dispersion compares variability between datasets with different units.
3) Common measures of absolute dispersion include range, variance, and standard deviation. Range is the difference between highest and lowest values, while variance and standard deviation take into account how far all values are from the mean.
This document provides an overview of measures of variation that can be used to analyze and describe datasets. It discusses absolute measures like range, interquartile range, variance and standard deviation, which provide a measure of variability within a single dataset. It also discusses the relative measure of coefficient of variation, which can be used to compare variability across different datasets even if they use different units of measurement. The learning objectives are to be able to calculate these dispersion measures, understand their strengths and limitations, and properly interpret what they convey about a dataset's spread or variability.
The document discusses various concepts related to variability and measures of dispersion in statistics:
- Variability refers to the spread or deviation of scores from the mean in a data set. Measures of variability quantify how concentrated or dispersed the data is.
- Common measures of variability include range, quartile deviation, mean deviation, variance, standard deviation, and coefficient of variation. Range simply measures the highest and lowest scores while other measures account for dispersion across all scores.
- The standard deviation is the most widely used measure of variability as it expresses dispersion in the same units as the original data. It quantifies how far scores deviate from the mean on average.
- Understanding variability is important for determining if averages
This document provides an overview of biostatistics. It defines biostatistics as the branch of statistics dealing with biological and medical data, especially relating to humans. Some key points covered include:
- Descriptive statistics are used to describe data through methods like graphs and quantitative measures. Inferential statistics are used to characterize populations based on sample results.
- Biostatistics applies statistical techniques to collect, analyze, and interpret data from biological studies and health/medical research. It is used for tasks like evaluating vaccine effectiveness and informing public health priorities.
- Common analyses in biostatistics include measures of central tendency like the mean, median, and mode to summarize data, and measures of dispersion to quantify variation. Frequency distributions are
The document discusses measures of variability in statistics including range, interquartile range, standard deviation, and variance. It provides examples of calculating each measure using sample data sets. The range is the difference between the highest and lowest values, while the interquartile range is the difference between the third and first quartiles. The standard deviation represents the average amount of dispersion from the mean, and variance is the average of the squared deviations from the mean. Both standard deviation and variance increase with greater variability in the data set.
Measure of dispersion has two types Absolute measure and Graphical measure. There are other different types in there.
In this slide the discussed points are:
1. Dispersion & it's types
2. Definition
3. Use
4. Merits
5. Demerits
6. Formula & math
7. Graph and pictures
8. Real life application.
This document discusses measures of dispersion, which characterize how spread out values are from the central tendency in a data set. It defines absolute and relative measures of dispersion. Absolute measures indicate variation in raw units, while relative measures are dimensionless and allow comparison between data sets. The document focuses on range and coefficient of range as simple absolute and relative measures of dispersion. It provides examples of calculating range and coefficient of range from data sets and exercises for the reader to practice.
This document provides an overview of key concepts in statistics. It discusses how statistics is used to collect, organize, summarize, present, and analyze numerical data to derive valid conclusions. It defines common statistical terminology like data, quantitative vs. qualitative data, measures of central tendency (mean, median, mode), measures of variability (range, standard deviation), the normal distribution curve, and coefficient of variation. The document also explains common statistical tests like the z-test, t-test, ANOVA, chi-square test and concepts like sensitivity and specificity. Overall, the document serves as a high-level introduction to foundational statistical methods and analyses.
This document discusses various measures of dispersion used to quantify how spread out or variable a data set is. It defines dispersion and explains the purposes of measuring it. The key measures of dispersion discussed are range, quartile deviation, mean deviation, variance, standard deviation, and coefficient of variation. Formulas are provided for calculating each measure along with their merits and limitations. The conclusion emphasizes that measures of dispersion are useful for comparing distributions and further statistical analysis.
This document discusses various measures used to describe data, including measures of central tendency (mean, median, mode) and measures of variation (range, variance, standard deviation). It provides definitions and formulas for calculating different statistical measures, along with their properties and appropriate uses. Measures of central tendency indicate the central or typical value of a data set, while measures of variation describe how spread out or dispersed the data are around the central value. The document compares absolute and relative measures and discusses specific measures like range, quartile deviation, average deviation, and standard deviation.
This document describes how to characterize the distribution of a quantitative variable in three steps: reporting the center, deviations from the center, and general shape. It discusses various measures of central tendency (mean, median, mode), variation (range, standard deviation, average deviation), and distribution shape (normal curve, skewness). The mean, median, and mode are introduced as measures of central tendency, along with how to calculate each one. Measures of variation like range, standard deviation, and average deviation are also defined and the formulas to compute them provided. Finally, the document discusses the normal distribution curve and how skewness indicates a distribution's departure from symmetry.
Measure of dispersion refers to statistical measures that quantify how data values are spread out or vary from the average value. There are various measures of dispersion including standard deviation, mean deviation, variance, range, and quartile deviation. These measures capture how concentrated or scattered the data is and help analyze the characteristics of data sets. Common measures include range, which is the difference between highest and lowest values; variance and standard deviation, which measure average deviation from the mean; and mean deviation, which averages the absolute deviations from the mean or median. Measures can be absolute, using the same units as the data, or relative by standardizing against the mean. Skewness and kurtosis further characterize the shape and outliers of a distribution.
Descriptions of data statistics for researchHarve Abella
ย
This document defines and describes various measures of central tendency and variation that are used to summarize and describe sets of data. It discusses the mean, median, mode, midrange, percentiles, quartiles, range, variance, standard deviation, interquartile range, coefficient of variation, measures of skewness and kurtosis. Examples are provided to demonstrate how to compute and interpret these statistical measures.
Statistical analysis is an important tool for researchers to analyze collected data. There are two major areas of statistics: descriptive statistics which develops indices to describe data, and inferential statistics which tests hypotheses and generalizes findings. Descriptive statistics measures central tendency (mean, median, mode), dispersion (range, standard deviation), and skewness. Relationship between variables is measured using correlation and regression analysis. Statistical tools help summarize large datasets, identify patterns, and make reliable inferences.
This document summarizes various statistical measures used to analyze and describe data distributions, including measures of central tendency (mean, median, mode), dispersion (range, standard deviation, variance), skewness, and kurtosis. It provides formulas and methods for calculating each measure along with interpretations of the results. Measures of central tendency provide a single value to represent the center of the data set. Measures of dispersion describe how spread out or varied the data values are. Skewness and kurtosis measure the symmetry and peakedness of distributions compared to the normal curve.
STATISTICS.pptx for the scholars and studentsssuseref12b21
ย
The document provides an overview of statistics, including definitions, types, and key concepts. It defines statistics as the science of collecting, presenting, analyzing, and interpreting data. It discusses descriptive statistics, which summarize and organize raw data, and inferential statistics, which allow generalization from samples to populations. The document also covers variables, scales of measurement, measures of central tendency (mean, median, mode), measures of dispersion (range, standard deviation), and other statistical terminology.
This document provides an introduction to analyzing experimental errors and data. It discusses evaluating potential sources of errors before, during, and after an analysis. There are two types of experimental errors - determinate errors that affect accuracy and indeterminate errors that affect precision. Determinate errors can be constant or proportional while indeterminate errors are random. The document outlines various sources of these errors and methods to identify and minimize them, such as analyzing samples of different sizes or using reference standards.
A teacher calculated the standard deviation of test scores to see how close students scored to the mean grade of 65%. She found the standard deviation was high, indicating outliers pulled the mean down. An employer also calculated standard deviation to analyze salary fairness, finding it slightly high due to long-time employees making more. Standard deviation measures dispersion from the mean, with low values showing close grouping and high values showing a wider spread. It is calculated using the variance formula of summing the squared differences from the mean divided by the number of values.
In this lesson, students will be shown that it is not enough to get measures of central tendency in a data set by scrutinizing two different data sets with the same measures of central tendency. We illustrate this using data on the returns on stocks where it is not only the mean, median and mode which are the same, it is also true for other measures of location like its minimum and maximum. However, the spread of observations are different which means that to further describe the data sets we need additional measures like a measure about the dispersion of the data, i.e. range, interquartile range, variance, standard deviation, and coefficient of variation. Also, the standard deviation, as a measure of dispersion can be viewed as a measure of risk, specifically in the case of making investments in stock market. The smaller the value of the standard deviation, the smaller is the risk.
MEANING OF DISPERSION In Statistics,
This term is used commonly to mean scatter,
Deviation, Fluctuation, Spread or variability of data.
The degree to which the individual values of the variate scatter away from the average or the central value, is called a dispersion
The document discusses various measures of central tendency including mean, median, and mode. It provides definitions and examples of calculating each measure. The mean is the sum of all values divided by the number of values and is the most commonly used measure. The median is the middle value when values are arranged in order. The mode is the value that occurs most frequently. The document also discusses weighted mean, geometric mean, and harmonic mean as alternatives to the standard mean in some situations.
This document provides an overview of key concepts in psychological statistics. It defines statistics as procedures for organizing, summarizing, and interpreting information using facts and figures. It discusses populations and samples, variables and data, parameters and statistics, descriptive and inferential statistics, sampling error, and experimental and nonexperimental methods. It also covers scales of measurement, frequency distributions, measures of central tendency and variability, and the importance of measurement in research.
There are several measures of variability used to determine how scores are distributed around the center and spread out, including range, mean deviation, variance, and standard deviation. Range is simply the difference between the highest and lowest scores, while mean deviation is the average distance of all scores from the mean. Variance measures how far scores deviate from the mean by squaring the differences and averaging them, providing a measure of precision. It is calculated by subtracting each score from the mean, squaring the differences, summing them, and dividing by the total number of scores.
This document provides an overview of descriptive statistics and related concepts. It begins with an introduction to descriptive analysis and then covers various types of variables and levels of measurement. It describes measures of central tendency including mean, median and mode. Measures of dispersion like range, standard deviation and normal distribution are also discussed. The document also covers measures of asymmetry, relationship and concludes with emphasizing the importance of statistical planning in research.
This document discusses measures of dispersion in statistics. It defines dispersion as the extent of variation in a data set from the average value. There are two main types of dispersion - absolute and relative. Absolute measures express variation in units of the data and include range, variance, standard deviation, and quartile deviation. Relative measures allow comparison between data sets by being unit-free, such as the coefficient of variation. Key absolute measures are then explained in more detail, along with their merits and demerits.
Deep learning is a subset of machine learning that uses artificial neural networks. Neural networks are composed of interconnected layers of nodes that process input data. Activation functions introduce non-linearity between layers to increase the model's ability to learn complex patterns. Models are trained via backpropagation to minimize loss by adjusting weights to better match predictions to actual outputs. Overfitting can occur if the model becomes too complex for the data.
The document discusses different machine learning algorithms including supervised learning algorithms like regression and classification. It also discusses unsupervised and semi-supervised learning used in recommendation systems. A large portion of the document is dedicated to evaluating machine learning model performance using classification metrics like accuracy, recall, precision and confusion matrices.
Measure of dispersion has two types Absolute measure and Graphical measure. There are other different types in there.
In this slide the discussed points are:
1. Dispersion & it's types
2. Definition
3. Use
4. Merits
5. Demerits
6. Formula & math
7. Graph and pictures
8. Real life application.
This document discusses measures of dispersion, which characterize how spread out values are from the central tendency in a data set. It defines absolute and relative measures of dispersion. Absolute measures indicate variation in raw units, while relative measures are dimensionless and allow comparison between data sets. The document focuses on range and coefficient of range as simple absolute and relative measures of dispersion. It provides examples of calculating range and coefficient of range from data sets and exercises for the reader to practice.
This document provides an overview of key concepts in statistics. It discusses how statistics is used to collect, organize, summarize, present, and analyze numerical data to derive valid conclusions. It defines common statistical terminology like data, quantitative vs. qualitative data, measures of central tendency (mean, median, mode), measures of variability (range, standard deviation), the normal distribution curve, and coefficient of variation. The document also explains common statistical tests like the z-test, t-test, ANOVA, chi-square test and concepts like sensitivity and specificity. Overall, the document serves as a high-level introduction to foundational statistical methods and analyses.
This document discusses various measures of dispersion used to quantify how spread out or variable a data set is. It defines dispersion and explains the purposes of measuring it. The key measures of dispersion discussed are range, quartile deviation, mean deviation, variance, standard deviation, and coefficient of variation. Formulas are provided for calculating each measure along with their merits and limitations. The conclusion emphasizes that measures of dispersion are useful for comparing distributions and further statistical analysis.
This document discusses various measures used to describe data, including measures of central tendency (mean, median, mode) and measures of variation (range, variance, standard deviation). It provides definitions and formulas for calculating different statistical measures, along with their properties and appropriate uses. Measures of central tendency indicate the central or typical value of a data set, while measures of variation describe how spread out or dispersed the data are around the central value. The document compares absolute and relative measures and discusses specific measures like range, quartile deviation, average deviation, and standard deviation.
This document describes how to characterize the distribution of a quantitative variable in three steps: reporting the center, deviations from the center, and general shape. It discusses various measures of central tendency (mean, median, mode), variation (range, standard deviation, average deviation), and distribution shape (normal curve, skewness). The mean, median, and mode are introduced as measures of central tendency, along with how to calculate each one. Measures of variation like range, standard deviation, and average deviation are also defined and the formulas to compute them provided. Finally, the document discusses the normal distribution curve and how skewness indicates a distribution's departure from symmetry.
Measure of dispersion refers to statistical measures that quantify how data values are spread out or vary from the average value. There are various measures of dispersion including standard deviation, mean deviation, variance, range, and quartile deviation. These measures capture how concentrated or scattered the data is and help analyze the characteristics of data sets. Common measures include range, which is the difference between highest and lowest values; variance and standard deviation, which measure average deviation from the mean; and mean deviation, which averages the absolute deviations from the mean or median. Measures can be absolute, using the same units as the data, or relative by standardizing against the mean. Skewness and kurtosis further characterize the shape and outliers of a distribution.
Descriptions of data statistics for researchHarve Abella
ย
This document defines and describes various measures of central tendency and variation that are used to summarize and describe sets of data. It discusses the mean, median, mode, midrange, percentiles, quartiles, range, variance, standard deviation, interquartile range, coefficient of variation, measures of skewness and kurtosis. Examples are provided to demonstrate how to compute and interpret these statistical measures.
Statistical analysis is an important tool for researchers to analyze collected data. There are two major areas of statistics: descriptive statistics which develops indices to describe data, and inferential statistics which tests hypotheses and generalizes findings. Descriptive statistics measures central tendency (mean, median, mode), dispersion (range, standard deviation), and skewness. Relationship between variables is measured using correlation and regression analysis. Statistical tools help summarize large datasets, identify patterns, and make reliable inferences.
This document summarizes various statistical measures used to analyze and describe data distributions, including measures of central tendency (mean, median, mode), dispersion (range, standard deviation, variance), skewness, and kurtosis. It provides formulas and methods for calculating each measure along with interpretations of the results. Measures of central tendency provide a single value to represent the center of the data set. Measures of dispersion describe how spread out or varied the data values are. Skewness and kurtosis measure the symmetry and peakedness of distributions compared to the normal curve.
STATISTICS.pptx for the scholars and studentsssuseref12b21
ย
The document provides an overview of statistics, including definitions, types, and key concepts. It defines statistics as the science of collecting, presenting, analyzing, and interpreting data. It discusses descriptive statistics, which summarize and organize raw data, and inferential statistics, which allow generalization from samples to populations. The document also covers variables, scales of measurement, measures of central tendency (mean, median, mode), measures of dispersion (range, standard deviation), and other statistical terminology.
This document provides an introduction to analyzing experimental errors and data. It discusses evaluating potential sources of errors before, during, and after an analysis. There are two types of experimental errors - determinate errors that affect accuracy and indeterminate errors that affect precision. Determinate errors can be constant or proportional while indeterminate errors are random. The document outlines various sources of these errors and methods to identify and minimize them, such as analyzing samples of different sizes or using reference standards.
A teacher calculated the standard deviation of test scores to see how close students scored to the mean grade of 65%. She found the standard deviation was high, indicating outliers pulled the mean down. An employer also calculated standard deviation to analyze salary fairness, finding it slightly high due to long-time employees making more. Standard deviation measures dispersion from the mean, with low values showing close grouping and high values showing a wider spread. It is calculated using the variance formula of summing the squared differences from the mean divided by the number of values.
In this lesson, students will be shown that it is not enough to get measures of central tendency in a data set by scrutinizing two different data sets with the same measures of central tendency. We illustrate this using data on the returns on stocks where it is not only the mean, median and mode which are the same, it is also true for other measures of location like its minimum and maximum. However, the spread of observations are different which means that to further describe the data sets we need additional measures like a measure about the dispersion of the data, i.e. range, interquartile range, variance, standard deviation, and coefficient of variation. Also, the standard deviation, as a measure of dispersion can be viewed as a measure of risk, specifically in the case of making investments in stock market. The smaller the value of the standard deviation, the smaller is the risk.
MEANING OF DISPERSION In Statistics,
This term is used commonly to mean scatter,
Deviation, Fluctuation, Spread or variability of data.
The degree to which the individual values of the variate scatter away from the average or the central value, is called a dispersion
The document discusses various measures of central tendency including mean, median, and mode. It provides definitions and examples of calculating each measure. The mean is the sum of all values divided by the number of values and is the most commonly used measure. The median is the middle value when values are arranged in order. The mode is the value that occurs most frequently. The document also discusses weighted mean, geometric mean, and harmonic mean as alternatives to the standard mean in some situations.
This document provides an overview of key concepts in psychological statistics. It defines statistics as procedures for organizing, summarizing, and interpreting information using facts and figures. It discusses populations and samples, variables and data, parameters and statistics, descriptive and inferential statistics, sampling error, and experimental and nonexperimental methods. It also covers scales of measurement, frequency distributions, measures of central tendency and variability, and the importance of measurement in research.
There are several measures of variability used to determine how scores are distributed around the center and spread out, including range, mean deviation, variance, and standard deviation. Range is simply the difference between the highest and lowest scores, while mean deviation is the average distance of all scores from the mean. Variance measures how far scores deviate from the mean by squaring the differences and averaging them, providing a measure of precision. It is calculated by subtracting each score from the mean, squaring the differences, summing them, and dividing by the total number of scores.
This document provides an overview of descriptive statistics and related concepts. It begins with an introduction to descriptive analysis and then covers various types of variables and levels of measurement. It describes measures of central tendency including mean, median and mode. Measures of dispersion like range, standard deviation and normal distribution are also discussed. The document also covers measures of asymmetry, relationship and concludes with emphasizing the importance of statistical planning in research.
This document discusses measures of dispersion in statistics. It defines dispersion as the extent of variation in a data set from the average value. There are two main types of dispersion - absolute and relative. Absolute measures express variation in units of the data and include range, variance, standard deviation, and quartile deviation. Relative measures allow comparison between data sets by being unit-free, such as the coefficient of variation. Key absolute measures are then explained in more detail, along with their merits and demerits.
Similar to Topic 4 Measures of Dispersion.pptx (20)
Deep learning is a subset of machine learning that uses artificial neural networks. Neural networks are composed of interconnected layers of nodes that process input data. Activation functions introduce non-linearity between layers to increase the model's ability to learn complex patterns. Models are trained via backpropagation to minimize loss by adjusting weights to better match predictions to actual outputs. Overfitting can occur if the model becomes too complex for the data.
The document discusses different machine learning algorithms including supervised learning algorithms like regression and classification. It also discusses unsupervised and semi-supervised learning used in recommendation systems. A large portion of the document is dedicated to evaluating machine learning model performance using classification metrics like accuracy, recall, precision and confusion matrices.
Artificial Neural Networks are computer systems inspired by biological neural networks in the brain. They are made up of interconnected nodes that process information using a connectionist approach to computation. ANNs can be used to model complex relationships between inputs and outputs and discover hidden patterns in data.
The document discusses data warehousing, data mining, and business intelligence. It defines each topic and explains their key processes and purposes. Data warehousing involves collecting, storing, and managing large amounts of data from different sources for analysis and decision making. Data mining analyzes large datasets to identify patterns and relationships for informed decisions. Business intelligence provides technologies and methods to analyze business data for insights, performance improvement, and informed decision making.
The document discusses database management systems (DBMS) and relational database management systems (RDBMS). It defines key concepts like data, structured, semi-structured and unstructured data, databases, tables, relationships, and SQL. A DBMS stores data across various formats and provides features for data validation, integrity, and sharing. An RDBMS is designed for structured data in tables with relationships and uses SQL. The document provides examples of creating tables and programming in SQL with queries, inserts, updates and joins.
Regression analysis is used to predict the value of a dependent variable based on the value of one or more independent variables. The dependent variable is what we want to predict, while the independent variables are what we use to explain the dependent variable. Simple linear regression uses one independent variable to describe the linear relationship between it and the dependent variable, assuming changes in the dependent variable are caused by changes in the independent variable. Multiple regression extends this to use two or more independent variables.
The document discusses different machine learning algorithms including supervised learning algorithms like regression and classification. It also discusses unsupervised and semi-supervised learning used in recommendation systems. A large portion of the document is dedicated to evaluating machine learning model performance using classification metrics like accuracy, recall, precision and confusion matrices. It provides definitions for these key evaluation metrics.
This document provides an introduction to decision theory and different methods for decision making under uncertainty and risk. It defines the key elements of decision theory as actions/alternatives, states of nature, outcomes, and objective variables. For decision making under uncertainty when probabilities are not known, it describes non-probability methods like maximax, maximin, and minimax regret. Maximax seeks to maximize the maximum possible outcome, maximin seeks to maximize the minimum outcome, while minimax regret takes a more balanced approach weighing both profits and losses.
This document provides an overview of business analytics and reasons for learning data analytics. It discusses different levels of business analytics from descriptive to predictive and prescriptive. Descriptive analytics describes what happened in the past while predictive analytics predicts the future. The document also introduces some statistical methods used in analytics like descriptive statistics, measures of central tendency, and data visualization techniques.
This document provides summaries of topics related to business information systems including: market basket analysis, global information systems, prototyping, change management, optimization, competitive advantages, electronic data interchange, business process management, and cyber security. It defines each topic and provides key details about components, types, and the importance of each within business contexts.
Artificial intelligence, machine learning, deep learning, and expert systems are all related fields involving the simulation of human intelligence in machines. Machine learning and deep learning are subfields of artificial intelligence where systems are able to learn from data to perform tasks without being explicitly programmed. Expert systems are a type of artificial intelligence application that uses a knowledge base of expert information to solve complex problems and provide expert-level advice.
The document discusses the benefits of exercise for mental health. Regular physical activity can help reduce anxiety and depression and improve mood and cognitive functioning. Exercise causes chemical changes in the brain that may help protect against mental illness and improve symptoms.
The document discusses data warehousing, data mining, and business intelligence. It defines data warehousing as a solution for fast analysis of information that operational systems cannot provide, due to limitations like unavailable historical data and poor query performance. It describes the architecture of data warehousing and lists databases, data warehouses, and transactional data as sources for data mining. The data mining process involves data collection, feature extraction, cleaning, and analytical algorithms. Common techniques are discussed as well. Business intelligence is defined as converting corporate data through processing and analysis into useful information and knowledge to trigger profitable business decisions.
This document discusses probability distributions, including binomial and Poisson distributions. It defines key terms like random variables, discrete and continuous probability distributions, and the assumptions and constants of binomial distributions. Specifically, it explains that a binomial distribution describes experiments with two possible outcomes (success and failure) where there is a fixed number of trials, the probability of success is the same for each trial, and trial results are independent. The mean of a binomial distribution is np and the variance is npq, where n is the number of trials, p is the probability of success, and q is the probability of failure.
Basic probability concepts are introduced including experiments, outcomes, events, sample space, elementary events, simple and joint probabilities. Key terms like mutually exclusive, independent and dependent events are defined. Formulas for calculating probabilities of simple, joint, union and intersection of events are provided. Examples of tossing coins, rolling dice and selecting items from sets are used to illustrate concepts. Probability relationships like complement, addition rule for mutually exclusive events and general addition rule are explained using Venn diagrams and examples.
The document provides an introduction to basic database terminology and concepts. It defines key terms like data, data item, entity, entity set, record, file, key, and information. It then discusses common data organization issues such as data redundancy, inconsistency, difficulty accessing data, isolation, integrity problems, and security issues that databases aim to address. It provides an overview of the difference between file systems and database management systems (DBMS), and how DBMS solutions are better suited to organizing large amounts of structured data for efficient querying and sharing across users.
Basic probability concepts are introduced including experiments, outcomes, events, sample space, and definitions of probability. Probability is defined numerically between 0 and 1. Key terms like elementary events, joint events, and mutually exclusive events are explained. Formulas for calculating probability of single events, multiple events, unions, and intersections of events are provided. Venn diagrams are used to illustrate relationships between events. Examples demonstrate calculating probability for independent and dependent events using multiplication rules and conditional probability.
This document discusses the system development life cycle (SDLC) process for developing IT solutions within an organization. The SDLC includes 5 phases - investigation, analysis, design, implementation, and maintenance. The analysis phase involves gathering requirements and modeling the system using tools like data flow diagrams to understand how data will flow through the various processes. This helps identify what needs to be done and how during the design phase.
E-commerce refers to the buying and selling of goods or services using the internet, and involves several types of online transactions between businesses and consumers. It allows for a low-cost way for businesses to access global markets and consumers to conveniently shop online. Key aspects of e-commerce include business-to-business (B2B), business-to-consumer (B2C), and consumer-to-consumer (C2C) transactions, as well as the historical development and common processes of online shopping.
This document discusses covariance and correlation. It begins by providing an example dataset showing the age and speed of motorcycles. It then defines covariance as a measure of how much two random variables vary together, while correlation measures the degree of relationship between variables. Covariance can be positive, negative, or zero, indicating the direction of the relationship. Correlation is standardized to always be between -1 and 1. The document provides formulas for covariance, correlation, and discusses different types of correlation based on variables, linearity, and other factors. It provides examples of calculating correlation coefficients and interpreting the results.
How to Create a Stage or a Pipeline in Odoo 17 CRMCeline George
ย
Using CRM module, we can manage and keep track of all new leads and opportunities in one location. It helps to manage your sales pipeline with customizable stages. In this slide letโs discuss how to create a stage or pipeline inside the CRM module in odoo 17.
Brand Guideline of Bashundhara A4 Paper - 2024khabri85
ย
It outlines the basic identity elements such as symbol, logotype, colors, and typefaces. It provides examples of applying the identity to materials like letterhead, business cards, reports, folders, and websites.
Artificial Intelligence (AI) has revolutionized the creation of images and videos, enabling the generation of highly realistic and imaginative visual content. Utilizing advanced techniques like Generative Adversarial Networks (GANs) and neural style transfer, AI can transform simple sketches into detailed artwork or blend various styles into unique visual masterpieces. GANs, in particular, function by pitting two neural networks against each other, resulting in the production of remarkably lifelike images. AI's ability to analyze and learn from vast datasets allows it to create visuals that not only mimic human creativity but also push the boundaries of artistic expression, making it a powerful tool in digital media and entertainment industries.
Information and Communication Technology in EducationMJDuyan
ย
(๐๐๐ ๐๐๐) (๐๐๐ฌ๐ฌ๐จ๐ง 2)-๐๐ซ๐๐ฅ๐ข๐ฆ๐ฌ
๐๐ฑ๐ฉ๐ฅ๐๐ข๐ง ๐ญ๐ก๐ ๐๐๐ ๐ข๐ง ๐๐๐ฎ๐๐๐ญ๐ข๐จ๐ง:
Students will be able to explain the role and impact of Information and Communication Technology (ICT) in education. They will understand how ICT tools, such as computers, the internet, and educational software, enhance learning and teaching processes. By exploring various ICT applications, students will recognize how these technologies facilitate access to information, improve communication, support collaboration, and enable personalized learning experiences.
๐๐ข๐ฌ๐๐ฎ๐ฌ๐ฌ ๐ญ๐ก๐ ๐ซ๐๐ฅ๐ข๐๐๐ฅ๐ ๐ฌ๐จ๐ฎ๐ซ๐๐๐ฌ ๐จ๐ง ๐ญ๐ก๐ ๐ข๐ง๐ญ๐๐ซ๐ง๐๐ญ:
-Students will be able to discuss what constitutes reliable sources on the internet. They will learn to identify key characteristics of trustworthy information, such as credibility, accuracy, and authority. By examining different types of online sources, students will develop skills to evaluate the reliability of websites and content, ensuring they can distinguish between reputable information and misinformation.
How to Create User Notification in Odoo 17Celine George
ย
This slide will represent how to create user notification in Odoo 17. Odoo allows us to create and send custom notifications on some events or actions. We have different types of notification such as sticky notification, rainbow man effect, alert and raise exception warning or validation.
Environmental science 1.What is environmental science and components of envir...Deepika
ย
Environmental science for Degree ,Engineering and pharmacy background.you can learn about multidisciplinary of nature and Natural resources with notes, examples and studies.
1.What is environmental science and components of environmental science
2. Explain about multidisciplinary of nature.
3. Explain about natural resources and its types
Post init hook in the odoo 17 ERP ModuleCeline George
ย
In Odoo, hooks are functions that are presented as a string in the __init__ file of a module. They are the functions that can execute before and after the existing code.
2. Introduction
โข Average or the measure of central tendency tell us where the center of the data
lies but does not tell us how the items of the set are distributed around the center
โข Two sets may have the same averages but items in one may scatter widely around
this center while in another case all the items may lie close to the average
โข Example: Consider the minimum temperature recorded during winters in two
cities A and B
City A: 10 0 , 12 0 ,8 0 ,9 0 ,6 0 ,4 0 ,8 0
City B: 0 0 , 12 0 ,8 0 ,14 0 ,11 0 ,4 0 ,8 0
The average of both data set is 8.14 0 ,i.e. the minimum average temperature
during the week in both the cities is same as 8.14 0,however in case of city B the
values are more away or more scattered from the average of 8.14
3. What Is Dispersion?
โขThus the average does not enable us to draw a full picture of the set of
observations. A further description about the degree of scatterdness is necessary
to get a better description of the data
โขThe meaning of dispersion is โscatteredness.โ
โขThe degree to which numerical data tends to spread around an average value is
called variation or dispersion of data.
No Variability in Cash Flow
Variability in Cash Flow
Mean
Mean
4. Same measures of central tendencies but
different spread (variability )
6. โข The measures of dispersion ( or variability) coupled with measures of
central tendencies give a fairly good idea (not complete idea) about the
nature of distribution.
โข To have complete idea about the distribution , we need measures of
skewness and kurtosis also
โข Dispersion is the spread or scatter of values from measure of central
tendency
โข Measure of dispersion is designed to state the extent to which individual
observations (or items) vary from their average .
โข Only the amount of variation and not direction is taken into account.
โข It is measured as an average deviation about central value
7. Need to study dispersion
โข It is the value of dispersion which says how much reliable a central tendency is?
โข Usually, a small value of dispersion indicates that measure of central tendency
is more reliable and vice โ versa.
โข Many powerful analytical tools in statistics such as correlation analysis, the
testing of hypothesis, analysis of variance, the statistical quality control,
regression analysis are based on measure o f variation of one kind or another.
โข The degree of data spread also helps in analyzing importance of different
components of a system, for example for financial analyst, it is important to
know the dispersion of a firms earnings-if the earnings are highly dispersed i.e.
varying from extremely low to very high then it indicates a higher risk to the
creditor or stock holder. Similarly for Quality Control Expert โA drug that is
average in purity but ranges from very pure to highly impure may endanger
lives.
โข Dispersion is also used to compare uniformity of different data like income,
temperature, rainfall, weight, height, etc.
8. โข It is useful to determine the nature and cause of the variation in order
to control the variation itself
โข Health โ variations in body temperature , pulse beat and blood
pressures are basic guides to diagnosis.
โข A greater amount of dispersion means lack of uniformity or
consistency in the data . In such a case no average will reliably
represent the series
โข It helps us to determine if central tendeny truly represents the series
9. Importance of Dispersion
โข Conclusion from central tendencies alone carries no meaning wihtout
knowing the variation of various items of the series from the Average
โข Inequalities in distribution of wealth and income can be measured by
dispersion
โข Dispersion is used to compare and measure the concentration of
economic power and monopoly in a country
10.
11. Classification of Measures of Dispersion
Measure of dispersion is always a positive real number. If all values of individual
observations are identical with central tendency then dispersion is always zero
and as deviation in observation from central tendency increases, dispersion also
increases but it never becomes negative.
There are two types of measures of dispersion:
1. Absolute measures of dispersion: Absolute measures of dispersion are
presented in the same unit as the unit of distribution.
2. Relative measures of dispersion: Relative measures of dispersion are useful
in comparing two sets of data which have different units of measurement.
Relative measures of dispersion are pure unit less numbers and are generally
called coefficient of dispersion.
12. Absolute vs Relative Measures of Dispersion
โข Absolute measures in terms of units of measurement while relative is
a ratio and is independent of units of measurement
โข Absolute measures cannot compare variability of 2 distributions
expressed in different units
โข Comparison of distributions with regard to their variability from
central value is done by relative measures of dispersion
13. The following are some of the important and widely used methods of
measuring dispersion:
1. Range
2. Interquartile range and Quartile deviation
3. Mean deviation or average deviation
4. Variance
5. Standard deviation
Methods of Measuring Dispersion
14. Properties of a Good Measure of Dispersion
1. Like a good measure of central tendency the good measure of
dispersion should also have similar characteristics.
2. A good measure of dispersion should be clearly defined so that
there should not be any scope of subjectivity in computation as
well as its interpretation.
3. It should be easy to compute, understand and interpret and
further, all individual observations should be used in its estimation
and also it should be free from any biasness or biasness due to any
extreme value.
4. Since dispersion is also used to estimate many statistical complex
properties of data so a dispersion should be easily applicable in any
algebraic operations.
5. Finally, such measure of dispersion should be least affected by
sampling or have high degree of sampling stability.
15. Range:
For raw data range is defined as the difference between the smallest and
the greatest values in a distribution.
Symbolically R= L-S
where L is the largest observation, S the smallest observation, and R the
range.
Range is an absolute measure of dispersion. The relative measure of
dispersion for range is called the coefficient of range and is calculated by
the following formula:
๐ถ๐๐๐๐๐๐๐๐๐๐ก ๐๐ ๐ ๐๐๐๐ =
๐ฟ โ ๐
๐ฟ + ๐
Thus in coefficient of range ,the range L-S is standardized by L+S
16. In the case of a grouped data range is estimated by taking the
difference of upper limit of highest class interval and lower limit of
lowest class interval.
Symbolically R= ULI โ LFI
Where ULI is Upper limit of last class interval while LFI is Lower limit of
first class interval, R is the range
To make it free from the units, relative measure of range is defined as
๐ถ๐๐๐๐๐๐๐๐๐๐ก ๐๐ ๐ ๐๐๐๐ =
๐๐ฟ๐ผ โ ๐ฟ๐น๐ผ
๐๐ฟ๐ผ + ๐ฟ๐น๐ผ
Range:
17. Illustration 1
Following are the wages of 8 workers of a factory. Find the range and the
coefficient of range. Wages in Dollars 1400, 1450, 1520, 1380, 1485, 1495,
1575, 1440.
18. Illustration 1 Solution
โข Here Largest value =L=1575 and Smallest Value =S=1380
Range =L-S=1575โ1380=195
๐ถ๐๐๐๐๐๐๐๐๐๐ก ๐๐ ๐ ๐๐๐๐ =
๐ฟโ๐
๐ฟ+๐
=
1575โ1380
1575+1380
= 0.66
19. Illustration 2
Let us take two sets of observations. Set A contains marks of five students in Mathematics out of 25
marks and group B contains marks of the same student in English out of 100 marks.
Set A: 10, 15, 18, 20, 20
Set B: 30, 35, 40, 45, 50
Calculate values of range and coefficient of range ?
20. Illustration 2 Solution
โข
โข Range
โข Coefficient of Range
โข Set A: (Mathematics)
โข 20โ10=10
โข
20โ10
20+10
= 0.33
โข Set B: (English)
โข 50โ30=20
โข
50โ30
50+30
= 0.25
21. Illustration 2 Solution
โข In set A the range is 10 and in set B the range is 20. Apparently it
seems as if there is greater dispersion in set B. But this is not true.
The range of 20 in set B is for large observations and the range of 10
in set A is for small observations. Thus 20 and 10 cannot be compared
directly. Their base is not the same. Marks in Mathematics are out of
25 and marks of English are out of 100. Thus, it makes no sense to
compare 10 with 20. When we convert these two values into
coefficient of range, we see that coefficient of range for set A is
greater than that of set B. Thus there is greater dispersion or variation
in set A. The marks of students in English are more stable than their
marks in Mathematics.
22. Find the range and coefficient of range ,of the weight of the
students of a university.
Illustration 3
Weights (Kg) 60โ62 63โ65 66โ68 69โ71 72โ74
Number of Students 55 18 42 27 8
23. Weights (Kg) Class Boundaries Mid Value No. of Students
60โ62 59.5โ62.5 61 55
63โ65 62.5โ65.5 64 18
66โ68 65.5โ68.5 67 42
69โ71 68.5โ71.5 70 27
72โ74 71.5โ74.5 73 8
Solution โMethod I
Here Upper class limit of the last class = ULI=74.5
Lower class limit of the first class =LFI =59.5
Range = ULI - LFI=74.5โ59.5=15 Kilogram
๐ถ๐๐๐๐๐๐๐๐๐๐ก ๐๐ ๐ ๐๐๐๐ =
๐๐ฟ๐ผโ๐ฟ๐น๐ผ
๐๐ฟ๐ผ+๐ฟ๐น๐ผ
=
74.5โ59.5
74.5+59.5
= 0.1119
24. The following distribution gives the numbers of houses and the number
of persons per house. Calculate the range and coefficient of range
Number of Persons 1 2 3 4 5 6 7 8 9 10
Number of Houses 26 113 120 95 60 42 21 14 55 44
Illustration 4
25. Illustration 4 Solution
Here largest value L = 10 and Smallest value S= 1
Range = L- S = 10- 1=9
Coefficient of range =
๐ฟโ๐
๐ฟ+๐
=
10โ1
10+1
= 0.818
26. Application of range
(i) Quality Control:
In quality control of manufactured products, range is used to study the variation in the quality of the
units manufactured. Even with the most modern mechanical equipment there may be a small,
almost insignificant, difference in the different units of a commodity manufactured. Thus, if a
company is manufacturing bottles of a particular type, there may be a slight variation in the size or
shape of the bottles manufactured. In such cases a range is usually determined, and all the units
which fall within these limits are accepted while those which fall outside the limits are rejected.
(ii) Variation in Money Rates, Share values, Exchange Rates and Gold prices, etc:
Variations in money rates, share values, gold prices and exchange rates are commonly studied
through range because the fluctuations in them are not very large. In fact range as a measure of
dispersion should be generally used only when variations in the value of the variable are not much.
(iii) Weather forecasting:
Range gives an idea of the variation between maximum and minimum levels of temperature. From
day to day the range would not vary much and it is helpful in studying the vagaries of nature if
variations suddenly rise or fall.
27. Interquartile range and Quartile deviation
Interquartile range is another positional and absolute measure of data dispersion in any
series which try to minimize the error of range as a measure of dispersion by avoiding the
use of extreme values and in its place uses the difference of first Q 1and third Q 3 quartile
as a measure of dispersion.
This measure of dispersion ignores fifty per cent (first 25 per cent and last 25 per cent) of
observations.
Interquartile range (IQR) = Q 3 โ Q 1
Half the distance between Q 1and Q 3 is called Semi-Interquartile Range or Quartile
Deviation (QD) .Thus
QD =
Q 3 โ Q 1
2
NOTE: The use of QD is that one may say that the span Medianยฑ QD contains 50% 0f the
data. It also provides a short cut method to calculate Standard Deviation using the formula
6 Q.D. = 5 M.D. = 4 S.D
28. 1. The interquartile range is an interval, not a scalar. You should always
report both numbers, Q 1and Q 3 not just the difference between them.
You can then explain it by saying that half the sample readings were
between these two values, a quarter were smaller than the lower
quartile, and a quarter higher than the upper quartile.
2. The median is not necessarily between Q 1and Q 3 ,however in case of
symmetrical distribution it is in the middle of two quartiles.
3. The median and quartiles divide the data into equal numbers of values
but do not necessarily divide the data into equally wide intervals
Interquartile range and Quartile deviation _Important Remarks
29. Coefficient of Quartile Deviation
A relative measure of dispersion based on the quartile deviation is
called the coefficient of quartile deviation. It is defined as Coefficient
of Quartile Deviation
Coefficient of Quartile Deviation =
Q 3 โ Q 1
2
Q 3 + Q 1
2
=
Q 3 โ Q 1
Q 3 + Q 1
It is pure number free of any units of measurement. It can be used
for comparing the dispersion in two or more than two sets of data.
30. Following are the responses from 55 students to the question about how much
money they spent every day. Calculate range and interquartile range and interpret
your result
55 60 80 80 80 85 85 85 90 90 90
90 92 94 95 95 95 95 100 100 100 100
100 100 105 105 105 105 109 110 110 110 110
112 115 115 115 115 115 120 120 120 120 120
124 125 125 125 130 130 140 140 140 145 150
Illustration 5
31. Range = Largest Observation โSmallest Observation=150-55= 95
Q1=Value of
๐+1
4
th item =Value of
55+1
4
th item =Value of (14)th item= 94
Q3=Value of
3(๐+1)
4
th item =Value of
3(55+1)
4
th item=Value of (42)nd item=120
Interquartile range (IQR) = Q 3 โ Q 1=120-94=26
Interpretation
A. Amongst all the students there is a variability of Rs.95 in their daily spending
B. 25% values lie below Q1=94. Variation from Smallest Value to Q 1 is 94 -55= Rs.39.Thus lower
25% of students have a variability of Rs.39 in their daily spending.
C. 25% values lie above Q3=120. Variation from Q3 to largest value is 150-120= Rs.30.Thus upper
25% of students have a variability of Rs.30 in their daily spending.
D. 50% values lie between Q1 and Q3. Interquartile range is Rs.26.Thus middle 50% of students
have a variability of Rs.26 in their daily spending.
Solution
39. Illustration 9
โข The wheat production (in Kg) of 20 acres is given as: 1120, 1240,
1320, 1040, 1080, 1200, 1440, 1360, 1680, 1730, 1785, 1342, 1960,
1880, 1755, 1720, 1600, 1470, 1750, and 1885. Find the quartile
deviation and coefficient of quartile deviation.
40. .
After arranging the observations in ascending order, we get 1040, 1080, 1120, 1200, 1240, 1320, 1342, 1360,
1440, 1470, 1600, 1680, 1720, 1730, 1750, 1755, 1785, 1880, 1885, 1960.
Solution 9
Q1=Value of
๐+1
4
th item =Value of
20+1
4
th item =Value of (5.25)th item
=5th item+0.25(6th itemโ5th item)
=1240+0.25(1320โ1240)=1240+20=1260
Q3=Value of
3(๐+1)
4
th item =Value of
3(20+1)
4
th item=Value of (15.75)th item
=15th item+0.75(16th itemโ15th item)
=1750+0.75(1755โ1750)=1753.75
Q.D=
Q3โQ1
2
=
1753.75โ1260
2
=
492.75
2
=246.875
Coefficient of Quartile Deviation=
Q3โQ1
Q3+Q1
=
1753.75โ1260
1753.75+1260
=0.164
41. Calculate the quartile deviation and coefficient of quartile deviation
from the data given below:
Illustration 10
Maximum Load
(short-tons)
Number of Cables
9.3โ9.7 2
9.8โ10.2 5
10.3โ10.7 12
10.8โ11.2 17
11.3โ11.7 14
11.8โ12.2 6
12.3โ12.7 3
12.8โ13.2 1
43. Mean deviation or average deviation
The mean deviation or the average deviation is defined as the mean of the absolute
deviations of observations from some suitable average which may be the arithmetic
mean, the median or the mode. The difference (Xโaverage) is called deviation and
when we ignore the negative sign, this deviation is written as |Xโaverage| and is read
as mod deviations. The mean of these mod or absolute deviations is called the mean
deviation or the mean absolute deviation. Thus for sample data in which the suitable
average is the X, the mean deviation M.D is given by the relation:
46. Illustration 11
Calculate the mean deviation form (1) arithmetic mean (2) median (3) mode in respect of
the marks obtained by nine students gives below and show that the mean deviation from
median is minimum.
Marks (out of 25): 7, 4, 10, 9, 15, 12, 7, 9, 7
48. Solution 11 (Contd.)
From the above calculations, it is clear that the
mean deviation from the median has the least
value.
Further it may be interpreted that the average
absolute discrepancies in marks from their
median is 2.33
50. Solution 12
It may be interpreted that the average
absolute discrepancies in size of items
from their mean is 0.915
51. The following data represents two income groups of five and seven
workers working in two different branches of a firm. Determine the
average absolute discrepancies. In which group do you feel that such
discrepancy is less.
Illustration 13
Branch I Branch II
Income(Rs.) Income(Rs.)
4000 3000
4200 4000
4400 4200
4600 4400
4800 4600
4800
5800
52. Illustration 13 (Cont.)
วX- Medianว วX- Medianว
Income(Rs) Median=4400 Income(Rs) Median=4400
4000 400 3000 1400
4200 200 4000 400
4400 0 4200 200
4600 200 4400 0
4800 400 4600 200
4800 400
5800 1400
N=5 โวX- Medianว=1200 N=7 โวX- Medianว=4000
Branch I Branch II Branch I
MAD =
โว๐โ ๐๐๐๐๐๐ว
๐
=
1200
5
=240
Coefficient of MAD=
๐๐ด๐ท
๐๐๐๐๐๐
=
240
4400
=0.054
Branch II
MAD =
โว๐โ ๐๐๐๐๐๐ว
๐
=
4000
7
=571.43
Coefficient of MAD=
๐๐ด๐ท
๐๐๐๐๐๐
=
571.43
4400
=0.13
1.The average absolute discrepancies from its median for branch I is Rs.240
2.The average absolute discrepancies from its median for branch II is Rs.571.43
3.To compare the two branches for lesser average absolute discrepancy we use relative measure of dispersion i.e.
coefficient of median absolute deviation , and observe that it is less (0.054) in branch I as compared to that (0.13) of
branch II
Thus one may interpret that there is more uniformity of income in branch I as compared to that in branch II
55. Illustration 15:Calculate Mean deviation from the
Mean for following weight distribution
Use Short Cut Method , Assumed Mean=12.5&
Step Deviation 5
58. โข Absolute deviations from A.M. are calculated to ignore negative sign
โข Another way to ignore negative sign is to take square these deviations
โข Population Variance is the average of the squared deviations from the
arithmetic mean, of each observation in the set of all of the observations.
โข Population variance is denoted by ฯ2 ( Sigma square)
๐2 =
โ(๐๐โ๐ )2
๐
Where ๐ is the population mean and N is the size of the population
Variance
59. The variance is difficult to interpret because it is expressed in squared
units. Also the variance is hard to understand because the deviations
from the mean are squared, making it too large for logical explanation.
This problem can be solved by working with the square root of the
variance, which is called the standard deviation.
Population standard deviation ฯ = ฯ2=
โ(๐๐โ๐ )2
๐
Standard Deviation
60.
61.
62. The wholesale prices of a commodity for seven consecutive days in a
month are as follows:
Days : 1 2 3 4 5 6 7
Commodity price/quintal : 240 260 270 245 255 286 264
Calculate the variance and standard deviation.
Illustration 16
64. Illustration 17
โข Find the mean deviation from the mean and standard deviation from
following sample observations on weight (gms) of certain product
โข 19,22,20,21,18,23,21,22,20,21,21,22,21,18,21,26
70. Standard Deviation by Short Cut
Method Formula
๐=
โ ๐๐
๐ ๐
๐
โ ๐๐
โ
โ ๐๐
๐ ๐
โ ๐๐
๐
X h where d=(mid point โ A)/h , A =Assumed Mean and
h=class interval
71. A study of 100 engineering companies give the following information
Profit (Rs.in Crore) : 0-10 10-20 20-30 30-40 40-50 50-60
No. of companies : 8 12 20 30 20 10
Calculate the standard deviation of the profit earned by Short Cut
Method taking Assumed Mean A=35
Std Deviation by Short Cut Method Formulae:
Illustration 19: Short Cut Method Formulae
๐=
โ ๐๐
๐๐
2
โ ๐๐
โ
โ ๐๐
๐๐
โ ๐๐
2
X h
75. Coefficient of Variation
โข Ex If you want to compare distribution of heights and weights of
students in class(different units โft , Kg) and compare variability
โข Relative measure of dispersion and has no units
โข Usually expressed as percentage
โข Used to compare the variability of two or more distributions
โข The distribution with higher coeff of variation is less stable , less
uniform ,less consistent, less homogenous, less equitable BUT
MORE VARIABLE
76.
77. In a small business firm, two typists are employed-typist A and typist B.
Typist A types out, on an average, 30 pages per day with a standard
deviation of 6. Typist B, on an average, types out 45 pages with a
standard deviation of 10. Which typist shows greater consistency in his
output?
Illustration 19
78. Illustration 19-Solution
Coefficient of variation of typist A =CVA=
๐
๐
ร 100 =
6
30
ร 100 = 20%
Coefficient of variation of typist B =CVB=
๐
๐
ร 100 =
10
45
ร 100 = 22.2%
Thus although typist B types out more pages, there is a greater
variation in his output as compared to that of typist A. We can say
this in a different way: Though typist A's daily output is much less, he
is more consistent than typist B.
79. Illustration 20 : Which firm has greater
variability and what is average wage
90. Standard Deviation
โข Rigidly defined
โข Based on all observations
โข Variance and SD will be zero when all values are equal
โข SD is independent of change of origin โif same value is added or
subtracted from all values , variance and SD will remain unchanged
โข SD depends on change of scale โ if all values are multiplied by same
quantity , SD will get multiplied by same quantity
โข When number of samples are drawn from same population , SD is
least affected from sample to sample as compared to other measures
of dispersion