If there is one prayer that you should pray/sing every day and every hour, it is the LORD's prayer (Our FATHER in Heaven prayer)
It is the most powerful prayer. A pure heart, a clean mind, and a clear conscience is necessary for it.
- Samuel Dominic Chukwuemeka

For in GOD we live, and move, and have our being. - Acts 17:28

The Joy of a Teacher is the Success of his Students. - Samuel Dominic Chukwuemeka

# Welcome to Statistics

I greet you this day,
Second: view the videos.
Third: solve the questions/solved examples.
Fourth: check your solutions with my thoroughly-explained solutions.
Comments, ideas, areas of improvement, questions, and constructive criticisms are welcome. You may contact me.
If you are my student, please do not contact me here. Contact me via the school's system.
Thank you for visiting.

Samuel Dominic Chukwuemeka (Samdom For Peace) B.Eng., A.A.T, M.Ed., M.S

## Introductory Statistics

### Objectives

Students will:
(1.) Discuss statistics.
(2.) Discuss the basic terms used in statistics.
(3.) Discuss the reasons for studying statistics.
(4.) Define data.
(5.) Identify the population, sample, and individual in scenarios.
(6.) Identify the statistic and/or parameter in scenarios.
(7.) Discuss the statistical process.
(8.) Classify variables as qualitative or quantitative.
(9.) Classify quantitative variables as discrete or continuous.
(10.) Classify variables based on the level of measurement of the variable.

### Definition

Statistics is the science that deals with the:
Collection
Organization
Presentation
Analysis and
Interpretation of data so as to make the right decision and the right conclusion.

The main reason for studying statistics is to make the right decision and the right conclusion.
The goal of learning statistics is to distinguish between statistical conclusions that are likely to be valid and those that are seriously flawed.

Teacher: Would you not like to make the right decision in anything you want to do?
What are some of those things? 😊😊😊 Note students responses.
Would you want to be able to differentiate between results that are valid and results that are flawed?
Would you want to be able to prevent any sources of bias when making important decisions?
Welcome to Statistics!

There are basically, two types of statistics:
Descriptive Statistics and
Inferential Statistics

Descriptive Statistics is the science that deals with the collection, organization, and presentation of data.
Inferential Statistics is the science that uses methods that takes the results obtained from a sample, infers it on the population, and measures the reliability of the results.

### Why do we learn Statistics?

(1.) The media uses statistics to predict election polls such as the Presidential election and nominate people for awards among others.

Example 1: President Barack Obama vs Governor Mitt Romney 2012 Presidential Poll - Gallup Polls

Example 2: Who do Americans blame for the Saturday, December 22, 2018 partial government shutdown?
More Americans blame President Trump for government shutdown - Reuters/Ipsos Polls

Discuss some statistics in those links.
For Example 2, read the second to the last paragraph.
Discuss more real-world examples if time is available.
Ask students to research more "valid" sites.
Hmmmm...how do you know if a website is valid? How do you know if a website is not biased?

(2.) School administrators use statistics to know the performance of the schools in their district, and make decisions as necessary.

Example 3: The Nation's Report Card - The National Assessment of Educational Progress (NAEP)

Discuss how NAEP obtain their data (data collection) to rate schools in each state.
Discuss some statistics based on their results.

(3.) Health professionals use statistics to know how different people react to different medicines.

(4.) Teachers use statistics to know how to meet the learning needs of their students.

(5.) People use statistics to make informed or decisions on who to marry (typical in Africa and India), what professor's class to take (typical in the United States where "unknown people" insult their teachers and professors), what car to buy, and what school to attend among others.

### Data

Data is the list of observed values for a variable.
Data is the fact used to make a conclusion or decision.
It is also referred to as Information.
It is collected from a survey, an experiment, and a historical record among others.
It can be numeric (numbers) such as age, weight, etc. It is more than just numbers, because it has context.
It can also be non-numeric such as color, gender, etc.
The process of posing a question, collecting data, analyzing data, and interpreting the data is known as a Data Cycle.
Data vary. It changes within an individual. It also changes among individuals.
Understanding the variablity of data is very important in statistics.
Statistical studies rely on two major concepts: Data and Variation.
Collecting data about something involves a study of that thing.
This study could be measured or observed.

### Population, Sample, and Individual

Population is the entire group of individuals or thing that is being studied.
It contains all subjects of interest.
Example: All student matadors (Arizona Western College students).

Sample is a proper subset (part) of the population being studied.
It contains some members of the population.
It contains some of the subjects of interest.
Example: AWC students in South Yuma campus.

Bring it to Algebra: what is the difference between a subset and a proper subset?

Individual is a member of the population being studied.
It is a subject of interest.
Example: An AWC student in San Luis campus.

Exercise 1
For each of these scenarios, identify the population, sample, and individual.

(1.) A 2012 survey of 100 million Nigerians in Nigeria found that they would prefer the South to secede from the North.

Population: All Nigerians in Nigeria
Sample: 100 million Nigerians in Nigeria
Individual: A Nigerian in Nigeria

(2.) 300 ladies aged 19 to 35 who live in the United States were contacted in a poll.
The poll asked whether they use abstinence as a form of birth control.
Hmmmm...what do you think the results would be?

Population: All ladies aged 19 to 35 who live in the United States
Sample: 300 ladies aged 19 to 35 who live in the United States
Individual: A lady aged 19 to 35 who lives in the United States

(3.) Naboth randomly sampled 125 plants in his farm on June 30 and weighed the chlorophyll in each plant.

Population: All plants in his farm on June 30
Sample: 125 plants in his farm on June 30
Individual: A plant in his farm on June 30

### From Question 1 in Exercise 1

Assuming 95 million Nigerians out of 100 million Nigerians said they were ready to secede immediately.
This means that 95\% of the 100 million Nigerians that were surveyed are ready for the secession immediately.
This describes the results of the sample without making any conclusions about the population. (Descriptive Statistics).
Note the population here is the entire Nigerian population.

### Statistic and Parameter

Statistic is a numerical summary of a sample.

Please note: It is "Statistic", not "Statistics".
No, Statistic is not the singular form of Statistics! 😊😊😊

In our example, the 95% is the statistic.
Suppose we now take this 95% and extend it to the entire Nigerian population. (Inferential Statistics).
Assume we now say that 95% of all Nigerians in Nigeria said they were ready to secede immediately, then the
95% becomes the parameter.

Parameter is a numerical summary of a population.

Did you notice how we went from Descriptive Statistics to Inferential Statistics?
Did you notice how we went from Sample to Population?
Did you notice how we went from Statistic to Parameter?
Is it making sense?

Exercise 2
For each of these scenarios, identify whether the underlined is a statistic or parameter.

(4.) A sample of London residents were surveyed and it was found that 85% had a bachelors degree or higher.

85% is a statistic because it is the numerical summary of the sample of London residents.

(5.) In a study of all 16000 students of Divine Mercy Academy, it was found that 99% of them speak in tongues.

99% is a parameter because it is the numerical summary of the population, $16000$ students of Divine Mercy Academy.

(6.) By 2014, around 38% of all mobile phone users were smartphone users.
By 2018, this number is expected to reach over 50%.
(Number of mobile phone users worldwide from 2015 to 2020 (in billions) - Statista - The Statistics Portal)

38% is a parameter because it is the numerical summary of all (population) of mobile phone users.
50% is a parameter because it is the numerical summary of all (population) of mobile phone users.

(7.) 26 of the 50 states in the United States voted for Barack Obama in the 2012 Presidential Elections.
(Presidential Election Results - NBC News)

26 is a statistic because it is the numerical summary of a sample of the states in the United States.
50 is a parameter because it is the numerical summary of the population of the states in the United States.

(8.) A homeowner in the City of Truth or Consequences, New Mexico measured the voltage supplied to his home on 6 days of a given week, and found that the average value was 120 volts.

120 is a statistic because it is the numerical summary of a sample, 6 days of a given week.

(9.) The Federal Republic of Nigeria has 36 states.
Assume the areas of 3 of the Southeastern states are added and the sum is divided by 3, the result is 5301.70566 square kilometers.

5301.70566 is a statistic because it is the numerical summary of a sample, 3 states of the 36 states of Nigeria.

(10.) Median weekly earnings of full-time workers were $887 in the third quarter of 2018. Women had median weekly earnings of 796, or 81.8 percent of the 973 median for men. (Usual Weekly Earnings of Wage and Salary Workers Third Quarter 2018 - Bureau of Labor Statistics) 887 is a parameter because it is the numerical summary of the population of the nation's 117.2 million full-time wage and salary workers. (11.) Of the 100 United States Senators, 77 of them voted for the very big error Iraq war. (Senate Roll Call: Iraq Resolution - The Washington Post) 77 is a parameter because it is the numerical summary of the population of the nation's 100 senators. (12.) A study from Harvard University researchers found that of 93,600 women aged between 25 and 42, three or more servings of berries per week may slash the risk of a heart attack by 33%. (Berries may lower women’s heart attack risk - Harvard School of Public Health published in the January 14, 2013 issue of the American Heart Association’s (AHA) journal Circulation) 33% is a statistic because it is the numerical summary of the sample of 93,600 women aged between 25 and 42. ### Statistical Process Statistics is a science because its process follows the scientific method. The basic steps of a statistical process is: (1.) Identify the research objective What do you want to find out about? What are the necessary questions to be asked? What is the population of the study? (2.) Collect the data needed to answer the questions Use appropriate data collection techniques. (Data Collection) Gaining access to an entire population is usually difficult. So, a sample is needed. How random did take your sample? (Sampling Methods) How large is your sample size? (3.) Describe the data Obtain a descriptive statistics of your sample data. (Descriptive Statistics) Organize your data. (Data Organization) Present your data properly. (Data Presentation) Analyze your data. (Data Analysis) (4.) Perform Inference Apply appropriate techniques to extend the results of your sample data to the population of your study. (Inferential Statistics) Report a level of reliability of the results. What is the confidence level of your results? What is the margin of error? Once a research objective is stated and the population is identified, the researcher must create a list of information of the individuals of the population. This leads us to... ### Variables A variable is a characteristic of the individual of the population being studied. Vocabulary Words/Hint: vary, varies, variable, variability, variation As the name implies, it always "varies". Variables can be classified as: Qualitative Variables or Categorical Variables Vocabulary Words/Hint: quality, category and Quantitative Variables or Numerical Variables Vocabulary Words/Hint: quantity, numerical(number) Quantitative variables can be further classified as: Discrete Variables Vocabulary Words/Hint: quantity you can count and Continuous Variables Vocabulary Words/Hint: quantity you can measure ### Qualitative and Quantitative Variables Qualitative Variables (also known as Categorical Variables) are variables that express qualitative attributes of the individuals of a population. They are not measurable. They are usually not numerical values. Examples are: gender; color such as eye color, hair color; religion; street names; and zip codes (yes because even though USA zip codes are numbers, they are not countable or measurable) among others. Even though categorical variables are not numeric, we can use numeric values to represent parts of a category or to differentiate a category from a non-category. For example: for the variable: Gender, we can use represent the Female gender with 0 and the Male gender with 1. Also, we can represent a Smoker with a 1 and a Non-smoker with a 0. The process of representing categorical variables with numbers is known as Coding. Sometimes, it is necessary to code categorical data to work with some statistical software especially if the categorical data is part of a numerical data (if one of the columns of the data is categorical and the other columns are numerical). Quantitative Variables are variables that express numerical measures of the individuals of a population. They are measurable or countable. They have a numerical value (number value). Examples are: number of ...."anything you can count", price, age, area, volume, temperature, weight, height, size, length, etc. ### Quantitative Variables: Discrete Variables and Continuous Variables Discrete Variables are quantitative variables that has a finite or countable number of values. If you can count to get the value of the quantitative variable, then that variable is discrete. Examples are: the number of ...."anything you can count" such as the number of GNTC (Georgia Northwestern Technical College) students, capacities of different buildings, etc. Students should give more examples Continuous Variables are quantitative variables that has an infinite or uncountable number of values. If you can measure to get the value of the quantitative variable, then that variable is continuous. Examples are: weight, height, size, percentage, volume, area, time, distance, temperature, pressure, length, etc. Students should give more examples. Exercise 3 (13.) Is age a discrete or continuous variable? Well, it depends! Age (in years only) is a discrete variable. You can count it. For example: 25 years, 30 years, etc. Age (that includes years, months, weeks, days, hours, minutes, and seconds) is a continuous variable. You cannot really count it. For example: 25.5 years, etc. We also have ... ### Dependent and Independent Variables Dependent Variable is: A variable that depends on another variable/other variables. Also known as the response variable Variable that is predicted Outcome/result of a study The y-value of a function Independent Variable is: Variable that is not dependent on any other variable. Also known as the explanatory or predictor variable Variable that explains the response variable The x-value function Recall: In Algebra and Calculus; y = f(x) y is the dependent variable. x is the independent variable. Bring it to Statistics y is the response variable. x is the predictor or explanatory variable. Bring it to Philosophy y is the effect. x is the cause. Depending on the class and time, you may explain the topic in Philosophy (Interdisciplinary connection) about the existence of GOD based on cause-effect relationship. Bring it to Economics/Business y is the output. x is the input. Bring it to Psychology/Human Behavior/Sociology y is the consequence. x is the action. Examples: (1.) The weight (quantitative: continuous variable) I gained in the United States (I was skinny in Nigeria) was dependent on the number of MacDonald's cheeseburgers I ate (quantitative: discrete variable) 😊😊😊 In this case, weight is the dependent variable and number of burgers is the independent variable (2.) GPA (grade point average - quantitative - continuous variable) is dependent on the number of "meaningful" hours of study (quantitative: discrete variable) In this case, GPA is the dependent variable and number of meaningful study hours is the independent variable Students should give more examples. ### Stacked (Narrow/Long) Data and Unstacked (Wide) Data My preference is unstacked data because it is an organized data. However, some data downloaded from the Internet or some raw data you collect may be stacked. Hence, it is important to know the meaning of both forms of data. Unstacked Data also known as Wide Data is the data table where the main row (table headings) are the variables and subsequent rows contains the values (observations) of the variables. Stacked Data also known as Narrow Data or Long Data is the data table where one column contains the variables and other columns contain the values (observations) of the variables. Let us review examples. Unstacked Table Data: Therapy Dogs Dog Breed Size Temperament French Bull Dog Small Playful Labrador Retriever Medium-Large Intelligent Toy Poodle Small Smart Stacked Table Data: Therapy Dogs Dog Breed Variable Value French Bull Dog Size Small French Bull Dog Temperament Playful Labrador Retriever Size Medium-Large Labrador Retriever Temperament Intelligent Toy Poodle Size Small Toy Poodle Temperament Smart Exercise 4 (14.) A sample of students were questioned to determine how much they would be willing to pay to see a movie in a theater that served dinner at the seats, with the accompanying results (in dollars). (a.) Write these data as they might appear in stacked format with codes. (b.) Write these data as they might appear in unstacked format. (a.) There are 5 male students and 4 female students. Coding the male students as 1 and the female students as 0, the stacked format with codes will show: 5 1's and 4 0's in one column alongside the respective costs provided by the male and female students respondents in another column. This implies that the correct answer is Option B. (b.) The unstacked data (without codes because the question did not ask us to provide codes) will show the costs provided by the male students in one column and the costs provided by the female students in another column. It is okay for the headings to be labeled as Male and Female respectively. However, it is much better for the headings to be labeled appropriately such as: Cost by Male Students and Cost by Female Students respectively. The correct answer is Option B. ### Data and Variables The type of variable dictates the methods that can be used to analyze the data. Qualitative data are observations corresponding to a qualitative variable. Quantitative data are observations corresponding to a quantitative variable. Discrete data are observations corresponding to a discrete variable. Continuous data are observations corresponding to a continuous variable. We can also classify variables based on the ... ### Level of Measurement of a Variable The level of measurement of a variable determines the types of descriptive statistics and inferential statistics that may be applied to a variable. It is an important factor in determining what tools may be used to describe the variable and what means of analysis to use for inference about the variable. Rather than classify a variable as qualitative or quantitative, we can assign a level of measurement to the variable. The levels of measurement of a variable are: Nominal level of measurement Ordinal level of measurement Interval level of measurement Ratio level of measurement ### Nominal Level of Measurement A variable is at the nominal level of measurement if the variable deals with name, label, category, or code and where the order of ranking is not relevant. Vocabulary Words/Hint: "nominal" means "name" Examples are: Race: African-American, Alaskan native, American Indian, Asian, Caucasian, Pacific Islander, etc. Ask students if they have filled any application for employment or internship. Did they realize they were doing some Statistics!? Nationality: Nigeria, United States, etc. Religion: Christianity, Judaism, Islam, etc. Marital Status: Married, Single Gender: Female, Male Favorite sports of people identified as$1$for Soccer,$2$for Basketball,$3$for Football (the order of ranking is not important) Survey responses of "yes" or "no" (the order of ranking is not important) Social security numbers Types of food dishes Types of music Types of movies Companies that closed locations and fired workers in$2018$Companies that filed for bankruptcy but paid the CEOs a lot of bonuses among others. ### Ordinal Level of Measurement A variable is at the ordinal level of measurement if the variable deals with name, label, category, or code where the order of ranking is relevant, but the differences between the values of the variable cannot be found or the differences between the values of the variable can be found but are not meaningful. Vocabulary Words/Hint: "ordinal" means "order" Examples are: Likert Scales: Strongly Agree, Agree, Neutral, Disagree, Strongly Disagree etc. Ask students if they have filled surveys or polls. Of course they have! or they may...😊😊😊 in evaluating the professor! Grades: A, B, C, D, F etc. Rankings or Ratings: 1st, 2nd, 3rd, five stars, three stars, etc. Levels: High, Medium, Low, etc. Thumbs up, Thumbs down, Internet speed levels of fast, medium, slow Alert levels identified as$10$for Low,$20$for Medium,$30$for High (the order of ranking is important) Positions of people in a line among others. ### Interval Level of Measurement A variable is at the interval level of measurement if the variable deals with name, label, category, or code, where the order of ranking is relevant, the differences between the values of the variable can be found and are meaningful, and there is no natural starting point. Examples are: calendar dates Celsius temperatures Fahrenheit temperatures years in which an economic recession occurred among others. ### Ratio Level of Measurement A variable is at the ratio level of measurement if the variable deals with name, label, category, or code, where the order of ranking is relevant, the differences between the values of the variable can be found and are meaningful, and there is a natural starting zero point. Examples are: time in minutes, time in hours acres of land ages in years weights in kilogram Kelvin temperatures number of buildings among others. Exercise 5 (15.) Identify the individuals, variables and their corresponding data, and the type of variable in the table. Participants Weight(lb.) Type Price($)
A 160 Athletic 25
B 250 Muscular 50
C 120 Athletic 16
D 100 Skinny 10
E 300 Obese 93

Individuals are the participants: A, B, C, D, and E

Variables are: Weight(lb.), Type, and Price($) Variables and their corresponding data are: Weight(lb.):$160, 250, 120, 100, 300$Type: Athletic, Muscular, Athletic, Skinny, Obese Price($): $25, 50, 16, 10, 93$

Variables and the types of variables are:
Weight(lb.) is a quantitative variable: continuous variable
Type is a qualitative variable

## Data Presentation

### Objectives

Students will:
(1.) Represent data using several data presentation tools.
(2.) Calculate the sectorial angles of the variables in pie charts.
(3.) Calculate the percentages of the variables in pie charts.
(4.) Interpret the data presented with several data presentation tools.

### Vocabulary Words

frequency distribution table, frequency table, dotplot, boxplot, box-and-whisker plot, stemplot, stem-and-leaf plot, scatter plot, scatter diagram, normal quantile plot, quantile-quantile plot, QQ plot, line graph, bar graph, bar chart, circle graph, pie chart, cumulative frequency graph, ogive, cumulative frequency curve, Pareto chart, pictogram, histogram, frequency polygon, cumulative frequency polygon, percentages, two-way table

Percentages or rates are often better than counts for making comparisons because they account for possible differences among the sizes of groups.
A two-way table is used to summarize two potentially related categorical variables.

## References

Chukwuemeka, S.D (2016, April 30). Samuel Chukwuemeka Tutorials - Math, Science, and Technology. Retrieved from https://www.samuelchukwuemeka.com

Black, Ken. (2012). Business Statistics for Contemporary Decision Making (7th ed.). New Jersey: Wiley

Gould, R., Wong, R., & Ryan, C. N. (2020). Introductory Statistics: Exploring the world through data (3rd ed.). Pearson.

Kozak, Kathryn. (2015). Statistics Using Technology (2nd ed.).

Sullivan, M., & Barnett, R. (2013). Statistics: Informed decisions using data with an introduction to mathematics of finance (2nd custom ed.). Boston: Pearson Learning Solutions.

Triola, M. F. (2015). Elementary Statistics using the TI-83/84 Plus Calculator (5th ed.). Boston: Pearson

Weiss, Neil A. (2015). Elementary Statistics (9th ed.). Boston: Pearson

Authority (NZQA), (n.d.). Mathematics and Statistics subject resources. www.nzqa.govt.nz. Retrieved December 14, 2020, from https://www.nzqa.govt.nz/ncea/subjects/mathematics/levels/

CMAT Question Papers CMAT Previous Year Question Bank - Careerindia. (n.d.). https://www.careerindia.com. Retrieved May 30, 2020, from https://www.careerindia.com/entrance-exam/cmat-question-papers-e23.html

Desmos. (n.d.). Desmos Graphing Calculator. https://www.desmos.com/calculator

DLAP Website. (n.d.). Curriculum.gov.mt. https://curriculum.gov.mt/en/Examination-Papers/Pages/list_secondary_papers.aspx

Free Jamb Past Questions And Answer For All Subject 2020. (2020, January 31). Vastlearners. https://www.vastlearners.com/free-jamb-past-questions/

Geogebra. (2019). Graphing Calculator - GeoGebra. Geogebra.org. https://www.geogebra.org/graphing?lang=en

GCSE Exam Past Papers: Revision World. Retrieved April 6, 2020, from https://revisionworld.com/gcse-revision/gcse-exam-past-papers

HSC exam papers | NSW Education Standards. (2019). Nsw.edu.au. https://educationstandards.nsw.edu.au/wps/portal/nesa/11-12/resources/hsc-exam-papers

JAMB Past Questions, WAEC, NECO, Post UTME Past Questions. (n.d.). Nigerian Scholars. Retrieved February 12, 2022, from https://nigerianscholars.com/past-questions/

KCSE Past Papers by Subject with Answers-Marking Schemes. (n.d.). ATIKA SCHOOL. Retrieved June 16, 2022, from https://www.atikaschool.org/kcsepastpapersbysubject

Myschool e-Learning Centre - It's Time to Study! - Myschool. (n.d.). https://myschool.ng/classroom

Netrimedia. (2022, May 2). ICSE 10th Board Exam Previous Papers- Last 10 Years. Education Observer. https://www.educationobserver.com/icse-class10-previous-papers/

NSC Examinations. (n.d.). www.education.gov.za. https://www.education.gov.za/Curriculum/NationalSeniorCertificate(NSC)Examinations.aspx

School Curriculum and Standards Authority (SCSA): K-12. Past ATAR Course Examinations. Retrieved December 10, 2021, from https://senior-secondary.scsa.wa.edu.au/further-resources/past-atar-course-exams

West African Examinations Council (WAEC). Retrieved May 30, 2020, from https://waeconline.org.ng/e-learning/Mathematics/mathsmain.html

Papua New Guinea: Department of Education. (n.d.). www.education.gov.pg. Retrieved November 24, 2020, from http://www.education.gov.pg/TISER/exams.html

51 Real SAT PDFs and List of 89 Real ACTs (Free) : McElroy Tutoring. (n.d.). Mcelroytutoring.com. Retrieved December 12, 2022, from https://mcelroytutoring.com/lower.php?url=44-official-sat-pdfs-and-82-official-act-pdf-practice-tests-free