For in GOD we live, and move, and have our being. - Acts 17:28

The Joy of a Teacher is the Success of his Students. - Samuel Dominic Chukwuemeka

# Welcome to Statistics

I greet you this day,
First: read the notes. Second: view the videos. Third: solve the questions/solved examples. Fourth: check your solutions with my thoroughly-explained solutions. Fifth: check your answers with the calculators as applicable.
Comments, ideas, areas of improvement, questions, and constructive criticisms are welcome. You may contact me.
If you are my student, please do not contact me here. Contact me via the school's system. Thank you for visiting!!!

Samuel Dominic Chukwuemeka (Samdom For Peace) B.Eng., A.A.T, M.Ed., M.S

## Introductory Statistics

### Objectives

Students will:
(1.) Discuss statistics.
(2.) Discuss the basic terms used in statistics.
(3.) Discuss the reasons for studying statistics.
(4.) Define data.
(5.) Identify the population, sample, and individual in scenarios.
(6.) Identify the statistic and/or parameter in scenarios.
(7.) Discuss the statistical process.
(8.) Classify variables as qualitative or quantitative.
(9.) Classify quantitative variables as discrete or continuous.
(10.) Classify variables based on the level of measurement of the variable.

### Definition

Statistics is the science that deals with the:
Collection
Organization
Presentation
Analysis and
Interpretation of data so as to make the right decision and the right conclusion.

The main reason for studying statistics is to make the right decision and the right conclusion.
The goal of learning statistics is to distinguish between statistical conclusions that are likely to be valid and those that are seriously flawed.

Teacher: Would you not like to make the right decision in anything you want to do?
What are some of those things? ☺☺☺ Note students responses.
Would you want to be able to differentiate between results that are valid and results that are flawed?
Would you want to be able to prevent any sources of bias when making important decisions?
Welcome to Statistics!

There are basically, two types of statistics:
Descriptive Statistics and
Inferential Statistics

Descriptive Statistics is the science that deals with the collection, organization, and presentation of data.
Inferential Statistics is the science that uses methods that takes the results obtained from a sample, infers it on the population, and measures the reliability of the results.

### Why do we learn Statistics?

(1.) The media uses statistics to predict election polls such as the Presidential election and nominate people for awards among others.

Example 1: President Barack Obama vs Governor Mitt Romney 2012 Presidential Poll - Gallup Polls

Example 2: Who do Americans blame for the Saturday, December 22, 2018 partial government shutdown?
More Americans blame President Trump for government shutdown - Reuters/Ipsos Polls

Discuss some statistics in those links.
For Example 2, read the second to the last paragraph.
Discuss more real-world examples if time is available.
Ask students to research more "valid" sites.
Hmmmm...how do you know if a website is valid? How do you know if a website is not biased?

(2.) School administrators use statistics to know the performance of the schools in their district, and make decisions as necessary.

Example 3: The Nation's Report Card - The National Assessment of Educational Progress (NAEP)

Discuss how NAEP obtain their data (data collection) to rate schools in each state.
Discuss some statistics based on their results.

(3.) Health professionals use statistics to know how different people react to different medicines.

(4.) Teachers use statistics to know how to meet the learning needs of their students.

(5.) People use statistics to make informed or decisions on who to marry (typical in Africa and India), what professor's class to take (typical in the United States where "unknown people" insult their teachers and professors), what car to buy, and what school to attend among others.

### Data

Data is the list of observed values for a variable.
Data is the fact used to make a conclusion or decision.
It is also referred to as "Information".
It is collected from a survey, an experiment, and a historical record among others.
It can be numeric (numbers) such as age, weight, etc.
It can also be non-numeric such as color, gender, etc.
Data vary. It changes within an individual. It also changes among individuals.
Understanding the variablity of data is very important in statistics.
Collecting data about something involves a study of that thing.
This study could be measured or observed.

### Population, Sample, and Individual

Population is the entire group of individuals or thing that is being studied.
It contains all subjects of interest.
Example: All student matadors (Arizona Western College students).

Sample is a proper subset (part) of the population being studied.
It contains some members of the population.
It contains some of the subjects of interest.
Example: AWC students in South Yuma campus.

Bring it to Algebra: what is the difference between a subset and a proper subset?

Individual is a member of the population being studied.
It is a subject of interest.
Example: An AWC student in San Luis campus.

Exercise 1
For each of these scenarios, identify the population, sample, and individual.

(1.) A $2012$ survey of $100$ million Nigerians in Nigeria found that they would prefer the South to secede from the North.

Population: All Nigerians in Nigeria
Sample: $100$ million Nigerians in Nigeria
Individual: A Nigerian in Nigeria

(2.) $300$ ladies aged $19$ to $35$ who live in the United States were contacted in a poll.
The poll asked whether they use abstinence as a form of birth control.
Hmmmm...what do you think the results would be?

Population: All ladies aged $19$ to $35$ who live in the United States
Sample: $200$ ladies aged $19$ to $35$ who live in the United States
Individual: A lady aged $19$ to $35$ who lives in the United States

(3.) Naboth randomly sampled $125$ plants in his farm on June $30$ and weighed the chlorophyll in each plant.

Population: All plants in his farm on June $30$
Sample: $125$ plants in his farm on June $30$
Individual: A plant in his farm on June $30$

### From Question $1$ in Exercise $1$

Assuming $95$ million Nigerians out of $100$ million Nigerians said they were ready to secede immediately.
This means that $95\%$ of the $100$ million Nigerians that were surveyed are ready for the secession immediately.
This describes the results of the sample without making any conclusions about the population. (Descriptive Statistics).
Note the population here is the entire Nigerian population.

### Statistic and Parameter

Statistic is a numerical summary of a sample.

Please note: It is "Statistic", not "Statistics".
No, Statistic is not the singular form of Statistics! ☺☺☺

In our example, the $95\%$ is the statistic.
Suppose we now take this $95\%$ and extend it to the entire Nigerian population. (Inferential Statistics).
Assume we now say that $95\%$ of all Nigerians in Nigeria said they were ready to secede immediately, then the
$95\%$ becomes the parameter.

Parameter is a numerical summary of a population.

Did you notice how we went from Descriptive Statistics to Inferential Statistics?
Did you notice how we went from Sample to Population?
Did you notice how we went from Statistic to Parameter?
Is it making sense?

Exercise 2
For each of these scenarios, identify whether the underlined is a statistic or parameter.

(4.) A sample of London residents were surveyed and it was found that $\underline{85\%}$ had a bachelors degree or higher.

$85\%$ is a statistic because it is the numerical summary of the sample of London residents.

(5.) In a study of all $16000$ students of Omega Bible Institute, it was found that $\underline{99\%}$ of them speak in tongues.

$99\%$ is a parameter because it is the numerical summary of the population, $16000$ students of Omega Bible Institute.

(6.) By $2014$, around $\underline{38\%}$ of all mobile phone users were smartphone users.
By $2018$, this number is expected to reach over $\underline{50\%}$.
(Number of mobile phone users worldwide from 2015 to 2020 (in billions) - Statista - The Statistics Portal)

$38\%$ is a parameter because it is the numerical summary of all (population) of mobile phone users.
$50\%$ is a parameter because it is the numerical summary of all (population) of mobile phone users.

(7.) $\underline{26}$ of the $\underline{50}$ states in the United States voted for Barack Obama in the $2012$ Presidential Elections.
(Presidential Election Results - NBC News)

$26$ is a statistic because it is the numerical summary of a sample of the states in the United States.
$50$ is a parameter because it is the numerical summary of the population of the states in the United States.

(8.) A homeowner in the City of Truth or Consequences, New Mexico measured the voltage supplied to his home on $6$ days of a given week, and found that the average value was $\underline{120}$ volts.

$120$ is a statistic because it is the numerical summary of a sample, $6$ days of a given week.

(9.) The Federal Republic of Nigeria has $36$ states.
Assume the areas of $3$ of the Southeastern states are added and the sum is divided by $3$, the result is $\underline{5301.70566}$ square kilometers.

$\underline{5301.70566}$ is a statistic because it is the numerical summary of a sample, $3$ states of the $36$ states of Nigeria.

(10.) Median weekly earnings of full-time workers were \underline{887}$in the third quarter of$2018$. Women had median weekly earnings of$796$, or 81.8 percent of the$973$median for men. (Usual Weekly Earnings of Wage and Salary Workers Third Quarter 2018 - Bureau of Labor Statistics)$887$is a parameter because it is the numerical summary of the population of the nation's$117.2$million full-time wage and salary workers. (11.) Of the$100$United States Senators,$\underline{77}$of them voted for the "very big error" Iraq war. (Senate Roll Call: Iraq Resolution - The Washington Post)$77$is a parameter because it is the numerical summary of the population of the nation's$100$senators. (12.) A study from Harvard University researchers found that of$93,600$women aged between$25$and$42$, three or more servings of berries per week may slash the risk of a heart attack by$\underline{33}\%$. (Berries may lower women’s heart attack risk - Harvard School of Public Health published in the January 14, 2013 issue of the American Heart Association’s (AHA) journal Circulation)$33\%$is a statistic because it is the numerical summary of the sample of$93,600$women aged between$25$and$42$### Statistical Process Statistics is a science because its process follows the scientific method. The basic steps of a statistical process is: (1.) Identify the research objective What do you want to find out about? What are the necessary questions to be asked? What is the population of the study? (2.) Collect the data needed to answer the questions Use appropriate data collection techniques. (Data Collection) Gaining access to an entire population is usually difficult. So, a sample is needed. How random did take your sample? (Sampling Methods) How large is your sample size? (3.) Describe the data Obtain a descriptive statistics of your sample data. (Descriptive Statistics) Organize your data. (Data Organization) Present your data properly. (Data Presentation) Analyze your data. (Data Analysis) (4.) Perform Inference Apply appropriate techniques to extend the results of your sample data to the population of your study. (Inferential Statistics) Report a level of reliability of the results. What is the confidence level of your results? What is the margin of error? Once a research objective is stated and the population is identified, the researcher must create a list of information of the individuals of the population. This leads us to... ### Variables A variable is a characteristic of the individual of the population being studied. Vocabulary Words/Hint: vary, varies, variable, variability, variation As the name implies, it always "varies". Variables can be classified as: Qualitative Variables or Categorical Variables Vocabulary Words/Hint: quality, category and Quantitative Variables Vocabulary Words/Hint: quantity Quantitative variables can be further classified as: Discrete Variables Vocabulary Words/Hint: quantity you can count and Continuous Variables Vocabulary Words/Hint: quantity you can measure ### Qualitative and Quantitative Variables Qualitative Variables are variables that express qualitative attributes of the individuals of a population. They are not measurable. They are usually not numerical values. Examples are: gender, color, religion, street names, zip codes (yes because even though USA zip codes are numbers, they are not countable or measurable), etc. Quantitative Variables are variables that express numerical measures of the individuals of a population. They are measurable or countable. They have a numerical value (number value). Examples are: number of ...."anything you can count", price, age, area, volume, temperature, weight, height, size, length, etc. ### Quantitative Variables - Discrete Variables and Continuous Variables Discrete Variables are quantitative variables that has a finite or countable number of values. If you can count to get the value of the quantitative variable, then that variable is discrete. Examples are: the number of ...."anything you can count" such as the number of GNTC (Georgia Northwestern Technical College) students, capacities of different buildings, etc. Students should give more examples Continuous Variables are quantitative variables that has an infinite or uncountable number of values. If you can measure to get the value of the quantitative variable, then that variable is continuous. Examples are: weight, height, size, percentage, volume, area, time, distance, temperature, pressure, length, etc. Students should give more examples Exercise 3 (13.) Is age a discrete or continuous variable? Well, it depends! Age (in years only) is a discrete variable. You can count it. For example:$25$years,$30$years, etc. Age (that includes years, months, weeks, days, hours, minutes, and seconds) is a continuous variable. You cannot really count it. For example:$25.5$years, etc. We also have.... ### Dependent and Independent Variables Dependent Variable is: A variable that depends on another variable/other variables. Also known as the response variable Variable that is predicted Outcome/result of a study The$y-value$of a function Independent Variable is: Variable that is not dependent on any other variable. Also known as the explanatory or predictor variable Variable that explains the response variable The$x-value$function Recall: In Algebra and Calculus;$y = f(x)y$is the dependent variable.$x$is the independent variable. Bring it to Statistics$y$is the response variable.$x$is the predictor or explanatory variable. Bring it to Philosophy$y$is the effect.$x$is the cause. Talk about the existence of GOD based on cause-effect relationship GOD exists!!! Bring it to Economics/Business$y$is the output.$x$is the input. Bring it to Psychology/Human Behavior/Sociology$y$is the consequence.$x$is the action. Examples: (1.) The weight (quantitative - continuous variable) I gained in the United States (I was skinny in Nigeria) was dependent on the number of MacDonald's cheeseburgers I ate (quantitative - discrete variable) ☺☺☺ In this case, weight is the dependent variable and number of burgers is the independent variable (2.) GPA (grade point average - quantitative - continuous variable) is dependent on the number of "meaningful" hours of study (quantitative - discrete variable) In this case, GPA is the dependent variable and number of meaningful study hours is the independent variable Students should give more examples ### Data and Variables The type of variable dictates the methods that can be used to analyze the data. Qualitative data are observations corresponding to a qualitative variable. Quantitative data are observations corresponding to a quantitative variable. Discrete data are observations corresponding to a discrete variable. Continuous data are observations corresponding to a continuous variable. We can also classify variables based on.... ### Level of Measurement of a Variable The level of measurement of a variable determines the types of descriptive statistics and inferential statistics that may be applied to a variable. It is an important factor in determining what tools may be used to describe the variable and what means of analysis to use for inference about the variable. Rather than classify a variable as qualitative or quantitative, we can assign a level of measurement to the variable. The levels of measurement of a variable are: Nominal level of measurement Ordinal level of measurement Interval level of measurement Ratio level of measurement ### Nominal Level of Measurement A variable is at the nominal level of measurement if the variable deals with name, label, category, or code and where the order of ranking is not relevant. Vocabulary Words/Hint: "nominal" means "name" Examples are: Race: African-American, Alaskan native, American Indian, Asian, Caucasian, Pacific Islander, etc. Ask students if they have filled any application for employment or internship. Did they realize they were doing some Statistics!? Nationality: Nigeria, United States, etc. Religion: Christianity, Judaism, Islam, etc. Marital Status: Married, Single Gender: Female, Male Favorite sports of people identified as$1$for Soccer,$2$for Basketball,$3$for Football (the order of ranking is not important) Survey responses of "yes" or "no" (the order of ranking is not important) Social security numbers Types of food dishes Types of music Types of movies Companies that closed locations and fired workers in$2018$Companies that filed for bankruptcy but paid the CEOs a lot of bonuses among others. ### Ordinal Level of Measurement A variable is at the ordinal level of measurement if the variable deals with name, label, category, or code where the order of ranking is relevant, but the differences between the values of the variable cannot be found or the differences between the values of the variable can be found but are not meaningful. Vocabulary Words/Hint: "ordinal" means "order" Examples are: Likert Scales: Strongly Agree, Agree, Neutral, Disagree, Strongly Disagree etc. Ask students if they have filled surveys or polls. Of course they have! or they may...☺☺☺ in evaluating the professor! Grades: A, B, C, D, F etc. Rankings or Ratings:$1^{st}$,$2^{nd}$,$3^{rd}$, five stars, three stars, etc. Levels: High, Medium, Low, etc. Thumbs up, Thumbs down, Internet speed levels of fast, medium, slow Alert levels identified as$10$for Low,$20$for Medium,$30$for High (the order of ranking is important) Positions of people in a line among others. ### Interval Level of Measurement A variable is at the interval level of measurement if the variable deals with name, label, category, or code, where the order of ranking is relevant, the differences between the values of the variable can be found and are meaningful, and there is no natural starting point. Examples are: calendar dates Celsius temperatures Fahrenheit temperatures years in which an economic recession occurred among others. ### Ratio Level of Measurement A variable is at the ratio level of measurement if the variable deals with name, label, category, or code, where the order of ranking is relevant, the differences between the values of the variable can be found and are meaningful, and there is a natural starting zero point. Examples are: time in minutes, time in hours acres of land ages in years weights in kilogram Kelvin temperatures number of buildings among others. Exercise 4 (14.) Identify the individuals, variables and their corresponding data, and the type of variable in the table. Participants Weight(lb.) Type Price($)
A $160$ Athletic $25$
B $250$ Muscular $50$
C $120$ Athletic $16$
D $100$ Skinny $10$
E $300$ Obese $93$

Individuals are the participants: A, B, C, D, and E

Variables are: Weight(lb.), Type, and Price($) Variables and their corresponding data are: Weight(lb.) -$160, 250, 120, 100, 300$Type - Athletic, Muscular, Athletic, Skinny, Obese Price($) - $25, 50, 16, 10, 93$

Variables and the types of variables are:
Weight(lb.) is a quantitative variable - continuous variable
Type is a qualitative variable