Upload the data set called MATHSAT.SAV (in Start Here folder). Replicate the correlational/regression analyses that are described in Example 4.3 and Example 4.4.
Add the categorical variable, PCT30, to the scatterplot.
Do the analysis for each group separately as defined by the categorical variable (PCT30).
Write out the overall regression equation (you can do that in words, if you do not like or understand the statistical notation), as well as for each sub group (the two PCT30 groups).
Define/explain the intercept and slope (beta coefficient) from each of the three regression models.
Category: Statistics
-
Title: “Exploring the Relationship between Math Scores and Socioeconomic Factors: A Regression Analysis with Categorical Variables”
-
MAT 240 Project Two: Analyzing Housing Sales Market for a Region of the United States Project Two: Hypothesis Testing and Confidence Intervals for House Listing Prices in Different Regions
MAT 240 Project Two Guidelines and Rubric
Competency
In this project, you will demonstrate your mastery of the following competency:
Apply statistical techniques to address research problems
Perform hypothesis testing to address an authentic problem
Overview
In this project, you will apply inference methods for means to test your hypotheses about the housing sales market for a region of the United States. You will use appropriate sampling and statistical methods.
Scenario
You have been hired by your regional real estate company to determine if your region’s housing prices and housing square footage are significantly different from those of the national market. The regional sales director has three questions that they want to see addressed in the report:
Are housing prices in your regional market lower than the national market average?
Is the square footage for homes in your region different than the average square footage for homes in the national market?
For your region, what is the range of values for the 95% confidence interval of square footage for homes in your market?
You are given a real estate data set that has houses listed for every county in the United States. In addition, you have been given national statistics and graphs that show the national averages for housing prices and square footage. Your job is to analyze the data, complete the statistical analyses, and provide a report to the regional sales director. You will do so by completing the Project Two Template located in the What to Submit area below.
Directions
Introduction
Region: Start by picking one region from the following list of regions:
West South Central, West North Central, East South Central, East North Central, Mid Atlantic
Purpose: What is the purpose of your analysis?
Sample: Define your sample. Take a random sample of 500 house sales for your region.Describe what is included in your sample (i.e., states, region, years or months).
Questions and type of test: For your selected sample, define two hypothesis questions (see the Scenario above) and the appropriate type of test for each. Address the following for each hypothesis:Describe the population parameter for the variable you are analyzing.
Describe your hypothesis in your own words.
Identify the hypothesis test you will use (1-Tail or 2-Tail).
Level of confidence: Discuss how you will use estimation and confidence intervals to help you solve the problem.
1-Tail Test
Hypothesis: Define your hypothesis.Define the population parameter.
Write null (Ho) and alternative (Ha) hypotheses. Note: For means, define a hypothesis that is less than the population parameter.
Specify your significance level.
Data analysis: Summarize your sample data using appropriate graphical displays and summary statistics and confirm assumptions have not been violated to complete this hypothesis test.Provide at least one histogram of your sample data.
In a table, provide summary statistics including sample size, mean, median, and standard deviation. Note: For quartiles 1 and 3, use the quartile function in Excel:
=QUARTILE([data range], [quartile number])
Summarize your sample data, describing the center, spread, and shape in comparison to the national information (under Supporting Materials, see the National Summary Statistics and Graphs House Listing Price by Region PDF). Note: For shape, think about the distribution: skewed or symmetric.
Check the conditions.Determine if the normal condition has been met.
Determine if there are any other conditions that you should check and whether they have been met. Note: Think about the central limit theorem and sampling methods.
Hypothesis test calculations: Complete hypothesis test calculations.Calculate the hypothesis statistics.Determine the appropriate test statistic (t). Note: This calculation is (mean – target)/standard error. In this case, the mean is your regional mean, and the target is the national mean.
Calculate the probability (p value). Note:This calculation is done with the T.DIST function in Excel:
=T.DIST([test statistic], [degree of freedom], True) The degree of freedom is calculated by subtracting 1 from your sample size.
Interpretation: Interpret your hypothesis test results using the p value method to reject or not reject the null hypothesis.Relate the p value and significance level.
Make the correct decision (reject or fail to reject).
Provide a conclusion in the context of your hypothesis.
2-Tail Test
Hypotheses: Define your hypothesis.Define the population parameter.
Write null and alternative hypotheses. Note: For means, define a hypothesis that is not equal tothe population parameter.
State your significance level.
Data analysis: Summarize your sample data using appropriate graphical displays and summary statistics and confirm assumptions have not been violated to complete this hypothesis test.Provide at least one histogram of your sample data.
In a table, provide summary statistics including sample size, mean, median, and standard deviation. Note: For quartiles 1 and 3, use the quartile function in Excel:
=QUARTILE([data range], [quartile number])
Summarize your sample data, describing the center, spread, and shape in comparison to the national information. Note: For shape, think about the distribution: skewed or symmetric.
Check the assumptions.Determine if the normal condition has been met.
Determine if there are any other conditions that should be checked on and whether they have been met. Note: Think about the central limit theorem and sampling methods.
Hypothesis test calculations: Complete hypothesis test calculations.Calculate the hypothesis statistics.Determine the appropriate test statistic (t). Note: This calculation is (mean – target)/standard error. In this case, the mean is your regional mean, and the target is the national mean.]
Determine the probability (p value). Note:This calculation is done with the TDIST.2T function in Excel:
=T.DIST.2T([test statistic], [degree of freedom]) The degree of freedom is calculated by subtracting 1 from your sample size.
Interpretation: Interpret your hypothesis test results using the p value method to reject or not reject the null hypothesis.Compare the p value and significance level.
Make the correct decision (reject or fail to reject).
Provide a conclusion in the context of your hypothesis.
Comparison of the test results: Revisit Question 3 from the Scenario section: For your region, what is the range of values for the 95% confidence interval of square footage for homes?Calculate and report the 95% confidence interval. Show or describe your method of calculation.
Final Conclusions
Summarize your findings: In one paragraph, summarize your findings in clear and concise plain language.
Discuss: Discuss whether you were surprised by the findings. Why or why not?
You can use the following tutorial that is specifically about this assignment:
MAT-240 Module 7 Project Two Video
What to Submit
To complete this project, you must submit the following:
Project Two Template Word Document Use this template to structure your report, and submit the finished version as a Word document.
Supporting Materials
The following resources may help support your work on the project:
Data Set: MAT 240 House Listing Price by Region Spreadsheet
Use this data for input in your project report.
Document: National Summary Statistics and Graphs House Listing Price by Region PDF
Use this data for input in your project report. -
Title: “Exploring the Relationship between Parental Involvement and Academic Achievement: A Model Drawing Analysis”
Please follow the link provided below and read/watch all the instructions and examples are on this page.
I have also attached the research topic and variables to complete the two proposed model drawings in the assignment. Below you can find the instructions and a example of exactly how the assignment should look. You are to follow the Hayes Model -
Title: Using the Normal Distribution and Confidence Intervals to Analyze BMI Differences Between Smokers and Non-Smokers
Part 1: Using the Normal Distribution
Given that BMI is approximately normally distributed, present the following summary statistics for BMI of smokers and nonsmokers in the table below.
Smokers Non-smokers
Mean
Standard Deviation
Sample size (n)
Use this table to answer the following questions based on your individual data. Compute probabilities based on the normal distribution. For each portion write a 1-sentence explanation as to how Excel was used to assist you in your computations.
Find the percent of smokers expected to have a BMI of greater than 25 (overweight).
Find the percent of nonsmokers expected to have a BMI of less than 18.5 (underweight).
A normal BMI range is between 18.5 and 24.9. What percentage of smokers are expected to be within this range? What percentage of nonsmokers are expected to be within this range?
A researcher is interested in which BMI represents the 90th percentile (where 90% are at this BMI level or lower). What BMI score represents the 90th percentile cut-off rate?
Part 2: Creating Confidence Intervals
Using the information from Part 1 (above), calculate a 90%, 95%, and 99% confidence interval for both groups. Then, complete the following table:
Group 90% Confidence Interval 95% Confidence Interval 99% Confidence Interval
Smokers
Non-smokers
Use this table to answer the following questions based on your individual data.
As the level of confidence increases, what happens to the width of the confidence interval? Does it increase or decrease? Explain one (1) reason why this would happen.
Using the 95% confidence interval, compare the BMI of smokers versus nonsmokers. Write a 2- to 3-sentence paragraph explaining if these intervals overlap or not. What does the comparison indicate regarding the differences in BMI between the two groups?
Explain, in 1–2 sentences, one situation in which you (as a researcher) may choose a 99% confidence interval? A 90% interval? Fully justify your choice. -
“Exploring Probability Distributions and the Central Limit Theorem”
Answer the questions on the worksheet that i attached on additional files. The questions are about: (Uniform distribution, Normal distibution, Central Limit Theorem, InvNorm function etc). for the work, you need to able to show the calcultor function that you used on the calculator and the work that you did on the calculator to find the solution.
-
Title: Survey Analysis for UMGC New Student Recruitment
Your work on the new student committee was a huge success! The director of new student recruitment has requested that you continue your work on the committee. Specifically, the director would like you to distribute a small survey to the students who attended the weekend event, gauging their level of interest in studying at UMGC. The director is interested in obtaining demographic information from the prospective students, the academic program into which they would enroll, and their overall level of interest in attending UMGC. The survey questions and results are below:
Survey questions given to prospective students
What is your age?
Would you live in on-campus housing or off-campus housing?
Into which academic program would you enroll?
How likely are you to attend UMGC in the next year? (Rate: 1–4, 1 is not likely and 4 is very likely)
SurveyStudent
Age
Housing
Academic Program
Likely to attend UMGC
1
18
Off campus
Political science
4
2
19
Off campus
History
1
3
17
On campus
Cybersecurity
2
4
30
Off campus
Nursing
4
5
18
On campus
History
3
6
21
On campus
Psychology
4
7
45
Off campus
Business
2
8
20
On campus
Business
3
9
18
On campus
Accounting
4
10
36
Off campus
Nursing
4
11
25
Off campus
History
2
12
29
Off campus
Sociology
2
13
31
Off campus
Spanish
2
14
19
On campus
Psychology
2
Your first task is to define the data resulting from each survey question as qualitative or quantitative. If the variable is qualitative, indicate if it is nominal or ordinal. If it is quantitative, indicate whether it is discrete or continuous and whether it is interval or ratio (see graphic below).
Next, create a table (a frequency distribution, stem and leaf plot, or a grouped frequency distribution) to organize the data from one of the variables. Include the table in your post. Does including the relative frequency or cumulative frequency make the table more meaningful? Why do you feel this table best organizes the data?
Then, consider how you might visually display the results as a graph (bar graph, Pareto chart, dot plot, line graph, histogram, pie chart, or box plot). Include the graph in your post. Why did you choose this graph? Explain why you believe this graph is the best choice to display the data.
Finally, find the mean, median, and mode for one of the variables. Which of these measures of central tendency do you think is the best choice for “average” and why? Find the range and standard deviation (measures of dispersion) for the variable. What would a narrower or wider deviation signify in the context of this data? -
“Exploring Statistical Concepts with Statkey: A Hands-On Approach”
Stats using statkey I attached the links for statkey and the sources being used to answer.
Statkey link: https://www.lock5stat.com/StatKey/ -
Title: Summary of Papers on Probability and Statistics: Methods, Results, and Conclusions Introduction: Probability and statistics are essential fields in mathematics that are used to analyze and interpret data in various disciplines. In recent years, there have been several papers
A 3 page minimum professional, type written report that summarizes the papers on topics such as probability and statistics and include:
(1) A short introduction
(2) A brief description of the materials/methods used in the paper
(3) The most significant results obtained by the authors
(4) Conclusion(s)
Please use single spacing with Times New Roman 12pt font and 1” margins. At the end of the report,
cite the paper discussed. The following reference styles are acceptable: Chicago, Harvard, Vancouver,
APA, and MLA. -
“Analyzing Non-Book Data Using Chi-Square Test”
instructions for assignment are on the week 6 document!! please do not use AI!!!!!
THE CHI SQUARE DATASET IS FOR NONBOOK QUESTION -
“Applying Confidence Intervals in the Health Sciences: A Practical Demonstration”
Scenario/Summary
The highlight of this week’s lab is confidence intervals and the use of these intervals in the health sciences. There is a short reading that specifically relates confidence intervals to health sciences and then you are asked to demonstrate your knowledge of confidence intervals by applying them in a practical manner.
Instructions
Steps to Complete the Week 7 Lab
Step 1: Find these articles in the Chamberlain Library. Once you click each link, you will be logged into the Library and then click on “PDF Full Text”.
First Article: Confidence Intervals, Part 1
Links to an external site.
Second Article: Confidence Intervals, Part 2
Links to an external site.
Step 2: Consider the use of confidence intervals in health sciences with these articles as inspiration and insights.
Step 3: Using the data you collected for the Week 5 Lab (heights of 10 different people that you work with plus the 10 heights provided by your instructor), discuss your method of collection for the values that you are using in your study (systematic, convenience, cluster, stratified, simple random). What are some faults with this type of data collection? What other types of data collection could you have used, and how might this have affected your study?
Step 4: Now use the Week 6 Spreadsheet to help you with calculations for the following questions/statements.
a) Give a point estimate (mean) for the average height of all people at the place where you work. Start by putting the 20 heights you are working with into the blue Data column of the spreadsheet. What is your point estimate, and what does this mean?
b) Find a 95% confidence interval for the true mean height of all the people at your place of work. What is the interval? [see screenshot below]
c) Give a practical interpretation of the interval you found in part b, and explain carefully what the output means. (For example, you might say, “I am 95% confident that the true mean height of all of the people in my company is between 64 inches and 68 inches”).
d) Post a screenshot of your work from the t value Confidence Interval for µ from the Confidence Interval tab on the Week 6 Excel spreadsheet
Step 5: Now, change your confidence level to 99% for the same data, and post a screenshot of this table, as well.
Step 6: Compare the margins of error from the two screenshots. Would the margin of error be larger or smaller for the 99% CI? Explain your reasoning.
Step 7: Save the Week 7 Lab document with your answers and include your name in the title.
Step 8: Submit the document.
Requirements
The deliverable is a Word document with your answers to the questions posed below based on the article you find.
Required Software
Microsoft Word
Internet access to read articles
Grading
This activity will be graded based on the Week 7 Lab Rubric.
Outcomes
CO 8: Given a sample dataset, estimate and interpret the confidence intervals for population mean or proportion.
Due Date
By 11:59 p.m. MT on Sunday