Section 6 Homework

Reminder: You are allowed to work with other students on the homework assignments but you must acknowledge who you worked with at the top of your homework assignment.

At the top of you assignment, List Any Collaborators (if any):

Statement of Integrity: All work submitted is my own, and I have followed all rules for collaboration.

Signature:

On the top of your assignment, copy the entire statement of integrity or just write the phrase “Statement of Integrity” and sign your name to it.

All homeworks should be handwritten (unless otherwise noted).


Exercise 1. Data was collected on home sales in Pierce county in Washington state. The data set was obtained from the OpenIntro textbook site at https://www.openintro.org/data/index.php?data=pierce_county_house_sales. Included among the variables measured on the 16814 homes are:

We want to determine if there is evidence that the average home price in Pierce county is different than 350,000 dollars, which corresponds to the (approximate) average home price in the United States. We also want a confidence interval for the average home price in Pierce county. Below is a histogram of sale_price (note that there are some houses with extremely high prices on the very far right of the histogram), along with a table of summary statistics.

\(\hat{\mu}\) s n
461233 236083 16814
  1. Prepare. Write the null and alternative hypotheses in notation for the question of interest.

  2. Write the fitted model for sale price.

  3. Check. Assume independence of observations is satisfied. Check the other condition for the hypothesis test.

  • Though the sample size is large, there are a few extreme outliers so it is possible that the sampling distribution of the sample mean will not be approximately normally distributed.

Regardless of whether you say the check condition holds, complete the rest of the hypothesis test.

  1. Calculate. Calculate a 99% confidence interval for the average home price in Pierce County. State the degrees of freedom for the \(t^*\) value that you found with StatKey.
  • your interval should be: (456542.8, 465923.2) dollars
  1. Calculate the test statistic, \(T\), for the hypothesis test for this example. State the degrees of freedom. Then, draw a graph of the t-distribution assuming the null hypothesis is true, mark where your observed \(T\) falls on the distribution, and shade the area that represents the p-value.
  • test statistic: T = 61.09
  1. Use StatKey to find the p-value for the test.

  2. Conclude. Interpret your confidence interval in context of the problem.

  3. Write a conclusion for your hypothesis test in context of the problem.

  • Hint: is there evidence that the mean price of houses in Pierce county is different than the national average of 350000 dollars? Is it strong, moderate, weak, or no evidence? Did you include an explicit statement about the point estimate in context of the problem?


Exercise 2. Recall from Section 2 the textbook data where, for individual textbooks, the price of each book was recorded at the UCLA bookstore and at Amazon. Suppose we want to determine if the average price of a new textbook at UCLA is different than the average price of a new textbook from Amazon. We take a random sample of 73 UCLA courses and record the textbook price at the UCLA bookstore and on Amazon. The following table gives the first 3 observations in the data set.

In section 2, we did a more subjective approach to answering this question. Now that we have the tools to conduct a formal hypothesis test, we will do so to determine if there is statistical evidence of a difference in average price.

dept_abbr course isbn ucla_new amaz_new
Am Ind C170 978-0803272620 27.67 27.95
Anthro 9 978-0030119194 40.59 31.14
Anthro 135T 978-0300080643 31.68 32

A histogram of the differences is also shown:

Finally, some summary statistics on the differences are shown here:

mean_diff sd_diff n_diff
12.76 14.26 73
  1. Review. Why is this data paired?
  • This data is paired because we have the price of the same textbook from each store (UCLA and Amazon).
  1. Prepare. Write the theoretical/population model for the difference in textbook prices.
  • \(Y_{diff} = \mu_{diff} + Error\), where \(Y_{diff}\) is the difference in textbook price (UCLA minus Amazon) and \(\mu_{diff}\) is the true mean difference in textbook price (UCLA minus Amazon).
  1. Prepare. Write the null and alternative hypotheses to test for a difference in mean textbook price across the two sources in statistical notation.
  • \(H_0: \mu_{diff} = 0\)
  • \(H_a: \mu_{diff} \ne 0\)
  1. Check. Assume that each difference is independent of other differences. Check the other condition for the hypothesis test.

Regardless of whether you say the check condition holds, complete the rest of the hypothesis test.

  1. Calculate. Calculate a 90% confidence interval for the average difference in textbook price. State the degrees of freedom for the \(t^*\) value that you found with StatKey.
  • your 90% confidence interval should be: (9.97, 15.55) dollars.
  1. Calculate the test statistic, \(T\), for the hypothesis test for this example. State the degrees of freedom. Then, draw a graph of the t-distribution assuming the null hypothesis is true, mark where your observed \(T\) falls on the distribution, and shade the area that represents the p-value.
  • test statistic: T = 7.645
  1. Use StatKey to find the p-value for the test.

  2. Conclude. Interpret your confidence interval in context of the problem.

  3. Write a conclusion for your hypothesis test in context of the problem.