**Project One: Multiple Regression, Qualitative Variables Interactions, Quadratic Regression****¶**

For Project One, you have been asked to create different regression models analyzing a housing data set. Before beginning work on the project, be sure to read through the Project One Guidelines and Rubric to understand what you need to do and how you will be graded on this assignment. Be sure to carefully review the Project One Summary Report template, which contains all of the questions that you will need to answer about the regression analyses you are performing.

For this project, you will be writing all the scripts yourself. You may reference the textbook and your previous work on the problem sets to help you write the scripts.

**Scenario**

You are a data analyst working for a real estate company. You have access to a large set of historical data that you can use to analyze relationships between different attributes of a house (such as square footage or the number of bathrooms) and the house’s selling price. You have been asked to create different regression models to predict sale prices for houses based on critical variable factors. These regression models will help your company set better prices when listing a home for a client. Setting better prices will ensure that listings can be sold within a reasonable amount of time.

There are several variables in this data set, but you will be working with the following important variables:

**VariableWhat does it represent?**priceSale price of the homebedroomsNumber of bedroomsbathroomsNumber of bathroomssqft_livingSize of the living area in sqftsqft_aboveSize of the upper level in sqftsqft_lotSize of the lot in sqftageAge of the homegradeMeasure of craftsmanship and the quality of materials used to build the homeappliance_ageAverage age of all appliances in the homecrimeCrime rate per 100,000 peoplebackyardHome has a backyard (backyard=1) or not (backyard=0)viewHome backs out to a lake (view=2), backs out to trees (view=1), or backs out to a road (view=0)

**Prepare Your Data Set**

In the following code block, you have been given the R code to prepare your data set.

Click the **Run** button on the toolbar to run this code.

In [1]:

`housing `**<-** read.csv(file**=**"housing.csv", header**=**TRUE, sep**=**",")

``

*# converting appropriate variables to factors *

`housing `**<-** within(housing, {

` view `**<-** factor(view)

` backyard `**<-** factor(backyard)

`})`

``

*# number of columns*

`ncol(housing)`

``

*# number of rows*

`nrow(housing)`

222692

**Model #1 – First Order Regression Model with Quantitative and Qualitative Variables**

You have been asked to create a first order regression model for *price* as the response variable, and *sqft_living*, *grade*, *bathrooms*, and *view* as predictor variables. Before writing any code, review Section 3 of the Summary Report template to see the questions you will be answering about your first order multiple regression model.

Run your scripts to get the outputs of your regression analysis. Then use the outputs to answer the questions in your summary report.

**Note: Use the + (plus) button to add new code blocks, if needed.**

In [ ]:

`myvars `**<-** c("mpg","wt","drat")

`mthouse_subset `**<-** mthouse2[myvars]

``

*# Print the first six rows*

`print("head")`

`head(mtcars_subset, 6)`

``

*# Print the correlation matrix*

`print("cor")`

`corr_matrix `**<-** cor(mtcars_subset, method **=** "pearson")

`round(corr_matrix, 4)`

In [ ]:

``

In [ ]:

``

In [ ]:

``

In [ ]:

``

**Model #2 – Complete Second Order Regression Model with Quantitative Variables**

You have been asked to create a complete second order regression model for *price* as the response variable, and *appliance_age* and *crime* as predictor variables. Before writing any code, review Section 4 of the Summary Report template to see the questions you will be answering about your complete second order multiple regression model.

Run your scripts to get the outputs of your regression analysis. Then use the outputs to answer the questions in your summary report.

**Note: Use the + (plus) button to add new code blocks, if needed.**

In [ ]:

``

In [ ]:

``

In [ ]:

``

In [ ]:

``

In [ ]:

``

**Nested Models F-Test**

You have been asked to create a reduced model and compare it with the complete second order model (Model #2 above). Before writing any code, review Section 5 of the Summary Report template to see the questions you will need to answer.

Run your scripts to get the outputs of your regression analysis. Then use the outputs to answer the questions in your summary report.

**Note: Use the + (plus) button to add new code blocks, if needed.**

In [ ]:

``

In [ ]:

``

In [ ]:

``

In [ ]:

``

In [ ]:

``

**End of Project One Jupyter Notebook**

The HTML output can be downloaded by clicking **File**, then **Download as**, then **HTML**. Be sure to answer all of the questions in the Summary Report template for Project One, and to include your completed Jupyter Notebook scripts as part of your submission.