Assignment 1
Total Points: 100
In this exercise, you will implement linear regression from scratch using the programming language of your choice. Please make sure to avoid using toolbox from R, MATLAB, Python, or any other programming language. You will implement the Gradient Descent Algorithm that we have discussed in class to find out the parameters for ⊖. One way to verify that gradient descent is working correctly is to look at the value of J (⊖) and check that it is decreasing with each iteration.
For implementing some of the principles of programming, try to modularize the code as much as possible and for improved code readability, please make sure to thoroughly comment the code clearly explaining what you did and why you did what you did. In this assignment, we will use Parkinson’s dataset for experiments. Assignment 1 contains four sections. Analysis is a crucial aspect of the assignment, so for each subpart try to answer the question in more detail to receive full credit and justify what you did in your implementation as well as the results you obtained.
Also, please divide the data into training and test data and use the test data to evaluate
performance.
1. Linear Regression with One Variables: (15 points)
a. Implement linear regression to predict motor_UPDRS using “PPE” feature from dataset.
“motor_UPDRS” will be the target variable.
b. Evaluate performance using metrics (such as Mean Squared Error (MSE), R-
squared (R²), and Adjusted R-squared (Adjusted R²). You may also use graphs for explaining your observations.
2. Linear Regression with Two Variables: (15 points)
a. Add ‘NHR' as an additional input feature to the previous linear regression model. Does adding this feature improve the performance of the model? Compare the performance of the models in Question 1 and Question 2 using evaluation metrics (such as Mean Squared Error (MSE), R-squared (R²), and Adjusted R-squared (Adjusted R²). You may also use graphs for explaining your observations.
3. Stepwise Linear Regression: (50 points)
In this section, you will explore stepwise linear regression to determine the most relevant features for predicting motor_UPDRS.
a. Select any 5 features out of the 10 provided below. Implement forward stepwise linear regression with the chosen features. The process involves iteratively adding one feature at a time from your selection. After adding each feature, evaluate the model's
performance using metrics such as Mean Squared Error, R-squared, Adjusted R- squared, or BIC-Bayesian Information Criterion (preferred). Choose the feature that contributes the most to improving the model's performance and add it to the model. Continue this iterative process for a total of 5 iterations. Explain your selection criteria for adding or removing features.
Features: 'age', 'Jitter (%)’, ‘Shimmer', 'Shimmer(dB)', 'Shimmer: APQ5',
'Shimmer: APQ11', 'HNR', 'RPDE', 'DFA', 'PPE'
b. For this task, you will be given a list of 10 features as below. Implement backward
stepwise linear regression, beginning with a model that includes all 10 features. Remove one feature at a time from the model and evaluate the model's performance after each removal. Remove the feature that has the minimal adverse effect on the model's performance. Continue this iterative process for a total of 5 iterations. Please provide a detailed explanation of your criteria for selecting which features to remove or retain during this process.
Features: 'age', 'Jitter (%)’, ‘Shimmer', 'Shimmer(dB)', 'Shimmer: APQ5',
'Shimmer: APQ11', 'HNR', 'RPDE', 'DFA', 'PPE'
c. Compare the final model obtained from forward stepwise regression with the final model obtained from backward stepwise regression. Which one is better? Discuss the
differences in terms of the selected features, model performance.
d. Compare the performance of the model built using the features from Q.2 (a) with the
resultant accuracies of the model built using the selected features Q.3(c). Which set of
features performed better?
Evaluate model performance using metrics like Mean Squared Error, R-squared, Adjusted R-squared or BIC-Bayesian Information Criterion (preferred). You may also use graphs for explaining your observation.
4. Regularization and Feature Scaling: (20 points)
a. For the best performing model in Q.3 (Model from Q.3(d)), does regularization improve the performance?
b. Does Feature Scaling improve the performance for the model in Question 3(d)?
Evaluate performance using metrics like Mean Squared Error, R-squared, Adjusted Rsquared. You may also use graphs for explaining your observation.
You can use NumPy, pandas, and Matplotlib libraries for your assignment, but
please do not use built-in libraries such as scikit-learn, statsmodels, etc.
Submission Instructions:
- Please upload a zipped file named 'Assignment-1_YourName' to Dropbox.
The zip file should contain the following items: a dataset, a code file (please
use relative paths when reading/importing the dataset), a PDF-format
report, and a README.txt.
- Please ensure that your code is error-free; there should be no errors or
warnings when running it on my machine.
- If you are using a programming language other than Python, please provide a
README.txt file explaining how to run your code to obtain the result.
Mention any libraries or dependencies that need to be installed before
running the code, if necessary.
Academic Integrity
Discussion of course contents with other students is an important part of the academic
process and is encouraged. However, it is expected that course programming
assignments, homework assignments, and other course assignments will be completed
on an individual basis (unless specified otherwise). Students may discuss general concepts
with one another, but may not, under any circumstances, work together on the actual
implementation of any course assignment. If you work with other students on “general
concepts” be certain to acknowledge the collaboration and its extent in the assignment.
Unacknowledged collaboration will be considered dishonest. “Code sharing” (including
code from previous quarters) is strictly disallowed. “Copying” or significant collaboration
on any graded assignments will be considered a violation of the university guidelines for
academic honesty.
If the same work is turned in by two or more students, all parties involved will be held
equally accountable for violation of academic integrity. You are responsible for ensuring
that other students do not have access to your work: do not give another student access
to your account, do not leave printouts in the recycling bin, pick up your printouts
promptly, do not leave your workstation unattended, etc. If you suspect that your work
has been compromised notify me immediately. If you have any questions about
collaboration or any other issues related to academic integrity, please see me
immediately for clarification. In addition to the policy stated in this syllabus, students are
expected to comply with the Wright State University Code of Student Conduct
(http://www.wright.edu/students/judicial/conduct.html) and in particular the portions
Pertaining to Academic Integrity http://www.wright.edu/students/judicial/integrity.html at all times.
Note: In cases where there is suspicion of academic dishonesty, the professor and
teaching assistant reserve the right to address the matter by calling in the student for an in-person question and answer session.
Need Help for this Assignment ? Just Leave a WhatsApp message at : +91-995 3141 035 (For quick response)
Solution Includes: AI writing Detection and Plagiarism report with 100% Accuracy.

Comments