The goal of this assignment is to write some code that will be able to train a Multilayer Perceptron (MLP) regression model and make predictions based on that model. To do this, we will use the PyBrain Python package for constructing custom neural networks. PyBrain is compatible with Linux, MacOS and Windows. Feel free to choose any environment that you prefer. In many Python environments, installation of PyBrain is as simple as
pip install pybrain. However, if that does not work, or if you want an alternative installation method, there are other ways available.
For this assignment, place your Python code in the
homework08 folder of your assignments GitLab
repository and push your work by 11:59 PM Friday, November 30.
To create a
homework08 branch in your local repository, follow the
$ cd path/to/cse-40171-fa18-assignments # Go to assignments repository $ git remote add upstream https://gitlab.com/wscheirer/cse-40171-fa18-assignments # Switch back over to the main class repository $ git fetch upstream # Toggle the upstream branch $ git pull upstream master # Pull the files for homework08 $ git checkout -b homework08 # Create homework08 branch and check it out $ cd homework08 # Go into homework08 folder
We've spent a bit of time in class discussing machine learning tasks like classification and clustering, but we haven't said much about regression, where we don't want to assign a class label, but instead want to predict a continuous value for a feature vector. In other words, our output y is a real-valued prediction. This setup is useful for many problems in social and behavioral science. One example problem is predicting the median value of owner-occupied homes given a set of attributes about the home and the surrounding neighborhood. The Boston Housing Dataset is a classic dataset used by the machine learning community to evaluate regressors, which contains data for the aforementioned problem. Given a set of 13 continuous feature dimensions, the task is to predict the housing values in suburbs of Boston as the median value in the thousands of dollars. Download the individual training, validation, and testing files that have been prepared for this assignment. The feature dimensions are as follows (the first thirteen are the x values):
Use PyBrain to write a training program called
trainNet.py that will learn an MLP regression model from the
housing-validation.csv files. Here are some guidelines:
(n,)so make sure to reshape that vector:
y = y.reshape( -1, 1 ). The data set can then be prepared as follows:
dataset = SupervisedDataSet(inputSize, targetSize)
train()method outputs mean-square error by default). If training is working correctly, you should observe the error scores decreasing (with some fluctuations) as training proceeds.
The PyBrain documentation may be useful. Also feel free to use other reference code available on the web (cite any sources you used in your
After training, note the root-mean-square-error value achieved at epoch 1000 in your
Use PyBrain to write a prediction program called predictNet.py that will use the trained model from Activity 1 to make predictions for the feature vectors in the
housing-testing.csv file. Here are the guidelines for this activity:
y_dummy = np.zeros(y.shape). The ground-truth targets are not needed, thus you can construct the PyBrain dataset for making predictions as follows:
ds = SupervisedDataSet(inputSize, targetSize)
housing-testing.csvoutput a predicted target value (y). Save these predictions in plaintext format to the file
predictions.txt, and turn this file in with your submission.
How did your trained model do making predictions on the test data? Add your answer to your
If you have any questions, comments, or concerns regarding the course, please
provide your feedback at the end of your
To submit your assignment, please commit your work to the
homework08 branch in your assignment's GitLab repository:
$ cd path/to/cse-40171-fa18-assignments # Go to assignments repository $ git checkout master # Make sure we are in master branch $ git pull --rebase # Make sure we are up-to-date with GitLab $ git checkout -b homework08 # Create homework08 branch and check it out $ cd homework08 # Go to homework08 directory ... $ $EDITOR README.md # Edit appropriate README.md $ git add README.md # Mark changes for commit $ git commit -m "homework08: complete" # Record changes ... $ git push -u origin homework08 # Push branch to GitLab
Procedure for submitting your work: create a merge request by the process that is described here, but make sure to change the target branch from wscheirer/cse-40171-fa18-assignments to your personal fork's master branch so that your code is not visible to other students. Additionally, assign this merge request to your TA and add wscheirer, agraese, and AndroidKitKat as approvers (so all class staff can track your submission). Your assigned TA is agraese if you have a last name starting with A through Ki, or AndroidKitKat if you have a last name starting with Kl through W.