Homework 04: Neural Network Model Search with Hyperopt

Hyperopt is a Python library for optimizing over awkward search spaces with real-valued, discrete, and conditional dimensions. It is very useful for automating the tuning of hyperparameters and other aspects of artificial neural network training. In this assignment you will have the opportunity to search for neural network models for a computer vision task and a natural language processing task, with the objective of maximizing performance on those particular tasks. This is akin to the high-throughput screening approach to AI search problems we discussed in class. Let's see how well you can do given your choice of free parameters to optimize, search spaces, and amount of time spent searching.

For this assignment, record your responses to the following activities in the README.md file in the homework04 folder of your assignments GitLab repository and push it and any code you developed by 11:59 PM Friday, October 18.

Activity 0: Branching

As discussed in class, each homework assignment must be completed in its own git branch; this will allow you to separate the work of each assignment and for you to use the merge request workflow.

First, follow these instructions to setup your git environment.

To create a homework04 branch in your local repository, follow the instructions below:

$ cd path/to/cse-40171-fa19-assignments   # Go to assignments repository

$ git checkout master                     # Make sure we are in master branch

$ git pull --rebase                       # Make sure we are up-to-date with GitLab

$ git checkout -b homework04              # Create homework04 branch and check it out

$ cd homework04                           # Go into homework04 folder

Once these commands have been successfully performed, you are now ready to add, commit, and push any work required for this assignment.

Activity 1: Installing and Testing Hyperopt

Let's begin by installing the Hyperopt package. On the same system or virtual machine you installed PyTorch on in Homework 01, simply run pip3 install hyperopt. Alternatively, you can try the developer installation:

git clone https://github.com/jaberg/hyperopt.git (cd hyperopt && python setup.py develop) (cd hyperopt && nosetests)

If the installation was successful, you should be able to run this test code and approximately reproduce the expected results in the comments at the end of the code snippet. Note that Hyperopt can make use of a distributed computing environment by utilizing a MongoDB database. For this assignment, there is no need to set that up (but you can if you want). If you received some errors during the pip3 install, you are likely missing dependencies. All of those can also be installed via pip3.

Take a look at Hyperopt's documentation. Pay particular attention to the three major components of a Hyperopt search job: defining an objective function, defining a search space, and minimizing the objective over the space. Looking at the test code you just ran, you will see examples of all three. For a neural network model search problem, the objective function will define a training function that Hyperopt will attempt to minimize. For clarity's sake, you will likely want to move your actual training code to separate methods, leaving the objective as scaffolding around the training routines that passes values from the search space to them. Hyperopt will perform the search for you via the fmin method.

Tip: get started on this assignment right away. The search tasks in the following activities will require quite a bit of time to run. If you encounter any trouble installing Hyperopt, contact the course staff ASAP. We are available if you need help.

Activity 2: Searching for a Good Insect Image Classifier (50 Points)

Now that you have a working installation of Hyperopt and some familiarity with how to use it, let's turn to a real model screening task: searching for the best possible classifier that can distinguish between images of ants and bees. We'll treat training in this instance as a fine-tuning task, where we will begin with a Convolutional Neural Network (CNN) model that is pre-trained on the ImageNet dataset. To ensure that you will be able to train this model in a reasonable amount of time using only a CPU, we will use a PyTorch implementation of ResNet18 as our pre-trained model. The goal of fine-tuning for our classifier training in this instance is to save time by skipping the random initialization of the weights in the entire network, and instead freeze the weights for all layers except the last fully-connected layer. This last layer is replaced with a new one with random weights, which will be adjusted during training. We want to perform this training many times via Hyperopt to search for a good configuration of hyperparameters that will maximize the performance of the classifier on the validation set that is available during training.

First, download the image data (photos of ants and bees like the ones above) and unzip it into your homework04/ folder. Next, download the PyTorch code, also into the homework04/ folder, for the standalone fine-tuning of ResNet18 for the insect classification problem. Examining the code, you will see that it performs some image pre-processing, loads the pre-trained ResNet18 model, and sets things up to train the last fully-connected layer with a set of specified hyperparameters. Run the code, the model shouldn't be very good:

$ python3 A2.py Epoch 0/24 ---------- train Loss: 0.7963 Acc: 0.7172 val Loss: 2.8741 Acc: 0.5294 <SNIP> Epoch 24/24 ---------- train Loss: 0.5503 Acc: 0.7377 val Loss: 0.5414 Acc: 0.7124 Training complete in 9m 5s Best val Acc: 0.777778

You should see accuracy results that are reasonably close to the ones above. Fortunately, it is possible to do much better on this task. Code up a Hyperopt search framework, including an objective function and search space, that is able to randomly search for good hyperparameter configurations for this ResNet18 architecture over a period of time, using the A2.py code as the basis for training. It is up to you to choose which hyperparameters you want to optimize. Minimize your objective function using accuracy on the validation set as the statistic. Put your code in a source file named A2Opt.py

Run a search for as long as you'd like. If you have created a good search problem, then you should be able to beat the baseline accuracy of the A2.py code. Document your best validation accuracy and the set of hyperparameters that led to it in your README.md file. Also document how many search jobs you ran, and how long it took.

Activity 3: Searching for a Good N-Gram Language Model (50 Points)

We can also perform a similar search for natural language processing problems. Word embeddings make use of a language model and feature learning strategy to map words or phrases from a text to vectors of real numbers. This is useful, because it lets us go from a high-dimensional space that is inefficient to process to a much lower dimensional space that better represents the relationships between linguistic elements. If we've done a good job at building the embedding model, we can pursue applications related to semantic analysis, syntactic parsing, and sentiment analysis. Let's see how well we can do looking for high quality language models that make use of N-Gram features to form an embedding space.

First, download the PyTorch code into the homework04/ folder for the standalone training and evaluation of an N-Gram Language model for this poem by Shel Silverstein. After training, the code will calculate the probability of the word that is closer to "Captain", either "Hook" or "Sardine". Given the text of the poem, the correct answer, of course, is "Hook". Out of the box, this code needs some optimization. As you can see, it is not always correct:

$ python3 A3.py [(['Captain', 'Hook'], 'must'), (['Hook', 'must'], 'remember'), (['must', 'remember'], 'Not')] [220.69850540161133, 220.6711208820343, 220.6437439918518, 220.61636543273926, 220.5889949798584, 220.5616250038147, 220.53425526618958, 220.50688862800598, 220.47952270507812, 220.45215702056885, 220.4247977733612, 220.3974359035492, 220.37009930610657, 220.3427698612213, 220.3154377937317, 220.2881097793579, 220.2607879638672, 220.2334656715393, 220.20614457130432, 220.1788260936737, 220.15150952339172, 220.1241946220398, 220.0968885421753, 220.06957125663757, 220.04225492477417, 220.01494073867798, 219.9876356124878, 219.96036577224731, 219.93311190605164, 219.90585827827454, 219.87860560417175, 219.85135293006897, 219.82410502433777, 219.79685640335083, 219.76960682868958, 219.74237489700317, 219.71514010429382, 219.68790912628174, 219.6606798171997, 219.63345170021057, 219.6062252521515, 219.5789976119995, 219.55177927017212, 219.52455925941467, 219.49735593795776, 219.47014594078064, 219.44294118881226, 219.4157338142395, 219.3885314464569, 219.36131286621094, 219.33409476280212, 219.30687928199768, 219.27966690063477, 219.2524540424347, 219.22524046897888, 219.1980481147766, 219.17085647583008, 219.1436686515808, 219.11648035049438, 219.0892903804779, 219.06210827827454, 219.0349247455597, 219.0077440738678, 218.98056173324585, 218.95338678359985, 218.9262125492096, 218.89904689788818, 218.87188076972961, 218.84471535682678, 218.81755137443542, 218.79039096832275, 218.7632441520691, 218.73609852790833, 218.70895862579346, 218.68181824684143, 218.65468311309814, 218.62754440307617, 218.60040712356567, 218.57327032089233, 218.54613876342773, 218.51900672912598, 218.49187684059143, 218.46475458145142, 218.43763303756714, 218.4105088710785, 218.383385181427, 218.35626554489136, 218.3291506767273, 218.3020396232605, 218.27492928504944, 218.247820854187, 218.2207226753235, 218.19362211227417, 218.16651797294617, 218.13939309120178, 218.11226987838745, 218.08514738082886, 218.05805277824402, 218.03100061416626, 218.0039541721344] Predicted word (probability): sardine

Try running the A3.py ten times to verify that the final prediction is not always correct. Let's unpack the operation of the code a bit further. The first line shows some example word-level trigrams that were extracted from the text. These are the base-level features that are used as input to train the language model. Trigrams are handy, in that they express the probabilistic relationship between sequences of words. In other words, some words are more likely to follow other words, while others are far less likely. Using this information, the code then trains a neural language model to build the embedding space. The long second line in the output above shows the loss at each training epoch. Given the very slow decrease in loss across epochs, it looks like this run wasn't particularly good. That is confirmed by the final wrong prediction when we apply the language model to assess word similarity.

It is possible to change the hyperparameters in this code to reliably train a more effective model. Code up a Hyperopt search framework, including an objective function and search space, that is able to randomly search for good hyperparameter configurations for this N-Gram language model architecture over a period of time, using the A3.py code as the basis for training. It is up to you to choose which hyperparameters you want to optimize. Minimize your objective function using the training loss as the statistic. Put your code in a source file named A3Opt.py

Run a search for as long as you'd like. If you have created a good search problem, then you should be able to reliably get the correct prediction of "Hook" after each training run. Verify this by training with the best set of hyperparameters ten times, checking the prediction after each training run. Document the lowest loss value achieved by Hyperopt and the set of hyperparameters that led to it in your README.md file. Also document how many search jobs you ran, and how long it took.

Feedback

If you have any questions, comments, or concerns regarding the course, please provide your feedback at the end of your README.md.

Submission

Remember to put your name in the README.md file. To submit your assignment, please commit your work to the homework04 folder of your homework04 branch in your assignment's GitLab repository:

$ cd path/to/cse-40171-fa19-assignments   # Go to assignments repository
$ git checkout master                     # Make sure we are in master branch
$ git pull --rebase                       # Make sure we are up-to-date with GitLab
$ git checkout -b homework04              # Create homework04 branch and check it out
$ cd homework04                           # Go to homework04 directory
...
$ $EDITOR README.md                       # Edit appropriate README.md
$ git add README.md                       # Mark changes for commit
$ git commit -m "homework04: complete"    # Record changes
...
$ git push -u origin homework04           # Push branch to GitLab

Procedure for submitting your work: create a merge request by the process that is described here, but make sure to change the target branch from wscheirer/cse-40567-sp19-assignments to your personal fork's master branch so that your code is not visible to other students. Additionally, assign this merge request to our TA (sabraha2) and add wscheirer as an approver (so all class staff can track your submission).