In Machine Learning one of the biggest problem faced by the practitioners in the process is choosing the correct set of hyper-parameters. And it takes a lot of time in tuning them accordingly, to stretch the accuracy numbers.
For instance lets take, SVC from well known library Scikit-Learn,
Introducing SMBO- Sequential Model Based Global Optimization. SMBO is one of the earlier versions of the hyperparameter optimization algorithms which takes next decision via observational history.
GP- Gaussian Processes approach is another such method.
One of the modern optimization algorithms, TPE (Tree Stuctured Parzen Estimator) approach which is an advanced tree based hyperparameter optimization algorithm.
Follow this link to know more on the former.
In the context of this article I will be highlighting Hyperopt, a module which implements TPE and use MongoDB trials and takes the optimization to the next level.
Lets first see what needs to be done while optimizing hyperparameters. An ML algorithm generally has a loss/cost function which needs to be minimized subject to the values of hyperparameters and parameters of the algorithm. Weights and Biases are decided on the basis of Optimization function of the ML algorithm. Selecting good hyperparameters further minimizes the loss fucntion, and makes model more robust and accurate, increasing the overall efficiency of the model. For instance in Gradient Descent, it is very important to select a good learning rate to reach the convergence point in shortest time, because if it is large the cost function will overshoot and won’t be able to find minima or if too small it will take forever to reach the minima.
So, Implementation of hyperopt enables us to minimize a loss function taking any number of variables. Lets work on a toy example, let the function be:
We have to minimize it, such that x and y belongs to a certain range R1 and R2.
To begin, lets get started with Installation firstI am assuming that you have python and sklearn installed, Hyperopt is a simple pip installation-
1 |
|
One of the most common error you face while using hyperopt for first time is incorrect version of networkx, downgrade it to version 1.11 and you are good to go
1 |
|
Now that you have installed hyperopt, lets see a code sample to minimize above function using hyperopt:
1 |
|
Now, lets go through how spaces are created then we will minimize the loss for a scikit-learn classifier.
Hyperopt Search Space:Unlike Scikit Learn’s Grid Search, Hyperopt search spaces does not take dictionary as it’s input. As can be seen in the example, I have used the list to define spaces for two different arguments x and y.
fmin() passes only one parameter to the function so, all the different spaces can be joined into one using either list, tuple or dictionary. It can be seen in the above example that I have extracted the two search spaces from the args parameter.More about different hyperopt spaces can be found here: https://github.com/hyperopt/hyperopt/wiki/FMin#21-parameter-expressions
I am mentionong few here which we will be using later-
hp.choice(label, options): label is a string input which refers to the hyperparameter, and options will contain a list, one element will be returned from the list for that particular label.
hp.uniform(label, low, high): Again the label will contain the string referring to hyperparameter and returns a value uniformly between low and high. And when optimizing, this variable is constrained to a two-sided interval.
hp.lognormal(label, mu, sigma): Returns a value drawn according to exp(normal(mu, sigma)) so that the logarithm of the return value is normally distributed. When optimizing, this variable is constrained to be positive.
Now, we’ll use a more complex hyperspace with a sklearn classifier and iris dataset, remember all you have to do is extract the correct hyperparameter search space from list/tuple/dictionary, we’ll use dictionary here:
1 |
|
See, how I created the space, first I used hp.choice so one of the two dictionaries is returned, now since we have to take parameter as per the classifier, I used simple if else statement to see which classifier it is. Then I accordingly assigned hyperopt spaces to different variables, after doing that I defined the clf model, and fit x_train and y_train to the model.fmin function will iterate on different sets of algorithm and their hyperparameters and return the the set on which loss is minimum. Note that I have minimized the loss on x_train and y_train, you can use cross validation and stuff to prevent overfitting.
So, another wrapper built on hyperopt is Hyperopt-Sklearn check it out here: https://github.com/hyperopt/hyperopt-sklearn
However, TPE can be exploited more when we compute the models with different sets of hyper-parameters in parallel instead of serial, Tree structured form of TPE makes it easy to draw many candidates at a single time to evaluate based on maximization of Expected Improvement. Refer to https://papers.nips.cc/paper/4443-algorithms-for-hyper-parameter-optimization.pdf
to check out the mathematics. And currently Hyperopt-Sklearn does not support MongoDB Trials.
Installation of MongoDB is pretty easy, just go here https://docs.mongodb.com/manual/tutorial/install-mongodb-on-ubuntu/#import-the-public-key-used-by-the-package-management-system and follow the instruction.
Lets start using Mongo with an example, and minimize a simple sin function.
Code will be:
1 |
|
Only one line needed to be changed, i.e we added a trials parameter, now we will go step by step on how to execute this code.
Start a local MongoDB serverTo start a mongodb server as daemon use commandsudo service mongod start
And this will get you the port number on which the server is createdlsof -i |grep mongo
Or you can also check the log files, but you’ll need root access,tail -f /var/log/mongodb/mongod.log
Usually by default port is localhost:27017.
Configure and Compile the fileAs can be seen in above example, a line calling MongoTrials was added
1 |
|
See that you don’t remove jobs at the end of the path, current implementation requires jobs at the end of name of database.Compile the python file, it will hang untill it gets the worker to connect it the mongodb sever.
Running Hyperopt Mongo Worker
1 |
|
This will start the worker, and the jobs assigned by the MongoTrials will be sent to the server which parallelizes the evaluations.
So, it was an easy three step process for a simple sin() function, however the problem occurs when you create an objective function, so in the final section we’ll discuss about the errors that we usually see while executing worker.Again lets consider an example where we want to minimize an objective function:
1 |
|
So, start the Mongo server, compile the file, but when you execute the hyperopt-mongo-worker command you might face this kind of error:
1 |
|
So to solve this error, make a seperate file(eg. objective.py) for the objective function:
1 |
|
and call it inside the file(hyperopt.py) where you are using hyperopt
1 |
|
Still one more step to go,Copy the file objective.py to the file path where hyperopt-mongo-worker is present, mine was this one- /home/greatskull/anaconda2/bin/
as you can see in the error above.
Now compile the hyperopt.py to and connect it to worker again using same worker command
1 |
|
It should work now.
If you plan to save results, like for each iteration or something from objective.py file, check the current file path while executing the program by
1 |
|
Doubts? Ask in comments.
Tanay Agrawaltanay_agrawal@hotmail.comMachine Learning/Deep Learning Enthusiast