In-Class Challenge Assignment: Experimenting with the Perceptron

In-Class Challenge Assignment: Experimenting with the Perceptron#

Day 19 Extension#

CMSE 202#

Now that you have a working Perceptron Classifier… let’s experiment with it a bit!#

When building and testing your Perceptron Classifier you used a simplified version of the iris dataset that has been reduced to just two features and two class labels.

Will your Perceptron classifier work on a more complex dataset?#

Another widely used dataset for experimenting with binary classification is the sonar dataset.

A version of this dataset can be found here:

https://raw.githubusercontent.com/msu-cmse-courses/cmse202-supplemental-data/main/data/sonar.csv

Make sure you take a moment to read the UC Irvine Machine Learning Repository page to understand exactly what is in this dataset, but essentially is a collection of sonar measurements of rocks and “mines” (metal cynlinders).

Testing your new tool and exploring others#

With any time that you have left in class, see if you can accomplish the following:

Load up the sonar dataset and change the class labels so that they can be used with the Perceptron classifier.
Use the Perceptron classifier you built from scratch to see how well you can do at distinguishing rocks from mines. You may need to make some modifications to your code if you didn’t build it to be flexible enough to accept an arbitary number of data deatures. Experiment with the learning rate and number of iterations to see how high of an accuracy you can get with your classifier.
If you get your Perceptron classifier working, can you figure out how to use the Perceptron Classifier that is available in scikit-learn? You may need to do a bit of Google searching and exploration of the documentation to figure this out. How well does the scikit-learn version do compared to the one you built?

The logistic regression model (Day 15) is also a multi-variable classifier. Use it on the same dataset. Compare the results of your perceptron classifier against that obtained from and discuss your observations.

✅ Do This: Load up the sonar data.

# Put code here to start and generate new cells as needed.

Copy your percentron class to the cell below. Note sklearn has a Perceptron function. We should avoid using the same function name of your perceptron class.

# put your code here

Train your percentron class with sonar data. What’s the accuracy?

# put your code here

✅ Do This: Use the Perceptron function from sklearn library to classify the same dataset in the cell below. Compare to your percentron classifier, how is the performance of the percetron in the sklearn library?

# put your code here

✅ Do This: Use logistic regress model from statsmodel library to classify the same dataset in the cell below.

Note that the full sonar data set contains some values that will result in singular values in the logistic regression. Thus, we will use only the first 40 attritbutes (columns) in the sonar dataset. The class label is still the last column in the sonar dataset.
We will add constant to the model, which is equivalent to the bias weight.
The Logit function requires the labels to be 1 or 0. You’ll need to replace ‘-1’ in the labels to ‘0’ for the Logit function.
Let’s set test_size = 0.15 in the train-test split, and fit the model using the training set.
Predict the labels of the test set. How is the accuracy?

# put your code here

Train your percentron class with the training set.

Don’t forget that the labels in the sonar dataset is ‘-1’. You probably need to convert ‘0’ in the train_labels and test_labels back to ‘-1’.
Use the features in the test set in your percetron prediction function to predict the labels.

# put your code here

✅ Do This: Give a short discussion of the comparison between the results from the different classifiers.

Congratulations, we’re done!#

Now, you just need to submit this assignment by uploading it to the course Desire2Learn web page for today’s submission folder (Don’t forget to add your names in the first cell).