In-Class Challenge Assignment: Experimenting with the Perceptron

In-Class Challenge Assignment: Experimenting with the Perceptron#

Day 19 Extension#

CMSE 202#

Now that you have a working Perceptron Classifier… let’s experiment with it a bit!#

When building and testing your Perceptron Classifier you used a simplified version of the iris dataset that has been reduced to just two features and two class labels.

Will your Perceptron classifier work on a more complex dataset?#

Another widely used dataset for experimenting with binary classification is the sonar dataset.

A version of this dataset can be found here:

https://raw.githubusercontent.com/msu-cmse-courses/cmse202-supplemental-data/main/data/sonar_pca10_attr.csv

Make sure you take a moment to read the UC Irvine Machine Learning Repository page to understand exactly what is in this dataset, but essentially is a collection of sonar measurements of rocks and “mines” (metal cynlinders). We are using a version of the dataset that has been reduced to 10 principal components (instead of the original 60 attributes) to make it easier to work with. The class labels are still the same as in the original dataset, and are located in the last column of the dataset.

Testing your new tool and exploring others#

In class today, try to accomplish the following:

Load up the sonar dataset and change the class labels so that they can be used with the Perceptron classifier.
Use the Perceptron classifier you built from scratch to see how well you can do at distinguishing rocks from mines. You may need to make some modifications to your code if you didn’t build it to be flexible enough to accept an arbitary number of data deatures. Experiment with the learning rate and number of iterations to see how high of an accuracy you can get with your classifier.
If you get your Perceptron classifier working, can you figure out how to use the Perceptron Classifier that is available in scikit-learn? You may need to do a bit of Google searching and exploration of the documentation to figure this out. How well does the scikit-learn version do compared to the one you built?

The logistic regression model (Day 15) is also a multi-variable classifier. Use it on the same dataset. Compare the results of your perceptron classifier against that obtained from and discuss your observations.

✅ Do This: Load the sonar data. Make a variable with the features and a variable with the labels.

# Put code here to start and generate new cells as needed.

Copy your percentron class to the cell below. Note sklearn has a Perceptron function. We should avoid using the same function name of your perceptron class.

Note: You may need to modify your code to take the new label definitions into account.

# put your code here

Train your percentron class with sonar data. What’s the accuracy?

# put your code here

✅ Do This: Use the Perceptron function from sklearn library to classify the same dataset in the cell below. Compared to your perceptron classifier, how is the performance of the perceptron in the sklearn library?

# put your code here

✅ Do This: Use logistic regression model from statsmodel library to classify the same dataset in the cell below.

Reminder that we will need add a constant to the input features, which is equivalent to the bias weight.
The Logit function requires the labels to be different values than the current [-1, 1] values (WHY??- Discuss with your group). You’ll need to replace the -1 label to make it work.
Let’s set the test set size to 15% in the train-test split, and fit the model using the training set.
Predict the labels of the test set. What is the accuracy on the test set?

# put your code here

✅ Do This: In the cell below, figure our how the accuracy compares between the training set and test set for this logistic regression model? Discuss your observations with your group.

# Put your code here

✅ Do This: Now, Train your custom percentron class with the training set.

Make sure your labels are correctly defined for the perceptron classifier.
Use the features in the test set in your perceptron prediction function to predict the labels.
What is the accuracy of your perceptron classifier on the test set? How does it compare to the logistic regression model?

# put your code here

✅ Do This: Provide a short discussion of the comparison between the results from the different classifiers.

Congratulations, we’re done!#

Now, you just need to submit this assignment by uploading it to the course Desire2Learn web page for today’s submission folder (Don’t forget to add your names in the first cell).