Mushroom Classification


Mushroom classification is a famous ML project because of its categorical features. The aim of this project is to create an end to and ML model with database integration to store user inputs.

The Data Demo

DATA

This dataset was originally contributed to the UCI Machine Learning repository nearly 30 years ago. This dataset includes descriptions of hypothetical samples corresponding to 23 species of gilled mushrooms in the Agaricus and Lepiota Family Mushroom drawn from The Audubon Society Field Guide to North American Mushrooms (1981). Each species is identified as definitely edible, definitely poisonous, or of unknown edibility and not recommended. This latter class was combined with the poisonous one.

Technical Aspect

This is a simple flask app for classifying mushroom as edibile or poisonous.

Tools : Python( Flask, Numpy, Pandas,), HTML, CSS, MySQL

The project is divided into three parts.
  1. Exploring the data to find insights.
    • Dataset have ~8000 rows and 23 columns(all are categorical features)
    • Odor is strongly indicative of what mushrooms are (edible/poisonous). Here edible and poisonous mushrooms have largely distinct smells. Edible mushrooms can have almond, anise, or no odor while any other smell indicates poisonous mushrooms. None odor mushrooms can be either edible or poisonous. Based on bruises, we cant distinguish mushrooms easily. Only edible mushrooms have abundant or numerous population. Most of the mushrooms with population several is poisonous. Only edible musrooms have waste type habitat. For any other habitat types mushrooms can be either edible or poisonous.
    • Most of the mushrooms in the dataset are free of gill attachement. Based on gill size, mushrooms are mainly under the broad category. We cant classify the mushrooms based on gill attachment, gill size or gill spacing. Only poisonous mushrooms have buff or green gill color. Only edible mushrooms have red or orange gill color.
    • Only poisonous mushrooms do not have rings. Only edible mushrooms have flaring ring type and only poisonous mushrooms have none ring type. Only edible mushrooms have brown veil color and only poisonous mushrooms have yellow veil color. With in the dataset, only partial veil type mushrooms are there.
    • Most common stalk surface type in mushrooom is smooth. Commonly mushrooms have stalk color above and below ring as white. All white color mushrooms are not edible. We cant eat mushroom if its color is simply white. Only edible mushrooms have stalk color above ring as gray, red, or orange. Only Poisonous mushrooms have stalk color above ring as buff, cinnamon or yellow. Stalk color below ring is grey,red or orange for only edible mushrooms where as buff for only poisonous mushrooms.
    • Based on stalk shape, we cant distinguish mushrooms easily. Only edible mushrooms have rooted stalk root. If stalk root is bulbous, there is equal chance that mushrooms can be edible or poisonous.
    • Only edible mushrooms have green or purple cap-color. Only poisonous mushrooms have grooves as cap-surface. Only poisonous mushrooms have conical cap shape and only edible mushrooms have sunken cap shape. Only edible mushrooms have buff, orange, purple or yellow spore print color and only poisonous mushrooms have green spore print color.

  2. Training classification model
    • Encoded categorical features
    • Applied ML models: Logistic Regression, Decision Tree Classifier, KNN Classifier, SVM, Random Forest Classifier, Gradient Boosting Classifier, Exterme Gradient Boosting, Naive Bayes Classifier
    • Most of the models were overfitting the data. Therefore, applied chi-square test to find important features in the data.
    • The best model was SVM linear with an accuracy of 0.913

  3. Building a Flask web app for end user.
    • Saved best model as pickle file and developed web app using Flask
    • Created database and table as per the data requirements
    • Integrated web page with MySQL to store user inputs for prediction

Demo


/

View other projects Bact to top