{
 "cells": [
  {
   "attachments": {},
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "<!-- TODO review here 8/11/23 -->\n",
    "Running Python code requires a running Python kernel. Click the {fa}`rocket` --> {guilabel}`Live Code` button above on this page to run the code below.\n",
    "\n",
    "```{warning}\n",
    "🚧 This site is under construction! As of now, the Python kernel may not run on the page or have very long wait times. Also, expect typos.👷🏽‍♀️\n",
    "```\n",
    "(sup_class_ex)=\n",
    "# Example: Supervised Classification App\n",
    "\n",
    "A supervised classification method fits the project requirements well and is so a good place to start. The nature of your Data and organizational needs dictate which methods you can use. So what type of data works with supervised classification methods? \n",
    "\n",
    "- One of the features (columns) contains mutually exclusive *categories* you want to predict (the dependent variable).\n",
    "- At least one other feature (the independent variable(s)).\n",
    "\n",
    ":::{margin}\n",
    "Classifying non-mutually exclusive categories is called *multi-label* or *mult-output* classification. Not to be confused with *multiclass* classification presented in this example, multi-label classification requires different techniques, particularly with measuring accuracy. See [Introduction to Multi-label Classification](https://www.geeksforgeeks.org/an-introduction-to-multilabel-classification/) for more information.     \n",
    ":::\n",
    "\n",
    "This will be a simple example. Simple data. Simple model. Simple interface. However, it does demonstrate the minimum requirements for [part C](task2c). We'll also show how things can progressively be improved, building on the *working* code. Simple is a great place to start -scaling up is typically easier than going in the other direction. \n",
    "\n",
    "<p float=\"center\">\n",
    "  <img src='https://raw.githubusercontent.com/ashejim/C964/main/url_images/iris_dim.png' height=\"250\" alt =\"Purple iris with arrows denoting the width and length of the petal and sepal.\"/>\n",
    "  <img src='https://raw.githubusercontent.com/ashejim/C964/main/url_images/plot_iris_svc.png' height=\"250\" alt=\"Gird of four panels showing 2D regions defining iris type classification using independent variables sepal length and width determined by SVC using linear, linearSVC, RBF, and degree 3 polynomial kernels. The data points in each panel are divided into blue, light-blue, and red regions.\"/> \n",
    "</p>\n",
    "\n",
    "Let's look at the famous [Fisher's Iris data set](https://en.wikipedia.org/wiki/Iris_flower_data_set): "
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 2,
   "id": "b053efd1",
   "metadata": {
    "tags": [
     "hide-input"
    ]
   },
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>sepal-length</th>\n",
       "      <th>sepal-width</th>\n",
       "      <th>petal-length</th>\n",
       "      <th>petal-width</th>\n",
       "      <th>type</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>0</th>\n",
       "      <td>5.1</td>\n",
       "      <td>3.5</td>\n",
       "      <td>1.4</td>\n",
       "      <td>0.2</td>\n",
       "      <td>Iris-setosa</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>1</th>\n",
       "      <td>4.9</td>\n",
       "      <td>3.0</td>\n",
       "      <td>1.4</td>\n",
       "      <td>0.2</td>\n",
       "      <td>Iris-setosa</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2</th>\n",
       "      <td>4.7</td>\n",
       "      <td>3.2</td>\n",
       "      <td>1.3</td>\n",
       "      <td>0.2</td>\n",
       "      <td>Iris-setosa</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>3</th>\n",
       "      <td>4.6</td>\n",
       "      <td>3.1</td>\n",
       "      <td>1.5</td>\n",
       "      <td>0.2</td>\n",
       "      <td>Iris-setosa</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>4</th>\n",
       "      <td>5.0</td>\n",
       "      <td>3.6</td>\n",
       "      <td>1.4</td>\n",
       "      <td>0.2</td>\n",
       "      <td>Iris-setosa</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>...</th>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>145</th>\n",
       "      <td>6.7</td>\n",
       "      <td>3.0</td>\n",
       "      <td>5.2</td>\n",
       "      <td>2.3</td>\n",
       "      <td>Iris-virginica</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>146</th>\n",
       "      <td>6.3</td>\n",
       "      <td>2.5</td>\n",
       "      <td>5.0</td>\n",
       "      <td>1.9</td>\n",
       "      <td>Iris-virginica</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>147</th>\n",
       "      <td>6.5</td>\n",
       "      <td>3.0</td>\n",
       "      <td>5.2</td>\n",
       "      <td>2.0</td>\n",
       "      <td>Iris-virginica</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>148</th>\n",
       "      <td>6.2</td>\n",
       "      <td>3.4</td>\n",
       "      <td>5.4</td>\n",
       "      <td>2.3</td>\n",
       "      <td>Iris-virginica</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>149</th>\n",
       "      <td>5.9</td>\n",
       "      <td>3.0</td>\n",
       "      <td>5.1</td>\n",
       "      <td>1.8</td>\n",
       "      <td>Iris-virginica</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "     sepal-length  sepal-width  petal-length  petal-width            type\n",
       "0             5.1          3.5           1.4          0.2     Iris-setosa\n",
       "1             4.9          3.0           1.4          0.2     Iris-setosa\n",
       "2             4.7          3.2           1.3          0.2     Iris-setosa\n",
       "3             4.6          3.1           1.5          0.2     Iris-setosa\n",
       "4             5.0          3.6           1.4          0.2     Iris-setosa\n",
       "..            ...          ...           ...          ...             ...\n",
       "145           6.7          3.0           5.2          2.3  Iris-virginica\n",
       "146           6.3          2.5           5.0          1.9  Iris-virginica\n",
       "147           6.5          3.0           5.2          2.0  Iris-virginica\n",
       "148           6.2          3.4           5.4          2.3  Iris-virginica\n",
       "149           5.9          3.0           5.1          1.8  Iris-virginica"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    }
   ],
   "source": [
    "#We'll import libraries as needed, but when submitting, \n",
    "# it's best having them all at the top.\n",
    "import pandas as pd\n",
    "\n",
    "# Load this well-worn dataset:\n",
    "url = \"https://raw.githubusercontent.com/jbrownlee/Datasets/master/iris.csv\"\n",
    "df = pd.read_csv(url) #read CSV into Python as a DataFrame\n",
    "df # displays the DataFrame\n",
    "\n",
    "column_names = ['sepal-length', 'sepal-width', 'petal-length', 'petal-width', 'type']\n",
    "df = pd.read_csv(url, names = column_names) #read CSV into Python as a DataFrame\n",
    "pd.options.display.show_dimensions = False #suppresses dimension output\n",
    "display(df)\n",
    "#Code hide and toggle managed with Jupyter meta-code 'tags.'"
   ]
  },
  {
   "attachments": {},
   "cell_type": "markdown",
   "id": "3e1087d3",
   "metadata": {},
   "source": [
    "Though we described everything as \"simple,\" we'll also see that this dataset is quite *rich* with angles to investigate. At this point, we have many options, but for a classification project we need a categorical feature as our dependent variable, and for this, we only have the choice: **type**."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 3,
   "id": "9edff8f0-c39e-41f0-ba20-738072a29da3",
   "metadata": {
    "tags": [
     "hide-input"
    ]
   },
   "outputs": [
    {
     "data": {
      "text/html": [
       "<style type=\"text/css\">\n",
       "#T_0f91b_row0_col4, #T_0f91b_row1_col4, #T_0f91b_row2_col4, #T_0f91b_row3_col4, #T_0f91b_row4_col4, #T_0f91b_row5_col4, #T_0f91b_row6_col4, #T_0f91b_row7_col4, #T_0f91b_row8_col4, #T_0f91b_row9_col4, #T_0f91b_row10_col4 {\n",
       "  background-color: yellow;\n",
       "}\n",
       "</style>\n",
       "<table id=\"T_0f91b\">\n",
       "  <thead>\n",
       "    <tr>\n",
       "      <th class=\"blank level0\" >&nbsp;</th>\n",
       "      <th id=\"T_0f91b_level0_col0\" class=\"col_heading level0 col0\" >sepal-length</th>\n",
       "      <th id=\"T_0f91b_level0_col1\" class=\"col_heading level0 col1\" >sepal-width</th>\n",
       "      <th id=\"T_0f91b_level0_col2\" class=\"col_heading level0 col2\" >petal-length</th>\n",
       "      <th id=\"T_0f91b_level0_col3\" class=\"col_heading level0 col3\" >petal-width</th>\n",
       "      <th id=\"T_0f91b_level0_col4\" class=\"col_heading level0 col4\" >type</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th id=\"T_0f91b_level0_row0\" class=\"row_heading level0 row0\" >0</th>\n",
       "      <td id=\"T_0f91b_row0_col0\" class=\"data row0 col0\" >5.1</td>\n",
       "      <td id=\"T_0f91b_row0_col1\" class=\"data row0 col1\" >3.5</td>\n",
       "      <td id=\"T_0f91b_row0_col2\" class=\"data row0 col2\" >1.4</td>\n",
       "      <td id=\"T_0f91b_row0_col3\" class=\"data row0 col3\" >0.2</td>\n",
       "      <td id=\"T_0f91b_row0_col4\" class=\"data row0 col4\" >Iris-setosa</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th id=\"T_0f91b_level0_row1\" class=\"row_heading level0 row1\" >1</th>\n",
       "      <td id=\"T_0f91b_row1_col0\" class=\"data row1 col0\" >4.9</td>\n",
       "      <td id=\"T_0f91b_row1_col1\" class=\"data row1 col1\" >3.0</td>\n",
       "      <td id=\"T_0f91b_row1_col2\" class=\"data row1 col2\" >1.4</td>\n",
       "      <td id=\"T_0f91b_row1_col3\" class=\"data row1 col3\" >0.2</td>\n",
       "      <td id=\"T_0f91b_row1_col4\" class=\"data row1 col4\" >Iris-setosa</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th id=\"T_0f91b_level0_row2\" class=\"row_heading level0 row2\" >2</th>\n",
       "      <td id=\"T_0f91b_row2_col0\" class=\"data row2 col0\" >4.7</td>\n",
       "      <td id=\"T_0f91b_row2_col1\" class=\"data row2 col1\" >3.2</td>\n",
       "      <td id=\"T_0f91b_row2_col2\" class=\"data row2 col2\" >1.3</td>\n",
       "      <td id=\"T_0f91b_row2_col3\" class=\"data row2 col3\" >0.2</td>\n",
       "      <td id=\"T_0f91b_row2_col4\" class=\"data row2 col4\" >Iris-setosa</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th id=\"T_0f91b_level0_row3\" class=\"row_heading level0 row3\" >3</th>\n",
       "      <td id=\"T_0f91b_row3_col0\" class=\"data row3 col0\" >4.6</td>\n",
       "      <td id=\"T_0f91b_row3_col1\" class=\"data row3 col1\" >3.1</td>\n",
       "      <td id=\"T_0f91b_row3_col2\" class=\"data row3 col2\" >1.5</td>\n",
       "      <td id=\"T_0f91b_row3_col3\" class=\"data row3 col3\" >0.2</td>\n",
       "      <td id=\"T_0f91b_row3_col4\" class=\"data row3 col4\" >Iris-setosa</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th id=\"T_0f91b_level0_row4\" class=\"row_heading level0 row4\" >4</th>\n",
       "      <td id=\"T_0f91b_row4_col0\" class=\"data row4 col0\" >5.0</td>\n",
       "      <td id=\"T_0f91b_row4_col1\" class=\"data row4 col1\" >3.6</td>\n",
       "      <td id=\"T_0f91b_row4_col2\" class=\"data row4 col2\" >1.4</td>\n",
       "      <td id=\"T_0f91b_row4_col3\" class=\"data row4 col3\" >0.2</td>\n",
       "      <td id=\"T_0f91b_row4_col4\" class=\"data row4 col4\" >Iris-setosa</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th id=\"T_0f91b_level0_row5\" class=\"row_heading level0 row5\" >...</th>\n",
       "      <td id=\"T_0f91b_row5_col0\" class=\"data row5 col0\" >...</td>\n",
       "      <td id=\"T_0f91b_row5_col1\" class=\"data row5 col1\" >...</td>\n",
       "      <td id=\"T_0f91b_row5_col2\" class=\"data row5 col2\" >...</td>\n",
       "      <td id=\"T_0f91b_row5_col3\" class=\"data row5 col3\" >...</td>\n",
       "      <td id=\"T_0f91b_row5_col4\" class=\"data row5 col4\" >...</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th id=\"T_0f91b_level0_row6\" class=\"row_heading level0 row6\" >145</th>\n",
       "      <td id=\"T_0f91b_row6_col0\" class=\"data row6 col0\" >6.7</td>\n",
       "      <td id=\"T_0f91b_row6_col1\" class=\"data row6 col1\" >3.0</td>\n",
       "      <td id=\"T_0f91b_row6_col2\" class=\"data row6 col2\" >5.2</td>\n",
       "      <td id=\"T_0f91b_row6_col3\" class=\"data row6 col3\" >2.3</td>\n",
       "      <td id=\"T_0f91b_row6_col4\" class=\"data row6 col4\" >Iris-virginica</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th id=\"T_0f91b_level0_row7\" class=\"row_heading level0 row7\" >146</th>\n",
       "      <td id=\"T_0f91b_row7_col0\" class=\"data row7 col0\" >6.3</td>\n",
       "      <td id=\"T_0f91b_row7_col1\" class=\"data row7 col1\" >2.5</td>\n",
       "      <td id=\"T_0f91b_row7_col2\" class=\"data row7 col2\" >5.0</td>\n",
       "      <td id=\"T_0f91b_row7_col3\" class=\"data row7 col3\" >1.9</td>\n",
       "      <td id=\"T_0f91b_row7_col4\" class=\"data row7 col4\" >Iris-virginica</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th id=\"T_0f91b_level0_row8\" class=\"row_heading level0 row8\" >147</th>\n",
       "      <td id=\"T_0f91b_row8_col0\" class=\"data row8 col0\" >6.5</td>\n",
       "      <td id=\"T_0f91b_row8_col1\" class=\"data row8 col1\" >3.0</td>\n",
       "      <td id=\"T_0f91b_row8_col2\" class=\"data row8 col2\" >5.2</td>\n",
       "      <td id=\"T_0f91b_row8_col3\" class=\"data row8 col3\" >2.0</td>\n",
       "      <td id=\"T_0f91b_row8_col4\" class=\"data row8 col4\" >Iris-virginica</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th id=\"T_0f91b_level0_row9\" class=\"row_heading level0 row9\" >148</th>\n",
       "      <td id=\"T_0f91b_row9_col0\" class=\"data row9 col0\" >6.2</td>\n",
       "      <td id=\"T_0f91b_row9_col1\" class=\"data row9 col1\" >3.4</td>\n",
       "      <td id=\"T_0f91b_row9_col2\" class=\"data row9 col2\" >5.4</td>\n",
       "      <td id=\"T_0f91b_row9_col3\" class=\"data row9 col3\" >2.3</td>\n",
       "      <td id=\"T_0f91b_row9_col4\" class=\"data row9 col4\" >Iris-virginica</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th id=\"T_0f91b_level0_row10\" class=\"row_heading level0 row10\" >149</th>\n",
       "      <td id=\"T_0f91b_row10_col0\" class=\"data row10 col0\" >5.9</td>\n",
       "      <td id=\"T_0f91b_row10_col1\" class=\"data row10 col1\" >3.0</td>\n",
       "      <td id=\"T_0f91b_row10_col2\" class=\"data row10 col2\" >5.1</td>\n",
       "      <td id=\"T_0f91b_row10_col3\" class=\"data row10 col3\" >1.8</td>\n",
       "      <td id=\"T_0f91b_row10_col4\" class=\"data row10 col4\" >Iris-virginica</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n"
      ],
      "text/plain": [
       "<pandas.io.formats.style.Styler at 0x20204957810>"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    }
   ],
   "source": [
    "##preserves Jupyter preview style (the '...') after applying .style. This is for presentation only. \n",
    "def display_df(dataframe, column_names, highlighted_col, precision=2):\n",
    "    pd.set_option(\"display.precision\", 2)\n",
    "    columns_dict = {}\n",
    "    for i in column_names:\n",
    "        columns_dict[i] ='...'\n",
    "    df2 = pd.concat([dataframe.iloc[:5,:],\n",
    "                       pd.DataFrame(index=['...'], data=columns_dict),\n",
    "                       dataframe.iloc[-5:,:]]).style.format(precision = precision).set_properties(subset=[highlighted_col], **{'background-color': 'yellow'})\n",
    "    pd.options.display.show_dimensions = True\n",
    "    display(df2)\n",
    "\n",
    "#display dataframe with highlighted column \n",
    "display_df(df, column_names, 'type', 1)"
   ]
  },
  {
   "attachments": {},
   "cell_type": "markdown",
   "id": "96de1903-332e-49f8-b7e7-96b50fc940ee",
   "metadata": {
    "tags": []
   },
   "source": [
    ":::{sidebar} Watch\n",
    "<iframe src=\"https://wgu.hosted.panopto.com/Panopto/Pages/Embed.aspx?id=ae4e4987-3196-4b67-9752-ae010137b64c&autoplay=false&offerviewer=true&showtitle=true&showbrand=true&captions=true&interactivity=all\" title=\"Simple ML supervised classification example \" style=\"border: 1px solid #464646;\" class=\"center\" allowfullscreen allow=\"autoplay\" alt= \"Preview screenshot for the video: Simple machine learning supervised classification coding python example by Dr. Jim Ashe. You can click on the image to play the video.\">\n",
    "</iframe>\n",
    ":::\n",
    "\n",
    "The highlighted column, **type** provides a category to predict/classify (dependent variables), and the non-highlighted columns are something by which to make that prediction/classification (independent variables)."
   ]
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "Python 3 (ipykernel)",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.11.1"
  },
  "vscode": {
   "interpreter": {
    "hash": "3ff4b9f9a77e43d422b45ad0e34f66a3a995e732d437005df0ccbc0093bddc0e"
   }
  }
 },
 "nbformat": 4,
 "nbformat_minor": 5
}