Show code cell source
# This just repeats the data importation and model training from the previous section
import pandas as pd
import numpy as np
from sklearn.model_selection import train_test_split
from sklearn import svm
url = "https://raw.githubusercontent.com/jbrownlee/Datasets/master/iris.csv"
column_names = ['sepal-length', 'sepal-width', 'petal-length', 'petal-width', 'type']
df = pd.read_csv(url, names = column_names) #read CSV into Python as a dataframe
X = df.drop(columns=['type']).values #indpendent variables
#Converting X to an array is needed to avoid the warning for using 2d arrays as input for the predict function.
y = df[['type']].copy() #dependent variables
y = y['type'].values #converts y to a 1d array.
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.333, random_state=41)
svm_model = svm.SVC(gamma='scale', C=1) #Creates a svm model object. Mote, 'scale' and 1.0 are gamma and C's respective defaults
svm_model.fit(X_train,y_train)
SVC(C=1)In a Jupyter environment, please rerun this cell to show the HTML representation or trust the notebook.
On GitHub, the HTML representation is unable to render, please try loading this page with nbviewer.org.
SVC(C=1)
User Interface#
We’ve made an application. Now we need a way for the user to apply it. There are no specific requirements for how this must be done. Following the User Guide in your documentation, the evaluator must be able to get it to work and meet the needs of the problem described in the documentation. We’ll present a few options. Remember that simpler interfaces need a more detailed User Guide.
User inputs and runs code#
The user can be instructed to input and run code. If you do this, provide explicit instructions and provide an example that can be copied and pasted.
To make a prediction for varaibles x1, x2, x3 and x4, type
print(svm_model.predict([[x1,x2,x3,x4]]))
into the code cell below and press the ‘Run’ button in the menu. Example:
# Note that the model was trained with X = df.drop(columns=['type']).values
print(svm_model.predict([[5, 4, 1, .5]]))
['Iris-setosa']
Note
When making predictions, the input should look exactly like the data the model as trained with. In this example, ‘svm_model’ was trained with a 2d-array. Hence the need for the double brackets, [[5,4,1,.5]]
to avoid a ValueError: Expected 2D array, got 1D array instead
and converting X
with .values
to avoid a UserWarning
for missing feature names.
User inputs with Widget sliders#
Sliders provide a user-friendly experience that can easily be modified to control input range and increments. While sliders might not be the best choice here (text entry might be easier for selecting precise values), we’ll present an example as, in many cases, sliders work great.
Implementation is almost identical to that of text entry. Reviewing the data’s statistics, we set the sliders’ ranges to capture approximately 95% of the flower’s parameter values:
\(\text{range}= \text{mean}\pm 2(\text{standard deviation})\)
Assuming it’s normally distributed (it’s close enough). Capturing 99.7% of the data using 3 standard deviations might’ve been better -but you get the idea.
For example,
df.describe()
sepal-length | sepal-width | petal-length | petal-width | |
---|---|---|---|---|
count | 150.000000 | 150.000000 | 150.000000 | 150.000000 |
mean | 5.843333 | 3.054000 | 3.758667 | 1.198667 |
std | 0.828066 | 0.433594 | 1.764420 | 0.763161 |
min | 4.300000 | 2.000000 | 1.000000 | 0.100000 |
25% | 5.100000 | 2.800000 | 1.600000 | 0.300000 |
50% | 5.800000 | 3.000000 | 4.350000 | 1.300000 |
75% | 6.400000 | 3.300000 | 5.100000 | 1.800000 |
max | 7.900000 | 4.400000 | 6.900000 | 2.500000 |
feature = 'petal-width'
r_max = str(df[feature].describe()['mean']+2*df[feature].describe()['std'])
r_min = str(df[feature].describe()['mean']-2*df[feature].describe()['std'])
print('min='+r_min +', min='+r_max)
min=-0.3276548167350155, min=2.724988150068349
Similarly, finding ranges for each independent variable, the sliders are set up.
#The sliders where the user can input values. Min and max are set by using the complete datasets'
sl_widget = widgets.FloatSlider(description='sepal L:',min=4.19, max=7.4)
sw_widget = widgets.FloatSlider(description='sepal W:', min=2.19, max=3.9)
pl_widget = widgets.FloatSlider(description='petal L:', min=0.23, max=7.29)
pw_widget = widgets.FloatSlider(description='petal W:', min=0.0, max=2.72)
#A button for the user to get predictions using input valus.
button_predict = widgets.Button( description='Predict' )
button_output = widgets.Label(value='Enter values and press the \"Predict\" button.' )
#Defines what happens when you click the button
def on_click_predict(b):
predicition = svm_model.predict([[
sl_widget.value, sw_widget.value, pl_widget.value, pw_widget.value]])
button_output.value='Prediction = ' + str(predicition[0])
button_predict.on_click(on_click_predict)
#Displays the text boxes and button inside a VBox
vb=widgets.VBox([sl_widget, sw_widget, pl_widget, pw_widget, button_predict,button_output])
print('\033[1m' + 'Enter parameter values (in cm) and make a prediction:' + '\033[0m')
display(vb)
Enter parameter values (in cm) and make a prediction:
To automatically update values from a Widget, see get the current value of a widget and automatically run code after altering widgets.
Other Input Methods#
What user interface approaches are allowed? Anything the evaluator (playing the role of the client) can get to work following your instructions. Consider user-friendliness when choosing your interface. With four independent variables, four input boxes or sliders work fine. But what if your model uses 400 variables? Or say the client needed to classify not one but hundreds of flowers? In such cases, the user could be directed to copy or upload their data, say a .xlsx or .csv file, to a location the model can retrieve and analyze. Whatever method you choose, don’t make things difficult for the evaluator. Provide explicit instructions, examples, or (when appropriate) example data files.