A. Train and Test SetsImbalanced datatypically refers to a problem with classification problems where the classesare not represented equally.Most classification data sets do not have exactlyequal number of instances in each class, but a small difference often does notmatter. One thus needs to make sure that all two classes of wine are present inthe training model. What’s more, the amount of instances of all two wine typesneeds to be more or less equal so that you do not favour one or the other classin your predictions.
—————————————————————————# Import`train_test_split` from `sklearn.model_selection`fromsklearn.model_selection import train_test_split# Specify the data X=wines.ix:,0:11# Specify the targetlabels and flatten the array y=np.ravel(wines.type)# Split the data upin train and test setsX_train, X_test,y_train, y_test = train_test_split(X, y, test_size=0.
33, random_state=42)Standardization is away to deal with these values that lie so far apart. The scikit-learn packageoffers you a great and quick way of getting your data standardized: import theStandardScaler module from sklearn.preprocessing# Import`StandardScaler` from `sklearn.preprocessing`fromsklearn.preprocessing import StandardScaler# Define the scaler scaler =StandardScaler().fit(X_train)# Scale the train setX_train =scaler.
transform(X_train)# Scale the test setX_test =scaler.transform(X_test)—————————————————————————B. Creating the Model We start by using Keras Sequential model: it’s a linear stackof layers. You can easily create the model by passing a list of layer instancesto the constructor, which you set up by running model = Sequential(). The modelwill be implemented using a multilayer perceptron network. The structure of themulti-layer perceptron involves an input layer, some hidden layers and anoutput layer.
we need to take into account that the first layer needs to makethe input shape clear. The model needs to know what input shape to expect andthat’s why the input_shape, input_dim, input_length, or batch_size argumentsare used to pass the relevant features.In this case, are using a Dense layer,which is a fully connected layer. Dense layers implement the followingoperation: output = activation(dot(input, kernel) + bias).
Note that withoutthe activation function, the Dense layer would consist only of two linearoperations: a dot product and an addition.In the first layer, the activation argumenttakes the value relu. Next, the input_shape has been defined. This is the inputof the operation that above: the model takes as input arrays of shape (12,), or(*, 12). The first layer has 12 as a first value for the units argument ofDense(), which is the dimensionality of the output space and which are actually12 hidden units.
This means that the model will output arrays of shape (*, 12):this is