In the previous post, we trained DynaML’s feed forward neural networks on the wine quality data set. Lets compare how single layer feed forward neural networks compare to a simple logistic regression trained using Gradient Descent. The TestLogisticWineQuality program in the examples package does precisely that (check out the source code below).

Red Wine

TestLogisticWineQuality(stepSize = 0.2, maxIt = 120,
mini = 1.0, training = 800,
test = 800, regularization = 0.2,
wineType = "red")
16/04/01 15:21:57 INFO BinaryClassificationMetrics: Classification Model Performance
16/04/01 15:21:57 INFO BinaryClassificationMetrics: ============================
16/04/01 15:21:57 INFO BinaryClassificationMetrics: Accuracy: 0.8475
16/04/01 15:21:57 INFO BinaryClassificationMetrics: Area under ROC: 0.7968417788802267
16/04/01 15:21:57 INFO BinaryClassificationMetrics: Maximum F Measure: 0.7493563745371187

red-roc

red-fmeasure

White Wine

TestLogisticWineQuality(stepSize = 0.26, maxIt = 300,
mini = 1.0, training = 3800,
test = 1000, regularization = 0.0,
wineType = "white")
16/04/01 15:27:17 INFO BinaryClassificationMetrics: Classification Model Performance
16/04/01 15:27:17 INFO BinaryClassificationMetrics: ============================
16/04/01 15:27:17 INFO BinaryClassificationMetrics: Accuracy: 0.829
16/04/01 15:27:17 INFO BinaryClassificationMetrics: Area under ROC: 0.7184782682020251
16/04/01 15:27:17 INFO BinaryClassificationMetrics: Maximum F Measure: 0.7182203962483446

red-roc

red-fmeasure

Comparison with Neural Networks

Considering that a simple logistic regression model performs quite well on the data, and that logistic regression is equivalent to a single perceptron neural network model, we can train a neural net with 0 hidden layers using the TestNNWineQuality program.

TestNNWineQuality(0, List(), List("tansig"), stepSize = 0.2, maxIt = 120, 
mini = 1.0, alpha = 0.0, training = 1200, test = 400, regularization = 0.0, 
wineType = "red")
16/04/01 14:04:34 INFO BinaryClassificationMetrics: Classification Model Performance
16/04/01 14:04:34 INFO BinaryClassificationMetrics: ============================
16/04/01 14:04:34 INFO BinaryClassificationMetrics: Accuracy: 0.895
16/04/01 14:04:34 INFO BinaryClassificationMetrics: Area under ROC: 0.8209578913532626
16/04/01 14:04:34 INFO BinaryClassificationMetrics: Maximum F Measure: 0.7975192758967482

Which gives a performance in the same ball park as the logistic regression model, here it must be noted that, we used a larger training set fraction and a hyperbolic tangent activation function.

Source Code