In this post, we use the DynaML Scala machine learning environment to train Gaussian Process models to analyse time series data taken from a coal power plant.


Abott: Representative Image


The Data Set

From the Daisy system identification database, we download the abott power plant data. The data characteristics are summarized below.

Description: The data comes from a model of a Steam Generator at Abbott Power Plant in Champaign IL.

Sampling Frequency: 3 sec

Number: 9600

Inputs: 1. Fuel scaled 0-1 2. Air scaled 0-1 3. Reference level inches 4. Disturbance definde by the load level

Outputs: 5. Drum pressure PSI 6. Excess Oxygen in exhaust gases % 7. Level of water in the drum 8. Steam Flow Kg./s

Nonlinear AutoRegressive with eXogenous inputs (NARX)

A candidate output signal modeled as a function of the previous values of itself and the exogenous inputs


Gaussian Processes

Gaussian Processes are powerful non-parametric methods to solve regression and classification problems. They are based on a structural assumption about the finite dimensional distributions over spaces of functions, as shown in the equations below.

Formulation

Posterior Predictive Distribution

In the presence of training data , one may calculate using Bayes Theorem the posterior predictive distribution assuming , the test inputs are known.


For an in depth treatment of Gaussian Processes refer to the book.


Modelling Power Plant Outputs

Drum pressure PSI

AbottPowerPlant(new PolynomialKernel(2, 0.49), new DiracKernel(0.09),
opt = Map("globalOpt" -> "GS", "grid" -> "4", "step" -> "0.004"),
num_training = 200, num_test = 1000, deltaT = 2, column = 5)


water level


Excess Oxygen in exhaust gases (as %)

AbottPowerPlant(new PolynomialKernel(2, 0.49), new DiracKernel(0.09),
opt = Map("globalOpt" -> "GS", "grid" -> "4", "step" -> "0.004"),
num_training = 200, num_test = 1000, deltaT = 2, column = 6)


water level


Level of water in the drum

AbottPowerPlant(new PolynomialKernel(2, 0.49), new DiracKernel(0.09),
opt = Map("globalOpt" -> "GS", "grid" -> "4", "step" -> "0.004"),
num_training = 200, num_test = 1000, deltaT = 2, column = 7)


water level


Steam Flow Kg./s

AbottPowerPlant(new PolynomialKernel(2, 0.49), new DiracKernel(0.09),
opt = Map("globalOpt" -> "GS", "grid" -> "4", "step" -> "0.004"),
num_training = 200, num_test = 1000,
deltaT = 2, column = 8)


water level


Source Code

Below is the example program as a github gist, to view the original program in DynaML, click here.