In this post, we use the DynaML Scala machine learning environment to train Gaussian Process models to analyse time series data taken from a coal power plant.

### The Data Set

From the Daisy system identification database, we download the abott power plant data. The data characteristics are summarized below.

**Description**:
The data comes from a model of a Steam Generator at
Abbott Power Plant in Champaign IL.

**Sampling Frequency**:
3 sec

**Number**:
9600

**Inputs**:
1. Fuel scaled 0-1
2. Air scaled 0-1
3. Reference level inches
4. Disturbance definde by the load level

**Outputs**:
5. Drum pressure PSI
6. Excess Oxygen in exhaust gases %
7. Level of water in the drum
8. Steam Flow Kg./s

### Nonlinear AutoRegressive with eXogenous inputs (NARX)

A candidate output signal modeled as a function of the previous values of itself and the exogenous inputs

### Gaussian Processes

Gaussian Processes are powerful non-parametric methods to solve regression and classification problems. They are based on a structural assumption about the finite dimensional distributions over spaces of functions, as shown in the equations below.

#### Formulation

#### Posterior Predictive Distribution

In the presence of training data , one may calculate using *Bayes Theorem* the posterior predictive distribution assuming , the test inputs are known.

For an in depth treatment of *Gaussian Processes* refer to the book.

## Modelling Power Plant Outputs

### Drum pressure PSI

```
AbottPowerPlant(new PolynomialKernel(2, 0.49), new DiracKernel(0.09),
opt = Map("globalOpt" -> "GS", "grid" -> "4", "step" -> "0.004"),
num_training = 200, num_test = 1000, deltaT = 2, column = 5)
```

### Excess Oxygen in exhaust gases (as %)

```
AbottPowerPlant(new PolynomialKernel(2, 0.49), new DiracKernel(0.09),
opt = Map("globalOpt" -> "GS", "grid" -> "4", "step" -> "0.004"),
num_training = 200, num_test = 1000, deltaT = 2, column = 6)
```

### Level of water in the drum

```
AbottPowerPlant(new PolynomialKernel(2, 0.49), new DiracKernel(0.09),
opt = Map("globalOpt" -> "GS", "grid" -> "4", "step" -> "0.004"),
num_training = 200, num_test = 1000, deltaT = 2, column = 7)
```

### Steam Flow Kg./s

```
AbottPowerPlant(new PolynomialKernel(2, 0.49), new DiracKernel(0.09),
opt = Map("globalOpt" -> "GS", "grid" -> "4", "step" -> "0.004"),
num_training = 200, num_test = 1000,
deltaT = 2, column = 8)
```

## Source Code

Below is the example program as a github gist, to view the original program in DynaML, click here.