This function generates realistic irregularly sampled functional dataset given mean and covariance functions.

generate_data(
  n,
  m,
  model_mean,
  covariance,
  model_noise,
  lambda,
  ti = NULL,
  grid = seq(0, 1, length.out = 101),
  p = 0.2,
  k = 1
)

Arguments

n

Number of curves to generate.

m

Mean number of observation points per curve.

model_mean

glmnet model for the mean curve.

covariance

Matrix for the covariance surface.

model_noise

Object of class 'gam' from the function learn_noise.

lambda

Value of the penalty parameter for the mean curve.

ti

Sampling points of each curves, default=NULL.

grid

Common grid for the curves, default=seq(0, 1, length.out = 101).

p

Uncertainty for the number of observation per curve, default=0.2.

k

Multiplicative factor for the noise variance, default=1.

Value

List containing n entries. Each of the entry represents a simulated curve as another list with three entries:

  • $t the sampling points.

  • $x the observed points.

  • $x_true the observed points without noise.

Details

The data are generated as

$$X = \mu + \Sigma u + \epsilon,$$

where \(\mu\) is the mean function, \(\Sigma\) is the square-root of the covariance matrix, \(u\) and \(\epsilon\) are random normal variables. Heteroscedasticity is allowed using the coefs parameter.

Examples

if (FALSE) {
if(interactive()){
 attach(powerconsumption)
 mod <- learn_mean(df = powerconsumption, k = 50)
 cov <- learn_covariance(powerconsumption, 'lm')
 coefs <- learn_noise(df = powerconsumption)
 df <- generate_data(n = 10, m = 40, model_mean = mod, covariance = cov,
                     model_noise = coefs, lambda = exp(-3.5),
                     ti = NULL, grid = seq(0, 1, length.out = 101),
                     p = 0.2, k = 1)
 }
}