mfd classLet us show how the funcharts package works through an
example with the dataset air, which has been included from
the R package FRegSigCom and is used in the paper of Qi and
Luo (2019).
NOTE: since the objective of this vignette is only to illustrate how the package works, in the following we will use only 5 basis functions and a fixed smoothing parameter to reduce the computational time.
mfd
classWe provide the mfd class for multivariate functional
data. It inherits from the fd class but provides some
additional features:
coef argument to be an array even when
the number of functional observations and/or the number of functional
variables are one[ that never drops
dimensions, then it always returns a mfd object with
three-dimensional array argument coef; moreover it allows
extracting observations/variables also by nameThe first thing is to get the mfd object from discrete
data. We currently allow two types of input with the two functions:
get_mfd_data.frame: first input must be a data.frame in
the long format, with:
arg column giving the argument (x)
values,id column indicating the functional
observation,y valuesget_mfd_list: first input must be a list of matrices
for the case all functional data are observed on the same grid, which:
In this example, the dataset air is in the second format
(list of matrices, with data observed on the same grid)
library(funcharts)
data("air")
fun_covariates <- names(air)[names(air) != "NO2"]
mfdobj_x <- get_mfd_list(air[fun_covariates],
grid = 1:24,
n_basis = 5,
lambda = 1e-2)In order to perform the statistical process monitoring analysis, we divide the dataset into a phase I and a phase II dataset.
rows1 <- 1:300
rows2 <- 301:355
mfdobj_x1 <- mfdobj_x[rows1]
mfdobj_x2 <- mfdobj_x[rows2]Now we extract the scalar response variable, i.e. the mean of
NO2 at each observation:
y <- rowMeans(air$NO2)
y1 <- y[rows1]
y2 <- y[rows2]We also provide plotting functions using ggplot2.
plot_mfd(mfdobj_x1)plot_mfd(mfdobj_x1[1:10, c("CO", "C6H6")])This functions provides a layer geom_mfd, which is
basically a geom_line that is added to
ggplot() to plot functional data. It also allows to plot
the original raw data by adding the argument
type_mfd = "raw". geom_mfd accepts the
argument data as input, which must be a data frame with two
columns, id and var, in order to use aesthetic
mappings that allow for example to colour different functions according
to some columns in this data frame.
dat <- data.frame(id = unique(mfdobj_x1$raw$id)) %>%
mutate(id_greater_than_100 = as.numeric(id) > 100)
ggplot() +
geom_mfd(mapping = aes(col = id_greater_than_100),
mfdobj = mfdobj_x1,
data = dat,
alpha = .2,
lwd = .3)For class mfd we provide a function
pca_mfd, which is a wrapper to pca.fd. It
returns multivariate functional principal component scores summed over
variables (fda::pca.fd returns an array of scores when
providing a multivariate functional data object). Moreover, the
eigenfunctions or multivariate functional principal components given in
harmonics argument are converted to the mfd
class. We also provide a plot function for the eigenfunctions (the
argument harm selects which components to plot).
pca <- pca_mfd(mfdobj_x1)
plot_pca_mfd(pca, harm = 1:3)