This vignette is a copy and slightly edited FAQ-entries and other short tutorials from the old page. The content comes is slightly random order, sorry about that.
If you want to use a prior for the hyperparameter that is not yet implemented there are two choices. If you think that your prior should be on the list and that other might use it to, please let us know. Alternatively, you can define your own prior using \verb|prior = “expression: ….”|, or by specifiying a table of \(x\) and \(y\) values which define the prior distribution.
There are three ways to specify prior distributions for hyperparameters in INLA:
In the following we will provide more details regarding the two last options. Finally, we will present an example illustrating (and comparing) the three different possibilities by means of the log-gamma distribution for the precision parameter.
A user can specify any (univariate) prior distribution for the hyperparameter \(\theta\) by defining an expression for the log-density log \(\pi(\theta)\), as a function of the corresponding \(\theta\). It is important to be aware that \(\theta\) is on the internal scale.
The format is
expression: statement; statement; ...; return(value)
where “statement” is any regular statement (more below) and “value” is the value for the log-density of the prior, evaluated at the current value for \(\theta\).
Here, is an example defining the log-gamma distribution:
prior.expression = "expression:
a = 1;
b = 0.1;
precision = exp(log_precision);
logdens = log(b^a) - lgamma(a)
+ (a-1)*log_precision - b*precision;
log_jacobian = log_precision;
return(logdens + log_jacobian);"
Some syntax specific notes: * No white-space before “(.)” in the return statement. * A “;” is needed to terminate each expression. * A "_" is allowed in variable names.
Known functions that can be used within the expression statement are
Instead of defining a prior distribution function, it is possible to provide a table of suitable values \(x\) (internal scale) and the corresponding log-density values \(y\). INLA fits a spline through the provided points and continues with this in the succeeding computations. Note, there is no transformation into a functional form performed or required. The input-format for the table is a string, which starts with and is then followed by a block of \(x\)-values and a block of the corresponding \(y\)-values, which represent the values of the log-density evaluated on x. Thus
table: x_1 ... x_n y_1 ... y_n
We illustrate all three different ways of defining a prior distribution for the precision of a normal likelihood. To show that the three definitions lead to the same result we inspect the logmarginal likelihood.
## the loggamma-prior
prior.function = function(log_precision) {
a = 1;
b = 0.1;
precision = exp(log_precision);
logdens = log(b^a) - lgamma(a) + (a-1)*log_precision - b*precision;
log_jacobian = log_precision;
return(logdens + log_jacobian)
}
## implementing the loggamma-prior using "expression:"
prior.expression = "expression:
a = 1;
b = 0.1;
precision = exp(log_precision);
logdens = log(b^a) - lgamma(a)
+ (a-1)*log_precision - b*precision;
log_jacobian = log_precision;
return(logdens + log_jacobian);"
## use suitable support points x
lprec = seq(-10, 10, len=100)
## link the x and corresponding y values into a
## string which begins with "table:""
prior.table = paste(c("table:", cbind(lprec,
prior.function(lprec))), collapse=" ", sep="")
# simulate some data
n = 50
y = rnorm(n)
## use the built-in loggamma prior
r1 = inla(y~1,data = data.frame(y),
control.family = list(hyper = list(prec = list(
prior = "loggamma", param = c(1, 0.1)))))
## use the definition using expression
r2 = inla(y~1, data = data.frame(y),
control.family = list(hyper = list(
prec = list(prior = prior.expression))))
## use a table of x and y values representing the loggamma prior
r3 = inla(y~1, data = data.frame(y),
control.family = list(hyper = list(
prec = list(prior = prior.table))))
print(round(c(r1$mlik[1], r2$mlik[1], r3$mlik[1]), dig=3))
## [1] -70.657 -70.657 -70.657
Yes, the type of link function is given in the statement using , and the type of link-functions implemented are listed on the documentation for each likelihood. The default link is which corresponds to the second link function in the list. Here is an example
n = 100
z = rnorm(n)
eta = 1 + 0.1*z
N = 2
p = inla.link.invlogit(eta)
y = rbinom(n, size = N, prob = p)
r = inla(y ~ 1 + z, data = data.frame(y, z), family = "binomial", Ntrials = rep(N, n),
control.family = list(control.link = list(model="logit")),
control.predictor = list(compute=TRUE))
p = inla.link.invprobit(eta)
y = rbinom(n, size = N, prob = p)
rr = inla(y ~ 1 + z, data = data.frame(y, z), family = "binomial", Ntrials = rep(N, n),
control.family = list(control.link = list(model="probit")),
control.predictor = list(compute=TRUE))
p = inla.link.invcloglog(eta)
y = rbinom(n, size = N, prob = p)
rrr = inla(y ~ 1 + z, data = data.frame(y, z), family = "binomial", Ntrials = rep(N, n),
control.family = list(control.link = list(model="cloglog")),
control.predictor = list(compute=TRUE))
Other linkfunctions/models are also avilable from within , see
In there is no function as for in . Predictions must to done as a part of the model fitting itself. As prediction is the same as fitting a model with some missing data, we can simply set for those “locations” we want to predict. Here is a simple example
n = 100
n.pred = 10
y = arima.sim(n=n, model=list(ar=0.9))
N = n + n.pred
yy = c(y, rep(NA, n.pred))
i = 1:N
formula = yy ~ f(i, model="ar1")
r = inla(formula, data = data.frame(i,yy),
control.family = list(initial = 10, fixed=TRUE)) ## no observational noise
which gives predictions
r$summary.random$i[(n+1):N, c("mean", "sd") ]
## mean sd
## 101 2.813609 1.577280
## 102 2.540447 1.788873
## 103 2.298450 1.947177
## 104 2.083634 2.069371
## 105 1.892564 2.165538
## 106 1.722280 2.242245
## 107 1.570227 2.304051
## 108 1.434194 2.354250
## 109 1.312265 2.395293
## 110 1.202776 2.429044
Quantiles such like and , if computed, use the identity link if by default. If you want the computed with a different link function, then there are two ways to doit.
In the case you want to use the link-function from the likelihood already used (most often the case), there is the argument in . If the response , then set , to indicate that you want to compute that fitted value using the link function from . With several likelihoods, set to family-index which is correct, ie the column number in the response. The following example shows the usage:
## simple poisson regression
n = 100
x = sort(runif(n))
eta = 1 + x
lambda = exp(eta)
y = rpois(n, lambda = lambda)
## missing values:
y[1:3] = NA
y[(n-2):n] = NA
## link = 1 is a shortcut for rep(1, n) where n is the appropriate
## length. here '1' is a reference to the first 'family', ie
## 'family[1]'
r = inla(y ~ 1 + x, family = "poisson",
data = data.frame(y, x),
control.predictor = list(link = 1))
plot(exp(eta),type ="l")
points(r$summary.fitted.values$mean, pch=19)
We only need to define where there are missing values. Entries for which the observation is not , is ignored.
For more than one likelihood, use ‘2’ to refer to the second likelihood. Here is an example where we split the data in two, and assign the second half the distribution.
n2 = n %/% 2L
Y = matrix(NA, n, 2)
Y[1:n2, 1] = y[1:n2]
Y[1:n2 + n2, 2] = y[1:n2 + n2]
link = rep(NA, n)
link[which(is.na(y[1:n2]))] = 1
link[n2 + which(is.na(y[1:n2 + n2]))] = 2
r = inla(Y ~ 1 + x, family = c("poisson", "nbinomial"),
data = list(Y=Y, x=x),
control.predictor = list(link = link))
plot(exp(eta),type ="l")
points(r$summary.fitted.values$mean, pch=19)
We can transform marginals manually using the function or compute expectations using , like in this example (taken from ).
## Load the data
data(Tokyo)
summary(Tokyo)
## y n time
## Min. :0.0000 Min. :1.000 Min. : 1.00
## 1st Qu.:0.0000 1st Qu.:2.000 1st Qu.: 92.25
## Median :0.0000 Median :2.000 Median :183.50
## Mean :0.5246 Mean :1.997 Mean :183.50
## 3rd Qu.:1.0000 3rd Qu.:2.000 3rd Qu.:274.75
## Max. :2.0000 Max. :2.000 Max. :366.00
Tokyo$y[300:366] <- NA
## Define the model
formula = y ~ f(time, model="rw2", scale.model=TRUE,
constr=FALSE, cyclic=TRUE,
hyper = list(prec=list(prior="pc.prec",
param=c(2,0.01)))) -1
## We'll get a warning since we have not defined the link argument
result = inla(formula, family="binomial", Ntrials=n, data=Tokyo,
control.predictor=list(compute=T))
## need to recompute the fitted values for those with data[i] = NA,
## as the identity link is used.
n = 366
fitted.values.mean = numeric(n)
for(i in 1:366) {
if (is.na(Tokyo$y[i])) {
if (FALSE) {
## either like this, which is slower
marg = inla.marginal.transform(
function(x) exp(x)/(1+exp(x)),
result$marginals.fitted.values[[i]] )
fitted.values.mean[i] = inla.emarginal(function(x) x, marg)
} else {
## or like this, which is faster
fitted.values.mean[i] = inla.emarginal(
function(x) exp(x)/(1 +exp(x)),
result$marginals.fitted.values[[i]])
}
} else {
fitted.values.mean[i] = result$summary.fitted.values[i,"mean"]
}
}
plot(fitted.values.mean)
Some of the models in needs the user to specify a graph, saying which nodes are neighbours to each other. A ‘graph’ can be specified in three different ways.
A graph defined in an ascii-file, must have the following format. The first entry is the number of nodes in the graph, \(n\). The nodes in the graph are labelled \(1, 2, \ldots, n\). The next entries, specify the number of neighbours and the neighbours for each node. A simple example is the following
4
1 2 3 4
2 0
3 1 1
4 1 1
This defines a graph with four nodes, where node 1 has 2 neighbours 3 and 4, node 2 as 0 neighbours, node 3 has 1 neighbour 1, and node 4 has 1 neighbour 1, and the graph looks like this
g = inla.read.graph("4 1 2 3 4 2 0 3 1 1 4 1 1")
plot(g)
Note that we need to specify the implied symmetry as well. In this example 4 is a neighbour of 1, then we also need to specify that 1 is a neighbour of 4.
Instead of storing the graph specification on a file, it can also be specified as a character string with the same contents as a file, like
"4 1 2 3 4 2 0 3 1 1 4 1 1"
as used in above.
Due to imitations of the length of a string/line, so in practice, this way specifying the graph, seems more useful for teaching or demonstration purposes than for practical analysis.
Within , this would look like
formula = y ~ f(idx, model = "besag", graph = "graph.dat")
or
formula = y ~ f(idx, model = "besag", graph = "4 1 2 3 4 2 0 3 1 1 4 1 1")
A graph can also be defined as a symmetric (dense or sparse) matrix, where the non-zero pattern of the matrix defines the graph A neighbour matrix is often used for defining which nodes that are neighbours, with the convention that if then \(i\) and \(j\) are neighbours if \(i\not=j\).
For example, the (dense) matrix C
C = matrix(c(1, 0, 1, 1, 0, 1, 0, 0, 1, 0, 1, 0, 1, 0, 0, 1),4,4)
C
## [,1] [,2] [,3] [,4]
## [1,] 1 0 1 1
## [2,] 0 1 0 0
## [3,] 1 0 1 0
## [4,] 1 0 0 1
defines the same graph show above.
Since graphs tends to be large, we often define them as a sparse matrix
C.sparse= inla.as.sparse(C)
C.sparse
## 4 x 4 sparse Matrix of class "dgTMatrix"
##
## [1,] 1 . 1 1
## [2,] . 1 . .
## [3,] 1 . 1 .
## [4,] 1 . . 1
We can then use or in the formula.
We can also define the graph as an -object, which is used internally, represent a graph. For example
str(g)
## List of 4
## $ n : int 4
## $ nnbs: num [1:4] 2 0 1 1
## $ nbs :List of 4
## ..$ : int [1:2] 3 4
## ..$ : num(0)
## ..$ : int 1
## ..$ : int 1
## $ cc :List of 3
## ..$ id : int [1:4] 1 2 1 1
## ..$ n : int 2
## ..$ nodes:List of 2
## .. ..$ : int [1:3] 1 3 4
## .. ..$ : int 2
## - attr(*, "class")= chr "inla.graph"
and use as the argument.
The internal format, are as follows. is the size of the graph. are the number of neighbours to each node, list all the neighbours to each node, and the class is . The -list is for internal use only and specify the connected components in the graph.
has some functions to work with graphs, and here is a short summary.
and , read and write graphs using any of the graph specifications above.
You can plot a -object using and get a summary using . The plotting requires the package.
From a graph specification, you can generate the neighbour matrix, using . You can plot a graph specification as a neighbour matrix, using
If you have ‘errors’ in your graph, you may read it using . This is only available for a graph specification in an ascii-file.
For a formula like
formula = y ~ x + f(k, model= <some model>)
then ’s in either \(y\), \(x\) or \(k\) are treated differently.
’s in the response \(y\). If , this means that is not observed, hence gives no contribution to the likelihood.
’s in fixed effect \(x\). If this means that is not part of the linear predictor for . For fixed effects, this is equivalent to , hence internally we make this change: .
’s in random effect \(k\). If , this means that the random effect does not contribute to the linear predictor for .
’s in a factor \(x\). ’s in a factor \(x\) is not allowed unless is a level in itself, or
control.fixed = list(expand.factor.strategy = "inla")
is set. With this option, then is interpreted similarly as a fixed effect, where means no contribution from x. The effect of , is best explained with an example.
r = inla(y ~ 1 + x, data = data.frame(y=1:3, x=factor(c("a","b","c"))))
as.matrix(r$model.matrix)
## (Intercept) xb xc
## 1 1 0 0
## 2 1 1 0
## 3 1 0 1
for default value of the argument . The effect of is removed to make the corresponding matrix non-singular. If we want to expand into each of each three effects, then we can do
r = inla(y ~ 1 + x, data = data.frame(y=1:3,x=factor(c("a","b","c"))),
control.fixed = list(expand.factor.strategy="inla"))
as.matrix(r$model.matrix)
## (Intercept) xa xb xc
## 1 1 1 0 0
## 2 1 0 1 0
## 3 1 0 0 1
As we see, each level of the factor is now treated symetrically. Although the corresponding frequentist matrix is singular as we have confounding with the intercept, the Bayesian posterior is still proper with proper priors.
With a in , we get
r = inla(y ~ 1 + x, data = data.frame(y=1:3,x=factor(c("a","b",NA))),
control.fixed = list(expand.factor.strategy="inla"))
as.matrix(r$model.matrix)
## (Intercept) xa xb
## 1 1 1 0
## 2 1 0 1
## 3 1 0 0
so that the 3rd element of the linear predictor has no contribution from , as it should.
No, INLA has no generic way to “impute” or integrate-out missing covariates. You have to adjust your model to account for missing covariates, like using one of the measurement error models (“meb”, “mec”), or construct a joint model for the data and the covariates, but this is case-specific.
provides two types of leave-one-out predictive measures of fit. It is the CPO value, which is \[
\text{Prob}(y_i | y_{-i}),
\]
the PIT value \[
\text{Prob}(y_i^\text{new} \le y_i | y_{-i})
\] To enable the computation of these quantities, you will need to add the argument
control.compute=list(cpo=TRUE)
We can also compute PO values \[
\text{Prob}(y_i | y),
\]
when argument is added.
If the resulting object is , then you will find the predictive quantities as and .
Implicit assumptions made in for computations, and there are internal checks that these are satisfied. The results of these checks will appear as . In short, if then some assumption is violated, the higher the value (maximum 1) the more seriously. If then the assumptions should be ok.
You want to recompute those with non-zero failure. However, this must be done by removing from the dataset, fit the model and then predict . To provide a more efficient implementation of this, we have provided
improved.result = inla.cpo(result)
which take an inla-object which is the output from , and recompute (in an efficient way) the cpo/pit for which , and return ‘result’ with the improved estimates of cpo/pit. See for details.
Yes! This option allow to use a remote server to do the computations. In order to use this feature, you need to do some setup which is different from (various) Linux distributions, Mac and Windows. In short:
You can also submit a job on the remote server, so you do not need to sit and wait for it to finish, but you can collect the results later. Basically, you do
r = inla(..., inla.call = "submit")
which will start the job on the server. You can start many jobs, and list them using
inla.qstat()
and you can fetch the results (for the job above) using
r = inla.qget(r)
You can also delete jobs and fetch the jobs from another machine; see for further details.
HOWTO setup ssh-keys: For the unexperienced user, this is somewhat tricky; sorry about that. The easiest is to find a friend that knows this and can help you. Newer system do a lot of these things very nicely these days.
It is also possible to setup this from Windows using CYGWIN, and INLA can work with this interface as well. Please see the old web-page for details, which are long and technical. HOWEVER, I am no longer convinced that this work anymore, as I haven’t seen this is use for years. It is much much easier to use a virtual machine with Linux on Windows.
I have some linear combinations of the nodes in the latent field that I want to compute the posterior marginal of, is that possible? Yes! These are called ‘linear combinations’. There are handy functions, ‘inla.make.lincomb()’ and ‘inla.make.lincombs()’, to define one or many such linear combinations. Single linear combinations made by using ‘inla.make.lincomb()’ can easily be joined into many. Its use is easiest explained using a rather long example…
Here is the example, that explains these features.
## A simple model
n = 100
a = rnorm(n)
b = rnorm(n)
idx = 1:n
y = rnorm(n) + a + b
formula = y ~ 1 + a + b + f(idx, model="iid")
## assume we want to compute the posterior for
##
## 2 * beta_a + 3 * beta_b + idx[1] - idx[2]
##
## which we specify as follows (and giving it a unique name)
lc1 = inla.make.lincomb(a=2, b=3, idx = c(1,-1,rep(NA,n-2)))
names(lc1) = "lc1"
## strictly speaking, it is sufficient to use `idx = c(1,-1)', as the
## remaining idx's are not used in any case.
r = inla(formula, data = data.frame(a,b,y),
## add the linear combinations here
lincomb = lc1,
## force noise variance to be essiatially zero
control.family = list(initial=10, fixed=TRUE))
## to verify the result, we can compare the mean but the variance and
## marginal cannot be computed from the simpler marginals alone.
lc1.1 = 2 * r$summary.fixed["a", "mean"] + 3 * r$summary.fixed["b",
"mean"] + r$summary.random$idx$mean[1] -
r$summary.random$idx$mean[2]
lc1.2= r$summary.lincomb.derived$mean
print(round(c(lc1.1 = lc1.1, lc1.2 = lc1.2), dig=3))
## lc1.1 lc1.2
## 5.152 5.152
The marginals are available as
There is an another function which is handy for specifying many linear combinations at once, that is (note the plural s). Here each ‘row’ define one linear combination
## let wa and wb be vectors, and we want to compute the marginals for
## beta_a * wa[i] + beta_b * wb[i], for i=1..m. this is done
## conveniently as follows
m = 10
wa = runif(m)
wb = runif(m)
lc.many = inla.make.lincombs(a = wa, b=wb)
## we can give them names as well, but there are also default names, like
print(names(lc.many))
## [1] "lc01" "lc02" "lc03" "lc04" "lc05" "lc06" "lc07" "lc08" "lc09" "lc10"
r = inla(formula, data = data.frame(a,b,y),
lincomb = lc.many,
control.family = list(initial=10, fixed=TRUE))
print(round(r$summary.lincomb.derived, dig=3))
## ID mean sd 0.025quant 0.5quant 0.975quant mode kld
## lc01 1 0.402 0.040 0.324 0.402 0.481 0.402 0
## lc02 2 1.096 0.088 0.924 1.096 1.269 1.096 0
## lc03 3 0.945 0.077 0.794 0.945 1.097 0.945 0
## lc04 4 1.158 0.121 0.919 1.158 1.396 1.158 0
## lc05 5 0.389 0.034 0.323 0.389 0.456 0.389 0
## lc06 6 1.570 0.120 1.334 1.570 1.807 1.570 0
## lc07 7 1.453 0.128 1.201 1.453 1.704 1.453 0
## lc08 8 1.074 0.084 0.908 1.074 1.240 1.074 0
## lc09 9 0.974 0.073 0.829 0.974 1.117 0.974 0
## lc10 10 1.495 0.120 1.259 1.495 1.731 1.495 0
Terms like ‘idx’ above, can be added as into , where IDX is a matrix. Again, each column of the arguments define one linear combination.
There is a further option available for the derived linear combinations, that is the option to compute also the posterior correlation matrix between all the linear combinations. To activate this option, use
control.inla = list(lincomb.derived.correlation.matrix = TRUE)
and you will find the resulting posterior correlation matrix as
result$misc$lincomb.derived.correlation.matrix
Here is a small example where we compute the correlation matrix for the predicted values of a hidden AR(1) model with an intercept.
n = 100
nPred = 10
phi = 0.9
x = arima.sim(n, model = list(ar=phi)) * sqrt(1-phi^2)
y = 1 + x + rnorm(n, sd=0.1)
time = 1:(n + nPred)
Y = c(y, rep(NA, nPred))
formula = Y ~ 1 + f(time, model="ar1")
## make linear combinations which are the nPred linear predictors
B = matrix(NA, nPred, n+nPred)
for(i in 1:nPred) {
B[i, n+i] = 1
}
lcs = inla.make.lincombs(Predictor = B)
r = inla(formula, data = data.frame(Y, time),
control.predictor = list(compute=TRUE),
lincomb = lcs,
control.inla = list(lincomb.derived.correlation.matrix=TRUE))
print(round(r$misc$lincomb.derived.correlation.matrix,dig=3))
## lc01 lc02 lc03 lc04 lc05 lc06 lc07 lc08 lc09 lc10
## lc01 1.000 0.608 0.426 0.316 0.242 0.190 0.152 0.124 0.103 0.086
## lc02 0.608 1.000 0.697 0.514 0.392 0.306 0.244 0.197 0.162 0.135
## lc03 0.426 0.697 1.000 0.734 0.557 0.432 0.341 0.274 0.224 0.185
## lc04 0.316 0.514 0.734 1.000 0.754 0.581 0.456 0.363 0.294 0.241
## lc05 0.242 0.392 0.557 0.754 1.000 0.765 0.596 0.471 0.377 0.307
## lc06 0.190 0.306 0.432 0.581 0.765 1.000 0.772 0.605 0.481 0.387
## lc07 0.152 0.244 0.341 0.456 0.596 0.772 1.000 0.777 0.612 0.488
## lc08 0.124 0.197 0.274 0.363 0.471 0.605 0.777 1.000 0.780 0.616
## lc09 0.103 0.162 0.224 0.294 0.377 0.481 0.612 0.780 1.000 0.783
## lc10 0.086 0.135 0.185 0.241 0.307 0.387 0.488 0.616 0.783 1.000
The methodology needs the full conditional density for the latent field to be “near” Gaussian. This is usually achived by either replications or smoothing/“borrowing strength”. A simple example which do not have this, is the following:
n = 100
u = rnorm(n)
eta = 1 + u
p = exp(eta)/(1+exp(eta))
y = rbinom(n, size=1, prob = p)
idx = 1:n
result = inla(y ~ 1 + f(idx, model="iid",
hyper = list(prec = list(prior="pc.prec",
prior = c(1,0.01)))),
data =data.frame(y,idx), family = "binomial",
Ntrials = 1)
summary(result)
##
## Call:
## c("inla(formula = y ~ 1 + f(idx, model = \"iid\", hyper = list(prec =
## list(prior = \"pc.prec\", ", " prior = c(1, 0.01)))), family =
## \"binomial\", data = data.frame(y, ", " idx), Ntrials = 1)")
## Time used:
## Pre = 0.196, Running = 0.075, Post = 0.0162, Total = 0.287
## Fixed effects:
## mean sd 0.025quant 0.5quant 0.975quant mode kld
## (Intercept) 0.407 0.205 0.01 0.406 0.813 0.402 0
##
## Random effects:
## Name Model
## idx IID model
##
## Model hyperparameters:
## mean sd 0.025quant 0.5quant 0.975quant mode
## Precision for idx 175679.28 4301373.42 7.42 203.38 146087.59 10.29
##
## Expected number of effective parameters(stdev): 1.46(0.908)
## Number of equivalent replicates : 68.56
##
## Marginal log-Likelihood: -68.05
For each binary observation there is an iid “random effect” \(u\), and there is no smoothing/``borrowing strength’’ (apart from the weak intercept). If you plot the loglikelihood for eta for \(y=1\), say, then its an increasing function for increasing eta, so the likelihood itself would like \(\eta = \infty\). With an unknown precision for \(u\) we run into problems; INLA has a tendency to estimate a to high precision for \(u\). However, it must be noted that the model is almost singular and you’ll have a strong prior sensitivity in the (exact) results as well. There is a similar discussion in here as well for the Salamander data example.
Yes, this is possible. Essentially, you have to set the linear predictor for the first model equal to ‘u’, and then you can copy ‘u’ and use the scaling to get the regression coefficient. A simple example will illustrate the idea:
## simple example
n = 100
x1 = rnorm(n)
eta1 = 1 + x1
x2 = rnorm(n)
eta2 = 2 + 2*eta1 + 2*x2
y1 = rnorm(n, mean=eta1, sd = 0.01)
y2 = rnorm(n, mean=eta2, sd = 0.01)
## the trick is to create a vector 'u' (iid) which is
## equal to eta1, and then we can copy 'u' to
## create beta*u or beta*eta1. we do this by
## using 0 = eta1 -u + tiny.noise
formula = Y ~ -1 + intercept1 + X1 + intercept2 + f(u, w, model="iid",
hyper = list(prec = list(initial = -6, fixed=TRUE))) + f(b.eta2,
copy="u", hyper = list(beta = list(fixed = FALSE))) + X2
Y = matrix(NA, 3*n, 3)
## part 1: y1
intercept1 = rep(1, n)
X1 = x1
intercept2 = rep(NA, n)
u = rep(NA, n)
w = rep(NA, n)
b.eta2 = rep(NA, n)
X2 = rep(NA, n)
Y[1:n, 1] = y1
## part 2: 0 = eta1 - u + tiny.noise
intercept1 = c(intercept1, intercept1)
X1 = c(X1, x1)
intercept2 = c(intercept2, rep(NA, n))
u = c(u, 1:n)
w = c(w, rep(-1, n))
b.eta2 = c(b.eta2, rep(NA, n))
X2 = c(X2, rep(NA, n))
Y[n + 1:n, 2] = 0
## part 3: y2
intercept1 = c(intercept1, rep(NA, n))
X1 = c(X1, rep(NA, n))
intercept2 = c(intercept2, rep(1, n))
u = c(u, rep(NA, n))
w = c(w, rep(NA, n))
b.eta2 = c(b.eta2, 1:n)
X2 = c(X2, x2)
Y[2*n + 1:n, 3] = y2
r = inla(formula,
data = list(Y=Y, intercept1=intercept1, X1=X1,
intercept2=intercept2, u=u, w=w, b.eta2=b.eta2, X2=X2),
family = rep("gaussian", 3),
control.inla = list(h = 1e-3),
control.family = list(
list(),
list(hyper = list(prec = list(initial = 10, fixed=TRUE))),
list()))
summary(r)
##
## Call:
## c("inla(formula = formula, family = rep(\"gaussian\", 3), data = list(Y
## = Y, ", " intercept1 = intercept1, X1 = X1, intercept2 = intercept2, ",
## " u = u, w = w, b.eta2 = b.eta2, X2 = X2), control.family =
## list(list(), ", " list(hyper = list(prec = list(initial = 10, fixed =
## TRUE))), ", " list()), control.inla = list(h = 0.001))")
## Time used:
## Pre = 0.335, Running = 0.267, Post = 0.0224, Total = 0.625
## Fixed effects:
## mean sd 0.025quant 0.5quant 0.975quant mode kld
## intercept1 0.999 0.001 0.997 0.999 1.000 0.999 0
## X1 1.002 0.001 1.000 1.002 1.003 1.002 0
## intercept2 2.005 0.003 1.999 2.005 2.011 2.005 0
## X2 1.999 0.002 1.995 1.999 2.002 1.999 0
##
## Random effects:
## Name Model
## u IID model
## b.eta2 Copy
##
## Model hyperparameters:
## mean sd 0.025quant 0.5quant
## Precision for the Gaussian observations 17278.62 131.050 16239.94 17300.20
## Precision for the Gaussian observations[3] Inf NaN 0.00 0.00
## Beta for b.eta2 2.00 0.002 1.99 2.00
## 0.975quant mode
## Precision for the Gaussian observations 18336.63 17320.81
## Precision for the Gaussian observations[3] Inf NaN
## Beta for b.eta2 2.00 2.00
##
## Expected number of effective parameters(stdev): 125.55(0.30)
## Number of equivalent replicates : 2.39
##
## Marginal log-Likelihood: 209.41
The list of latent models, likelihood and priors implemented, can be found by doing (or give a spesific section, see )
inla.list.models()
## Section [group]
## ar AR(p) correlations
## ar1 AR(1) correlations
## besag Besag model
## exchangeable Exchangeable correlations
## exchangeablepos Exchangeable positive correlations
## iid Independent model
## rw1 Random walk of order 1
## rw2 Random walk of order 2
## Section [hazard]
## iid An iid model for the log-hazard
## rw1 A random walk of order 1 for the log-hazard
## rw2 A random walk of order 2 for the log-hazard
## Section [latent]
## 2diid (This model is obsolute)
## ar Auto-regressive model of order p (AR(p))
## ar1 Auto-regressive model of order 1 (AR(1))
## ar1c Auto-regressive model of order 1 w/covariates
## besag The Besag area model (CAR-model)
## besag2 The shared Besag model
## besagproper A proper version of the Besag model
## besagproper2 An alternative proper version of the Besag model
## bym The BYM-model (Besag-York-Mollier model)
## bym2 The BYM-model with the PC priors
## clinear Constrained linear effect
## copy Create a copy of a model component
## crw2 Exact solution to the random walk of order 2
## dmatern Dense Matern field
## fgn Fractional Gaussian noise model
## fgn2 Fractional Gaussian noise model (alt 2)
## generic A generic model
## generic0 A generic model (type 0)
## generic1 A generic model (type 1)
## generic2 A generic model (type 2)
## generic3 A generic model (type 3)
## iid Gaussian random effects in dim=1
## iid1d Gaussian random effect in dim=1 with Wishart prior
## iid2d Gaussian random effect in dim=2 with Wishart prior
## iid3d Gaussian random effect in dim=3 with Wishart prior
## iid4d Gaussian random effect in dim=4 with Wishart prior
## iid5d Gaussian random effect in dim=5 with Wishart prior
## intslope Intecept-slope model with Wishart-prior
## linear Alternative interface to an fixed effect
## log1exp A nonlinear model of a covariate
## logdist A nonlinear model of a covariate
## matern2d Matern covariance function on a regular grid
## meb Berkson measurement error model
## mec Classical measurement error model
## ou The Ornstein-Uhlenbeck process
## revsigm Reverse sigmoidal effect of a covariate
## rgeneric Generic latent model spesified using R
## rw1 Random walk of order 1
## rw2 Random walk of order 2
## rw2d Thin-plate spline model
## rw2diid Thin-plate spline with iid noise
## seasonal Seasonal model for time series
## sigm Sigmoidal effect of a covariate
## slm Spatial lag model
## spde A SPDE model
## spde2 A SPDE2 model
## spde3 A SPDE3 model
## z The z-model in a classical mixed model formulation
## Section [likelihood]
## beta The Beta likelihood
## betabinomial The Beta-Binomial likelihood
## betabinomialna The Beta-Binomial Normal approximation likelihood
## bgev The blended Generalized Extreme Value likelihood
## binomial The Binomial likelihood
## cbinomial The clustered Binomial likelihood
## cenpoisson Then censored Poisson likelihood
## circularnormal The circular Gaussian likelihoood
## coxph Cox-proportional hazard likelihood
## dgp Discrete generalized Pareto likelihood
## exponential The Exponential likelihood
## exponentialsurv The Exponential likelihood (survival)
## fmri fmri distribution (special nc-chi)
## fmrisurv fmri distribution (special nc-chi)
## gamma The Gamma likelihood
## gammacount A Gamma generalisation of the Poisson likelihood
## gammajw A special case of the Gamma likelihood
## gammajwsurv A special case of the Gamma likelihood (survival)
## gammasurv The Gamma likelihood (survival)
## gaussian The Gaussian likelihoood
## gev The Generalized Extreme Value likelihood
## gp Generalized Pareto likelihood
## gpoisson The generalized Poisson likelihood
## iidgamma (experimental)
## iidlogitbeta (experimental)
## loggammafrailty (experimental)
## logistic The Logistic likelihoood
## loglogistic The loglogistic likelihood
## loglogisticsurv The loglogistic likelihood (survival)
## lognormal The log-Normal likelihood
## lognormalsurv The log-Normal likelihood (survival)
## logperiodogram Likelihood for the log-periodogram
## nbinomial The negBinomial likelihood
## nbinomial2 The negBinomial2 likelihood
## nmix Binomial-Poisson mixture
## nmixnb NegBinomial-Poisson mixture
## poisson The Poisson likelihood
## poisson.special1 The Poisson.special1 likelihood
## pom Likelihood for the proportional odds model
## qkumar A quantile version of the Kumar likelihood
## qloglogistic A quantile loglogistic likelihood
## qloglogisticsurv A quantile loglogistic likelihood (survival)
## simplex The simplex likelihood
## sn The Skew-Normal likelihoood
## stochvol The Gaussian stochvol likelihood
## stochvolnig The Normal inverse Gaussian stochvol likelihood
## stochvolt The Student-t stochvol likelihood
## t Student-t likelihood
## tstrata A stratified version of the Student-t likelihood
## tweedie Tweedie distribution
## weibull The Weibull likelihood
## weibullcure The Weibull-cure likelihood (survival)
## weibullsurv The Weibull likelihood (survival)
## wrappedcauchy The wrapped Cauchy likelihoood
## xbinomial The Binomial likelihood (expert version)
## xpoisson The Poisson likelihood (expert version)
## zeroinflatedbetabinomial0 Zero-inflated Beta-Binomial, type 0
## zeroinflatedbetabinomial1 Zero-inflated Beta-Binomial, type 1
## zeroinflatedbetabinomial2 Zero inflated Beta-Binomial, type 2
## zeroinflatedbinomial0 Zero-inflated Binomial, type 0
## zeroinflatedbinomial1 Zero-inflated Binomial, type 1
## zeroinflatedbinomial2 Zero-inflated Binomial, type 2
## zeroinflatedcenpoisson0 Zero-inflated censored Poisson, type 0
## zeroinflatedcenpoisson1 Zero-inflated censored Poisson, type 1
## zeroinflatednbinomial0 Zero inflated negBinomial, type 0
## zeroinflatednbinomial1 Zero inflated negBinomial, type 1
## zeroinflatednbinomial1strata2 Zero inflated negBinomial, type 1, strata 2
## zeroinflatednbinomial1strata3 Zero inflated negBinomial, type 1, strata 3
## zeroinflatednbinomial2 Zero inflated negBinomial, type 2
## zeroinflatedpoisson0 Zero-inflated Poisson, type 0
## zeroinflatedpoisson1 Zero-inflated Poisson, type 1
## zeroinflatedpoisson2 Zero-inflated Poisson, type 2
## zeroninflatedbinomial2 Zero and N inflated binomial, type 2
## zeroninflatedbinomial3 Zero and N inflated binomial, type 3
## Section [link]
## cauchit The cauchit-link
## cloglog The complementary log-log link
## default The default link
## identity The identity link
## inverse The inverse link
## log The log-link
## loga The loga-link
## logit The logit-link
## logitoffset Logit-link with an offset
## loglog The log-log link
## logoffset Log-link with an offset
## neglog The negative log-link
## pquantile The population quantile-link
## probit The probit-link
## quantile The quantile-link
## robit Robit link
## sn Skew-normal link
## special1 A special1-link function (experimental)
## special2 A special2-link function (experimental)
## sslogit Logit link with sensitivity and specificity
## tan The tan-link
## test1 A test1-link function (experimental)
## Section [mix]
## gaussian Gaussian mixture
## loggamma LogGamma mixture
## mloggamma Minus-LogGamma mixture
## Section [predictor]
## predictor (do not use)
## Section [prior]
## betacorrelation Beta prior for the correlation
## dirichlet Dirichlet prior
## expression: A generic prior defined using expressions
## flat A constant prior
## gamma Gamma prior
## gaussian Gaussian prior
## invalid Void prior
## jeffreystdf Jeffreys prior for the doc
## linksnintercept Skew-normal-link intercept-prior
## logflat A constant prior for log(theta)
## loggamma Log-Gamma prior
## logiflat A constant prior for log(1/theta)
## logitbeta Logit prior for a probability
## logtgaussian Truncated Gaussian prior
## logtnormal Truncated Normal prior
## minuslogsqrtruncnormal (obsolete)
## mvnorm A multivariate Normal prior
## none No prior
## normal Normal prior
## pc Generic PC prior
## pc.alphaw PC prior for alpha in Weibull
## pc.ar PC prior for the AR(p) model
## pc.cor0 PC prior correlation, basemodel cor=0
## pc.cor1 PC prior correlation, basemodel cor=1
## pc.dof PC prior for log(dof-2)
## pc.fgnh PC prior for the Hurst parameter in FGN
## pc.gamma PC prior for a Gamma parameter
## pc.gammacount PC prior for the GammaCount likelihood
## pc.gevtail PC prior for the tail in the GEV likelihood
## pc.matern PC prior for the Matern SPDE
## pc.mgamma PC prior for a Gamma parameter
## pc.prec PC prior for log(precision)
## pc.range PC prior for the range in the Matern SPDE
## pc.sn PC prior for the skew-normal
## pc.spde.GA (experimental)
## pom #classes-dependent prior for the POM model
## ref.ar Reference prior for the AR(p) model, p<=3
## table: A generic tabulated prior
## wishart1d Wishart prior dim=1
## wishart2d Wishart prior dim=2
## wishart3d Wishart prior dim=3
## wishart4d Wishart prior dim=4
## wishart5d Wishart prior dim=5
## Section [wrapper]
## joint (experimental)
We often encounter the situation where an element of a model is needed more than once for each observation. One example is where
y = a + b*w + ...
for fixed weights \(w\) and where \((a_i, b_i)\) is bivariate Normal and all 2-vectors are independent.
Using the model
f(idx, model="iid2d", n=2*m, ...)
provide a random vector \(v\), say, with length \(2m\) stored as \[ v = (a_1, a_2, ..., a_m, b_1, b_2, ...., b_m). \] To implement this, we simply create an indentical copy of \(v\), \(v^*\), where \(v == v^*\) (nearly). Using the copy-feature, we can do
idx = 1:m
idx.copy = m + 1:m
formula = y ~ f(idx, model="iid2d", n=2*m) + f(idx.copy, w, copy="idx") + ....
recalling that the first \(m\) elements is \(a\) and the last \(m\) elements are \(b\), and where \(w\) are the weights.
The second terms define itself as a copy of , and it inherit some of its features, like and .
A copied model may also have an unknown scaling (hyperparameter), which is default fixed to be 1. In the following example, we will use this feature to estimate the unknown scaling (in this case, scaling is 2), assuming we know the precision for \(z\).
n=1000
i=1:n
j = i
z = rnorm(n)
w = runif(n)
y = z + 2*z*w + rnorm(n)
formula = y ~ f(i, model="iid",initial=0, fixed=T) +
f(j, w, copy="i", fixed=FALSE)
r = inla(formula, data = data.frame(i,j,w,y))
summary(r)
##
## Call:
## "inla(formula = formula, data = data.frame(i, j, w, y))"
## Time used:
## Pre = 0.27, Running = 2.15, Post = 0.0642, Total = 2.49
## Fixed effects:
## mean sd 0.025quant 0.5quant 0.975quant mode kld
## (Intercept) -0.071 0.068 -0.205 -0.071 0.062 -0.071 0
##
## Random effects:
## Name Model
## i IID model
## j Copy
##
## Model hyperparameters:
## mean sd 0.025quant 0.5quant
## Precision for the Gaussian observations 0.65 0.114 0.456 0.64
## Beta for j 1.72 0.146 1.436 1.72
## 0.975quant mode
## Precision for the Gaussian observations 0.904 0.619
## Beta for j 2.011 1.716
##
## Expected number of effective parameters(stdev): 663.66(49.41)
## Number of equivalent replicates : 1.51
##
## Marginal log-Likelihood: -2244.52
If the scaling parameter is within given range, then option \verb|range = c(low, high)|, can be given. In this case
beta = low + (high-low)*exp(beta.local)/(1+exp(beta.local))
and the prior is defined on .
If or , then the identity mapping is used. If and |verb|low!=Inf|, then the mapping is used. The case low=Inf and high!=Inf is not yet implemented.
A model or a copied model can be copied several times. The degree of closeness of \(v\) and \(v^*\) is specified by the argument , as the precision of the noise added to \(v\) to get \(v^*\).
Independent replications of a model with the same hyperparmeters can be defined using the argument ,
f(idx, model = .., replicate = r)
Here, is a vector of the same length as . In this case, we use a two-dimensional index to index this (sub-)model: , so identify the first value of the second replication of this model (component). Number of replications are defined as , unless it is defined by the argument .
One example is the model `iid’:
i = 1:n
formula = y ~ f(i, model = "iid") + ...
which has an alternative equivalent formulation as `n’ replications of an iid-model with length 1
i = rep(1,n)
r = 1:n
formula = y ~ f(i, model="iid", replicate = r) + ...
In the following example, we estimate the parameters in three AR(1) time-series with the same hyperparameters (ie its replicated) but with separate means:
n = 100
y1 = arima.sim(n=n, model=list(ar=c(0.9)))+10
y2 = arima.sim(n=n, model=list(ar=c(0.9)))+20
y3 = arima.sim(n=n, model=list(ar=c(0.9)))+30
formula = y ~ mean -1 + f(i, model="ar1", replicate=r)
y = c(y1,y2,y3)
i = rep(1:n, 3)
r = rep(1:3, each=n)
mean = as.factor(r)
result = inla(formula, family = "gaussian",
data = data.frame(y, i, mean),
control.family = list(initial = 12, fixed=TRUE))
summary(result)
##
## Call:
## c("inla(formula = formula, family = \"gaussian\", data = data.frame(y,
## ", " i, mean), control.family = list(initial = 12, fixed = TRUE))" )
## Time used:
## Pre = 0.235, Running = 0.473, Post = 0.0232, Total = 0.732
## Fixed effects:
## mean sd 0.025quant 0.5quant 0.975quant mode kld
## mean1 10.086 1.448 7.235 10.070 13.038 10.048 0.001
## mean2 18.122 1.457 15.054 18.170 20.893 18.229 0.002
## mean3 31.072 1.464 27.958 31.131 33.820 31.204 0.002
##
## Random effects:
## Name Model
## i AR1 model
##
## Model hyperparameters:
## mean sd 0.025quant 0.5quant 0.975quant mode
## Precision for i 0.132 0.041 0.061 0.129 0.218 0.124
## Rho for i 0.942 0.018 0.904 0.943 0.973 0.945
##
## Expected number of effective parameters(stdev): 300.00(0.00)
## Number of equivalent replicates : 1.00
##
## Marginal log-Likelihood: -431.25
All other arguments is interpreted for the basic model and also replicated. Like argument , is interpreted as each replication sums to zero, and additional constraints are also replicated.
There is no constraint in INLA that the type of likelihood must be the same for all observations. In fact, every observation could have its own likelihood. Extentions include more than one familily, like the Normal, Poisson, etc, but also having in the model groups of observations with separate hyperparameters within each group, where the family, for example, can be the same.
In the formula
y ~ a + 1
we allow to be a matrix. In this case each column of define one likelihood where the family is the same the hyperparameters are the same. For each row, only one of the columns could (but don’t have to) have an observation (non- value), the other colums must have value . All other parameters to the likelihood, like , and are used as appropriate. Example: If row \(i\) column \(j\) is a Poission observation, then is used as the scaling. Similar with the others. This works as only one column for each row is non-.
The argument is in the case where is a matrix, a list of families. The argument is then a list of lists; one for each family.
The first example, is a simple linear regression, where the first half of the data is observed with unknown precision (with a ‘default’ noninformative prior) and the second half of the data is observed with unknown precision . Otherwise, the two models have the same form for the linear predictor.
## Simple linear regression with observations with two different
## variances.
n = 100
N = 2*n
y = numeric(N)
x = rnorm(N)
y[1:n] = 1 + x[1:n] + rnorm(n, sd = 1/sqrt(1))
y[1:n + n] = 1 + x[1:n + n] + rnorm(n, sd = 1/sqrt(2))
Y = matrix(NA, N, 2)
Y[1:n, 1] = y[1:n]
Y[1:n + n, 2] = y[1:n + n]
formula = Y ~ x + 1
result = inla(
formula,
data = list(Y=Y, x=x),
family = c("gaussian", "gaussian"),
control.family = list(list(prior = "flat", param = numeric()),
list()))
summary(result)
##
## Call:
## c("inla(formula = formula, family = c(\"gaussian\", \"gaussian\"), data
## = list(Y = Y, ", " x = x), control.family = list(list(prior = \"flat\",
## param = numeric()), ", " list()))")
## Time used:
## Pre = 0.185, Running = 0.162, Post = 0.0164, Total = 0.363
## Fixed effects:
## mean sd 0.025quant 0.5quant 0.975quant mode kld
## (Intercept) 1.029 0.051 0.929 1.029 1.128 1.029 0
## x 1.033 0.048 0.938 1.033 1.128 1.032 0
##
## Model hyperparameters:
## mean sd 0.025quant 0.5quant
## Precision for the Gaussian observations 0.996 0.141 0.742 0.988
## Precision for the Gaussian observations[2] 3.002 0.424 2.240 2.979
## 0.975quant mode
## Precision for the Gaussian observations 1.30 0.975
## Precision for the Gaussian observations[2] 3.90 2.939
##
## Expected number of effective parameters(stdev): 2.00(0.00)
## Number of equivalent replicates : 99.88
##
## Marginal log-Likelihood: -248.42
The second example shows how to use information from two sources to estimate the effect of the covariate .
## Simple example with two types of likelihoods
n = 10
N = 2*n
## common covariates
x = rnorm(n)
## Poisson, depends on x
E1 = runif(n)
y1 = rpois(n, lambda = E1*exp(x))
## Binomial, depends on x
size = sample(1:10, size=n, replace=TRUE)
prob = exp(x)/(1+exp(x))
y2 = rbinom(n, size= size, prob = prob)
## Join them together
Y = matrix(NA, N, 2)
Y[1:n, 1] = y1
Y[1:n + n, 2] = y2
## The E for the Poisson
E = numeric(N)
E[1:n] = E1
E[1:n + n] = NA
## Ntrials for the Binomial
Ntrials = numeric(N)
Ntrials[1:n] = NA
Ntrials[1:n + n] = size
## Duplicate the covariate which is shared
X = numeric(N)
X[1:n] = x
X[1:n + n] = x
## Formula involving Y as a matrix
formula = Y ~ X - 1
## `family' is now
result = inla(formula,
family = c("poisson", "binomial"),
data = list(Y=Y, X=X),
E = E, Ntrials = Ntrials)
summary(result)
##
## Call:
## c("inla(formula = formula, family = c(\"poisson\", \"binomial\"), data
## = list(Y = Y, ", " X = X), E = E, Ntrials = Ntrials)")
## Time used:
## Pre = 0.218, Running = 0.121, Post = 0.0131, Total = 0.352
## Fixed effects:
## mean sd 0.025quant 0.5quant 0.975quant mode kld
## X 1.054 0.182 0.674 1.062 1.39 1.078 0
##
## Expected number of effective parameters(stdev): 1.00(0.00)
## Number of equivalent replicates : 20.00
##
## Marginal log-Likelihood: -28.28
If the covariate ‘x’ is different for the two families, and , say, then we only need to make the following changes
X = numeric(N)
X[1:n] = x
X[1:n + n] = NA
XX = numeric(N)
XX[1:n] = NA
XX[1:n + n] = xx
formula = Y ~ X + XX -1
and add into the data.frame. Note how we can express the joint model as a ‘union’ of models with the use of ’s to remove terms.
In the next example, we use also the feature to estimate three replicated AR(1) models with the same hyperparamters, each observed differently.
## An example with three independent AR(1)'s with separate means, but
## with the same hyperparameters. These are observed with three
## different likelihoods.
n = 100
x1 = arima.sim(n=n, model=list(ar=c(0.9))) + 0
x2 = arima.sim(n=n, model=list(ar=c(0.9))) + 1
x3 = arima.sim(n=n, model=list(ar=c(0.9))) + 2
## Binomial observations
Nt = 10 + rpois(n,lambda=1)
y1 = rbinom(n, size=Nt, prob = exp(x1)/(1+exp(x1)))
## Poisson observations
Ep = runif(n, min=1, max=10)
y2 = rpois(n, lambda = Ep*exp(x2))
## Gaussian observations
y3 = rnorm(n, mean=x3, sd=0.1)
## stack these in a 3-column matrix with NA's where not observed
y = matrix(NA, 3*n, 3)
y[1:n, 1] = y1
y[n + 1:n, 2] = y2
y[2*n + 1:n, 3] = y3
## define the model
r = c(rep(1,n), rep(2,n), rep(3,n))
rf = as.factor(r)
i = rep(1:n, 3)
formula = y ~ f(i, model="ar1", replicate=r, constr=TRUE) + rf -1
data = list(y=y, i=i, r=r, rf=rf)
## parameters for the binomial and the poisson
Ntrial = rep(NA, 3*n)
Ntrial[1:n] = Nt
E = rep(NA, 3*n)
E[1:n + n] = Ep
result = inla(formula, family = c("binomial", "poisson", "normal"),
data = data, Ntrial = Ntrial, E = E,
control.family = list(
list(),
list(),
list()))
summary(result)
##
## Call:
## c("inla(formula = formula, family = c(\"binomial\", \"poisson\",
## \"normal\"), ", " data = data, E = E, Ntrials = Ntrial, control.family
## = list(list(), ", " list(), list()))")
## Time used:
## Pre = 0.259, Running = 0.526, Post = 0.0292, Total = 0.814
## Fixed effects:
## mean sd 0.025quant 0.5quant 0.975quant mode kld
## rf1 -0.480 0.095 -0.670 -0.478 -0.298 -0.475 0
## rf2 1.490 0.040 1.409 1.491 1.566 1.493 0
## rf3 1.107 0.001 1.105 1.107 1.109 1.107 0
##
## Random effects:
## Name Model
## i AR1 model
##
## Model hyperparameters:
## mean sd 0.025quant
## Precision for the Gaussian observations[3] 2.10e+04 2.13e+04 1732.326
## Precision for i 2.30e-01 4.70e-02 0.144
## Rho for i 8.61e-01 2.90e-02 0.800
## 0.5quant 0.975quant mode
## Precision for the Gaussian observations[3] 1.47e+04 7.71e+04 4846.618
## Precision for i 2.28e-01 3.29e-01 0.225
## Rho for i 8.62e-01 9.13e-01 0.863
##
## Expected number of effective parameters(stdev): 241.81(2.40)
## Number of equivalent replicates : 1.24
##
## Marginal log-Likelihood: -848.11
In some cases, the data/response might depend on a linear combination of the “linear predictor” defined in the formula, like
y ~ 1 + z
then this implies that depends on . Suppose if depends on ? Although it is possible to express this, using the tools we already have, it is more convenient to add another layer into the model. Let be a matrix, which defines new linear predictors, from , like
eta~ = A %*% eta
Here, is the ordinary linear predictor defined using the formula, and the data depends on the linear predictor . So we might express this as
y ~ 1 + z, with addition matrix A
meaning in short, that
y ~ eta~ ## no intercept...
eta~ = A %*% eta
eta = intercept + beta*z
This is specified by adding the \(A\)-matrix, using
control.predictor=list(A=A)
The argument , which might be defined in the formula as or as an argument , does have different interpretation in the case where the \(A\)-matrix is used. The rule is that in the formula, goes into , whereas an argument goes into . If we write out the expressions above adding offsets, and , we get
eta~ = A %*% eta + offset.arg
eta = intercept + beta*z + offset.formula
In the case where there is no \(A\)-matrix, then \verb|offset.total = offset.arg + offset.formula|.
The following example should provide more insight. You may change \(n\) and \(m\), such that \(m < n\), \(m=n\) or \(m > n\). Note that since the response has dimension \(m\) and the covariates dimension \(n\), we need to use and not a . This example also illustrates the use of offset’s.
## 'm' is the number of observations of eta*, where eta* = A eta +
## offset.arg, and A is a fixed m x n matrix, and eta has length n. An
## offset in the formula goes into 'eta' whereas an offset in the
## argument of the inla-call, goes into eta*
n = 10
m = 100
offset.formula = 10+ 1:n
offset.arg = 1 + 1:m
## a covariate
z = runif(n)
## the linear predictor eta
eta = 1 + z + offset.formula
## the linear predictor eta* = A eta + offset.arg.
A = matrix(runif(n*m), m, n);
##A = inla.as.sparse(A) ## sparse is ok
## need 'as.vector', as 'Eta' will be a sparseMatrix if 'A' is sparse
## even if ncol(Eta) = 1
Eta = as.vector(A %*% eta) + offset.arg
s = 1e-6
Y = Eta + rnorm(m, sd=s)
## for a check, we can use several offsets. here, m1=-1 and p1=1, so
## they m1+p1 = 0.
r = inla(Y ~ 1+z + offset(offset.formula) + offset(m1) + offset(p1),
## The A-matrix defined here
control.predictor = list(A = A, compute=TRUE, precision = 1e6),
## we need to use a list() as the different lengths of Y
## and z
data = list(Y=Y, z=z,
m1 = rep(-1, n),
p1 = rep(1, n),
offset.formula = offset.formula,
offset.arg = offset.arg),
## this is the offset defined in the argument of inla
offset = offset.arg,
##
control.family = list(initial = log(1/s^2), fixed=TRUE))
## Warning in inla(Y ~ 1 + z + offset(offset.formula) + offset(m1) + offset(p1), : The A-matrix in the predictor (see ?control.predictor) is specified
## but an intercept is in the formula. This will likely result
## in the intercept being applied multiple times in the model, and is likely
## not what you want. See ?inla.stack for more information.
## You can remove the intercept adding ``-1'' to the formula.
summary(r)
##
## Call:
## c("inla(formula = Y ~ 1 + z + offset(offset.formula) + offset(m1) + ",
## " offset(p1), data = list(Y = Y, z = z, m1 = rep(-1, n), p1 = rep(1, ",
## " n), offset.formula = offset.formula, offset.arg = offset.arg), ", "
## offset = offset.arg, control.predictor = list(A = A, compute = TRUE, ",
## " precision = 1e+06), control.family = list(initial = log(1/s^2), ", "
## fixed = TRUE))")
## Time used:
## Pre = 0.183, Running = 0.0685, Post = 0.0172, Total = 0.269
## Fixed effects:
## mean sd 0.025quant 0.5quant 0.975quant mode kld
## (Intercept) 1 0.002 0.997 1 1.003 1 0
## z 1 0.003 0.993 1 1.007 1 0
##
## Expected number of effective parameters(stdev): 100.00(0.00)
## Number of equivalent replicates : 1.00
##
## Marginal log-Likelihood: -53381.68
## Posterior marginals for the linear predictor and
## the fitted values are computed
## this should be a small number
print(max(abs(r$summary.linear.predictor$mean - c(Eta, eta))))
## [1] 8.041647e-06
Here is a another example where the informal formula is
y = intercept + s[j] + 0.5*s[k] + noise
Instead of using the copy feature, we can implement this model using the A-matrix feature. What we do, is to first define a linear predictor being the intercept and \(s\), then we use the \(A\)-matrix to ‘construct the model’.
n = 100
s = c(-1, 0, 1)
nS = length(s)
j = sample(1L:nS, n, replace=TRUE)
k = j
k[j == 1L] = 2
k[j == 2L] = 3
k[k == 3L] = 1
noise = rnorm(n, sd=0.0001)
y = 1 + s[j] + 0.5*s[k] + noise
## build the formula such that the linear predictor is the intercept
## (index 1) and the 's' term (index 2:(n+1)). then kind of
## 'construct' the model using the A-matrix.
formula = y ~ -1 + intercept + f(idx)
A = matrix(0, n, nS+1L)
for(i in 1L:n) {
A[i, 1L] = 1
A[i, 1L + j[i]] = 1
A[i, 1L + k[i]] = 0.5
}
data = list(intercept = c(1, rep(NA, nS)), idx = c(NA, 1L:nS))
result = inla(formula, data=data, control.predictor=list(A=A))
## should be a straight line
plot(result$summary.random$idx$mean, s, pch=19)
abline(a=0,b=1)