Package 'cbass' reference manual

Title:	Classification -- Bayesian Adaptive Smoothing Splines
Description:	Fit multiclass Classification version of Bayesian Adaptive Smoothing Splines (CBASS) to data using reversible jump MCMC. The multiclass classification problem consists of a response variable that takes on unordered categorical values with at least three levels, and a set of inputs for each response variable. The CBASS model consists of a latent multivariate probit formulation, and the means of the latent Gaussian random variables are specified using adaptive regression splines. The MCMC alternates updates of the latent Gaussian variables and the spline parameters. All the spline parameters (variables, signs, knots, number of interactions), including the number of basis functions used to model each latent mean, are inferred. Functions are provided to process inputs, initialize the chain, run the chain, and make predictions. Predictions are made on a probabilistic basis, where, for a given input, the probabilities of each categorical value are produced. See Marrs and Francom (2023) "Multiclass classification using Bayesian multivariate adaptive regression splines" Under review.
Authors:	Frank Marrs [aut, cre] , Devin Francom [aut]
Maintainer:	Frank Marrs <[email protected]>
License:	GPL-3
Version:	0.1
Built:	2024-11-05 06:12:48 UTC
Source:	https://github.com/cran/cbass

Augment X for missing data approach for MNAR

Description

Augment X for missing data approach for MNAR

Usage

augment.X(X)
augment.X(X)

Arguments

`X`	matrix of covariates, including some missing values (NAs)

Value

Matrix same size as X, with augmented columns and zeros in the missing spots

Examples

set.seed(1)
n <- 100
X <- matrix(runif(n*2, 0, 1), ncol=2)
X[sample(1:length(X), round(.1*length(X)))] <- NA
X.new <- augment.X(X)
sum(is.na(X.new))
set.seed(1)
n <- 100
X <- matrix(runif(n*2, 0, 1), ncol=2)
X[sample(1:length(X), round(.1*length(X)))] <- NA
X.new <- augment.X(X)
sum(is.na(X.new))

Fit CBASS model using reversible jump MCMC

Description

Fit CBASS model using reversible jump MCMC

Usage

fit.cbass(
  X,
  y,
  max.int = 3,
  max.basis = 10 * ncol(X),
  tau2 = 10,
  nmcmc = 10000,
  nburn = round(nmcmc/2),
  nthin = 10,
  h1 = 4,
  h2 = 20 * (length(unique(y)) - 1)/nrow(X),
  p.int.prior = 1/(1:max.int),
  verbose = FALSE,
  print.interval = round(nmcmc/100),
  init1 = FALSE,
  ordinal = NULL,
  writeout = FALSE,
  writedir = tempdir(),
  mod = NULL,
  restart = FALSE
)
fit.cbass(
  X,
  y,
  max.int = 3,
  max.basis = 10 * ncol(X),
  tau2 = 10,
  nmcmc = 10000,
  nburn = round(nmcmc/2),
  nthin = 10,
  h1 = 4,
  h2 = 20 * (length(unique(y)) - 1)/nrow(X),
  p.int.prior = 1/(1:max.int),
  verbose = FALSE,
  print.interval = round(nmcmc/100),
  init1 = FALSE,
  ordinal = NULL,
  writeout = FALSE,
  writedir = tempdir(),
  mod = NULL,
  restart = FALSE
)

Arguments

`X`	n by p matrix of inputs on unit invteral
`y`	n-length factor factor of categories
`max.int`	maximum number of interactions, default 3
`max.basis`	maximum number of basis functions for each latent mean function, default ncol(X)*10
`tau2`	prior variance of basis regression coefficients, default 10
`nmcmc`	number of MCMC samples, default 1e4
`nburn`	number of MCMC samples to burn, default nmcmc/2
`nthin`	number of samples by which to thin, default 10
`h1`	first parameter for Gamma hyperprior on tau2, default 4
`h2`	second parameter for Gamma hyperprior on tau2, default 20(d-1)/n
`p.int.prior`	prior for number of interactions, default 1/(1:max.int)
`verbose`	should progress be printed out? default false
`print.interval`	how often should progress be printed out? default every 1%
`init1`	should model be initialized with single interaction model? default FALSE
`ordinal`	indicator of ordinal predictors (non-categorical), usually computed automatically
`writeout`	should samples be written out to text file? default FALSE
`writedir`	where should text files be written? default tempdir()
`mod`	initial / previous model fit, default NULL
`restart`	should initial input model be used for starting chain? default FALSE

Value

A list of CBASS model parameters. LIST THEM.

Examples

set.seed(1)
n <- 100;  d <- 3
X <- matrix(runif(n*2, 0, 1), ncol=2)
mu <- scale(X)
bound <- qnorm(1/d^(1/(d-1)))
mu <- cbind(bound, mu)
z <- mu
z[,-1] <- rnorm(length(mu[,-1]), mu[,-1], 1)
y <- apply(z, 1, which.max)
mod <- fit.cbass(X, y, max.int=1, max.basis=10, nmcmc=1e3, nburn=500, nthin=10)
pred.chain <- pred.cbass(mod, X)
mu.hat <- apply(pred.chain, 2:3, mean)
mean(abs(mu - mu.hat))
plot(c(mu), c(mu.hat))
set.seed(1)
n <- 100;  d <- 3
X <- matrix(runif(n*2, 0, 1), ncol=2)
mu <- scale(X)
bound <- qnorm(1/d^(1/(d-1)))
mu <- cbind(bound, mu)
z <- mu
z[,-1] <- rnorm(length(mu[,-1]), mu[,-1], 1)
y <- apply(z, 1, which.max)
mod <- fit.cbass(X, y, max.int=1, max.basis=10, nmcmc=1e3, nburn=500, nthin=10)
pred.chain <- pred.cbass(mod, X)
mu.hat <- apply(pred.chain, 2:3, mean)
mean(abs(mu - mu.hat))
plot(c(mu), c(mu.hat))

Predict vector of probabilities from vector of latent means

Description

Predict vector of probabilities from vector of latent means

Usage

p.mu(mu, d = NULL, bound = NULL, npts = 100, rel.tol = 1e-04)
p.mu(mu, d = NULL, bound = NULL, npts = 100, rel.tol = 1e-04)

Arguments

`mu`	d-length vector of latent means
`d`	input to avoid computing length of mu
`bound`	input of mu[1] to avoid computation
`npts`	number of integration points, default 100
`rel.tol`	number of integration points, default 1e-4

Value

A d-length numeric vector of probabilities given input latent means

Examples

set.seed(1)
mu <- rnorm(5)
p.mu(mu)
set.seed(1)
mu <- rnorm(5)
p.mu(mu)

Generate chain of latent normal random variables for a given X, for values saved in 'mod'

Description

Generate chain of latent normal random variables for a given X, for values saved in 'mod'

Usage

pred.cbass(mod, X, nburn = 0, nsub = NULL)
pred.cbass(mod, X, nburn = 0, nsub = NULL)

Arguments

`mod`	CBASS model list
`X`	matrix of covariates of same size / makeup as that used to create mod. If matrix not scaled to the unit interval, then it will be
`nburn`	Number of samples to burn from the chain in mod, default 0
`nsub`	Number of samples to subset to, default to those stored in mod

Value

An array of latent variables, nsub by nrow(X) by d

Examples

set.seed(1)
n <- 100;  d <- 3
X <- matrix(runif(n*2, 0, 1), ncol=2)
mu <- scale(X)
bound <- qnorm(1/d^(1/(d-1)))
mu <- cbind(bound, mu)
z <- mu
z[,-1] <- rnorm(length(mu[,-1]), mu[,-1], 1)
y <- apply(z, 1, which.max)
mod <- fit.cbass(X, y, max.int=1, max.basis=10, nmcmc=1e3, nburn=500, nthin=10)
pred.chain <- pred.cbass(mod, X)
mu.hat <- apply(pred.chain, 2:3, mean)
round(p.mu(mu.hat[1,]), 3)
set.seed(1)
n <- 100;  d <- 3
X <- matrix(runif(n*2, 0, 1), ncol=2)
mu <- scale(X)
bound <- qnorm(1/d^(1/(d-1)))
mu <- cbind(bound, mu)
z <- mu
z[,-1] <- rnorm(length(mu[,-1]), mu[,-1], 1)
y <- apply(z, 1, which.max)
mod <- fit.cbass(X, y, max.int=1, max.basis=10, nmcmc=1e3, nburn=500, nthin=10)
pred.chain <- pred.cbass(mod, X)
mu.hat <- apply(pred.chain, 2:3, mean)
round(p.mu(mu.hat[1,]), 3)

Draw samples of independent normals (matrix) given previous sample, and maximal values

Description

Draw samples of independent normals (matrix) given previous sample, and maximal values

Usage

sample.z(mu, y, z)
sample.z(mu, y, z)

Arguments

`mu`	n by d matrix of latent means
`y`	n-length vector of maximal indices
`z`	n by d matrix of latent random variables

Value

A new sample of n by d matrix of latent random variables

Examples

set.seed(1)
n <- 100;  d <- 3
mu <- matrix(rnorm(n*d), n, d)
bound <- qnorm(1/d^(1/(d-1)))
mu[,1] <- bound
z <- mu
z[,-1] <- rnorm(length(mu[,-1]), mu[,-1], 1)
y <- apply(z, 1, which.max)
z.new <- sample.z(mu, y, z)
all(apply(z.new, 1, which.max) == y)
set.seed(1)
n <- 100;  d <- 3
mu <- matrix(rnorm(n*d), n, d)
bound <- qnorm(1/d^(1/(d-1)))
mu[,1] <- bound
z <- mu
z[,-1] <- rnorm(length(mu[,-1]), mu[,-1], 1)
y <- apply(z, 1, which.max)
z.new <- sample.z(mu, y, z)
all(apply(z.new, 1, which.max) == y)

Package 'cbass'

Help Index

Augment X for missing data approach for MNAR

Description

Usage

Arguments

Value

Examples

Fit CBASS model using reversible jump MCMC

Description

Usage

Arguments

Value

Examples

Predict vector of probabilities from vector of latent means

Description

Usage

Arguments

Value

Examples

Generate chain of latent normal random variables for a given X, for values saved in 'mod'

Description

Usage

Arguments

Value

Examples

Draw samples of independent normals (matrix) given previous sample, and maximal values

Description

Usage

Arguments

Value

Examples