Title: | Robust Kernel Unsupervised Methods |
---|---|
Description: | Robust kernel center matrix, robust kernel cross-covariance operator for kernel unsupervised methods, kernel canonical correlation analysis, influence function of identifying significant outliers or atypical objects from multimodal datasets. Alam, M. A, Fukumizu, K., Wang Y.-P. (2018) <doi:10.1016/j.neucom.2018.04.008>. Alam, M. A, Calhoun, C. D., Wang Y.-P. (2018) <doi:10.1016/j.csda.2018.03.013>. |
Authors: | Md Ashad Alam |
Maintainer: | Md Ashad Alam <[email protected]> |
License: | GPL-3 |
Version: | 0.1.1.1 |
Built: | 2024-11-10 04:14:07 UTC |
Source: | https://github.com/cran/RKUM |
Many radial basis function kernels, such as the Gaussian kernel, map X into a infinte dimensional space. While the Gaussian kernel has a free parameter (bandwidth), it still follows a number of theoretical properties such as boundedness, consistence, universality, robustness etc. It is the most applicable kernel of the positive definite kernel based methods.
gkm(X)
gkm(X)
X |
a data matrix. |
Many radial basis function kernels, such as the Gaussian kernel, map input sapce into a infinite dimensional space. The Gaussian kernel has a a number of theoretical properties such as boundedness, consistence, universality and robustness, etc.
K |
a Gram/ kernel matrix |
Md Ashad Alam <[email protected]>
Md. Ashad Alam, Hui-Yi Lin, HOng-Wen Deng, Vince Calhour Yu-Ping Wang (2018), A kernel machine method for detecting higher order interactions in multimodal datasets: Application to schizophrenia, Journal of Neuroscience Methods, Vol. 309, 161-174.
Md Ashad Alam, Kenji Fukumizu and Yu-Ping Wang (2018), Influence Function and Robust Variant of Kernel Canonical Correlation Analysis, Neurocomputing, Vol. 304 (2018) 12-29.
M. Romanazzi (1992), Influence in canonical correlation analysis, Psychometrika vol 57(2) (1992) 237-259.
##Dummy data: X<-matrix(rnorm(1000),100) gkm(X)
##Dummy data: X<-matrix(rnorm(1000),100) gkm(X)
#An matrices dicomposition function
gm3edc(Amat, Bmat, Cmat)
gm3edc(Amat, Bmat, Cmat)
Amat |
a square matrix |
Bmat |
a square matrix |
Cmat |
a square matrix |
Md Ashad Alam <[email protected]>
#An matrices dicomposition function
gmedc(A, B = diag(nrow(A)))
gmedc(A, B = diag(nrow(A)))
A |
a square matrix |
B |
a diagonal matrix |
Md Ashad Alam <[email protected]>
###An function to adjust
gmi(X, tol = sqrt(.Machine$double.eps))
gmi(X, tol = sqrt(.Machine$double.eps))
X |
a square matrix |
tol |
a real value |
Md Ashad Alam <[email protected]>
##The ratio of the first derivative of the Hampel loss fuction to the argument. Tuning constants are fixed in different quintiles.
hadr(u)
hadr(u)
u |
vector values |
a real value
Md Ashad Alam <[email protected]>
Md Ashad Alam, Kenji Fukumizu and Yu-Ping Wang (2018), Influence Function and Robust Variant of Kernel Canonical Correlation Analysis, Neurocomputing, Vol. 304 (2018) 12-29.
M. Romanazzi (1992), Influence in canonical correlation analysis, Psychometrika vol 57(2) (1992) 237-259.
#Tuning constants of the Hampel loss fuction are fixed in different quintiles of the arguments.
halfun(u)
halfun(u)
u |
vector of values. |
comp1 |
a real number |
Md Ashad Alam <[email protected]>
Md Ashad Alam, Kenji Fukumizu and Yu-Ping Wang (2018), Influence Function and Robust Variant of Kernel Canonical Correlation Analysis, Neurocomputing, Vol. 304 (2018) 12-29.
M. Romanazzi (1992), Influence in canonical correlation analysis, Psychometrika vol 57(2) (1992) 237-259.
See Also as hulfun
, hadr
, hudr
Objective function of Hampel's loss fucntion
halofun(x)
halofun(x)
x |
vector values |
a real value
Md Ashad Alam <[email protected]>
Md Ashad Alam, Kenji Fukumizu and Yu-Ping Wang (2018), Influence Function and Robust Variant of Kernel Canonical Correlation Analysis, Neurocomputing, Vol. 304 (2018) 12-29.
M. Romanazzi (1992), Influence in canonical correlation analysis, Psychometrika vol 57(2) (1992) 237-259.
See also as hulofun
The ratio of the first derivative of the Huber loss fuction to the argument. Tuning constants is fixed as a meadian vlue.
hudr(x)
hudr(x)
x |
vector values |
y |
a real value |
Md Ashad Alam <[email protected]>
Md Ashad Alam, Kenji Fukumizu and Yu-Ping Wang (2018), Influence Function and Robust Variant of Kernel Canonical Correlation Analysis, Neurocomputing, Vol. 304 (2018) 12-29.
M. Romanazzi (1992), Influence in canonical correlation analysis, Psychometrika vol 57(2) (1992) 237-259.
See also as hadr
Tuning constants of the Huber loss fuction are fixed in different quintiles of the arguments.
hulfun(x)
hulfun(x)
x |
a vector values |
Tuning constants of the Huber fuction is fixed as a median.
a real number
Md Ashad Alam <[email protected]>
Md Ashad Alam, Kenji Fukumizu and Yu-Ping Wang (2018), Influence Function and Robust Variant of Kernel Canonical Correlation Analysis, Neurocomputing, Vol. 304 (2018) 12-29.
M. Romanazzi (1992), Influence in canonical correlation analysis, Psychometrika vol 57(2) (1992) 237-259.
See also as halfun
Objective function of Huber's loss fucntion
hulofun(x)
hulofun(x)
x |
vector values |
a real value
Md Ashad Alam <[email protected]>
Md Ashad Alam, Kenji Fukumizu and Yu-Ping Wang (2018), Influence Function and Robust Variant of Kernel Canonical Correlation Analysis, Neurocomputing, Vol. 304 (2018) 12-29.
M. Romanazzi (1992), Influence in canonical correlation analysis, Psychometrika vol 57(2) (1992) 237-259.
See Also as halofun
, ~~~
For GWASs, a kernel captures the pairwise similarity across a number of SNPs in each gene. Kernel projects the genotype data from original high dimensional space to a feature space. One of the more popular kernels used for genomics similarity is the identity-by-state (IBS) kernel (non- parametric function of the genotypes)
ibskm(Z)
ibskm(Z)
Z |
a data matrix |
For genome-wide association study, a kernel captures the pairwise similarity across a number of SNPs in each gene. Kernel projects the genotype data from original high dimensional space to a feature space. One popular kernel used for genomics similarity is the identity-by-state (IBS) kernel, The IBS kernel does not need any assumption on the type of genetic interactions.
K |
a Gram/ kernel matrix |
Md Ashad Alam <[email protected]>
Md. Ashad Alam, Hui-Yi Lin, HOng-Wen Deng, Vince Calhour Yu-Ping Wang (2018), A kernel machine method for detecting higher order interactions in multimodal datasets: Application to schizophrenia, Journal of Neuroscience Methods, Vol. 309, 161-174.
Md Ashad Alam, Kenji Fukumizu and Yu-Ping Wang (2018), Influence Function and Robust Variant of Kernel Canonical Correlation Analysis, Neurocomputing, Vol. 304 (2018) 12-29.
M. Romanazzi (1992), Influence in canonical correlation analysis, Psychometrika vol 57(2) (1992) 237-259.
##Dummy data: X <- matrix(rnorm(200),50) ibskm(X)
##Dummy data: X <- matrix(rnorm(200),50) ibskm(X)
##To define the robustness in statistics, different approaches have been pro- posed, for example, the minimax approach, the sensitivity curve, the influence function (IF) and the finite sample breakdown point. Due to its simplic- ity, the IF is the most useful approach in statistical machine learning
ifcca(X, Y, gamma = 1e-05, ncomps = 2, jth = 1)
ifcca(X, Y, gamma = 1e-05, ncomps = 2, jth = 1)
X |
a data matrix index by row |
Y |
a data matrix index by row |
gamma |
the hyper-parameters |
ncomps |
the number of canonical vectors |
jth |
the influence function of the jth canonical vector |
iflccor |
Influence value of the data by linear canonical correalation |
Md Ashad Alam <[email protected]>
Md Ashad Alam, Kenji Fukumizu and Yu-Ping Wang (2018), Influence Function and Robust Variant of Kernel Canonical Correlation Analysis, Neurocomputing, Vol. 304 (2018) 12-29.
M. Romanazzi (1992), Influence in canonical correlation analysis, Psychometrika vol 57(2) (1992) 237-259.
##Dummy data: X <- matrix(rnorm(500),100); Y <- matrix(rnorm(500),100) ifcca(X,Y, 1e-05, 2, 2)
##Dummy data: X <- matrix(rnorm(500),100); Y <- matrix(rnorm(500),100) ifcca(X,Y, 1e-05, 2, 2)
## To define the robustness in statistics, different approaches have been pro- posed, for example, the minimax approach, the sensitivity curve, the influence function (IF) and the finite sample breakdown point. Due to its simplic- ity, the IF is the most useful approach in statistical machine learning.
ifmkcca(xx, yy, zz, kernel = "rbfdot", gamma = 1e-05, ncomps = 1, jth=1)
ifmkcca(xx, yy, zz, kernel = "rbfdot", gamma = 1e-05, ncomps = 1, jth=1)
xx |
a data matrix index by row |
yy |
a data matrix index by row |
zz |
a data matrix index by row |
kernel |
a positive definite kernel |
ncomps |
the number of canonical vectors |
gamma |
the hyper-parameters. |
jth |
the influence function of the jth canonical vector |
iflccor |
Influence value of the data by multiple kernel canonical correalation |
Md Ashad Alam <[email protected]>
Md Ashad Alam, Kenji Fukumizu and Yu-Ping Wang (2018), Influence Function and Robust Variant of Kernel Canonical Correlation Analysis, Neurocomputing, Vol. 304 (2018) 12-29.
M. Romanazzi (1992), Influence in canonical correlation analysis, Psychometrika vol 57(2) (1992) 237-259.
See also as ifcca
##Dummy data: X <- matrix(rnorm(500),100); Y <- matrix(rnorm(500),100); Z <- matrix(rnorm(500),100) ifmkcca(X,Y, Z, "rbfdot", 1e-05, 2, 1)
##Dummy data: X <- matrix(rnorm(500),100); Y <- matrix(rnorm(500),100); Z <- matrix(rnorm(500),100) ifmkcca(X,Y, Z, "rbfdot", 1e-05, 2, 1)
##To define the robustness in statistics, different approaches have been pro- posed, for example, the minimax approach, the sensitivity curve, the influence function (IF) and the finite sample breakdown point. Due to its simplic- ity, the IF is the most useful approach in statistical machine learning.
ifrkcca(X, Y, lossfu = "Huber", kernel = "rbfdot", gamma = 0.00001, ncomps = 10, jth = 1)
ifrkcca(X, Y, lossfu = "Huber", kernel = "rbfdot", gamma = 0.00001, ncomps = 10, jth = 1)
X |
a data matrix index by row |
Y |
a data matrix index by row |
lossfu |
a loss function: square, Hampel's or Huber's loss |
kernel |
a positive definite kernel |
gamma |
the hyper-parameters |
ncomps |
the number of canonical vectors |
jth |
the influence function of the jth canonical vector |
ifrkcor |
Influence value of the data by robust kernel canonical correalation |
ifrkxcv |
Influence value of cnonical vector of X dataset |
ifrkycv |
Influence value of cnonical vector of Y dataset |
Md Ashad Alam <[email protected]>
Md Ashad Alam, Kenji Fukumizu and Yu-Ping Wang (2018), Influence Function and Robust Variant of Kernel Canonical Correlation Analysis, Neurocomputing, Vol. 304 (2018) 12-29.
M. Romanazzi (1992), Influence in canonical correlation analysis, Psychometrika vol 57(2) (1992) 237-259.
##Dummy data: X <- matrix(rnorm(500),100); Y <- matrix(rnorm(500),100) ifrkcca(X,Y, lossfu = "Huber", kernel = "rbfdot", gamma = 0.00001, ncomps = 10, jth = 2)
##Dummy data: X <- matrix(rnorm(500),100); Y <- matrix(rnorm(500),100) ifrkcca(X,Y, lossfu = "Huber", kernel = "rbfdot", gamma = 0.00001, ncomps = 10, jth = 2)
#A function ..............
lcv(X, Y, res)
lcv(X, Y, res)
X |
a matrix |
Y |
a matrix |
res |
a real value |
Md Ashad Alam <[email protected]>
The linear kernel is used by the underlying Euclidean space to define the similarity measure. Whenever the dimensionality is high, it may allow for more complexity in the function class than what we could measure and assess otherwise
lkm(X)
lkm(X)
X |
a data matrix |
The linear kernel is used by the underlying Euclidean space to define the similarity measure. Whenever the dimensionality of the data is high, it may allow for more complexity in the function class than what we could measure and assess otherwise.
K |
a kernel matrix. |
Md Ashad Alam <[email protected]>
Md. Ashad Alam, Hui-Yi Lin, HOng-Wen Deng, Vince Calhour Yu-Ping Wang (2018), A kernel machine method for detecting higher order interactions in multimodal datasets: Application to schizophrenia, Journal of Neuroscience Methods, Vol. 309, 161-174.
Md Ashad Alam, Kenji Fukumizu and Yu-Ping Wang (2018), Influence Function and Robust Variant of Kernel Canonical Correlation Analysis, Neurocomputing, Vol. 304 (2018) 12-29.
Md Ashad Alam, Vince D. Calhoun and Yu-Ping Wang (2018), Identifying outliers using multiple kernel canonical correlation analysis with application to imaging genetics, Computational Statistics and Data Analysis, Vol. 125, 70- 85
##Dummy data: X <- matrix(rnorm(500),100) lkm(X)
##Dummy data: X <- matrix(rnorm(500),100) lkm(X)
A median of the pairwise distance of the data
mdbw(X)
mdbw(X)
X |
a data matrix |
While the Gaussian kernel has a free parameter (bandwidth), it still follows a number of theoretical properties such as boundedness, consistenc, universality, robustness, etc. It is the most applicable one. In a Gaussian RBF kernel, we need to select an appropriate a bandwidth. It is well known that the parameter has a strong influence on the result of kernel methods. For the Gaussian kernel, we can use the median of the pairwise distance as a bandwidth.
s |
a median of the pairwise distance of the X dataset |
Md Ashad Alam <[email protected]>
Md. Ashad Alam, Hui-Yi Lin, HOng-Wen Deng, Vince Calhour Yu-Ping Wang (2018), A kernel machine method for detecting higher order interactions in multimodal datasets: Application to schizophrenia, Journal of Neuroscience Methods, Vol. 309, 161-174.
Md. Ashad Alam, Kenji Fukumizu and Yu-Ping Wang (2018), Influence Function and Robust Variant of Kernel Canonical Correlation Analysis, Neurocomputing, Vol. 304 (2018) 12-29.
Md. Ashad Alam and Kenji Fukumizu (2015), Higher-order regularized kernel canonical correlation analysis, International Journal of Pattern Recognition and Artificial Intelligence, Vol. 29(4) 1551005.
Arthu Gretton, Kenji. Fukumizu, C. H. Teo, L. Song, B. Scholkopf and A. Smola (2008), A Kernel statistical test of independence, in Advances in Neural Information Processing Systems, Vol. 20 585–592.
##Dummy data: X <- matrix(rnorm(1000),100) mdbw(X)
##Dummy data: X <- matrix(rnorm(1000),100) mdbw(X)
# A function
medc(A, fn = sqrt)
medc(A, fn = sqrt)
A |
a matrix |
fn |
a funciton |
Md Ashad Alam <[email protected]>
## A function
mvnod(n = 1, mu, Sigma, tol = 1e-06, empirical = FALSE, EISPACK = FALSE)
mvnod(n = 1, mu, Sigma, tol = 1e-06, empirical = FALSE, EISPACK = FALSE)
n |
an integer number |
mu |
a real value |
Sigma |
a real value |
tol |
a curection factor |
empirical |
a logical value |
EISPACK |
a logical value. TRUE for a complex values. |
Md Ashad Alam <[email protected]>
A function
ranuf(p)
ranuf(p)
p |
a real value |
Md Ashad Alam <[email protected]>
#A robust correlation
rkcca(X, Y, lossfu = "Huber", kernel = "rbfdot", gamma = 1e-05, ncomps = 10)
rkcca(X, Y, lossfu = "Huber", kernel = "rbfdot", gamma = 1e-05, ncomps = 10)
X |
a data matrix index by row |
Y |
a data matrix index by row |
lossfu |
a loss function: square, Hampel's or Huber's loss |
kernel |
a positive definite kernel |
gamma |
the hyper-parameters |
ncomps |
the number of canonical vectors |
An S3 object containing the following slots:
rkcor |
Robsut kernel canonical correlation |
rxcoef |
Robsut kernel canonical coficient of X dataset |
rycoef |
Robsut kernel canonical coficient of Y dataset |
rxcv |
Robsut kernel canonical vector of X dataset |
rycv |
Robsut kernel canonical vector of Y dataset |
Md Ashad Alam <[email protected]>
Md Ashad Alam, Kenji Fukumizu and Yu-Ping Wang (2018), Influence Function and Robust Variant of Kernel Canonical Correlation Analysis, Neurocomputing, Vol. 304 (2018) 12-29.
M. Romanazzi (1992), Influence in canonical correlation analysis, Psychometrika vol 57(2) (1992) 237-259.
See also as ifcca
, rkcca
, ifrkcca
##Dummy data: X <- matrix(rnorm(1000),100); Y <- matrix(rnorm(1000),100) rkcca(X,Y, "Huber", "rbfdot", 1e-05, 10)
##Dummy data: X <- matrix(rnorm(1000),100); Y <- matrix(rnorm(1000),100) rkcca(X,Y, "Huber", "rbfdot", 1e-05, 10)
# A function
rkcco(X, Y, lossfu = "Huber", kernel = "rbfdot", gamma = 1e-05)
rkcco(X, Y, lossfu = "Huber", kernel = "rbfdot", gamma = 1e-05)
X |
a data matrix index by row |
Y |
a data matrix index by row |
lossfu |
a loss function: square, Hampel's or Huber's loss |
kernel |
a positive definite kernel |
gamma |
the hyper-parameters |
rkcmx |
Robust kernel center matrix of X dataset |
rkcmy |
Robust kernel center matrix of Y dataset |
rkcmx |
Robust kernel covariacne operator of X dataset |
rkcmy |
Robust kernel covariacne operator of Y dataset |
rkcmx |
Robust kernel cross-covariacne operator of X and Y datasets |
Md Ashad Alam <[email protected]>
Md Ashad Alam, Kenji Fukumizu and Yu-Ping Wang (2018), Influence Function and Robust Variant of Kernel Canonical Correlation Analysis, Neurocomputing, Vol. 304 (2018) 12-29.
M. Romanazzi (1992), Influence in canonical correlation analysis, Psychometrika vol 57(2) (1992) 237-259.
See also as rkcca
snpfmridata
, ifrkcca
##Dummy data: X <- matrix(rnorm(2000),200); Y <- matrix(rnorm(2000),200) rkcco(X,Y, "Huber","rbfdot", 1e-05)
##Dummy data: X <- matrix(rnorm(2000),200); Y <- matrix(rnorm(2000),200) rkcco(X,Y, "Huber","rbfdot", 1e-05)
# A functioin
rkcm(X, lossfu = "Huber", kernel = "rbfdot")
rkcm(X, lossfu = "Huber", kernel = "rbfdot")
X |
a data matrix index by row |
lossfu |
a loss function: square, Hampel's or Huber's loss |
kernel |
a positive definite kernel |
rkcm |
a square robust kernel center matrix |
Md Ashad Alam <[email protected]>
Md Ashad Alam, Kenji Fukumizu and Yu-Ping Wang (2018), Influence Function and Robust Variant of Kernel Canonical Correlation Analysis, Neurocomputing, Vol. 304 (2018) 12-29.
Md Ashad Alam, Vince D. Calhoun and Yu-Ping Wang (2018), Identifying outliers using multiple kernel canonical correlation analysis with application to imaging genetics, Computational Statistics and Data Analysis, Vol. 125, 70- 85
See also as ifcca
, rkcca
, ifrkcca
##Dummy data: X <- matrix(rnorm(2000),200); Y <- matrix(rnorm(2000),200) rkcm(X, "Huber","rbfdot")
##Dummy data: X <- matrix(rnorm(2000),200); Y <- matrix(rnorm(2000),200) rkcm(X, "Huber","rbfdot")
#A function to calcualte generalized logit function.
rlogit(x)
rlogit(x)
x |
a real value to be tranformed |
Md Ashad Alam <[email protected]>
#A function
snpfmridata(n = 300, gamma=0.00001, ncomps = 2, jth = 1)
snpfmridata(n = 300, gamma=0.00001, ncomps = 2, jth = 1)
n |
the sample size |
gamma |
the hyper-parameters |
ncomps |
the number of canonical vectors |
jth |
the influence function of the jth canonical vector |
IFCCAID |
Influence value of canonical correlation analysis for the ideal data |
IFCCACD |
Influence value of canonical correlation analysis for the contaminated data |
IFKCCAID |
Influence value of kernel canonical correlation analysis for the ideal data |
IFKCCACD |
Influence value of kernel canonical correlation analysis for the contaminated data |
IFHACCAID |
Influence value of robsut (Hampel's loss) canonical correlation analysis for the ideal data |
IFHACCACD |
Influence value of robsut (Hampel's loss) canonical correlation analysis for the contaminated data |
IFHUCCAID |
Influence value of robsut (Huber's loss) canonical correlation analysis for the ideal data |
IFHUCCACD |
Influence value of robsut (Huber's loss) canonical correlation analysis for the contaminated data |
Md Ashad Alam <[email protected]>
Md Ashad Alam, Kenji Fukumizu and Yu-Ping Wang (2018), Influence Function and Robust Variant of Kernel Canonical Correlation Analysis, Neurocomputing, Vol. 304 (2018) 12-29.
Md Ashad Alam, Vince D. Calhoun and Yu-Ping Wang (2018), Identifying outliers using multiple kernel canonical correlation analysis with application to imaging genetics, Computational Statistics and Data Analysis, Vol. 125, 70- 85
See also as rkcca
, ifrkcca
, snpfmrimth3D
##Dummy data: n<-100 snpfmridata(n, 0.00001, 10, jth = 1)
##Dummy data: n<-100 snpfmridata(n, 0.00001, 10, jth = 1)
#A function
snpfmrimth3D(n = 500, gamma = 1e-05, ncomps = 1, jth=1)
snpfmrimth3D(n = 500, gamma = 1e-05, ncomps = 1, jth=1)
n |
the sample size |
gamma |
the hyper-parameters |
ncomps |
the number of canonical vectors |
jth |
the influence function of the jth canonical vector |
IFim |
Influence value of multiple kernel canonical correlation analysis for the ideal data |
IFcm |
Influence value of multiple kernel canonical correlation analysis for the contaminated data |
Md Ashad Alam <[email protected]>
Md Ashad Alam, Kenji Fukumizu and Yu-Ping Wang (2018), Influence Function and Robust Variant of Kernel Canonical Correlation Analysis, Neurocomputing, Vol. 304 (2018) 12-29.
Md Ashad Alam, Vince D. Calhoun and Yu-Ping Wang (2018), Identifying outliers using multiple kernel canonical correlation analysis with application to imaging genetics, Computational Statistics and Data Analysis, Vol. 125, 70- 85
See also as rkcca
, snpfmridata
, ifrkcca
##Dummy data: n<-100 snpfmrimth3D(n, 0.00001, 10, 1)
##Dummy data: n<-100 snpfmrimth3D(n, 0.00001, 10, 1)
### A function to a measure of a system's real point computing power
udtd(x)
udtd(x)
x |
a real value |
Md Ashad Alam <[email protected]>