Getting fhat values of 0 in my bivariate kernel regression smoothing function - r

I have written a kernel regression smoothing function below.
#simulated data: b0 + b1x1 + b2x2 + e
x1 <- runif(100)
x2 <- runif(100)
y <- 5 + 7 * x1 + 5 * x2 + rnorm(100,0,.1)
sample <- cbind(x1, x2, y)
sample <- as.data.frame(sample)
Kregsmooth2 <- function(sample, h){
output <- matrix(0, nrow = 100, ncol = 100)
grid.x1 <- seq(min(sample$x1), max(sample$x1), length.out = 100)
grid.x2 <- seq(min(sample$x2), max(sample$x2), length.out = 100)
for (j in 1:length(grid.x2)){
for (i in 1:length(grid.x1)){
output[i,j] <- sum(sample$y * dnorm((grid.x1[i]-sample$x1)/h)) * sum(sample$y * dnorm((grid.x2[j]-sample$x2)/h)) / (sum(dnorm((grid.x1[i]-sample$x1)/h)) * sum(dnorm((grid.x2[j]-sample$x2)/h)))
}
return(list(x1 = grid.x1, x2 = grid.x2, output=output))
}
}
fit <- Kregsmooth2(sample, 1)
fit
When I run this function, in my output column, only the [,1] column is filled out. [,2:100] are populated with 0s. I have a feeling it's the way I'm storing the output, but I can't seem to figure out why I have this issue. Any help would be appreciated, thanks!

You put return inside the j-loop. Move it down below one }.

Related

Calculating RSS manually with given pairs of beta0 and beta1

I am trying to manually calculate the RSS for a dataset with given pairs of beta0 and beta1. For each (beta_0,beta_1) pair of values, I need to calculate the residual sum of squares. Store it as a vector in data called RSS. Here's the code provided.
x = pinotnoir$Aroma
y = pinotnoir$Quality
fit = lm(y ~ x)
summary(fit)
b0s <- seq(0, 10, .1)
b1s <- seq(0, 4, .01)
data <- expand.grid(beta0=b0s, beta1=b1s)
Here's what I have so far. I think the residual calculation is wrong but I'm not sure how to fix it.
rows = length(b1s)
rsd <- rep(NA,rows)
for (i in 1:rows){
residual = (y - (b0s[i] + b1s[i] * x))^2
rsd[i] <- residual
}
data <- expand.grid(beta0=b0s, beta1=b1s, RSS=rsd)
Any help would be appreciated. Thanks in advance!
I am not sure this is exactly what you aim but adapting your code slightly you can get the sum of squared residuals and which betas minimizes them. (using mtcars data for the example)
mtcars
x = mtcars$drat
y = mtcars$wt
(fit = lm(y ~ x))
summary(fit)
grid_len <- 20
b0s <- seq(5, 10, length.out = grid_len)
b1s <- seq(-3, -1, length.out = grid_len)
(data <- expand.grid(beta0=b0s, beta1=b1s))
rows = nrow(data)
resids <- rep(NA,rows)
for (i in 1:rows) {
fitted <- (data$beta0[i] + (data$beta1[i] * x))
squared_resid <- (y - fitted)^2
SSR <- sum(squared_resid)
resids[i] <- SSR
cat(i, ": ", SSR, "\n")
}
data[which.min(resids), ]
fit
results:
> data[which.min(resids), ]
beta0 beta1
332 7.894737 -1.315789
> fit
Call:
lm(formula = y ~ x)
Coefficients:
(Intercept) x
7.906 -1.304

Can't add numbers at a specific index to a double in R

I want create 10 b1's for each value of x1 and x2's in xp and yp lists by optimizing res formula below. However my values are somehow not added to b1.created. I get b1.created = 0 when I check after I run the code.How can I make the code work?
y <- matrix(c(1,2,3,4,2,6,7,8,9,10),ncol = 1)
x1 <- matrix(c(2,4,6,5,10,12,14,16,18,20),ncol =1)
x2 <- matrix(c(1,4,9,16,25,25,48,64,81,99),ncol = 1)
x <- cbind(x1,x2)
created.b1 = 0
normal <- function(b0,y,xp,yp,x1,x2){for (i in xp){
res <- sum((y- (b0 + x1[i]*xp[i] + x2[i]*yp[i]))^2)
optobj <- optimize(normal,c(-10,10),y =y ,xp = xp,yp =yp, x1 = x1,x2 = x2)
created.b1[i] = obtobj$minimum[i]
}
}
I think this does what you want, but please cross-check.
created.b1 <- numeric(length = 10)
for (i in 1:10)
{
opt_obj <- optimise(f = function(b0, y, xp, yp, x1, x2) sum((y - (b0 + (x1 * xp) + (x2 * yp))) ^ 2),
interval = c(-10, 10),
y = y,
xp = xp[i],
yp = yp[i],
x1 = x1,
x2 = x2)
created.b1[i] <- opt_obj$minimum
}
created.b1

Why do MASS:lm.ridge coefficents differ from those calculated manually?

When performing ridge regression manually, as it is defined
solve(t(X) %*% X + lbd*I) %*%t(X) %*% y
I get different results from those calculated by MASS::lm.ridge. Why? For ordinary linear regression the manual method (computing the pseudoinverse) works fine.
Here is my Minimal, Reproducible Example:
library(tidyverse)
ridgeRegression = function(X, y, lbd) {
Rinv = solve(t(X) %*% X + lbd*diag(ncol(X)))
t(Rinv %*% t(X) %*% y)
}
# generate some data:
set.seed(0)
tb1 = tibble(
x0 = 1,
x1 = seq(-1, 1, by=.01),
x2 = x1 + rnorm(length(x1), 0, .1),
y = x1 + x2 + rnorm(length(x1), 0, .5)
)
X = as.matrix(tb1 %>% select(x0, x1, x2))
# sanity check: force ordinary linear regression
# and compare it with the built-in linear regression:
ridgeRegression(X, tb1$y, 0) - coef(summary(lm(y ~ x1 + x2, data=tb1)))[, 1]
# looks the same: -2.94903e-17 1.487699e-14 -2.176037e-14
# compare manual ridge regression to MASS ridge regression:
ridgeRegression(X, tb1$y, 10) - coef(MASS::lm.ridge(y ~ x0 + x1 + x2 - 1, data=tb1, lambda = 10))
# noticeably different: -0.0001407148 0.003689412 -0.08905392
MASS::lm.ridge scales the data before modelling - this accounts for the difference in the coefficients.
You can confirm this by checking the function code by typing MASS::lm.ridge into the R console.
Here is the lm.ridge function with the scaling portion commented out:
X = as.matrix(tb1 %>% select(x0, x1, x2))
n <- nrow(X); p <- ncol(X)
#Xscale <- drop(rep(1/n, n) %*% X^2)^0.5
#X <- X/rep(Xscale, rep(n, p))
Xs <- svd(X)
rhs <- t(Xs$u) %*% tb1$y
d <- Xs$d
lscoef <- Xs$v %*% (rhs/d)
lsfit <- X %*% lscoef
resid <- tb1$y - lsfit
s2 <- sum(resid^2)/(n - p)
HKB <- (p-2)*s2/sum(lscoef^2)
LW <- (p-2)*s2*n/sum(lsfit^2)
k <- 1
dx <- length(d)
div <- d^2 + rep(10, rep(dx,k))
a <- drop(d*rhs)/div
dim(a) <- c(dx, k)
coef <- Xs$v %*% a
coef
# x0 x1 x2
#[1,] 0.01384984 0.8667353 0.9452382

How to create covariance matrix in R?

I'm trying to build covariance matrix from a scratch (cov() function). My task is not to use any package. Hence I created my functions:
meanf <- function(x){
sum(x) / length(x)
}
sampleCov <- function(x,y){
stopifnot(identical(length(x), length(y)))
sum((x - meanf(x)) * (y - meanf(y))) / (length(x) - 1)
}
> sampleCov(winequality_red$quality, winequality_red$alcohol)
[1] 0.409789
Unfortunately, I'm stuck here. All loops I tried to apply are missing any point. Of course it's possible to just copy the sampleCov function and make it for every possible combination but that's not my point.
If I understand you correctly then I believe you want to recreate a covariate output like the one returned by cov function.
OPs given function:
meanf <- function(x){
sum(x) / length(x)
}
sampleCov <- function(x,y){
stopifnot(identical(length(x), length(y)))
sum((x - meanf(x)) * (y - meanf(y))) / (length(x) - 1)
}
You can try this way, I have taken mtcars data here:
Covariate Function:
vars <- names(mtcars)
egrid <- expand.grid(vars, vars)
egrid <- data.frame(sapply(egrid, as.character),stringsAsFactors = F)
egrid <- egrid[order(egrid$Var1, egrid$Var2),]
mat <- vector("list", nrow(egrid))
for(i in 1:nrow(egrid)){
mat[[i]] <- sampleCov(mtcars[,egrid[i,"Var1"]], mtcars[,egrid[i,"Var2"]])
}
finaldat <- cbind(egrid, cov = do.call('rbind', mat))
finaldat_list <- split(finaldat, finaldat$Var1)
mat_form <- do.call('cbind', finaldat_list)
cov_values <- mat_form[,grepl("\\.cov",names(mat_form))]
col_values <- mat_form[,paste0(egrid$Var1[1],".Var2")]
final_matrix_cov <- cbind(col_values, cov_values)
Sample Output:
> final_matrix_cov
col_values am.cov carb.cov cyl.cov disp.cov
9 mpg 1.80393145 -5.36310484 -9.1723790 -633.09721
20 cyl -0.46572581 1.52016129 3.1895161 199.66028
31 disp -36.56401210 79.06875000 199.6602823 15360.79983
42 hp -8.32056452 83.03629032 101.9314516 6721.15867
You need the matrix multiplication %*%.
sampleCov <- function(x,y){
stopifnot(identical(length(x), length(y)))
sum((x - mean(x)) %*% (y - mean(y))) / (length(x) - 1)
}
> sampleCov(rnorm(10000),rnorm(10000))
[1] 0.01808466
This is probably a little more than you need, but it should answer your question, and I think it is a nice illustration of the practical application of covariances, correlations, etc.
# load the data
link <- "https://raw.githubusercontent.com/DavZim/Efficient_Frontier/master/data/mult_assets.csv"
df <- data.table(read.csv(link))
# calculate the necessary values:
# I) expected returns for the two assets
er_x <- mean(df$x)
er_y <- mean(df$y)
# II) risk (standard deviation) as a risk measure
sd_x <- sd(df$x)
sd_y <- sd(df$y)
# III) covariance
cov_xy <- cov(df$x, df$y)
# create 1000 portfolio weights (omegas)
x_weights <- seq(from = 0, to = 1, length.out = 1000)
# create a data.table that contains the weights for the two assets
two_assets <- data.table(wx = x_weights,
wy = 1 - x_weights)
# calculate the expected returns and standard deviations for the 1000 possible portfolios
two_assets[, ':=' (er_p = wx * er_x + wy * er_y,
sd_p = sqrt(wx^2 * sd_x^2 +
wy^2 * sd_y^2 +
2 * wx * (1 - wx) * cov_xy))]
two_assets
# lastly plot the values
ggplot() +
geom_point(data = two_assets, aes(x = sd_p, y = er_p, color = wx)) +
geom_point(data = data.table(sd = c(sd_x, sd_y), mean = c(er_x, er_y)),
aes(x = sd, y = mean), color = "red", size = 3, shape = 18) +
# Miscellaneous Formatting
theme_bw() + ggtitle("Possible Portfolios with Two Risky Assets") +
xlab("Volatility") + ylab("Expected Returns") +
scale_y_continuous(label = percent, limits = c(0, max(two_assets$er_p) * 1.2)) +
scale_x_continuous(label = percent, limits = c(0, max(two_assets$sd_p) * 1.2)) +
scale_color_continuous(name = expression(omega[x]), labels = percent)
See the link below for all details.
https://datashenanigan.wordpress.com/2016/05/24/a-gentle-introduction-to-finance-using-r-efficient-frontier-and-capm-part-1/

Reproduce Fisher linear discriminant figure

Many books illustrate the idea of Fisher linear discriminant analysis using the following figure (this particular is from Pattern Recognition and Machine Learning, p. 188)
I wonder how to reproduce this figure in R (or in any other language). Pasted below is my initial effort in R. I simulate two groups of data and draw linear discriminant using abline() function. Any suggestions are welcome.
set.seed(2014)
library(MASS)
library(DiscriMiner) # For scatter matrices
# Simulate bivariate normal distribution with 2 classes
mu1 <- c(2, -4)
mu2 <- c(2, 6)
rho <- 0.8
s1 <- 1
s2 <- 3
Sigma <- matrix(c(s1^2, rho * s1 * s2, rho * s1 * s2, s2^2), byrow = TRUE, nrow = 2)
n <- 50
X1 <- mvrnorm(n, mu = mu1, Sigma = Sigma)
X2 <- mvrnorm(n, mu = mu2, Sigma = Sigma)
y <- rep(c(0, 1), each = n)
X <- rbind(x1 = X1, x2 = X2)
X <- scale(X)
# Scatter matrices
B <- betweenCov(variables = X, group = y)
W <- withinCov(variables = X, group = y)
# Eigenvectors
ev <- eigen(solve(W) %*% B)$vectors
slope <- - ev[1,1] / ev[2,1]
intercept <- ev[2,1]
par(pty = "s")
plot(X, col = y + 1, pch = 16)
abline(a = slope, b = intercept, lwd = 2, lty = 2)
MY (UNFINISHED) WORK
I pasted my current solution below. The main question is how to rotate (and move) the density plot according to decision boundary. Any suggestions are still welcome.
require(ggplot2)
library(grid)
library(MASS)
# Simulation parameters
mu1 <- c(5, -9)
mu2 <- c(4, 9)
rho <- 0.5
s1 <- 1
s2 <- 3
Sigma <- matrix(c(s1^2, rho * s1 * s2, rho * s1 * s2, s2^2), byrow = TRUE, nrow = 2)
n <- 50
# Multivariate normal sampling
X1 <- mvrnorm(n, mu = mu1, Sigma = Sigma)
X2 <- mvrnorm(n, mu = mu2, Sigma = Sigma)
# Combine into data frame
y <- rep(c(0, 1), each = n)
X <- rbind(x1 = X1, x2 = X2)
X <- scale(X)
X <- data.frame(X, class = y)
# Apply lda()
m1 <- lda(class ~ X1 + X2, data = X)
m1.pred <- predict(m1)
# Compute intercept and slope for abline
gmean <- m1$prior %*% m1$means
const <- as.numeric(gmean %*% m1$scaling)
z <- as.matrix(X[, 1:2]) %*% m1$scaling - const
slope <- - m1$scaling[1] / m1$scaling[2]
intercept <- const / m1$scaling[2]
# Projected values
LD <- data.frame(predict(m1)$x, class = y)
# Scatterplot
p1 <- ggplot(X, aes(X1, X2, color=as.factor(class))) +
geom_point() +
theme_bw() +
theme(legend.position = "none") +
scale_x_continuous(limits=c(-5, 5)) +
scale_y_continuous(limits=c(-5, 5)) +
geom_abline(intecept = intercept, slope = slope)
# Density plot
p2 <- ggplot(LD, aes(x = LD1)) +
geom_density(aes(fill = as.factor(class), y = ..scaled..)) +
theme_bw() +
theme(legend.position = "none")
grid.newpage()
print(p1)
vp <- viewport(width = .7, height = 0.6, x = 0.5, y = 0.3, just = c("centre"))
pushViewport(vp)
print(p2, vp = vp)
Basically you need to project the data along the direction of the classifier, plot a histogram for each class, and then rotate the histogram so its x axis is parallel to the classifier. Some trial-and-error with scaling the histogram is needed in order to get a nice result. Here's an example of how to do it in Matlab, for the naive classifier (difference of class' means). For the Fisher classifier it is of course similar, you just use a different classifier w. I changed the parameters from your code so the plot is more similar to the one you gave.
rng('default')
n = 1000;
mu1 = [1,3]';
mu2 = [4,1]';
rho = 0.3;
s1 = .8;
s2 = .5;
Sigma = [s1^2,rho*s1*s1;rho*s1*s1, s2^2];
X1 = mvnrnd(mu1,Sigma,n);
X2 = mvnrnd(mu2,Sigma,n);
X = [X1; X2];
Y = [zeros(n,1);ones(n,1)];
scatter(X1(:,1), X1(:,2), [], 'b' );
hold on
scatter(X2(:,1), X2(:,2), [], 'r' );
axis equal
m1 = mean(X(1:n,:))';
m2 = mean(X(n+1:end,:))';
plot(m1(1),m1(2),'bx','markersize',18)
plot(m2(1),m2(2),'rx','markersize',18)
plot([m1(1),m2(1)], [m1(2),m2(2)],'g')
%% classifier taking only means into account
w = m2 - m1;
w = w / norm(w);
% project data onto w
X1_projected = X1 * w;
X2_projected = X2 * w;
% plot histogram and rotate it
angle = 180/pi * atan(w(2)/w(1));
[hy1, hx1] = hist(X1_projected);
[hy2, hx2] = hist(X2_projected);
hy1 = hy1 / sum(hy1); % normalize
hy2 = hy2 / sum(hy2); % normalize
scale = 4; % set manually
h1 = bar(hx1, scale*hy1,'b');
h2 = bar(hx2, scale*hy2,'r');
set([h1, h2],'ShowBaseLine','off')
% rotate around the origin
rotate(get(h1,'children'),[0,0,1], angle, [0,0,0])
rotate(get(h2,'children'),[0,0,1], angle, [0,0,0])

Resources