Related
I am new here in this forum and i am a Beginner in learning R. I tried to do a Ridge Regression but it failed somehow and I do not understand why.
I have a Matrix x(dim 6,28) with 6 observation and a vector y with 6 rows.
x <- structure(c(1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,
1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 0, 1, 1,
1, 1, 1, 1, 1, 1, 1, 1, 0, 0, 1, 1, 1, 0, 0, 0, 1, 1, 1, 1, 1,
1, 1, 1, 1, 1, 1, 1, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 1, 1, 0, 0,
0, 1, 1, 1, -1, -1, -1, -1, -1, -1, -1, 0, 0, 0, 0, 0, -1, -1,
-1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, 0, 0, 0, 0, -1,
-1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1,
-1, -1, -1, -1, -1, -1, -1, -1, 0, 0, 0, 0, 0, -1, -1, -1, -1,
-1, -1, 0, -1, -1, -1, -1, -1, 0, 0, -1, -1, -1, -1, 0, -1, -1,
-1, -1, -1), dim = c(6L, 28L))
y <- c(2, 0, 3, 0, 0, 1)
w <- c(0.02222222, 0.125, 0.1, 0.33333333, 0.14285714, 0.04761905)
# Ridge regression
model_cv <- glmnet::cv.glmnet(x = x, y = y, weights = w, alpha = 0)
When I run cv.glmnet, however, I get the following error message:
Error: (converted from warning) Option grouped=FALSE enforced in cv.glmnet, since < 3 observations per fold
Thanks for the help in advance.
I created a binary matrix and I wanna plot 1's as black square.
How can I write it without using any package?
For example, my matrix is:
m <- matrix(c(0,1,1,0,0,1,0,1,1),nrow=3, ncol=3)
Do you want this?
m <- matrix(c(0,1,1,0,0,1,0,1,1), nrow=3, ncol=3)
image(m, main = "My binary matrix plot", col = c("white", "black"))
If image doesn't suffice, we could write a generalized function using mapply like this one.
chessplot <- function(m, col=1, border=NA) {
stopifnot(dim(m)[1] == dim(m)[2]) ## allows only square matrices
n <- nrow(m)
plot(n, n, type='n', xlim=c(0, n), ylim=c(0, n))
mapply(\(i, j, m) {
rect(-1 + i, n - j, 0 + i, n - j + 1, col=m, border=border)
}, seq(n), rep(seq(n), each=n), t(m)) |> invisible()
}
Gives:
chessplot(m3)
chessplot(m4)
chessplot(m8)
Data:
m3 <- structure(c(0, 1, 1, 0, 0, 1, 0, 1, 1), .Dim = c(3L, 3L))
m4 <- structure(c(0, 1, 1, 0, 0, 1, 0, 1, 1, 0, 1, 1, 0, 0, 1, 0), .Dim = c(4L,
4L))
m8 <- structure(c(0, 1, 0, 1, 0, 1, 0, 1, 1, 0, 1, 0, 1, 0, 1, 0, 0,
1, 0, 1, 0, 1, 0, 1, 1, 0, 1, 0, 1, 0, 1, 0, 0, 1, 0, 1, 0, 1,
0, 1, 1, 0, 1, 0, 1, 0, 1, 0, 0, 1, 0, 1, 0, 1, 0, 1, 1, 0, 1,
0, 1, 0, 1, 0), .Dim = c(8L, 8L))
I have the following matrix:
m = np.asarray(
[[0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0],
[0, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 0],
[0, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 0],
[0, 1, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 1, 0],
[0, 1, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 1, 0],
[0, 1, 1, 0, 0, 1, 1, 1, 1, 1, 0, 0, 1, 1, 0],
[0, 1, 1, 0, 0, 1, 1, 1, 1, 1, 0, 0, 1, 1, 0],
[0, 1, 1, 0, 0, 1, 1, 1, 1, 1, 0, 0, 1, 1, 0],
[0, 1, 1, 0, 0, 1, 1, 1, 1, 1, 0, 0, 1, 1, 0],
[0, 1, 1, 0, 0, 1, 1, 1, 1, 1, 0, 0, 1, 1, 0],
[0, 1, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 1, 0],
[0, 1, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 1, 0],
[0, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 0],
[0, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 0],
[0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0],
], dtype=np.uint8
)
And I apply the following few gradient transforms:
gradient = cv2.morphologyEx(m, cv2.MORPH_GRADIENT, np.ones((2,2),np.uint8))
gradient = cv2.morphologyEx(m, cv2.MORPH_GRADIENT, np.ones((3,3),np.uint8))
And I get the following results, respectively:
What I was expecting was the opposite of the image; is there a straight forward, fool-proof way to get the perimeters?
There seems to be an error in the way OpenCV applies the gradient transform.
By running the following command, I was able to obtain the correct result:
something = cv2.morphologyEx(m, cv2.MORPH_GRADIENT, np.ones((3,3),np.uint8))
gradient = something - m
gradient[gradient > 1] = 0
Use care copying my answer so that the numbers in the resultant matrix are what you need them to be.
I have two logical vectors which look like this:
x = c(0, 0, 0, 1, 1, 1, 1, 0, 0, 0, 0, 0, 0)
y = c(0, 0, 0, 0, 0, 1, 1, 1, 1, 1, 1, 0, 0)
I would like to count the intersections between ranges of consecutive values. Meaning that consecutive values (of 1s) are handled as one range. So in the above example, each vector contains one range of 1s and these ranges intersect only once.
Is there any R package for range intersections which could help here?
I think this should work (calling your logical vectors x and y):
sum(rle(x & y)$values)
A few examples:
x = c(0, 0, 0, 1, 1, 1, 1, 0, 0, 0, 0, 0, 0)
y = c(0, 0, 0, 0, 0, 1, 1, 1, 1, 1, 1, 0, 0)
sum(rle(x & y)$values)
# [1] 1
x = c(1, 1, 0, 1, 1, 1, 1, 0, 0, 0, 0, 0, 0)
y = c(0, 1, 1, 0, 0, 1, 1, 1, 1, 1, 1, 0, 0)
sum(rle(x & y)$values)
# [1] 2
x = c(1, 1, 0, 1, 1, 1, 1, 0, 0, 0, 1, 1, 0)
y = c(0, 1, 1, 0, 0, 1, 1, 1, 1, 1, 1, 0, 0)
sum(rle(x & y)$values)
# [1] 3
By way of explanation, x & y gives the intersections on a per-element level, rle collapses runs of adjacent intersections, and sum counts.
I have a question about PCA using the caret package and an error message I'm getting, "cannot rescale a constant/zero column to unit variance".
Consider two sets of similar code. The first works just fine:
a = c(0, 1, 1, 1, 0, 0, 1, 0, 0, 1, 0, 1, 0, 1, -1, -1, NA)
b = c(1, 1, 0, 0, 1, 1, 0, 1, 1, 0, 0, 0, 0, 0, -1, -1, NA)
c = c(1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 0, 0)
df = data.frame(a, b, c)
trans = preProcess(df, method = c("center", "scale", "pca"))
The variance of each column can be seen as:
apply(df, 2, var, na.rm=TRUE)
Note that the variance of column "c" is 0.11
Let's say I change the second to last integer in column "c" to 1 instead of 0, and then run the same code:
a = c(0, 1, 1, 1, 0, 0, 1, 0, 0, 1, 0, 1, 0, 1, -1, -1, NA)
b = c(1, 1, 0, 0, 1, 1, 0, 1, 1, 0, 0, 0, 0, 0, -1, -1, NA)
c = c(1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 0)
df = data.frame(a, b, c)
trans = preProcess(df, method = c("center", "scale", "pca"))
I get an error message:
Error in prcomp.default(x, scale = TRUE, retx = FALSE) :
cannot rescale a constant/zero column to unit variance
If you look at the variance for column c, it's 0.059:
apply(df, 2, var, na.rm=TRUE)
Can anyone please help me understand the difference between these two sets of code and why the second gives an error when the first does not?
Thank you
PCA only uses complete observations. In your second definition of df above, a PCA analysis will drop the last row due to missingness. And column c is constant within the remaining rows.
Note: my answer is around PCA generally and not specific to the caret package.