Basically I have a matrix and row with a in it I want to append a "1" to a list, otherwise append a "0"
The code is as follows:
is.there.A <- function(a,b,c,d,e) {
library(combinat)
x <- c(a,b,c,d,e)
matrix <- matrix(combn(x,3), ncol=3, byrow=T)
row <- nrow(matrix)
list <- list()
for (i in seq(row)) {
if (matrix[i,] %in% "A") {c(list, "1")}
else {c(list, "0")}
print(list)
}
}
But it doesn't work and this shows up.
Warning messages:
1: In if (matrix[i, ] %in% "A") { :
the condition has length > 1 and only the first element will be used
The question is how to overcome this to achieve the objective
You can avoid your explicit loop by using apply
is.there.A <- function(a,b,s,d,e) {
library(combinat)
x <- c(a,b,s,d,e)
.matrix <- matrix(combn(x,3), ncol=3, byrow=T)
any_A <- apply(.matrix, 1, `%in%`, x = 'A')
as.list(as.numeric(any_A))
}
Never grow an object within a for loop, pre-allocate then fill.
Avoid naming objects with function names (eg c or matrix orlist)
You meant to test for "A" %in% matrix[i,], not the other way around. However, note that
row <- nrow(matrix)
list <- list()
for (i in seq(row)) {
if ("A" %in% matrix[i,]) {c(list, "1")}
else {c(list, "0")}
}
can be rewritten
rowSums(matrix == "A") > 0
It returns a vector of logicals (TRUE/FALSE) which is the most appropriate output for your function. However, if you really need a list of '1' or '0', you can wrap it as follows:
as.list(ifelse(rowSums(matrix == "A") > 0, "1", "0"))
Also note that it is a bad idea to name an object matrix since it is also the name of a function in R.
Related
I tried to create a If Else Statement to Recode my Variable in a Dummy-Variable.
I Know there is the ifelse() Function and the fastDummy-Package, but I tried this Way without succes.
Why does this not work? I want to learn and understand R in a better Way.
if(df$iscd115==1){
df$iscd1151 <- 1
} else {
df$iscd1151 <- 0
}
This should be a reasonable solution.
First we'll find out what the positions of your important columns are, and then we'll apply a function that will search the rows (margin = 1) that will check if that our important column is 1 or 0, and then modify the other column accordingly.
col1 <- which(names(df) == "iscd115")
col2 <- which(names(df) == "iscd1151")
mat <- apply(df, margin = 1, function(x) {
if (x[col1] == 1) {x[col2] <- 1
} else {
x[col2] == 0
}
x
})
Unfortunately, this transforms the original data frame into a transposed matrix. We can re-transpose the matrix back and turn it back into a data frame with the following.
new_df <- as.data.frame( t(mat))
I have some R code that takes in the args string from the command line and then filters a dataframe based on values in a column; the args string contains the column names. Right now I'm doing it by looping through the vector but something tells me that there has to be a better way. Is there a way to optimize this code?
args = c("col1","col2")
for(i in args){
df = df[df[,i]==0,]
}
If I understand correctly, you want to keep the rows where all of the args are equal to 0 (or any other given value).
First get the indices of the columns you're interested in:
idx <- match(args, colnames(df))
Then you can simply do:
df <- df[apply(df[, idx], 1, function(x) all(x == 0)), ]
Another possibility:
df <- df[rowSums(df[, idx] != 0) == 0, ]
my_list = list()
my_list[[1]] = c("Fast","Slow","Heavy","Light")
my_list[[2]] = c("Fast","Small","Intelligent","Light")
my_list[[3]] = c("Dumb","Slow","Heavy","Light")
my_list[[4]] = c("Slow","Intelligent","Dumb","Heavy")
my_list[[5]] = c("Heavy","Light","Intelligent","Tall")
This is a simplified version of what I am trying to do, but how can I filter a list so that if two strings are contained within it (ie. Fast and Slow, Tall and Small, Heavy and Light, lastly, Intelligent and Dumb), then they can are removed to leave a final vector with sensible vectors.
I have been trying to do this with an IF function, is that the most appropriate way?
This would be what you want:
cont_check <- function(x) {
cont_words <- list(c("Fast", "Slow"),
c("Heavy", "Light"))
found <- sum(sapply(cont_words, function (y) sum(y %in% x) == 2)) > 0
return(found)
}
sapply(my_list, cont_check)
# select ones
# my_list[!sapply(my_list, cont_check)]
R has problems when reading .csv files with column names that begin with a number; it changes these names by putting an "X" as the first character.
I am trying to write a function which simply solves this problem (although: is this the easiest way?)
As an example file, I simply created two new (non-sensical) columns in iris:
iris$X12.0 <- iris$Sepal.Length
iris$X18.0 <- iris$Petal.Length
remv.X <- function(x){
if(substr(colnames(x), 1, 1) == "X"){
colnames(x) <- substr(colnames(x), 2, 100)
}
else{
colnames(x) <- substr(colnames(x), 1, 100)
}
}
remv.X(iris)
When printing, I get a warning, and nothing changes.
What do I do wrong?
check.names=FALSE
Use the read.table/read.csv argument check.names = FALSE to turn off column name mangling.
For example,
read.csv(text = "1x,2x\n10,20", check.names = FALSE)
giving:
1x 2x
1 10 20
Removing X using sub
If for some reason you did have an unwanted X character at the beginning of some column names they could be removed like this. This only removes an X at the beginning of columns names for which the next character is a digit. If the next character is not a digit or if there is no next character then the column name is left unchanged.
names(iris) <- sub("^X(\\d.*)", "\\1", names(iris))
or as a function:
rmX <- function(data) setNames(data, sub("^X(\\d.*)", "\\1", names(data)))
# test
iris <- rmX(iris)
Problem with code in question
There are two problems with the code in the question.
in if (condition) ... the condition is a vector but must be a
scalar.
the data frame is never returned.
Here it is fixed up. We have also factored out the LHS of the two legs of the if.
remv.X2 <- function(x) {
for (i in seq_along(x)) {
colnames(x)[i] <- if (substr(colnames(x)[i], 1, 1) == "X") {
substr(colnames(x)[i], 2, 100)
} else {
substr(colnames(x)[i], 1, 100)
}
}
x
}
iris <- remv.X2(iris)
or maybe even:
remv.X3 <- function(x) {
setNames(x, substr(colnames(x), (substr(colnames(x), 1, 1) == "X") + 1, 100))
}
iris <- remv.X3(iris)
I want to search through a vector for the sequence of strings "hello" "world". When I find this sequence, I want to copy it, including the 10 elements before and after, as a row in a data.frame to which I'll apply further analysis.
My problem: I get an error "new column would leave holes after existing columns". I'm new to coding, so I'm not sure how to manipulate data.frames. Maybe I need to create rows in the loop?
This is what I have:
df = data.frame()
i <- 1
for(n in 1:length(v))
{
if(v[n] == 'hello' & v[n+1] == 'world')
{
df[i,n-11:n+11] <- v[n-10:n+11]
i <- i+1
}
}
Thanks!
May be this helps
indx <- which(v1[-length(v1)]=='hello'& v1[-1]=='world')
lst <- Map(function(x,y) {s1 <- seq(x,y)
v1[s1[s1>0 & s1 < length(v1)]]}, indx-10, indx+11)
len <- max(sapply(lst, length))
d1 <- as.data.frame(do.call(rbind,lapply(lst, `length<-`, len)))
data
set.seed(496)
v1 <- sample(c(letters[1:3], 'hello', 'world'), 100, replace=TRUE)