r error recursive indexing failed at level 2 matrix - r

Rows of a Matrix book store the latitude, longitude at column 2 and 3 and column 6 to n stores the indices of the points which are within 600 m to the ith one. In the below code, I am trying to check if any points in ith row is within a range to the jth point. If so, then I am appending the indices of both the rows. But while doing so, I am getting an error Error in *tmp*[[j]] : recursive indexing failed at level 2
This is the data set
vehicle_id longt latit date B B B B B B B B B B B B B B B
1 19967 86.2885 23.8210 27 3 1 2 6 0 0 0 0 0 0 0 0 0 0 0
2 19967 86.2891 23.8200 27 2 2 6 0 0 0 0 0 0 0 0 0 0 0 0
3 19967 86.5343 23.8254 27 1 3 0 0 0 0 0 0 0 0 0 0 0 0 0
4 19967 86.7273 23.8200 27 1 4 0 0 0 0 0 0 0 0 0 0 0 0 0
5 19967 86.1362 23.7538 28 1 5 0 0 0 0 0 0 0 0 0 0 0 0 0
6 19967 86.2839 23.8212 28 1 6 0 0 0 0 0 0 0 0 0 0 0 0 0
B B B B B B B B B B B B B
1 0 0 0 0 0 0 0 0 0 0 0 0 0
2 0 0 0 0 0 0 0 0 0 0 0 0 0
3 0 0 0 0 0 0 0 0 0 0 0 0 0
4 0 0 0 0 0 0 0 0 0 0 0 0 0
5 0 0 0 0 0 0 0 0 0 0 0 0 0
6 0 0 0 0 0 0 0 0 0 0 0 0 0
I wanted to know why I am getting this errror and how could I resolve it.???/
S=0
for(i in 1:(nrow(book)-1))
{
if( book[i, 1] != book[i+1,1] )
{
next
}
for(j in i:nrow(book))
{
if( book[i,1]!=book[j,1])
{
break }
if( book[i,1]==book[j,1] & (book[i,5] > 2 ))
{ for( k in 7:(5+book[i,5]))
if(distm (c(book[book[i,k],3], book[book[i,k],2]), c(book[j,3], book[j,2]), fun = distHaversine) < 600)
{ S=book[i,5]+book[j,5]
if (S-k > 0)
{ B<- matrix(0,nrow(book),(S-k))
book <- cbind(book,B)
book[i,5]=book[i,5]+book[j,5]
}
book[i,(6+book[i,5]):((6+book[i,5])+book[j,5])] <- book[j,(6:(5+book[j,5]))]
}
}
}
for( k in 7:(5+book[i,5]))
{ if(i!=book[i,k])
{book[book[i,k],1]=0;
}
}
}
`

Related

Multiplying multiple columns with each other into a new dataframe in R

I want to multiply many of my binary variables into new columns, so called interactive variables. My dataset is structured like this:
YearCountry <- data.frame( Time = c("2000","2001", "2002", "2003",
"2000","2001", "2002", "2003",
"2000","2001", "2002", "2003"),
AL = c(1,1,1,1,0,0,0,0,0,0,0,0),
FR = c(0,0,0,0,1,1,1,1,0,0,0,0),
UK = c(0,0,0,0,0,0,0,0,1,1,1,1),
Y2000d = c(1,0,0,0,1,0,0,0,1,0,0,0),
Y2001d = c(0,1,0,0,0,1,0,0,0,1,0,0),
Y2002d = c(0,0,1,0,0,0,1,0,0,0,1,0),
Y2003d = c(0,0,0,1,0,0,0,1,0,0,0,1))
YearCountry
Time AL FR UK Y2000d Y2001d Y2002d Y2003d
1 2000 1 0 0 1 0 0 0
2 2001 1 0 0 0 1 0 0
3 2002 1 0 0 0 0 1 0
4 2003 1 0 0 0 0 0 1
5 2000 0 1 0 1 0 0 0
6 2001 0 1 0 0 1 0 0
7 2002 0 1 0 0 0 1 0
8 2003 0 1 0 0 0 0 1
9 2000 0 0 1 1 0 0 0
10 2001 0 0 1 0 1 0 0
11 2002 0 0 1 0 0 1 0
12 2003 0 0 1 0 0 0 1
I need to multiply the binary variable for each of the countries (AL,FR,UK) with each of the binary variables for a given year so that I get #country x #year new variables. In this case I have three countries and four years which gives 12 new variables. My full data contains 105 countries/regions and stretches over twenty years. I therefore need a general formula. I want data that looks like this
Interact <- data.frame(Time = c("2000","2001", "2002", "2003",
"2000","2001", "2002", "2003",
"2000","2001", "2002", "2003"),
Y2000xAL = c(1,0,0,0,0,0,0,0,0,0,0,0),
Y2001xAL = c(0,1,0,0,0,0,0,0,0,0,0,0),
Y2002xAL = c(0,0,1,0,0,0,0,0,0,0,0,0),
Y2003xAL = c(0,0,0,1,0,0,0,0,0,0,0,0),
Y2000xFR = c(0,0,0,0,1,0,0,0,0,0,0,0),
Y2001xFR = c(0,0,0,0,0,1,0,0,0,0,0,0),
Y2002xFR = c(0,0,0,0,0,0,1,0,0,0,0,0),
Y2003xFR = c(0,0,0,0,0,0,0,1,0,0,0,0),
Y2000xUk = c(0,0,0,0,0,0,0,0,1,0,0,0),
Y2001xUK = c(0,0,0,0,0,0,0,0,0,1,0,0),
Y2002xUK = c(0,0,0,0,0,0,0,0,0,0,1,0),
Y2003xUK = c(0,0,0,0,0,0,0,0,0,0,0,1))
Interact
Time Y2000xAL Y2001xAL Y2002xAL Y2003xAL Y2000xFR Y2001xFR Y2002xFR Y2003xFR Y2000xUk Y2001xUK Y2002xUK Y2003xUK
1 2000 1 0 0 0 0 0 0 0 0 0 0 0
2 2001 0 1 0 0 0 0 0 0 0 0 0 0
3 2002 0 0 1 0 0 0 0 0 0 0 0 0
4 2003 0 0 0 1 0 0 0 0 0 0 0 0
5 2000 0 0 0 0 1 0 0 0 0 0 0 0
6 2001 0 0 0 0 0 1 0 0 0 0 0 0
7 2002 0 0 0 0 0 0 1 0 0 0 0 0
8 2003 0 0 0 0 0 0 0 1 0 0 0 0
9 2000 0 0 0 0 0 0 0 0 1 0 0 0
10 2001 0 0 0 0 0 0 0 0 0 1 0 0
11 2002 0 0 0 0 0 0 0 0 0 0 1 0
12 2003 0 0 0 0 0 0 0 0 0 0 0 1
Here's an approach with dplyr::across. We can make the final result into a plain data.frame with purrr:invoke as demonstrated in this answer.
library(dplyr)
library(purrr)
YearCountry %>%
mutate(across(AL:UK, ~ . * select(cur_data(), Y2000d:Y2003d))) %>%
select(-(Y2000d:Y2003d)) %>%
invoke(.f = data.frame) %>%
rename_with(~str_replace(.,"\\.",""))
Time ALY2000d ALY2001d ALY2002d ALY2003d FRY2000d FRY2001d FRY2002d FRY2003d UKY2000d UKY2001d UKY2002d UKY2003d
1 2000 1 0 0 0 0 0 0 0 0 0 0 0
2 2001 0 1 0 0 0 0 0 0 0 0 0 0
3 2002 0 0 1 0 0 0 0 0 0 0 0 0
4 2003 0 0 0 1 0 0 0 0 0 0 0 0
5 2000 0 0 0 0 1 0 0 0 0 0 0 0
6 2001 0 0 0 0 0 1 0 0 0 0 0 0
7 2002 0 0 0 0 0 0 1 0 0 0 0 0
8 2003 0 0 0 0 0 0 0 1 0 0 0 0
9 2000 0 0 0 0 0 0 0 0 1 0 0 0
10 2001 0 0 0 0 0 0 0 0 0 1 0 0
11 2002 0 0 0 0 0 0 0 0 0 0 1 0
12 2003 0 0 0 0 0 0 0 0 0 0 0 1
1) model.matrix We split the names by the number of characters in them (the countries have 2 characters in their names and the years have 6) and paste pluses in each. (Alternately use Plus(grep("^..$", nms, value = TRUE)) to get the country names and use that in place of spl["2"] and similarly Plus(grep("^Y....d$", nms, value = TRUE)) in place of spl["6"].)
c(`2` = "AL+FR+UK", `6` = "Y2000d+Y2001d+Y2002d+Y2003d")
and from that the formula:
~(AL + FR + UK):(Y2000d + Y2001d + Y2002d + Y2003d) + 0
and then compute its model matrix.
The formula could also be expanded to one accepted by lm by modifying the sprintf format so we might not even need to create the model matrix. For example, if we had a response vector R then we could write: s <- sprintf("R ~ (%s)*(%s)", spl["2"], spl["4"]); fo <- formula(s); lm(fo, YearCountry) to include all variables and the interactions of countries and year as well as an intercept.
Plus <- function(x) paste(x, collapse = "+")
nms <- names(YearCountry)[-1]
spl <- sapply(split(nms, nchar(nms)), Plus)
s <- sprintf("~ (%s):(%s)+0", spl["2"], spl["6"])
fo <- formula(s)
model.matrix(fo, YearCountry)
giving this matrix:
AL:Y2000d AL:Y2001d AL:Y2002d AL:Y2003d FR:Y2000d FR:Y2001d FR:Y2002d FR:Y2003d UK:Y2000d UK:Y2001d UK:Y2002d UK:Y2003d
1 1 0 0 0 0 0 0 0 0 0 0 0
2 0 1 0 0 0 0 0 0 0 0 0 0
3 0 0 1 0 0 0 0 0 0 0 0 0
4 0 0 0 1 0 0 0 0 0 0 0 0
5 0 0 0 0 1 0 0 0 0 0 0 0
6 0 0 0 0 0 1 0 0 0 0 0 0
7 0 0 0 0 0 0 1 0 0 0 0 0
8 0 0 0 0 0 0 0 1 0 0 0 0
9 0 0 0 0 0 0 0 0 1 0 0 0
10 0 0 0 0 0 0 0 0 0 1 0 0
11 0 0 0 0 0 0 0 0 0 0 1 0
12 0 0 0 0 0 0 0 0 0 0 0 1
attr(,"assign")
[1] 1 2 3 4 5 6 7 8 9 10 11 12
Alternately we can write it compactly like this:
Plus <- function(x) paste(x, collapse = "+")
nms <- names(YearCountry)
s <- sprintf("~ (%s):(%s)+0", Plus(nms[2:4]), Plus(nms[5:8]))
fo <- formula(s)
model.matrix(fo, YearCountry)
2) eList Another approach is to use list comprehensions. With the eList package we can do this:
library(eList)
DF(for(i in YearCountry[2:4]) for(j in YearCountry[5:8]) i*j)
giving this data frame. Use as.matrix(...) on it if you want a matrix.
AL.Y2000d AL.Y2001d AL.Y2002d AL.Y2003d FR.Y2000d FR.Y2001d FR.Y2002d FR.Y2003d UK.Y2000d UK.Y2001d UK.Y2002d UK.Y2003d
1 1 0 0 0 0 0 0 0 0 0 0 0
2 0 1 0 0 0 0 0 0 0 0 0 0
3 0 0 1 0 0 0 0 0 0 0 0 0
4 0 0 0 1 0 0 0 0 0 0 0 0
5 0 0 0 0 1 0 0 0 0 0 0 0
6 0 0 0 0 0 1 0 0 0 0 0 0
7 0 0 0 0 0 0 1 0 0 0 0 0
8 0 0 0 0 0 0 0 1 0 0 0 0
9 0 0 0 0 0 0 0 0 1 0 0 0
10 0 0 0 0 0 0 0 0 0 1 0 0
11 0 0 0 0 0 0 0 0 0 0 1 0
12 0 0 0 0 0 0 0 0 0 0 0 1
3) listcompr listcompr is another list comprehension package. Note that the development version of this package is needed in order to use bycol=. Replace gen.named.matrix with gen.named.data.frame if you want a data frame.
# devtools::github_github("patrickroocks/listcompr")
library(listcompr)
nms <- names(YearCountry)
gen.named.matrix("{nms[i]}.{nms[j]}", YearCountry[[i]] * YearCountry[[j]],
i = 2:4, j = 5:8, bycol = TRUE)

Filling a table with additional columns if they don't exist

I've the following difficult problem. Here short example of my data. Assume that I've two data sets (my real example has something about 20). The data frames result as a list computed by a self written function with lapply. So, I put the data frames in my example in a list, too. Then I "rbind" them to compute a frequency table.
df1 <- data.frame(rev(seq(12:0)), paste0("a=",sample(0:12, 13, replace=T)))
colnames(df1) <- c("k", "a")
df2 <- data.frame(rev(seq(12:0)), paste0("a=",sample(0:12, 13, replace=T)))
colnames(df2) <- c("k", "a")
list_df <- list(df1,df2)
df_combine<- plyr::ldply(list_df, rbind)
freq_foo <- table(df_combine$k,df_combine$a)
I get a frequency table of the following form.
a=0 a=11 a=12 a=2 a=5 a=6 a=7 a=8 a=3 a=9
1 1 0 0 0 0 0 0 1 0 0
2 1 0 0 0 0 0 0 0 0 1
3 1 0 0 0 0 1 0 0 0 0
4 0 0 0 1 0 1 0 0 0 0
5 0 0 0 1 1 0 0 0 0 0
6 0 0 0 0 0 0 1 0 0 1
7 0 1 1 0 0 0 0 0 0 0
8 1 0 0 0 0 1 0 0 0 0
9 0 0 0 0 0 0 2 0 0 0
10 0 0 1 0 1 0 0 0 0 0
11 1 1 0 0 0 0 0 0 0 0
12 0 0 0 0 0 0 1 0 1 0
13 1 0 1 0 0 0 0 0 0 0
I want to extend and manipulate my table in the following way:
First the table should go over a range of a=0 to a=15. So if there is a missing column, it should be added. And 2nd) I want to order the columns from 0 to 15.
For the first problem I tried
if(freq_foo$paste0("a=",0:15) == F){freq_foo$paste("a=",0:15) <- 0}
but this should work only for data frames and not for tables. Also. i've no idea how to order the columns with an ascending order. The data type isnt important to me because I just want to use the output for further calculations. So, it can also be a data frame instead of a table.
#convert freq_foo table to dataframe
df <- as.data.frame.matrix(freq_foo)
#add all zeros column for missing column name in 0:15 series
df[, paste0("a=", c(0:15)[!(c(0:15) %in% as.numeric(gsub(".*=(\\d+)", "\\1", names(df))))])] <- 0
#order columns from 0 to 15
df <- df[, order(as.numeric(gsub(".*=(\\d+)", "\\1", names(df))))]
Output is:
a=0 a=1 a=2 a=3 a=4 a=5 a=6 a=7 a=8 a=9 a=10 a=11 a=12 a=13 a=14 a=15
1 0 0 0 0 0 0 0 0 0 1 1 0 0 0 0 0
2 0 0 0 1 0 0 0 0 0 0 1 0 0 0 0 0
3 0 0 0 0 1 0 0 1 0 0 0 0 0 0 0 0
4 0 0 0 0 0 0 0 0 2 0 0 0 0 0 0 0
5 0 1 0 0 0 0 1 0 0 0 0 0 0 0 0 0
6 1 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0
7 0 1 0 0 0 0 0 0 0 0 0 1 0 0 0 0
8 0 0 0 0 0 0 0 1 1 0 0 0 0 0 0 0
9 0 0 0 0 0 0 0 1 1 0 0 0 0 0 0 0
10 1 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0
11 1 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0
12 0 0 0 0 0 0 0 0 1 0 0 0 1 0 0 0
13 0 1 0 0 1 0 0 0 0 0 0 0 0 0 0 0
(Edit: Updated code after getting a requirement clarification from OP)

how to convert a matrix of values into a binary matrix

I'd like to convert a matrix of values into a matrix of 'bits'.
I have been looking for solutions and found this, which seems to be part of a solution.
I'll try to explain what I am looking for.
I have a matrix like
> x<-matrix(1:20,5,4)
> x
[,1] [,2] [,3] [,4]
[1,] 1 6 11 16
[2,] 2 7 12 17
[3,] 3 8 13 18
[4,] 4 9 14 19
[5,] 5 10 15 20
which I would like to convert into
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20
1 1 0 0 0 0 1 0 0 0 0 1 0 0 0 0 1 0 0 0 0
2 0 1 0 0 0 0 1 0 0 0 0 1 0 0 0 0 1 0 0 0
3 0 0 1 0 0 0 0 1 0 0 0 0 1 0 0 0 0 1 0 0
4 0 0 0 1 0 0 0 0 1 0 0 0 0 1 0 0 0 0 1 0
5 0 0 0 0 1 0 0 0 0 1 0 0 0 0 1 0 0 0 0 1
so for each value in the row a "1" in the corresponding column.
If I use
> table(sequence(length(x)),t(x))
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20
1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
2 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0
3 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0
4 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0
5 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
6 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0
7 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0
8 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0
9 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
10 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0
11 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0
12 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0
13 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
14 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0
15 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0
16 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0
17 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
18 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0
19 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0
20 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1
this is close to what I am looking for, but returns a line for each value.
I would only need to consolidate all values from one row into one row.
Because a
> table(x)
x
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20
1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1
gives alls values of the whole table, so what do I need to do to get the values per row.
Here is another option using table() function:
table(row(x), x)
# x
# 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20
# 1 1 0 0 0 0 1 0 0 0 0 1 0 0 0 0 1 0 0 0 0
# 2 0 1 0 0 0 0 1 0 0 0 0 1 0 0 0 0 1 0 0 0
# 3 0 0 1 0 0 0 0 1 0 0 0 0 1 0 0 0 0 1 0 0
# 4 0 0 0 1 0 0 0 0 1 0 0 0 0 1 0 0 0 0 1 0
# 5 0 0 0 0 1 0 0 0 0 1 0 0 0 0 1 0 0 0 0 1
bit_x = matrix(0, nrow = nrow(x), ncol = max(x))
for (i in 1:nrow(x)) {bit_x[i,x[i,]] = 1}
Let
(x <- matrix(c(1, 3), 2, 2))
[,1] [,2]
[1,] 1 1
[2,] 3 3
One approach would be
M <- matrix(0, nrow(x), max(x))
M[cbind(c(row(x)), c(x))] <- 1
M
# [,1] [,2] [,3]
# [1,] 1 0 0
# [2,] 0 0 1
In one line:
replace(matrix(0, nrow(x), max(x)), cbind(c(row(x)), c(x)), 1).
Following your approach, and similarly to #Psidom's suggestion:
table(rep(1:nrow(x), ncol(x)), x)
# x
# 1 3
# 1 2 0
# 2 0 2
We can use the reshape2 package.
library(reshape2)
# At first we make the matrix you provided
x <- matrix(1:20, 5, 4)
# then melt it based on first column
da <- melt(x, id.var = 1)
# then cast it
dat <- dcast(da, Var1 ~ value, fill = 0, fun.aggregate = length)
which gives us this
Var1 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20
1 1 1 0 0 0 0 1 0 0 0 0 1 0 0 0 0 1 0 0 0 0
2 2 0 1 0 0 0 0 1 0 0 0 0 1 0 0 0 0 1 0 0 0
3 3 0 0 1 0 0 0 0 1 0 0 0 0 1 0 0 0 0 1 0 0
4 4 0 0 0 1 0 0 0 0 1 0 0 0 0 1 0 0 0 0 1 0
5 5 0 0 0 0 1 0 0 0 0 1 0 0 0 0 1 0 0 0 0 1

Loosing observation when I use reshape in R

I have data set
> head(pain_subset2, n= 50)
PatientID RSE SE SECODE
1 1001-01 0 0 0
2 1001-01 0 0 0
3 1001-02 0 0 0
4 1001-02 0 0 0
5 1002-01 0 0 0
6 1002-01 1 2a 1
7 1002-02 0 0 0
8 1002-02 0 0 0
9 1002-02 0 0 0
10 1002-03 0 0 0
11 1002-03 0 0 0
12 1002-03 1 1 1
> dim(pain_subset2)
[1] 817 4
> table(pain_subset2$RSE)
0 1
788 29
> table(pain_subset2$SE)
0 1 2a 2b 3 4 5
788 7 5 1 6 4 6
> table(pain_subset2$SECODE)
0 1
788 29
I want to create matrix with n * 6 (n :# of PatientID, column :6 levels of SE)
I use reshape, I lost many observations
> dim(p)
[1] 246 9
My code:
p <- reshape(pain_subset2, timevar = "SE", idvar = c("PatientID","RSE"),v.names = "SECODE", direction = "wide")
p[is.na(p)] <- 0
> table(p$RSE)
0 1
226 20
Compare with table of RSE, I lost 9 patients having 1.
This is out put I have
PatientID RSE SECODE.0 SECODE.2a SECODE.1 SECODE.5 SECODE.3 SECODE.2b SECODE.4
1 1001-01 0 0 0 0 0 0 0 0
3 1001-02 0 0 0 0 0 0 0 0
5 1002-01 0 0 0 0 0 0 0 0
6 1002-01 1 0 1 0 0 0 0 0
7 1002-02 0 0 0 0 0 0 0 0
10 1002-03 0 0 0 0 0 0 0 0
12 1002-03 1 0 0 1 0 0 0 0
13 1002-04 0 0 0 0 0 0 0 0
15 1003-01 0 0 0 0 0 0 0 0
18 1003-02 0 0 0 0 0 0 0 0
21 1003-03 0 0 0 0 0 0 0 0
24 1003-04 0 0 0 0 0 0 0 0
27 1003-05 0 0 0 0 0 0 0 0
30 1003-06 0 0 0 0 0 0 0 0
32 1003-07 0 0 0 0 0 0 0 0
35 1004-01 0 0 0 0 0 0 0 0
36 1004-01 1 0 0 0 1 0 0 0
40 1004-02a 0 0 0 0 0 0 0 0
Anyone knows what happens, I really appreciate.
Thanks for your help, best.
Try:
library(dplyr)
library(tidyr)
pain_subset2 %>%
spread(SE, SECODE)

using lappy and elseif command

Using R I have a table, lets say 'locations'
head(locations, n=10)
apillar fender fwheel fdoor compart rdoor rwheel boot
1 0 0 0 0 0 0 0 1
2 0 0 0 1 0 0 0 0
3 0 0 0 0 1 0 0 0
4 0 1 0 0 0 0 0 0
5 1 0 1 0 0 0 0 0
6 1 0 0 1 0 0 0 0
7 0 0 0 0 0 0 0 0
8 0 0 0 0 1 0 0 0
9 0 0 0 1 0 0 0 0
10 0 0 0 0 0 1 0 0
now i want to create a new variable "cat" which groups the impacts into category locations.
I have been using if, elseif and else command, but I cannot get it to work.
The command is:
cat <- lapply(locations, function(x) if (apillar|fender|fwheel == 1)print("front") else if (fdoor|compart|rdoor == 1)print("middle") else if(rwheel|boot ==1)print("rear") else print("NA")
such that cat should read rear, middle, middle, middle, front etc
When vectors of TRUE or FALSE statements are involved, I usually prefer not to work with if to avoid loops. I find conditional referencing to be more elegant in this case. See below.
locations <- read.table(header=TRUE, text=
"apillar fender fwheel fdoor compart rdoor rwheel boot
1 0 0 0 0 0 0 0 1
2 0 0 0 1 0 0 0 0
3 0 0 0 0 1 0 0 0
4 0 1 0 0 0 0 0 0
5 1 0 1 0 0 0 0 0
6 1 0 0 1 0 0 0 0
7 0 0 0 0 0 0 0 0
8 0 0 0 0 1 0 0 0
9 0 0 0 1 0 0 0 0
10 0 0 0 0 0 1 0 0")
locations$cat <- NA
within(locations,{
cat[apillar|fender|fwheel] <- "front"
cat[fdoor|compart|rdoor] <- "middle"
cat[rwheel|boot] <- "rear"
})
Result:
apillar fender fwheel fdoor compart rdoor rwheel boot cat
1 0 0 0 0 0 0 0 1 rear
2 0 0 0 1 0 0 0 0 middle
3 0 0 0 0 1 0 0 0 middle
4 0 1 0 0 0 0 0 0 front
5 1 0 1 0 0 0 0 0 front
6 1 0 0 1 0 0 0 0 middle
7 0 0 0 0 0 0 0 0 <NA>
8 0 0 0 0 1 0 0 0 middle
9 0 0 0 1 0 0 0 0 middle
10 0 0 0 0 0 1 0 0 middle
Cheers!
Corrected your own code:
locations$cat= with(locations, ifelse(apillar|fender|fwheel, "front", ifelse(fdoor|compart|rdoor,"middle",ifelse(rwheel|boot, "rear", "NA"))) )
> locations
apillar fender fwheel fdoor compart rdoor rwheel boot cat
1 0 0 0 0 0 0 0 1 rear
2 0 0 0 1 0 0 0 0 middle
3 0 0 0 0 1 0 0 0 middle
4 0 1 0 0 0 0 0 0 front
5 1 0 1 0 0 0 0 0 front
6 1 0 0 1 0 0 0 0 front
7 0 0 0 0 0 0 0 0 NA
8 0 0 0 0 1 0 0 0 middle
9 0 0 0 1 0 0 0 0 middle
10 0 0 0 0 0 1 0 0 middle
>

Resources