I'm trying to take three different variables representing partisanship and combine them into one. The data looks like this, where each respondent has data on only one of the three variables as either a 1 or 2:
PARTISANSHIP_D PARTISANSHIP_I PARTISANSHIP_R
1 NA NA
2 NA NA
NA 1 NA
And what I'm trying to create is one variable on a 1:6 scale based on the responses to all three. I've tried to do this using dplyr
survey$partisan <- mutate(survey, partisan = ifelse(PARTISANSHIP_D==1, 6,
ifelse(PARTISANSHIP_D==2, 5,
ifelse(PARTISANSHIP_I==1, 4, ifelse(PARTISANSHIP_I==2, 3, ifelse(
PARTISANSHIP_R==2, 2, 1)
)))))
car
survey$partisan <- Recode(survey$PARTISANSHIP_D, "1=6; 2=5",
survey$PARTISANSHIP_I, "1=4; 2=3",
survey$PARTISANSHIP_R, "1=1; 2=2")
and plain ifelse commands like this:
survey$partisan <- ifelse(survey$PARTISANSHIP_D == 1, 6,
ifelse(survey$PARTISANSHIP_D == 2, 5,
ifelse(survey$PARTISANSHIP_I == 1, 4,
ifelse(survey$PARTISANSHIP_I == 2, 3,
ifelse(survey$PARTISANSHIP_R == 2, 2, 1)))))
But none of these is working. Any pointers of what I'm doing wrong?
I got your mutate to work by doing a couple of things: change the NA in your survey dataframe to 0:
survey[is.na(survey)]<-0
This is because ifelse stops when it encounters an NA.
And don't assign the mutate result to survey$partisan. Rather, assign it to the whole dataframe:
survey <- mutate(survey, partisan = ifelse(PARTISANSHIP_D==1, 6,
ifelse(PARTISANSHIP_D==2, 5,
ifelse(PARTISANSHIP_I==1, 4, ifelse(PARTISANSHIP_I==2, 3, ifelse(
PARTISANSHIP_R==2, 2, 1)
)))))
You are looking to pivot and reshape into a tidy format.
Try this:
library(dplyr)
tidysurvey <- gather(survey, ## the source DF
key = Partisanship, ## A name for the new key varaible
value = Code, ## A name for the new values varaible
PARTISANSHIP_D:PARTISANSHIP_R) ## a list of which of the source DF to reshape
Related
I have a dataset with these three columns and other additional columns
structure(list(from = c(1, 8, 3, 3, 8, 1, 4, 5, 8, 3, 1, 8, 4,
1), to = c(8, 3, 8, 54, 3, 4, 1, 6, 7, 1, 4, 3, 8, 8), time = c(1521823032,
1521827196, 1521827196, 1522678358, 1522701516, 1522701993, 1522702123,
1522769399, 1522780956, 1522794468, 1522794468, 1522794468, 1522794468,
1522859524)), class = "data.frame", row.names = c(NA, -14L))
I need the code to take all indices less than a number (e.g. 5) and for each of them do the following: Subset the data set if the index is either in column "from" or in column "to" and calculate a function (e.g the difference between the min and max in time). As a result I expect a dataframe with the indexes and the results of the calculation.
This is what I have, but it does not work.
dur<-function(x)max(x)-min(x) #The function to calculate the difference. In other cases I need to use other functions of my own
filternumber <- function(number,x){ #A function to filter data x by the number in the two two columns
x <- x%>% subset(from == number | to == number)
return(x)
}
lista <- unique(c(data$from, data$to)) # Creates a list with all the indexes in the data. I do this to avoid having non-existing indexes
lista <-lista[lista <= 5] #Limit the list to 5. In my code this number would be an argument to a function
result<-lista%>%filteremployee(.,data) %>% select(time) %>% dur() #I use select because I have many other columns in the data
The result in this case should be a dataframe with 1036492 for 1, 967272 for 3 and 92475 for 4
I´ve also try putting filteremployee(.,data) %>% select(time) %>% dur() in side mutate but that does not work either
Perhaps you are looking for something like this:
library(purrr)
library(dplyr)
index <- c(1, 3, 4)
names(index) <- index
index %>%
map_dfr(~ df %>%
filter(from == .x | to == .x) %>%
summarize(result = dur(time)),
.id = "index")
This returns
index result
1 1 1036492
2 3 967272
3 4 92475
The function was created with ==, which is elementwise. Here, we may need to loop
library(dplyr)
library(purrr)
map_dbl(lista, ~ filternumber(.x, data) %>%
select(time) %>%
dur)
[1] 1036492 967272 92475 0
Hope you have a nice day.
Today I was trying two make from one big column two small ones in R. However, I haven't found a way how to make it.
I have something like this (however, it is way bigger)
name3 <- c(1, 2, 3, 4, 5, 6)
df1 <- data.frame(name3)
print(df1)
I want to do something like this. My intention is just take the total number of variables and divide it into two equal groups.
name <- c(1, 2, 3)
name1 <- c(4, 5, 6)
df <- data.frame(name, name1)
print (df)
Thanks in advance!
One way to do it, you can first write this as a matrix in which you specify the number of columns
than transform the matrix to dataframe
from a dataframe you can convert each column to a vector
This is how I did it
name3 <- c(1, 2, 3, 4, 5, 6)
df <- as.data.frame(matrix(name3, ncol = 2))
name1 <- df$V1
name2 <- df$V2
Trying to accomplish this as close to base r as possible, this would be my method if the order of the sub vector don't matter:
# needed for index function
library(zoo)
# simple function to calculate even / odd
is.even <- function(x) x %% 2 == 0
# define my vector of values
name3 <- c(1, 2, 3, 4, 5, 6)
# split vector by even or odd index.
split(name3,f= is.even(index(name3)) )
Result:
$`FALSE`
[1] 1 3 5
$`TRUE`
[1] 2 4 6
I have two vectors and I need to find out the unique elements in both, together.
I tried doing length(summary(merge(v1, v2))) but summary aggregates a bunch of my dataset because there is only one of those entries, so I get an incorrect length.
E.g.:
list_1 <- c(1,2,3,4,5,5,6,1,2,3)
list_2 <- c(2,3,4,5,10,11,10)
and the outcome should be
1,2,3,4,5,6,10,11
P.S. bonus points if you can return all the unique elements in a vector... :-)
It sounds like you're looking for union:
> union(v1, v2)
[1] 1 2 3 4 5 6 10 11
here is my solution.
p1 <- c(1, 4, 1, 1, 4, 5, 6, 7, 8)
p2 <- c(3, 4, 1, 6, 90, 10, 32)
unique(c(p1, p2))
You can use unlist with union
unlist(union(a,b))
I have an R dataframe that contains 18 columns, I would like to write a function that compares column 1 to column 2, and if both columns contain the same value, a logical result of T or F is written to a new column (this part is not too hard for me), however I would like to repeat this process over for the next columns and write T/F to a new column.
values col 1 = values col 2, write T/F to new column, values col 3 = values col 4, write T/F to a new column (or write results to a new dataframe)
I have been trying to do this with the purrr package, and use the pmap/map function, but I know I am making a mistake and missing some important part.
This function should work if I understand your problem correctly.
df <-
data.frame(a = c(18, 6, 2 ,0),
b = c(0, 6, 2, 18),
c = c(1, 5, 6, 8),
d = c(3, 5, 9, 2))
compare_columns <-
function(x){
n_columns <- ncol(x)
odd_columns <- 2*1:(n_columns/2) - 1
even_columns <- 2*1:(n_columns/2)
comparisons_list <-
lapply(seq_len(n_columns/2),
function(y){
df[, odd_columns[y]] == df[, even_columns[y]]
})
comparisons_df <-
as.data.frame(comparisons_list,
col.names = paste0("column", odd_columns, "_column", even_columns))
return(cbind(x, comparisons_df))
}
compare_columns(df)
I want to multiply two data.frames that are of unequal length
If I have a data frame of observations (in reality this is around 30000 entries long)
Species number
1 3
1 3
3 5
4 40
5 22
and another data frame with conversion ratios for each species present in the first data frame (this is only about 120 entries in length)
species conversion ratio
1 3
2 5
3 4
4 2
5 2
and I want to multiply each number column entry by the conversion ratio entry associated with that Species, how might I go about doing this in R?
I've attempted using the match function to no avail, and my attempts at working with arrays have only resulted in errors, as well.
See ?merge. Assuming you have species named consistently (capitals):
df3 <- merge(df1,df2)
df3$number*df3$conversion.ratio
You could merge the two data frames.
## Your example data
df.number <- matrix(c(1, 1, 3, 4, 5, 3, 3, 5, 40, 22), ncol = 2)
colnames(df.number) <- c("species", "number")
df.ratio <- matrix(c(1, 2, 3, 4, 5, 3, 5, 4, 2, 2), ncol = 2)
colnames(df.ratio) <- c("species", "ratio")
## Merge the two matrices
dat <- merge(df.number, df.ratio, by = "species")
## Multiply for your result
result <- with(dat, number * ratio)
Edit
#Frank: In your comment to James, you say that the resulting data frame after the merge is too long. Do you mean that you want to remove duplicated rows? If so:
dat2 <- subset(dat, subset = !duplicated(dat))
result2 <- with(dat2, number * ratio)