I’m having trouble putting two conditions into a subset. The result is a whole bunch of NA.
> df[(df$col > 0) && (df$col < 4), ]
Drop the space after ',' and you only need one '&'.
df[df$col > 0 & df$col < 4,]
You may be getting NA 'cause you want OR (|) instead of AND (&).
Related
How could I identify a column in R dataframe using a variable? In the following code, I used paste0 to identify a columns with variable. Is there any alternative?
if ((leadsnp4[[paste0('Z_in_',trait1)]] > 0) & (leadsnp4[[paste0('Z_in_',trait2)]] > 0))
{leadsnp4$ConcordEffect='Yes'} else if ((leadsnp4[[paste0('Z_in_',trait1)]] < 0) & (leadsnp4[[paste0('Z_in_',trait2)]] < 0))
{leadsnp4$ConcordEffect='Yes'} else if ((leadsnp4[[paste0('Z_in_',trait1)]] > 0) & (leadsnp4[[paste0('Z_in_',trait2)]] < 0))
{leadsnp4$ConcordEffect='No'} else if ((leadsnp4[[paste0('Z_in_',trait1)]] < 0) & (leadsnp4[[paste0('Z_in_',trait2)]] > 0))
{leadsnp4$ConcordEffect='No'}
leadsnp4 is a dataframe. trait1 and trait2 are user defined variables. The above code is giving me warning : The condition has length > 1 and only the first element will be used. Also not getting the expected output.
Not sure what is wrong here. Maybe there are other alternatives for the above if else statements. Any help?
The way you're selecting columns in fine. Using df[[col_name]] (list context) is the same as df[, col_name] -- each returns a vector copy of column col_name. You can save the column name as a variable instead of using paste0 directly in the selection.
The reason you're getting an error is that if is not vectorized and you're giving it a vector with length > 1. In this case, if uses only the first value in the vector, but warns that it's doing so. ifelse is the vectorized version in base R (there's also dplyr::if_else). If I understand your code, the below should be close to what you're looking for.
t1 <- paste0('Z_in_', trait1)
t2 <- paste0('Z_in_', trait2)
# a single boolean vector indicating if trait1 and trait2 are
# both positive or both negative
same_sign <- ((leadsnp4[, t1] > 0) & (leadsnp4[, t2] > 0)) |
((leadsnp4[, t1] < 0) & (leadsnp4[, t2] < 0))
leadsnp4$ConcordEffect <- ifelse(same_sign, "Yes", "No")
Note that if trait1 and/or trai2 are equal to 0 they will be assigned false. You'll need to modify the logic if this is not the desired behavior.
Here is an explanation for why pasting will not work for creating a column reference and one suggestion for what you can do instead: Dynamically select data frame columns using $ and a character value
I have the folowing R statement. Basically it goes through the entire matchesData data frame and checks if the conditions are matched for each row.
If it matches, put a '1' at matchesData$isRedPreferredLineup.
matchesData$isRedPreferredLineup <- ifelse((matchesData$redTop==red_poplist[1] &
matchesData$redADC==red_poplist[2] &
matchesData$redJungle==red_poplist[3] &
matchesData$redSupport==red_poplist[4] &
matchesData$redMiddle==red_poplist[5] &
matchesData$YearSeason==Season), 1,
matchesData$isRedPreferredLineup)
However, now I need the condition to be flexible. Meaning, if
matchesData$redTop==red_poplist[1]
matchesData$redADC==red_poplist[2]
matchesData$redJungle==red_poplist[3]
conditions are matched, or if
matchesData$redJungle==red_poplist[3]
matchesData$redSupport==red_poplist[4]
matchesData$redMiddle==red_poplist[5]
conditions are matched, or any other permutation comprising 3 or more of the following conditions are matched, I would like to put '1' at matchesData$isRedPreferredLineup.
(matchesData$redTop==red_poplist[1] &
matchesData$redADC==red_poplist[2] &
matchesData$redJungle==red_poplist[3] &
matchesData$redSupport==red_poplist[4] &
matchesData$redMiddle==red_poplist[5] &
matchesData$YearSeason==Season)
How can I do so in a vectorized ifelse statement like this?
Or is there a better way to do this?
Please bear with me, I am pretty new to R. Thanks.
Maybe this coud work:
selectIndex <- apply(matchesData,1,function(row){
sum(c(row['redTop'] == red_poplist[1],
row['redADC'] == red_poplist[2],
row['redJungle'] == red_poplist[3],
row['redSupport'] == red_poplist[4],
row['redMiddle'] == red_poplist[5],
row['YearSeason'] == Season) > 3)
})
matchesData$isRedPreferredLineup[selectIndex] <- 1
You could vectorise the TRUE/FALSE statements like this:
my.conditions <- cbind(matchesData$redTop==red_poplist[1], matchesData$redADC==red_poplist[2],
matchesData$redJungle==red_poplist[3], matchesData$redSupport==red_poplist[4],
matchesData$redMiddle==red_poplist[5], matchesData$YearSeason==Season)
Then you could consider S1 <- rowSums(my.conditions) which will give you the number of TRUEs in my.conditions and then (your final condition would boil down to ifelse(S1 > 2, 1, ...)) consider the following:
matchesData$isRedPreferredLineup[which(S1 > 2)] <- 1
I'm currently researching a matching-to-sample task in monkeys. I want to evaluate how often a certain stimulus was chosen, regardless of correctness of the choice.
To do so, I have a dataframe df with 6288 rows and 6 columns ("Monkey", "Session", "Sample", "Match", "Foil", "Success"), of which only the last three are important now.
The data in df$Match and df$Foilare the names of the stimuli (string) and df$Success is binary. df$Match and df$Foil are made up of 65 distinct stimuli names, which I included in a vector Match.Foil.
Now I want to count how often a picture (part of the vector Match.Foil) is clicked in all 6288 trials. That is, everytime the name is either part of df$Match & df$Success == "1" OR when df$Foil & df$Success == "0".
I tried to build a vector with the number of times clicked for each part of Match.Foil like this:
Pic.clicked= vector(mode="numeric", length= length(Match.Foil))
for (i in 1:length(Match.Foil)){
Pic.clicked[i] = ifelse(
df$Match == Match.Foil[i] & df$Success == "1")|
(df$Foil== Match.Foil[i] & df$Success == "0"),
Pic.clicked[i] +1,
Pic.clicked[i] +0)
}
So, as you see I wanted to use the functions Pic.clicked + 1 and Pic.clicked + 0 as the returns if the statement is TRUE or FALSE. It does not work and gives me the error:
In Pic.clicked[i] = ifelse((df$Match == Match.Foil[i] & ... :
number of items to replace is not a multiple of replacement length
Does anybody have an idea, how to build an appropriate counter? I thought about using switch, but I don't have any experience with that function and it seems not to work like I need it. I also tried running it for 6288 loops, but that produces the same warning.
you can use sum(), which on a boolean vector makes TRUE count as 1:
for (i in 1:length(Match.Foil)) {
Pic.clicked[i]= sum((Stage4.pics$Match == Match.Foil[i] & Stage4.pics$Success == "1")|
(Stage4.pics$Foil== Match.Foil[i] & Stage4.pics$Success == "0"))
}
I am trying to get my head around daa.R, one of the functions in the matchingMarkets R library (links are to GitHub repositories). On lines 134-135, one finds the following if statement
if (0 %in% (c.hist[[j]] & any(c.prefs[ ,j]==proposers[k]))){ # if no history and proposer is on preference list
c.hist[[j]][c.hist[[j]]==0][1] <- proposers[k] # then accept
}
where c.hist and proposers are a list and c.prefs a matrix.
I am puzzled by the parentheses in the conditional statement. Instead of the above synthax, I would have opted for
if (0 %in% c.hist[[j]] & any(c.prefs[ ,j]==proposers[k]))
I don't understand how the original condition may work. How could R possibly check whether 0 is in (c.hist[[j]] & any(c.prefs[ ,j]==proposers[k]))?
I am a beginner in R, so I wanted to make sure I was not missing something and tried to replicate a similar synthax with other conditions such as,
> x = list(4,3)
> y = list(5,2)
> if (3 %in% (x & any(y == 5))){z = 8}
As I expected, I got an error message
Error in x & any(y == 5) : operations are possible only for numeric, logical or complex types
whereas things go just fine when I write
if (3 %in% x & any(y == 5)){z = 8}
instead.
What am I missing? Why would the kind of conditional synthax I am puzzled by work in daa.R and not with the other conditions I tried?
When you ask R if 0 %in% x where x is a logical vector, R will first convert x to a numeric vector where FALSE becomes 0 and TRUE becomes 1. So essentially, asking if 0 %in% x is like asking if x contains any FALSE. This is arguably pretty bad practice. A better approach would be to test if any(!x) or !all(x). Worse, if x has length 1 as it seems to be the case here, you would just test if !x.
In light of the contorted usage, you are raising a very good question: is the code doing what it really meant to do? In R, the %in% operator has higher precedence than & (see ?Syntax), thus these two statements are not the same:
0 %in% (c.hist[[j]]) & any(c.prefs[ ,j]==proposers[k])) # original code
0 %in% c.hist[[j]] & any(c.prefs[ ,j]==proposers[k]) # what you suggested
and we would need to look closely at what the code is supposed to be doing to decide if it is correct or wrong. I will just point out that you did not test your assumption properly: the error you got ("unexpected '{'") is because you forgot a closing parenthesis:
if (3 %in% (x & any(y == 5)){z = 8}
should be
if (3 %in% (x & any(y == 5))){z = 8}
I have a dataframe where the dates are given as hydrological years (October to September). To change this I am trying to use a if statement:
if(cet$month== 10|cet$month==11|cet$month==12)
cet$year <- substr(as.character(cet[,2]),1,4) else
cet$year <- substr(as.character(cet[,2]),6,9)
but I get an error:
the condition has length > 1 and only the first element will be used
Reading the "if" help file I realized that the condition has to be a length-one logical vector. Is there no way of using an "or" with an "if"? All I want is to apply that expression if the month is October, November or December.
ifelse is the vectorised version. You can also use %in% to reduce the number of statements.
cet$year <- ifelse(cet$month%in%(10:12), substr(as.character(cet[,2]),1,4), substr(as.character(cet[,2]),6,9))
Ok, here's a reproducible example that should help to clarify things:
# generate some vector
x <- c(1,2,4,4,5,5,6,6,6)
# have a check using OR, return values
x[x == 2 | x == 1]
## or return TRUE / FALSE
(x == 2 | x == 1)
or check ?ifelse
EDIT: Note that for characters you need to use "", like x == "yourchars" | x == "someotherchars"
Here's also some simple reference and how to work with operators: QuickR
the OR instruction is double pipes
| => || in the if()