I have a data matrix 1200 (row, sample name)* 20000 (col, gene name), I want to delete row when my interested 5 genes have zero values in all samples
command I used for single gene:
allexp <-preallexp[preallexp$GZMB > 0, ]
but I want to use AND in above command, like this:
allexp <-preallexp[preallexp$GZMB && preallexp$TP53 && preallexp$EGFR && preallexp$BRAF && preallexp$VGEF > 0, ]
but this command doesnt work, please I need your help..How to use AND in above command.
EDIT: in response to OP.
I'm sure there's a much more efficient way to code this, but this is what you're after:
allexp <-preallexp[preallexp$GZMB + preallexp$TP53 + preallexp$EGFR +
preallexp$BRAF + preallexp$VGEF > 0, ]
Unless you have negative expression values I would have thought mkt's should work. But here is mine. It will remove values rows where each of the 5 genes and a value of 0
which(preallexp$GZMB == 0 && preallexp$TP53 &&
preallexp$EGFR == 0 && preallexp$BRAF == 0 && preallexp$VGEF == 0)
This gives so the rows where all 5 genes have a value of zero
So we can remove these rows if from the dataframe like follows
allexp <-preallexp[
-(which(preallexp$GZMB == 0 && preallexp$TP53 &&
preallexp$EGFR == 0 && preallexp$BRAF == 0 && preallexp$VGEF == 0)), ]
Related
I want to add the output of the loop in a new column "Compared_data".
Data set is libraries_four.
for (i in 1:20)
{
if ((Libraries_four[i,"PhyloAlps_iden"] == 1) & (Libraries_four[i,"ArctBorBryo_iden"] == 1 |
Libraries_four[i,"EMBL_143_iden"] == 1 | Libraries_four[i,"PhyloNorway_iden"] == 1 ))
{
print(TRUE)
}
else
{
print(FALSE)
}
}
The code is working fine but I tried the mutate function for the new column but it is not working. Is there any other way to add a new variable/column?
R is vectorised language, you would rarely need an explicit for loop. Try this :
library(dplyr)
Libraries_four <- Libraries_four %>%
mutate(result = PhyloAlps_iden == 1 & ArctBorBryo_iden == 1|
EMBL_143_iden == 1 | PhyloNorway_iden == 1)
This would create a new column called result in Libraries_four dataset.
You can also do this in base R :
Libraries_four <- transform(Libraries_four, result = PhyloAlps_iden == 1 & ArctBorBryo_iden == 1 | EMBL_143_iden == 1 | PhyloNorway_iden == 1)
I am running below code, its working but not showing me output
for (name in tita$name){
if (tita$sex == 'female' && tita$embarked == 'S' && tita$age > 33.00)
{
print (name)
}
}
It's just showing me ****** in R studio, though when I check dataset, it has data which have female having age greater than 33 and embarked from S, but this statement is not showing me result. But when I change the value from 33 to 28 the same code shows me the result. Why is that.
I am using the following dataset:
https://biostat.app.vumc.org/wiki/pub/Main/DataSets/titanic3.csv
I think you're mixing loops and vectorization where you shouldn't. As I mentioned in the comments your conditions are vectorized, but it looks like you're trying to evaluate each element in a loop.
You should do either:
# loop through elements
for (i in seq_along(tita$name)){
if (tita$sex[i] == 'female' & tita$embarked[i] == 'S' & tita$age[i] > 33.00){
print(tita$name[i])
}
}
OR use vectorization (this will be faster and is recommended):
conditions <- tita$sex == 'female' & tita$embarked == 'S' & tita$age > 33.00
names <- tita$name[conditions]
Here conditions is a TRUE and FALSE logical vector -- TRUE where all the conditions are met. We can use the to subset in R. For more information on what I mean by vectorization please see this link.
I am strugling with this loop. I want to get "6" in the second row of column "Newcolumn".I get the following error.
Error in if (mydata$type_name[i] == "a" && mydata$type_name[i - :
missing value where TRUE/FALSE needed.
The code that I created:
id type_name name score newcolumn
1 a Car 2 2
1 a van 2 6
1 b Car 2 2
1 b Car 2 2
mydata$newcolumn <-c(0)
for (i in 1:length(mydata$id)){
if ((mydata$type_name [i] == "a") && (mydata$type_name[i-1] == "a") && ((mydata$name[i]) != (mydata$name[i-1]))){
mydata$newcolumn[i]=mydata$score[i]*3 }
else {
mydata$newcolumn[i]=mydata$score[i]*1
}
}
Thank you very much in advance
List starts at index 1 in R but like you are doing a i-1 in your loop starting at 1, your list is out of range (i-1=0) so your code can not return a True or False.
I've read in my SPSS file in R and want to recode a new variable if such and such assumptions are made. To be specific:
I want to turn my spssdata_sub$gest variable into a new variable if the following the conditions are met:
spssdata_sub$indusert != 2 & spssdata_sub$ivf != 1 & spssdata_sub$leie != 3 & spssdata_sub$svkompl_II != 7 & spssdata_sub$svkompl_II != 2 & spssdata_sub$svkompl_II != 1
Anyone here who can help me with a code?
Does one of the following codes work for you?
Either this adapted version of Renu's solution
spssdata_sub$gest <- ifelse(spssdata_sub$indusert != 2 & spssdata_sub$ivf != 1 & spssdata_sub$leie != 3 & spssdata_sub$svkompl_II != 7 & spssdata_sub$svkompl_II != 2 & spssdata_sub$svkompl_II != 1, spssdata_sub$gest, NA)
or this code for filtering observations:
library(dplyr)
spssdata_sub_new <- spssdata_sub %>%
filter(indusert != 2 & ivf != 1 & leie != 3 & svkompl_II != 7 & svkompl_II != 2 & ssvkompl_II != 1)
One way is the following, if you really mean either one of the conditions
Mynewdata <- dplyr::filter(spssdata, indusert != 2, ivf != 1, leie != 3,
svkompl_II != 7 & svkompl_II != 2 & svkompl_II != 1)
only keeps entries that are neither, or putting it the other way exludes entries that have either indusert = 2 or ivf = 1 etc... one of the condition is enough to exclude it.
add-on: or something also like that:
Mynewdata <- dplyr::filter(spssdata, indusert != 2, ivf != 1, leie != 3,
!(svkompl_II %in% c(7,2,1))
I want to make a new variable "churned" by taking into account five variables :
Include in churn
A-Churn
B-Churn
C-Churn
D-Churn
My condition is - If variable "Include in churn" has 1 and for all other variables , if any one of the variables has 1 than my new variable "Churned" should have 1 else 0. I am a newbie in using mutate function.
Please help me to create this new variable thru 'mutate' function.
If I understand your formulation logically, you want
mutate(data, Churned = Include.in.Churn == 1 & (A.Churn == 1 | B.Churn == 1 | C.Churn == 1 | D.Churn == 1))
This will make Churned a logical. If you really need an integer, as.integer will produce 1 for TRUE and 0 for FALSE.
If all mentioned Variables are either 1 or 0 you can also use the possibly faster
mutate(data, Churned = Include.in.Churn * (A.Churn + B.Churn + C.Churn + D.Churn) >= 1)