This question already has answers here:
Using ifelse in R
(2 answers)
Closed 6 years ago.
I'm quite new to R, picked it up less than two weeks ago and was wondering why this didn't work. Basically what I'm trying to do is to loop through the new added column, compare a value of another column in the same row, and based on a condition, change the value in the column I'm looping.
myDataFrame["column2"] <- "a"
refValue = x
for(i in nrow(myDataFrame){
if(column1[i] >= refValue){
column2[i] <- "b"
}}
Tried to run it but the value doesn't change
View(MyDataFrame)
So myDataFrame is at th moment is
column1---------column 2
someValue------a
someValue------a
someValue------a
someValue------a
after it finished looping based on a condition which is the value of the corresponding row in column1, I want to change some of the 'a's to 'b's
No need to use loop for this. You can replace your code with
myDataFrame$column2 <- with(myDataFrame, ifelse(column1 >= x, "b", "a"))
Related
This question already has answers here:
How to remove rows with 0 values using R
(2 answers)
Closed 2 years ago.
I searched many questions that were suggested by the stackoverflow before posting this question but I couldn't find what I was looking for, I decided to ask here, I have data file:
https://github.com/learnseq/learning/blob/main/GSE133399_Fig2_FPKM.csv
The file has 9 columns, first column has names, the other 8 columns have values, I want to render into an object all columns that do not have zero and save the in csv format.
I had a look on your data set: it contains some rows having all values zero, except the identifier. I assume you want to omit the lines being full of zero's. This code does the job:
data1 = read.csv("GSE133399_Fig2_FPKM.csv")
## Apply <all> on each row.
allZero = apply(data1[, -1] == 0, 1, all)
data2 = data1[!allZero, ]
Now, data2 is the same as data1, but without the rows having only zeros.
This question already has answers here:
How to remove columns with same value in R
(4 answers)
Closed 2 years ago.
I have a really large dataset and I want to filter out some of the columns because it is the same data all throughout (ex: company name is all "Walmart"). I can go through and do these manually but I'm looking for a code to do it automatically.
I had in mind a function to subset based on if sum(unique(colnam)) == 1 but not sure how to get it to work. Thanks.
which(sapply(dat, function(col) length(unique(col)) == 1))
This question already has answers here:
Extract a column from a data.table as a vector, by position
(2 answers)
Closed 3 years ago.
Let say I have below data.table
library(data.table)
DT <- data.table(x=sample(letters, 1e6, TRUE), y=rnorm(1e6), v=runif(1e6))
Now I want to subset DT based on the 1st column value equals to letters[1:2]
If the chose column by name, then this is straighforward -
DT[x %in% letters[1:2]]
However I want to select column by placement i.e. first column or 4th column etc
Below code doesnt work
DT[1 %in% letters[1:2]]
Any pointer on what will be right syntax to select column based on placement, would be helpful
You can do this :
DT[DT[[1]] %in% letters[1:2]]
This question already has answers here:
Split delimited strings in a column and insert as new rows [duplicate]
(6 answers)
Split comma-separated strings in a column into separate rows
(6 answers)
Closed 6 years ago.
I've tried to create a function that replaces semicolon-containing elements in a dataframe column with splitted entries that are places on the bottom of the column, using basic R. The main purpose is to use this function with apply and make the addition whenever detecting an entry with semicolon.
The main problem with my code is that it returns the exact same data frame without any additional values.
> df
rs2480711
rs74832092
rs4648658
rs4648659
rs61763535
rs28733941;rs67677371
>x
"rs28733941;rs67677371"
function(x){
semiCols = length(unlist(strsplit(x, ";")))
elementsRs = unlist(strsplit(x, ";"))
if(semiCols>1){
for(i in 1:semiCols){
df = rbind(df, elementsRs[i])
}}}
I would also like to know how can I expand the code in order to split rows based on one value leaving all the others unchanged. For example, this
>df
0 rs61763535 T1
1 rs28733941;rs67677371 T2
will look like this
>df2
0 rs61763535 T1
1 rs28733941 T2
1 rs67677371 T2
If I understood correctly, this will work
unlist(strsplit(as.character(df$V1),split = ";"))
Again, I couldn't get you properly. But, maybe you are looking for this
apply(df,2,function(t) unlist(strsplit(as.character(t),split = ";")))
This question already has answers here:
Subset dataframe by multiple logical conditions of rows to remove
(8 answers)
Closed 7 years ago.
I'm trying to delete all rows in a dataframe when the average of a vector > an individual number in the vector. For some reason it seems to pick and choose which ones it deletes. All help is appreciated thank you, here is my code.
k<-c(HW2$AGE)
j<-mean(k)
for (i in HW2$AGE)
if (j>i){
HW2 <- HW2[-i, ]
}
Don't need to vectorise. Instead I would use the below
Sample data
x <- data.frame("A"= runif(10), "B" = runif(10))
Calculate mean
xMean <- mean(x[,"A"])
Exclude rows
y <- x[x$A < xMean,]
This is probably the most obvious way of excluding unwanted rows