It has to be really simple but it looks like my mind is not working properly anymore.
So, what I would like to do is to store one of the columns from mtcars as a vector but after subsetting it. I need one line code for the subsetting and assigning a vector.
That's what I would like to achieve but with one line:
data <- mtcars[mtcars[,11]==4,]
vec <- data[,1]
Thx!
vec<-mtcars[mtcars[,11]==4,][,1]
The mtcars[,11]==4 would be the row index and by selecting the column index as '1', we get the first column with subset of rows based on the condition.
mtcars[mtcars[,11]==4, 1]
Related
I have two dataframes
dataframe 1 has around million rows.. and its has two columns named 'row' and 'columns' that has the index of row and column of another dataframe (i.e. dataframe 2)..
i want to extract the values from dataframe 2 with the indexes stated in the columns named 'row' and 'columns' for each row in dataframe1.
I used a simple for loop to get the solution but it is time consuming and takes around 9 minutes, is there any other way with functions in R to solve this problem?
for(i in 1:nrow(datafram1)) {
dataframe1$value[i] = dataframe2[dataframe1$row[i],dataframe1$columns[i]]
}
You actually don't need a for loop to do this. Just add the new column to the Data Frame using the row and column names:
DataFrame1$value <- DataFrame2[DataFrame1$row, DataFrame1$column]
This should work a lot faster. If you wanted to try it a different way you could try adding the values to a new vector and then using cbind to join the vector to the Data Frame. The fact that you're trying to update the whole Data Frame during the loop is most likely what's slowing it down.
Maybe you can try the code below
dataframe1$value <- dataframe2[as.matrix(dataframe1[c("row","columns")])]
Sionce your loop only consider the rows in df1, you can cut the surplus roes on df2 and then use cbind:
dataframe2 <- dataframe2[nrow(dataframe1),]
df3 <- cbind(dataframe1, dataframe2)
I have a data frame (df) given from an excel sheet. In the first row of the date frame it's always "correct" or "wrong", the other rows are filled with data.
Now I want to select all the Columns where the first row says "correct" by using the function apply.
I tried:
apply(df,2,function(df) grepl ("correct",df))
The answer is just a data frame with TRUE and FALSE. How can I select the columns without losing the data in the other rows?
You shouldn't need a loop. The following should work,
df[,df[1,] == 'correct']
i <- sapply(df, function(x) x[1] =='correct')
df[,i]
So I have two columns. I need to add a third column. However this third column needs to have A for the first amount of rows, and B for the second specified amount of rows.
I tried adding this data_exercise_3 ["newcolumn"] <- (1:6)
but it didn't work. Can someone tell me what I'm doing wrong please?
Looks like you're having a problem with subsetting a data frame correctly. I'd recommend reviewing this concept before you proceed much further, either via a Coursera course or on a website like this UCLA R learning module on subsetting data frames. Subsetting is a crucial component of data wrangling with R, and you'll go much faster with a solid foundation of the basics!
You can assign values to a subset of a data frame by using [row, column] notation. Since your data frame is called data_exercise_3 and the column you'd like to assign values to is called 'newcolumn', then assuming you want the first 6 rows as 'A' and the next 3 as 'B', you could write it like this:
data_exercise_3[1:6,'newcolumn'] <- 'A'
data_exercise_3[7:9,'newcolumn'] <- 'B'
data_exercise_3$category <- c(rep("A",6),rep("B",6))
I have a data frame(A) of size (92047x2) and a list(B) of size (1829). I want to create a new data frame with all rows of A whose first column value is present in B.
How to use which()? Or any other good way to approach this?
All the values are in form of character. (Eg. "Vc2345")
You can do it like that:
dfA=data.frame(C1=sample(1:92047), C2=sample(1:92047))
listB=list(sample(1:1829))
dfAinB=dfA[which(dfA$C1 %in% unlist(listB)),]
str(dfAinB)
Is it possible to append a column to data frame in the following scenario?
dfWithData <- data.frame(start=c(1,2,3), end=c(11,22,33))
dfBlank <- data.frame()
..how to append column start from dfWithData to dfBlank?
It looks like the data should be added when data frame is being initialized. I can do this:
dfBlank <- data.frame(dfWithData[1])
but I am more interested if it is possible to append columns to an empty (but inti)
I would suggest simply subsetting the data frame you get back from the RODBC call. Something like:
df[,c('A','B','D')]
perhaps, or you can also subset the columns you want with their numerical position, i.e.
df[,c(1,2,4)]
dfBlank[1:nrow(dfWithData),"start"] <- dfWithData$start