Refactor whole data frame [duplicate] - r

This question already has answers here:
Drop unused factor levels in a subsetted data frame
(16 answers)
Closed 8 years ago.
I have a data frame that I have read from disk and then applied a filter:
df <- df[ df$x > 10, ]
Question: How can I refactor all factors in the data frame now that several rows have been removed?

The following worked for me:
df <- as.data.frame(lapply(df, function (x) if (is.factor(x)) factor(x) else x))
Source: http://r.789695.n4.nabble.com/Refactor-all-factors-in-a-data-frame-tp826749p826754.html

Related

How to loop through a vector of data frame names to print first columns of the df's? [duplicate]

This question already has answers here:
How to extract certain columns from a list of data frames
(3 answers)
Closed 2 years ago.
so x is a vector. i am trying to print the first col of df's name's saved in the vector. so far I have tried the below but they don't seem to work.
x = (c('Ethereum,another Df..., another DF...,'))
for (i in x){
print(i[,1])
}
sapply(toString(Ethereum), function(i) print(i[1]))
You can try this
x <- c('Ethereum','anotherDf',...)
for (i in x){
print(get(i)[,1])
}
You can use mget to get data in a list and using lapply extract the first column of each dataframe in the list.
data <- lapply(mget(x), `[`, 1)
#Use `[[` to get it as vector.
#data <- lapply(mget(x), `[[`, 1)
Similar solution using purrr::map :
data <- purrr::map(mget(x), `[`, 1)

R make a list of dataframes by subsetting from a dataframe [duplicate]

This question already has answers here:
Split a large dataframe into a list of data frames based on common value in column
(3 answers)
Closed 4 years ago.
I have a dataframe like the following
x <- c(1:100)
y <- c("a","b","c","d","e","f","g","h","i","j")
y<-rep(y, each=10)
df<-data.frame(x,y)
I would like to make a list of dataframes by subsetting by values in the y column. The end result would produce the same output as something like this:
df1 <- data.frame(df[df$y=="a",])
df2 <- data.frame(df[df$y=="b",])
...
df10 <- data.frame(df[df$y=="j",])
list <- list(df1,df2.....df10)
... but without all of the repetition. Thanks!
split(df, y)
.................

Split into list of data frames by column index [duplicate]

This question already has answers here:
How do I split a data frame among columns, say at every nth column?
(1 answer)
What is the algorithm behind R core's `split` function?
(1 answer)
Closed 4 years ago.
Is there an easy way in base R to split a data frame into a list of data frames based on an index factor levels (taken from another data frame)?
For example,
x = data.frame(num1 = 1:26, let = letters, num2 = 10:35, LET = LETTERS)
ls = list(x[, 1:2], x[, 3:4])
But lets say we had an index indicating factor levels for columns, can split be used?
indx = c(1,1,2,2)
? split(x, indx)
It would be the default method of split
out <- split.default(x, indx)
identical(ls, setNames(out, NULL))
#[1] TRUE

From a dataframe extract columns with numerical values [duplicate]

This question already has answers here:
Selecting only numeric columns from a data frame
(12 answers)
Closed 4 years ago.
I would like to extract all columns for which the values are numeric from a dataframe, for a large dataset.
#generate mixed data
dat <- matrix(rnorm(100), nrow = 20)
df <- data.frame(letters[1 : 20], dat)
I was thinking of something along the lines of:
numdat <- df[,df == "numeric"]
That however leaves me without variables. The following gives an error.
dat <- df[,class == "numeric"]
Error in class == "numeric" :
comparison (1) is possible only for atomic and list types
What should I do instead?
use sapply
numdat <- df[,sapply(df, function(x) {class(x)== "numeric"})]

Delete Every even(uneven) row from a dataset [duplicate]

This question already has answers here:
Selecting multiple odd or even columns/rows for dataframe
(5 answers)
Closed 6 years ago.
What is the most elegant way to delete every even/uneven row from a data-frame in R? My first try was to enter the numbers of every second row "by hand" (See below)
(SmallerDataFrame <- OriginalDataFrame[-c(2,4,6,8,10,12),])
Try this
dat <- mtcars$cyl
toDelete <- seq(0, length(dat), 2)
toDelete
dat <- dat[-toDelete, ]
For data frame,
dat <- mtcars
toDelete <- seq(1, nrow(dat), 2)
dat[ toDelete ,]

Resources