a<- data.frame(sex=c(1,1,2,2,1,1),bq=factor(c(1,2,1,2,2,2)))
library(Hmisc)
label(a$sex)<-"gender"
label(a$bq)<-"xxx"
str(a)
b<-data.frame(lapply(a, as.character), stringsAsFactors=FALSE)
str(b)
When I covert dataframe a columns to character,the columns labels disappeared.My dataframe have many columns.Here as an example only two columns. How to keep columns labels when numeric convert to character? Thank you!
Labels are not a commonly used R feature. Unfortunately, you will have to do it yourself:
b <- data.frame(lapply(a, function(x) { y <- as.character(x); label(y) <- label(x); y }), stringsAsFactors = FALSE)
Related
I have a character vector of the type a<- c('ES1-5', 'ES14-26', 'ES27-38', 'ES6-13', 'SA1-13', 'SA14-25') and it is a column of a dataframe.
What I would like to do is to transform into a factor with levels 1,2,3,4,5,6 and subsequently transform into a numerical vector of 1,2,3,4,5,6 and colbind into the dataframe.
Could someone give me an elegant way to do this please.
You can use as.numeric(as.factor))
##Some random data##
a<- c('ES1-5', 'ES14-26', 'ES27-38', 'ES6-13', 'SA1-13', 'SA14-25')
x <- tibble(x = rnorm(6),
y = rnorm(6),
a = a) ##Append vector a as column in a dataframe
##Make a into a factor and append to the dataframe
x$a_factor <- as.numeric(factor(x$a))
x
I have around 200 columns in my dataframe.
I am looking to convert the columns that has a data type of char into factors and then to levels or integers.
For example , Man becoming 1.
The below code works manually,
as.factor(df$colName1)
as.integer(df$colName1)
But how can we make that check for all columns using a loop and then convert it ?
Thanks.
df <- apply(df,2,function(x){
if(is.character(x)){
x <- as.factor(x)
levels(x) <- 1:length(levels(x))
return(x)
}
})
## I believe that this should work
With tidyverse, the syntax would be
library(tidyverse)
df %>%
mutate_if(is.character, funs(as.integer(factor(.))))
I have a data frame with factors and characters. I want to change the columns with the column prefix "ID_" to be changed from factors to characters.
I tried the below, but it changes the whole data frame to characters, I just want to change the colnames with "ID_". I don't know how many "ID_" will end up in the data frame (this is part of a larger function that will loop across dataframes with various numbers of "ID_")
###Changes the whole dataframe to character rather than only the intended columns
df.loc[] <- lapply(df.loc[, grepl("ID_", colnames(df.loc))], as.character)
The problem is you assign to the whole data frame with df.loc[] <-. Try this:
my_cols <- grepl("ID_", colnames(df.loc))
df.loc[my_cols] <- lapply(df.loc[my_cols], as.character)
Here is a tidyverse solution:
food <- data_frame(
"ID_fruits" = factor(c("apple", "banana", "cherry")),
"vegetables" = factor(c("asparagus", "broccoli", "cabbage")),
"ID_drinks" = factor(c("absinthe", "beer", "cassis"))
)
food %>%
mutate_at(vars(starts_with("ID_")), as.character)
```
You can also do this with ifelse:
df[] <- ifelse(grepl("^ID_", colnames(df)), lapply(df, as.character), df)
Input:
df = data.frame(col1 = 1:5, col2 = 5:9)
rownames(df) <- letters[1:5]
#add jitter
jitter(df) #Error in jitter(df) : 'x' must be numeric
Expected output: jitter will be added to the columns of df. Thanks!
jitter is a function that takes numeric as input. You cannot simply run jitter on the whole data.frame. You need to loop through the columns. You can do:
data.frame(lapply(df, jitter))
Jitter is to be applied to a numerical vector, not a dataframe.
If you want to apply Jitter to all your columns, this should do:
apply(df, 2, jitter)
Just adding random numbers?
df_jit <- df + matrix(rnorm(nrow(df) * ncol(df), sd = 0.1), ncol = ncol(df))
I've got a frame with a set of different variables - integers, factors, logicals - and I would like to recode all of the "NAs" as a numeric across the whole dataset while preserving the underlying variable class. For example:
frame <- data.frame("x" = rnorm(10), "y" = rep("A", 10))
frame[6,] <- NA
dat <- as.data.frame(apply(frame,2, function(x) ifelse(is.na(x)== TRUE, -9, x) ))
dat
str(dat)
However, here the integers turn into factors; when I include as.numeric(x) in the apply() function, this introduces errors. Thanks for any and all thoughts on how to deal with this.
apply returns a matrix of type character. as.data.frame turns this into factors by default. Instead, you could do
dat <- as.data.frame(lapply(frame, function(x) ifelse(is.na(x), -9, x) ) )