R programming ifelse() programming [duplicate] - r

This question already has answers here:
Replace missing values with column mean
(14 answers)
Closed 6 years ago.
I have a dataset named malt, where one of the columns is named ka. I want to replace NA values in that ka column by mean values in malt$ka and other value remain as it is, so do this by if else
malt$ka <- ifelse(malt$ka=="NA", mean(malt$ka), "malt$AcqCostPercust")
This does not seem to work, and I am confused how to replace values the NA values.

Or
malt$ka[is.na(malt$ka)] <- mean(malt$ka, na.rm = TRUE)

x <- mean(malt$ka, na.rm=T) # assign mean value to a variable x
malt$ka<-ifelse(is.na(malt$ka),x,malt$ka)

Related

replace the NA value [duplicate]

This question already has answers here:
Replace NA in column with value in adjacent column
(3 answers)
Closed 1 year ago.
In the following case I want to replace the NA with the corresponding z column value. How do I assign it?
df<-data.frame(x=c(1,2,3,4,5,NA,7,NA,9,10),
z=c(1:10))
With this code I the NA is replaced with starting z value (1 and 2) and I need (6 and 8).
df$x[is.na(df$x)] <- (df$z)
We can use same the logic on the rhs
df$x[is.na(df$x)] <- df$z[is.na(df$x)]

is there a way to set all cells in a dataframe in the form of a vector as NA? [duplicate]

This question already has answers here:
R: Count number of objects in list [closed]
(5 answers)
Closed 2 years ago.
I have a dataframe in R, and I am trying to set all cells in the form of a vector, either c(1,2,3) or 1:2 to NA. Is there any easy way to do this?
You can use lengths to count number of elements in each value of column. Set them to NA where the length is greater than 1. Here I am considering dataframe name as df and column name as col_name. Change them according to your data.
df$col_name[lengths(df$col_name) > 1] <- NA

Replace value in dataframe by value divided by mean of the entire column [duplicate]

This question already has answers here:
Normalizing selection of dataframe columns with dplyr
(2 answers)
Closed 3 years ago.
In need to normalise my data by dividing each value by the mean of the entire column, preferably using dplyr.
assume
inputs <- c(3,5,3,9,12)
mydata = data.frame(inputs)
I would like all the values replaced by themselves divided by the mean, which is 6.4.
Any straightforward suggestion?
We can use sapply in base R for generalized approach
sapply(mydata, function(x) x/mean(x))
Or with colMeans if more than one column
mydata/colMeans(mydata)[col(mydata)]

How to replace a certain value with the smallest/largest value in column [duplicate]

This question already has answers here:
Replace all values lower than threshold in R
(4 answers)
Closed 3 years ago.
I have some values such as -77777 which denote a special type of missing info in my dataset. I want to replace these with the smallest value or the largest value in their own column. Say I'm working with dataset HLDE and column is RTLM.
HLDE <- data.frame(RTLM = c(0:9, -77777))
This is not a duplicate! The so-called duplicate has no resemblance.
Use conditional assignment with max or min. To make it more robust set na.rm=TRUE.
HLDE[HLDE$RTLM == -77777, "RTLM"] <- max(HLDE$RTLM, na.rm=TRUE)

How to calculate NA's in each column and drop columns having more than 100 NA's [duplicate]

This question already has answers here:
R: how to total the number of NA in each col of data.frame
(9 answers)
Closed 6 years ago.
Dataset attribute headings
I am a beginner and I am trying something like this:
for (i in newTrain) {
count = 0
count = length(which(is.na(newTrain$i)))
names(-which(count>100))
}
but this isn't working at all for me.
We could first apply is.na for the entire dataframe and then sum the value of NAs for every column. Then select columns which have NA value less than 100.
newTrain[colSums(is.na(newTrain)) < 100]

Resources