I have a column Total with about 300 values in a Data Frame, here first 30 values are NA’s .. and I would like to fill in these with a vector values c(233,423,545,354,223,646,243,553,634----231),
do you have any suggestions for getting it done?.
You could just do:
df[1:30, "Total"] <- yourvector
Related
I have a .csv file of 39 variables and 713 rows, each containing a count of plastic items. I have another column which is the survey length, and I want to standardise each count of items by a survey length of 100. I am unsure how to create a loop to run through each row and cell individually to do this. Many also have NA values.
Any ideas would be great.
Thank you.
Consider applying formula directly on columns without need of looping:
# RETRIEVE ALL COLUMN NAMES (MINUS SURVEY LENGTH)
vars <- names(df)[!grepl("survey_length", names(df))]
# EXPAND SINGLE COLUMN TO EQUAL DIMENSION OF DATA FRAME
survey_length_mat <- matrix(df$survey_length, ncol=length(vars), nrow=nrow(df))
# APPLY FORMULA
df[vars] <- (df[vars] / survey_length_mat) * 100
df
I have a huge dataset of about 1.6 million rows, and the variable (column) I need to focus on is 'temperature'. The temperature column has many NA values, and the other variable columns have NA values throughout as well. I want to remove only the rows with NA values in the temperature column, I don't particularly care about the NA values in the other columns. How can I do this? If I end up needing to remove rows with NA values for more than just my temperature column, (eg the depth column) how can I select two columns? This is my code:
otn <- tidync(filename, row.names=TRUE) %>% activate('D0')
glider_table <- hyper_tibble(otn)
attach(glider_table)
summary(temperature)
na.omit(glider_table)
na.omit () removes all rows with NA values regardless of which column they're in, so I need something more selective.
You can use the drop_na() function, the first argument is the dataset name, and the second is an optional argument where you can name the specific columns you want to remove the NA responses from.
Like this , drop_na(dataset, column)
Suppose I have a data frame with 6 columns.
How do I replace all the NA values in the first 4 columns with a 0?
I have tried:
grades[is.na(grades), 1:4] = 0
The is.na is applied on the full dataset and it gives a matrix with dimensions equal to the original dataset. So, it is better to subset the dataset and apply the is.na on the first four column to get a logical matrix and then use the same subset of data to assign the TRUE values to 0
grades[1:4][is.na(grades[1:4])] <- 0
I am trying to subset a data frame by taking the integer values of 2 columns om my data frame
Subs1<-subset(DATA,DATA[,2][!is.na(DATA[,2])] & DATA[,3][!is.na(DATA[,3])])
but it gives me an error : longer object length is not a multiple of shorter object length.
How can I construct a subset which is composed of NON NA values of column 2 AND column 3?
Thanks a lot?
Try this:
Subs1<-subset(DATA, (!is.na(DATA[,2])) & (!is.na(DATA[,3])))
The second parameter of subset is a logical vector with same length of nrow(DATA), indicating whether to keep the corresponding row.
The na.omit functions can be an answer to you question
Subs1 <- na.omit(DATA[2:3])
[https://stat.ethz.ch/R-manual/R-patched/library/stats/html/na.fail.html]
Here an example.
a,b ,c are 3 vectors which a and b have a missing value.
once they are created i use cbind in order to bind them in one matrix which afterwards you can transform to data frame.
The final result is a dataframe where 2 out of 3 columns have a missing value.
So we need to keep only the rows with complete cases.DATA[complete.cases(DATA), ] is used in order to keep only these rows that have not missing values in every column. subset object is these rows that have complete cases.
a <- c(1,NA,2)
b <- c(NA,1,2)
c <- c(1,2,3)
DATA <- as.data.frame(cbind(a,b,c))
subset <- DATA[complete.cases(DATA), ]
I am new to R with a fairly simple question, I just can't figure out the answer. For my example I will use a data frame with 3 columns, but my actual data set is 139 columns with 10000 rows.
I want to replace all of the values in a given row with NA if the value in the same row in column C contains a value < 10.
Assume that all of my columns are either number or integer values.
so I want to take the data frame:
x=data.frame(c(5,9,2),c(3,4,6),c(12,9,11))
names(x)=c("A","B","C")
and replace row 2 with NA to create
y=data.frame(c(5,"NA",2),c(3,"NA",6),c(12,"NA",11))
names(y)=c("A","B","C")
Thanks!
how about:
x[x$C <10 ,] <- NA