How to get date difference if column is NA [closed] - r

Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 7 years ago.
Improve this question
I am trying to work out the difference between dates between two date columns in a dataframe df$started and df$done. The result is put in a third column called df$diff Some of the df$diff has NA and in these rows I want to enter the difference between the current date and df$started. How can I do this?

In the future, please follow the questions guidelines.
library(lubridate)
start = as.Date(c("14.01.2015", "26.03.2015"),format = "%d.%m.%Y")
end = as.Date(c("18.01.2015", NA),format = "%d.%m.%Y")
diff = ifelse(!is.na(end),difftime(end,start,units="days"),difftime(Sys.time(),start,units="days"))
df = data.frame(start,end,diff)
View(df)
start end diff
1 2015-01-14 2015-01-18 4.0000
2 2015-03-26 NA 174.7846

Related

How to find the percentage of observations in each day in R? [closed]

Closed. This question needs details or clarity. It is not currently accepting answers.
Want to improve this question? Add details and clarify the problem by editing this post.
Closed 17 days ago.
Improve this question
I have a dataset with time, where the time intervals are 6 hours apart and I have a column of heaterstatus.
The dataset :
I would like to know the percentage of zero occurred in each day for heaterstatus. New to R, any suggestion will be helpful.
Not tested since you only provided data as an image, but this should do what you want:
library(dplyr)
dat %>%
group_by(day = as.Date(Time)) %>%
summarize(pct_0 = mean(HeaterStatus == 0))

How to convert dates in m/d/y and m-d-y to a single format in R [closed]

Closed. This question needs details or clarity. It is not currently accepting answers.
Want to improve this question? Add details and clarify the problem by editing this post.
Closed last month.
Improve this question
activity$ActivityDate = as.POSIXct(activity$ActivityDate, format="%d/%m/%y", tz=Sys.timezone()) returns NA in ActivityDate and Date columns in R
activity$ActivityDate = as.POSIXct(activity$ActivityDate, format="%d/%m/%y", tz=Sys.timezone()) ends up with NA in ActivityDate column in R
I can only guess what you would like to achieve based on your question title - it would help if you could include a minimal reproducible example and some prose describing your problem.
It seems that lubridate's parsing functions might do what you need:
ActivityDate <- c("07/31/2022", "07-31-2022")
lubridate::mdy(ActivityDate)
#> [1] "2022-07-31" "2022-07-31"
Created on 2022-12-21 with reprex v2.0.2

create an extra column based on 16 other numeric columns [closed]

Closed. This question needs details or clarity. It is not currently accepting answers.
Want to improve this question? Add details and clarify the problem by editing this post.
Closed 1 year ago.
Improve this question
I have 16 variables that are numeric, and I need to create an extra column that is YES (otherwise NO) when 3 or more variables out of those 16 have a value above 1015.
How could I do that?
Thanks
You can try with rowSums :
cols <- 1:16
df$res <- ifelse(rowSums(df[cols] > 1015, na.rm = TRUE) >= 3, 'Yes', 'No')

Changing Horizontal Label axis in curve fitting in R [closed]

Closed. This question needs details or clarity. It is not currently accepting answers.
Want to improve this question? Add details and clarify the problem by editing this post.
Closed 5 years ago.
Improve this question
I have tried to plot sales order against time. graph is as follows:
I want to replace number of days by dates:
which means replace 0 by 22/09/2016
50 by 11/11/2016
100 by 31/12/2016
150 by 19/02/2017
200 by 10/04/2017
I don't know how to proceed.
You will want to make use of the xts package which can handle daily values
library(xts)
v <- 0:200 # your data here
d <- seq.Date(as.Date('2016-09-22'), as.Date('2017-04-10'), length.out = 201)
plot(xts(v, order.by = as.POSIXct(d)))
Here is a great cheat sheet:
https://www.datacamp.com/community/blog/r-xts-cheat-sheet#gs.1XjXRyI

Calculate churn in R given data variables [closed]

Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 6 years ago.
Improve this question
I'm trying to calculate the number of retained students using R. The two variables I'm working with are 'registration_date' (mm/dd/yr) and 'date_of_last_login' (mm/dd/yr). A student is considered retained if they logged-in in the preceding 30 days.
ID 1 , 2, 3, 4, 5
registration_date 2/1/15, 2/1/15, 3/15/15, 2/10/15, 4/15/15
date_of_last_login 2/3/15, 3/15/15, 4/30/15, 4/25/15, 5/16/15
I imagine the idea is to create a new variable: 'retained students' but I am not sure how to set up the formula in R.
Assuming you mean the 30 days previous to today:
last_login <- c("2/3/15","3/15/15","4/30/15")
login <- as.Date(last_login, format = '%m/%d/%y')
retained_students <- (Sys.Date()-login < 30)
retained_students
retained_students is then a vector with either TRUE or FALSE for each login

Resources