Calculation inaccuracy in date difference [closed] - r

Closed. This question is not reproducible or was caused by typos. It is not currently accepting answers.
This question was caused by a typo or a problem that can no longer be reproduced. While similar questions may be on-topic here, this one was resolved in a way less likely to help future readers.
Closed 4 years ago.
Improve this question
I tried to find the duration of the number of days of the starting date until the present time using Sys.time(). I use the following command to find the duration. However the output in R is totally wrong.
person$duration <- lubridate::interval(as.Date(person$create_date, "%m/%d/%y"),Sys.Date()) %/% days()
The output:
Name create_date duration
A 09/23/2014 -811
B 05/05/2014 -670
It is supposed to be 1380 days NOT -811. I am not sure why is it negative and why is it '-811' or '-670' specifically.

You were very close. Since your year consists of 4 digits, you need a capital Y.
library(lubridate)
interval(as.Date("09/23/2014", "%m/%d/%Y"),Sys.Date()) %/% days()
gives 1380.
In your code it took only the first 2 digits, and it assumed you wanted the current century, so year 2020. To be exact: in case you provide two numbers as a year, values between 69 and 99 are converted to 1969-1999, and values between 00 and 68 to 2000-2068.
interval(as.Date("09/23/2020", "%m/%d/%Y"),Sys.Date()) %/% days()
gives -811 as well.

Use the simple difference
as.numeric(as.Date("2018-07-05") - Sys.Date())
# use abs() if the date it's in the past

Related

Why means are different in Rstudio? [closed]

Closed. This question is not reproducible or was caused by typos. It is not currently accepting answers.
This question was caused by a typo or a problem that can no longer be reproduced. While similar questions may be on-topic here, this one was resolved in a way less likely to help future readers.
Closed 1 year ago.
Improve this question
I am using mean function in R. But getting different answers all the time, I change the sequence of the inputs.
Not able to understand. Please help me.
Thanks in advance.
> mean(10,12,13)
[1] 10
> mean(12,10,13)
[1] 12
> mean(13,10,12)
[1] 13
You should pass numbers as a vector to mean and not as separate values. See ?mean.
mean(x, ...)
So when you are doing mean(10,12,13) you are just getting mean of 10 hence the same number is returned. Same with mean(12,10,13).
Pass them as a vector with c(...).
mean(c(10,12,13))
#[1] 11.66667
mean(c(12,10,13))
#[1] 11.66667
This is different behaviour from functions like sum/min/max where you can pass different numbers as comma separated values.
sum(10, 12, 13)
#[1] 35
min(10, 12, 9)
#[1] 9

Make the Day of Week Variable Binary with Baseline [closed]

Closed. This question is not reproducible or was caused by typos. It is not currently accepting answers.
This question was caused by a typo or a problem that can no longer be reproduced. While similar questions may be on-topic here, this one was resolved in a way less likely to help future readers.
Closed 3 years ago.
Improve this question
I want to make the day of the week variable binary, but also have a baseline which should be Saturday.
I have tried the following, but this does not give me my intended result at all. Is there a need to use a logical?
The range of the days of the week is 1 to 7, where 1 would be Monday.
First i renamed the numbers into the days of the week.
df$DayofW <- recode(df$dowc,
"1"="Monday",
"2"="Tuesday",
"3"="Wednesday",
"4"="Thursday",
"5"="Friday",
"6"="Saturday",
"7"="Sunday")
df$DayofW <- ifelse((df$dowc == 6), 1, -1)
The issue is that we are assigning (=) instead of comparing (==). According to ?ifelse, the first argument test
test - an object which can be coerced to logical mode.
So, it needs a comparison operator
ifelse((df$dowc == 6), 1, -1)

Get distinct values from columns in R [closed]

Closed. This question is not reproducible or was caused by typos. It is not currently accepting answers.
This question was caused by a typo or a problem that can no longer be reproduced. While similar questions may be on-topic here, this one was resolved in a way less likely to help future readers.
Closed 5 years ago.
Improve this question
I have the following kind of data in my csv file
DriveNo Date and Time Longitude
156 2014-01-31 23:00:00 41.88367183
187 2014-01-31 23:00:01 41.92854
These data have a lot of noise. Sometimes, a driver(the DriveNo is unique) is present in two different locations at the same time , which is not possible and a noise. I tried to do it using distinct(select(five,DriveNo,Date and Time))
but i get the following error
Error: unexpected symbol in "distinct(select(five,DriveNo,Date and"
However, when i try
distinct(select(five,DriveNo,Longitude))
it works.But, i need it with DriveNo and Date and Time.
you can escape with backticks, like:
df %>%
select(DriveNo, `Date and Time`, Longitude) %>%
distinct()
or using group_by, like:
df %>%
group_by(DriveNo, `Date and Time`) %>%
select(Longitude) %>%
unique()

conditionally selecting two numeric row values in R [closed]

Closed. This question needs debugging details. It is not currently accepting answers.
Edit the question to include desired behavior, a specific problem or error, and the shortest code necessary to reproduce the problem. This will help others answer the question.
Closed 6 years ago.
Improve this question
I'm trying to subset a data set to remove all values before the 7th month of the year 2011. I have Years and Months in different columns.
What I am doing I know is logically wrong(also getting a wrong output), but can't seem to figure out the right way to do this:
state_in2_check <- subset(state_in2, Month > 6 & Year > 2011)
#thelatemail has given you a workable solution in the comments. Your problem is that You're asking R to match two logical checks separately, but each of those checks is dependant on the other. You won't, for example, get any "January" dates (because you're only accepting months greater than 6), even though "Jan-2013" would be fine. #thelatemail's solution separates the checks, such that months lower than 6 will be accepted, as long as they're in years greater than 2011.
Another way would be to convert to date at the same time as subsetting, this way the process is a little more logical:
Month <- 7
Year <- 2011
as.Date( paste( Year, Month, 15, sep = "-" ) )
[1] "2011-07-15"
You can use that simple conversion to subset in a more (in my opinion) logical way:
state_in2_check <- subset(state_in2,
as.Date( paste( Year, Month, 15, sep = "-" ) ) >
as.Date( "2011-06-15" )
)
Note I've made the day of the month the same in both date conversions, which will mean they're compared only according to month/year.

less than negative in R [closed]

Closed. This question is not reproducible or was caused by typos. It is not currently accepting answers.
This question was caused by a typo or a problem that can no longer be reproduced. While similar questions may be on-topic here, this one was resolved in a way less likely to help future readers.
Closed 8 years ago.
Improve this question
How can I compare a column in data frame for range? The range is like less than a negative number and greater than a positive number. For Positive number there is no problem but for negative number it is taking it as an assignment operator.Code for reference is given below
Resited<-Reap[mean < -5 & mean > 5,]
"mean" cannot be less than -5 and more than 5 at the same time. Did you mean logical OR? If both abs values are the same, you could simply write
Resited <- Reap[abs(mean) > 5, ]
This simplest way in my opinion is just to put parentheses around your negative value:
Resited<-Reap[mean<(-5) | mean>5,]
For the case where your range is not symmetric around zero, R allows you to define your own operators:
`%between%` = function(x,range) x>range[1] & x<range[2]
-6 %between% c(-7,-2)
[1] TRUE

Resources