conditionally selecting two numeric row values in R [closed] - r

Closed. This question needs debugging details. It is not currently accepting answers.
Edit the question to include desired behavior, a specific problem or error, and the shortest code necessary to reproduce the problem. This will help others answer the question.
Closed 6 years ago.
Improve this question
I'm trying to subset a data set to remove all values before the 7th month of the year 2011. I have Years and Months in different columns.
What I am doing I know is logically wrong(also getting a wrong output), but can't seem to figure out the right way to do this:
state_in2_check <- subset(state_in2, Month > 6 & Year > 2011)

#thelatemail has given you a workable solution in the comments. Your problem is that You're asking R to match two logical checks separately, but each of those checks is dependant on the other. You won't, for example, get any "January" dates (because you're only accepting months greater than 6), even though "Jan-2013" would be fine. #thelatemail's solution separates the checks, such that months lower than 6 will be accepted, as long as they're in years greater than 2011.
Another way would be to convert to date at the same time as subsetting, this way the process is a little more logical:
Month <- 7
Year <- 2011
as.Date( paste( Year, Month, 15, sep = "-" ) )
[1] "2011-07-15"
You can use that simple conversion to subset in a more (in my opinion) logical way:
state_in2_check <- subset(state_in2,
as.Date( paste( Year, Month, 15, sep = "-" ) ) >
as.Date( "2011-06-15" )
)
Note I've made the day of the month the same in both date conversions, which will mean they're compared only according to month/year.

Related

Change dates to number [closed]

Closed. This question needs debugging details. It is not currently accepting answers.
Edit the question to include desired behavior, a specific problem or error, and the shortest code necessary to reproduce the problem. This will help others answer the question.
Closed 1 year ago.
Improve this question
tsReturns = xts(x = returns, order.by = dates)
I have a time series ordered by date, how can I change it to order by numbers(1-n)? The first day corresponds to 1, and the last date corresponds to n. don't change the data set value.
1) xts does not support a plain numeric index. It requires one of several date or datetime classes; however, a plain numeric index could be achieved with zoo or ts. If x is an xts object and we want the plain numeric index to be consecutive numbers from 1 to nrow(x) then:
zoo(coredata(x))
ts(coredata(x))
2) If instead:
the desired index of the ith row is to be the number of days since the first date plus 1 and
the index of x is of Date class and
the dates are not consecutive but are unique, e.g. there are gaps for weekends
then this will give a non-consecutive index for zoo. Since ts can only represent regularly spaced series the ts solution below will fill in the values with NA where there is no date in the input.
tt <- as.numeric(time(x))
z <- zoo(coredata(x), tt - tt[1] + 1)
as.ts(z)

Calculation inaccuracy in date difference [closed]

Closed. This question is not reproducible or was caused by typos. It is not currently accepting answers.
This question was caused by a typo or a problem that can no longer be reproduced. While similar questions may be on-topic here, this one was resolved in a way less likely to help future readers.
Closed 4 years ago.
Improve this question
I tried to find the duration of the number of days of the starting date until the present time using Sys.time(). I use the following command to find the duration. However the output in R is totally wrong.
person$duration <- lubridate::interval(as.Date(person$create_date, "%m/%d/%y"),Sys.Date()) %/% days()
The output:
Name create_date duration
A 09/23/2014 -811
B 05/05/2014 -670
It is supposed to be 1380 days NOT -811. I am not sure why is it negative and why is it '-811' or '-670' specifically.
You were very close. Since your year consists of 4 digits, you need a capital Y.
library(lubridate)
interval(as.Date("09/23/2014", "%m/%d/%Y"),Sys.Date()) %/% days()
gives 1380.
In your code it took only the first 2 digits, and it assumed you wanted the current century, so year 2020. To be exact: in case you provide two numbers as a year, values between 69 and 99 are converted to 1969-1999, and values between 00 and 68 to 2000-2068.
interval(as.Date("09/23/2020", "%m/%d/%Y"),Sys.Date()) %/% days()
gives -811 as well.
Use the simple difference
as.numeric(as.Date("2018-07-05") - Sys.Date())
# use abs() if the date it's in the past

Calculate Mean for Parts of time Series [closed]

Closed. This question needs debugging details. It is not currently accepting answers.
Edit the question to include desired behavior, a specific problem or error, and the shortest code necessary to reproduce the problem. This will help others answer the question.
Closed 6 years ago.
Improve this question
I have a data Frame of annual values that Looks something like this:
Time Value
01/2000-12/2000 123
01/2001-12/2001 126
01/2002-12/2002 129
...
01/2040-12/2040 223
I would like to Calculate the mean for certain parts of the time series (e.g. 2010-2015; 2015-2020; etc.)
Can anyone tell me how to do it?
# first extract the year
df$year <- as.numeric(sub(".*\\/", "", df$Time))
# then a simple mean() does the work for you!
mean(df$Value[df$year >= 2000 & df$year <= 2005])
You can do sth like this if your column Time is in Date format:
To transfer the column into date format use:
my.data.frame$Date = as.Date(paste("01.01.",sub(".*\\/", "", my.data.frame$Time),sep = ""),format = "%d.%m.%Y")
Then to calculate the mean:
mean(my.data.frame[my.data.frame$Date >= "2016-01-01" & my.data.frame$Date <= "2020-01-01","Value"])

categorizing date in R [closed]

Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 8 years ago.
Improve this question
I'm working with a dataset in R where the main area of interest is the date. (It has to do with army skirmishes and the date of the skirmish is recorded). I wanted to check if these were more likely to happen in a given season, or near a holiday, etc, so I want to be able to see how many dates there are in the summer, winter, etc but I'm sort of at a loss for how to do that.
A general recommendation: use the package lubridate for converting from strings to dates if you're having trouble with that. use cut() to divide dates into ranges, like so:
someDates <- c( '1-1-2013',
'2-14-2013',
'3-5-2013',
'8-21-2013',
'9-15-2013',
'11-28-2013',
'12-22-2013')
cutpoints<- c('1-1-2013',# star of range 'winter'
'3-20-2013',# spring
'6-21-2013',# summer
'9-23-2013',# fall
'12-21-2013',# winter
'1-1-2014')# end of range
library(lubridate)
temp <- cut(mdy(someDates),
mdy(cutpoints),
labels=FALSE)
someSeasons <- c('winter',
'spring',
'summer',
'fall',
'winter')[temp]
Now use 'someSeasons' to group your data into date ranges with your favorite
statistical analysis. For a choice of statistical analysis, poisson
regression adjusting for exposure (i.e. length of the season), comes to
mind, but that is probably a better question for Cross Validated
You can make a vector of cut points with regular intervals like so:
cutpoints<- c('3-20-2013',# spring
'6-21-2013',# summer
'9-23-2013',# fall
'12-21-2013')# winter
temp <- cut(mdy(someDates),
outer(mdy(cutpoints), years(1:5),`+`),
labels=F)
someSeasons <- c('spring',
'summer',
'fall',
'winter')[(temp-1)%% 4 + 1] #the index is just a little tricky...

I found ways to plot a graph in R using plot function. but I am looking to create plot with part of the data [closed]

Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
Questions concerning problems with code you've written must describe the specific problem — and include valid code to reproduce it — in the question itself. See SSCCE.org for guidance.
Closed 9 years ago.
Improve this question
I am trying to plot a graph for a data containing years between 1900 and 2010 and output in each month of the year in R. I need to select years between 1950-2001 against months of Nov-february. How can I select part of data for plotting this graph ?
since I am a rookie at R programming or any programming, an easy to follow example would be of great help.
thanks
GRV
I am not sure exactly what you mean by
select years between 1950-2001 against months of Nov-february
But the following should get you started on a reproducible example...
#create a vector of months from 1900 through 2010
months <- seq(as.Date("1900/1/1"), as.Date("2010/12/31"), "months")
#assign a random vector of equal length
output <- rnorm(length(months))
#assign both values to a data_frame
data <- data.frame(months = months, output = output)
Based on your description, your data should look something like the dataframe, called data.
From here, you can make use of the subset function to help you on your way. The first example subsets to data from 1950 through 2001. The next further restricts that subset to the months of November through February.
#subset to just 1950 through 2001
data_sub <- subset(data, months >= as.Date("1950-01-01") & months <= as.Date("2001-12-31"))
#subset the 1950 to 2001 data to just Nov-feb months (i.e. c(11,12,1,2))
data_sub_nf <- subset(data_sub, as.numeric(format(data_sub$months, "%m")) %in% c(11,12,1,2))
You should also read Why is `[` better than `subset`? to move beyond subset.
As stated, after the data has been subset, you can use plot or any other plotting function to graph your data.

Resources