I have quite a simple problem that I've not found anywhere on here.
I have date format:
times = c("Dec_2011" , "July_2011", "Dec_2010" ,"July_2010" , "Dec_2009" , "July_2009", "Dec_2008" ,
"July_2008" ,"Dec_2007" , "July_2007", "Dec_2006" , "July_2006" ,"Dec_2005" , "July_2005",
"Dec_2004" , "July_2004" ,"Dec_2003" , "July_2003", "Dec_2002" , "July_2002", "Dec_2001" ,
"July_2001", "Dec_2000" , "July_2000")
How can I get these into date format:
31-07-2000, 31-07-2001, etc...
31-12-2000, 31-12-2001, etc...
I've tried:
times <- format(as.Date(time, "%B_%Y")
times
[1] NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA
times <- format(as.Date(time, "%B_%Y), "31-%m-%Y)
times
[1] NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA
times <- as.Date(paste("31", times, sep="-"), "%d-%m-%Y")
times
[1] NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA
times <- format(as.Date(time, "%b_%Y"), "31-%m-%Y")
# NA
I'm not quite sure how to proceed.
If we need 31 as the day for all the elements, use paste to join 31, convert to Date class and get the desired format with format.
format(as.Date(paste(times, "31", sep="_"), "%b_%Y_%d"), "%d-%m-%Y")
#[1] "31-12-2011" "31-07-2011" "31-12-2010" "31-07-2010" "31-12-2009" "31-07-2009" "31-12-2008" "31-07-2008" "31-12-2007" "31-07-2007" "31-12-2006"
#[12] "31-07-2006" "31-12-2005" "31-07-2005" "31-12-2004" "31-07-2004" "31-12-2003" "31-07-2003" "31-12-2002" "31-07-2002" "31-12-2001" "31-07-2001"
#[23] "31-12-2000" "31-07-2000"
Instead of manually pasteing 31, we can automate this with as.yearmon from zoo. The advantage is that for months that have less than 31 days, we get the last day by doing that.
library(zoo)
format(as.Date(as.yearmon(times, "%b_%Y"), frac=1), "%d-%m-%Y")
#[1] "31-12-2011" "31-07-2011" "31-12-2010" "31-07-2010" "31-12-2009" "31-07-2009" "31-12-2008" "31-07-2008" "31-12-2007" "31-07-2007" "31-12-2006"
#[12] "31-07-2006" "31-12-2005" "31-07-2005" "31-12-2004" "31-07-2004" "31-12-2003" "31-07-2003" "31-12-2002" "31-07-2002" "31-12-2001" "31-07-2001"
#[23] "31-12-2000" "31-07-2000"
Related
I'm trying to convert a column of time (which I imported from Excel) that R has converted into a decimal/character string back into hh:mm:ss time. I have seen many good answers (using library chron, for example), but I keep getting these errors:
My data:
> head(env$Time, 10)
[1] "0.41736111111111113" "0.6020833333333333" "0.45" "0.47222222222222227" "0.5131944444444444"
[6] "0.51250000000000007" "0.47361111111111115" "0.44791666666666669" "0.35138888888888892" "0.45277777777777778"
times(env$Time)
Error in convert.times(times., fmt) : format h:m:s may be incorrect
In addition: Warning message:
In unpaste(times, sep = fmt$sep, fnames = fmt$periods, nfields = 3) :
8173 entries set to NA due to wrong number of fields
chron(times(env$Time))
Error in convert.times(times., fmt) : format h:m:s may be incorrect
In addition: Warning message:
In unpaste(times, sep = fmt$sep, fnames = fmt$periods, nfields = 3) :
8173 entries set to NA due to wrong number of fields
strptime(env$Time, format = "%H:%M:%S")
[1] NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA
[38] NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA
Found this answer here: How to express a decimal time in HH:MM
x <- as.numeric(env$Time) # Store time variable as numeric vector
env$Time2 <- sub(":00$", "", round(times(x), "min")) # Run this code to save as new column in dataframe (note, don't need to divide by 24 if decimal is fraction of a day like my data is
I want to convert those dates as this format "1995-01", "1995-02", etc.
Here is some of my data
Date Change
1 January-1995 0.01417476
2 February-1995 0.01427050
3 March-1995 0.01556348
4 April-1995 0.01644737
5 May-1995 0.01603727
6 June-1995 0.01627500
7 July-1995 0.01557800
8 August-1995 0.01429773
9 September-1995 0.01344300
10 October-1995 0.01334667
11 November-1995 0.01328429
12 December-1995 0.01345368
13 January-1996 0.01293091
14 February-1996 0.01301762
15 March-1996 0.01289048
16 April-1996 0.01268476
17 May-1996 0.01287364
18 June-1996 0.01253400
19 July-1996 0.01254591
20 August-1996 0.01271238
21 September-1996 0.01245700
22 October-1996 0.01201636
23 November-1996 0.01191300
24 December-1996 0.01195600
I tried this :
date <- as.Date(Data$Date,format="%B/%Y")
and this
date <- as.Date(paste0("01/", Data$Date),format = "%m/%Y")
But it juste return me
[1] NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA
I'm stuck!
The last option can be modified to include the %d for day when we are pasteing the day. Also, in format, specify the correct delimiter and %B- (Full month name) where as %m- month as decimal number)
as.Date(paste0("01/", Data$Date),format = "%d/%B-%Y")
Or use lubridate
library(lubridate)
my(Data$Date)
my("January-1995")
[1] "1995-01-01"
Here is another base R option, but with output of type string (not Date)
x <- c("January-1995", "February-1996")
paste0(
gsub("\\D", "", x),
"-",
sprintf("%02d", match(gsub("-.*", "", x), month.name))
)
which gives
[1] "1995-01" "1996-02"
I am having some data with a time column expressed in week.year and a corresponding unit that was measured in that week.
Week-Year Units
01.2020 39.12727273
02.2020 33.34545455
03.2020 118.7181818
04.2020 83.71818182
05.2020 58.56985
. .
52.2020 89.54651534
I have to create a ts object which takes these Week-Year values as input.
The reason for requiring this step is- there are sometimes values missing for certain weeks so using an auto generated time scale (start=, end=, frequency=) will mess up the readings.
Is there any way of achieving it? or is there any way to accommodate such a situation?
R novice here, would really appreciate some guidance. :)
Assuming the input is the data frame DF shown reproducibly in the Note at the end, convert it to a zoo object and then use as.ts to create a ts series with frequency 52.
library(zoo)
week <- as.integer(DF[[1]])
year <- as.numeric(sub("...", "", DF[[1]]))
z <- zoo(DF[[2]], year + (week - 1) / 52)
tt <- as.ts(z)
tt
## Time Series:
## Start = c(2020, 1)
## End = c(2020, 52)
## Frequency = 52
## [1] 39.12727 33.34545 118.71818 83.71818 58.56985 NA NA
## [8] NA NA NA NA NA NA NA
## [15] NA NA NA NA NA NA NA
## [22] NA NA NA NA NA NA NA
## [29] NA NA NA NA NA NA NA
## [36] NA NA NA NA NA NA NA
## [43] NA NA NA NA NA NA NA
## [50] NA NA 89.54652
frequency(tt)
## [1] 52
class(tt)
## [1] "ts"
Note
Lines <- " Week-Year Units
01.2020 39.12727273
02.2020 33.34545455
03.2020 118.7181818
04.2020 83.71818182
05.2020 58.56985
52.2020 89.54651534"
DF <- read.table(text = Lines, header = TRUE, colClasses = c("character", NA))
Closed. This question is not reproducible or was caused by typos. It is not currently accepting answers.
This question was caused by a typo or a problem that can no longer be reproduced. While similar questions may be on-topic here, this one was resolved in a way less likely to help future readers.
Closed 5 years ago.
Improve this question
dates <- as.Date(dli$Dates)
class(dates)
[1] "Date"
dates
[1] "2016-01-01" "2016-01-02" "2016-01-03" "2016-01-04" "2016-01-05" "2016-01-06"
[7] "2016-01-07" "2016-01-08" "2016-01-09" "2016-01-10" "2016-01-11" "2016-01-12"
[13] "2016-01-13" "2016-01-14" "2016-01-15" "2016-01-16" "2016-01-17" "2016-01-18"
[19] "2016-01-19" "2016-01-20" "2016-01-21" "2016-01-22" "2016-01-23" "2016-01-24"
[25] "2016-01-25" "2016-01-26" "2016-01-27" "2016-01-28" "2016-01-29" "2016-01-30"
[31] "2016-01-31" "2016-02-01" "2016-02-02" "2016-02-03" "2016-02-04" "2016-02-05"
[37] "2016-02-06" "2016-02-07" "2016-02-08" "2016-02-09" "2016-02-10" "2016-02-11"
This is my date format , so i need to convert it into "2016-month-day"
I am getting NA values
dates <- as.Date(dli$Dates,"%d/%b/%Y")
class(dates)
[1] "Date"
dates
[1] NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA
[31] NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA
[61] NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA
[91] NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA
[121] NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA
can you give any suggestions
Thanks in Advance
Good practice is to store date in R as YYYY-MM-DD, and your strings already seem to be at the good format, but :
The format you're passing to as.Date must describe what the strings contain, not what you're expecting as an output.
"%d/%b/%Y" stands for "day as a number (0-31) slash abbreviated month slash 4-digit year", and your strings format are "4-digit year - month as a number - day as a number".
If you want to format the date, you need to call format :
> date <- "2016-01-01"
> date <- as.Date(date, format = "%Y-%m-%d")
> date
[1] "2016-01-01"
> format(date, "%d/%b/%Y")
[1] "01/jan/2016"
To obtain your required format i.e., 2016-month-day , you can use format function once you have converted vector of strings to Date type.
I hope below code snippet clears your doubt.
> d = c("2016-02-08","2016-02-18","2015-02-08","2016-02-02")
> class(d)
[1] "character"
> d = as.Date(d)
> class(d)
[1] "Date"
> d = format(d,"%Y-%b-%d")
> d
[1] "2016-Feb-08" "2016-Feb-18" "2015-Feb-08" "2016-Feb-02"
Format function converts the date type objects into the required format. Refer to this link for more information on date type formatting.
If you just want to render your dates in this format, then use format:
x <- as.Date("2016-01-01")
format(x, "%Y %b %a %d")
[1] "2016 Jan Fri 01"
There is a separation of concerns here. If you already have your date information stored in R as date types, then you need not change anything internally to extract further information from those dates.
Demo
You would use as.Date() to convert between dates saved as character and Date objects.
If you want to change the format of a Date object, you can use format().
You have specified "2016-month-day" as the desired format of the dates in the question, but in the code you provide you are using "%d/%b/%Y". The way this works is: the % indicates that the next character will be a conversion specification, everything else (e.g. (- or /) will be used for finding / adding delimiter to the date representation. (see ?strptime for details).
So in your case, just use
dates <- format(dli$Dates, format = "%Y-%b-%d")
to get the result specified in the text of the question:
[1] "2016-Jan-01" "2016-Jan-02" "2016-Jan-03" "2016-Jan-04" "2016-Jan-05"
or this:
dates <- format(dli$Dates, format = "%Y/%b/%d")
to get what you have used in the code snipped:
[1] "2016/Jan/01" "2016/Jan/02" "2016/Jan/03" "2016/Jan/04" "2016/Jan/05"
You can use the package lubridate to convert to a date with ymd then format it for the way you want it displayed
Dates_df <- mutate(Dates, dli = format(ymd(dli), "%Y/%b/%d")
(I use dplyr here, I assume you have other variables in Dates)
without dplyr if you just want to keep the dates in a vector:
Dates_vec <- format(ymd(Dates$dli), "%Y/%b/%d")
I am pretty sure this is quite simple, but seem to have got stuck...I have two xts vectors that have been merged together, which contain numeric values and NAs.
I would like to get the rowSums for each index period, but keeping the NA values.
Below is a reproducible example
set.seed(120)
dd <- xts(rnorm(100),Sys.Date()-c(100:1))
dd1 <- ifelse(dd<(-0.5),dd*-1,NA)
dd2 <- ifelse((dd^2)>0.5,dd,NA)
mm <- merge(dd1,dd2)
mm$m <- rowSums(mm,na.rm=TRUE)
tail(mm,10)
dd1 dd2 m
2013-08-02 NA NA 0.000000
2013-08-03 NA NA 0.000000
2013-08-04 NA NA 0.000000
2013-08-05 1.2542692 -1.2542692 0.000000
2013-08-06 NA 1.3325804 1.332580
2013-08-07 NA 0.7726740 0.772674
2013-08-08 0.8158402 -0.8158402 0.000000
2013-08-09 NA 1.2292919 1.229292
2013-08-10 NA NA 0.000000
2013-08-11 NA 0.9334900 0.933490
In the above example on the 10th Aug 2013 I was hoping it would say NA instead of 0, the same goes for the 2nd-4th Aug 2013.
Any suggestions for an elegant way of getting NAs in the relevant places?
If you have a variable number of columns you could try this approach:
mm <- merge(dd1,dd2)
mm$m <- rowSums(mm, na.rm=TRUE) * ifelse(rowSums(is.na(mm)) == ncol(mm), NA, 1)
# or, as #JoshuaUlrich commented:
#mm$m <- ifelse(apply(is.na(mm),1,all),NA,rowSums(mm,na.rm=TRUE))
tail(mm, 10)
# dd1 dd2 m
#2013-08-02 NA NA NA
#2013-08-03 NA NA NA
#2013-08-04 NA NA NA
#2013-08-05 1.2542692 -1.2542692 0.000000
#2013-08-06 NA 1.3325804 1.332580
#2013-08-07 NA 0.7726740 0.772674
#2013-08-08 0.8158402 -0.8158402 0.000000
#2013-08-09 NA 1.2292919 1.229292
#2013-08-10 NA NA NA
#2013-08-11 NA 0.9334900 0.933490
Use logical indexing with [ and is.na(ยท) to localize the entries where both are NA and then replace them with NA.
Try this:
> mm[is.na(mm$dd1) & is.na(mm$dd2), "m"] <- NA
> mm
dd1 dd2 m
2013-08-02 NA NA NA
2013-08-03 NA NA NA
2013-08-04 NA NA NA
2013-08-05 1.2542692 -1.2542692 0.000000
2013-08-06 NA 1.3325804 1.332580
2013-08-07 NA 0.7726740 0.772674
2013-08-08 0.8158402 -0.8158402 0.000000
2013-08-09 NA 1.2292919 1.229292
2013-08-10 NA NA NA
2013-08-11 NA 0.9334900 0.933490
mm$m <- "is.na<-"(rowSums(mm, na.rm = TRUE), !rowSums(!is.na(mm)))
> tail(mm)
# dd1 dd2 m
# 2013-08-06 NA 1.3325804 1.332580
# 2013-08-07 NA 0.7726740 0.772674
# 2013-08-08 0.8158402 -0.8158402 0.000000
# 2013-08-09 NA 1.2292919 1.229292
# 2013-08-10 NA NA NA
# 2013-08-11 NA 0.9334900 0.933490
My solution would be
library(magrittr)
mm <- mm %>%
transform(ccardNA = rowSums(!is.na(.))/rowSums(!is.na(.)), m = rowSums(., na.rm = TRUE)) %>%
transform(m = ifelse(is.nan(ccardNA), NA, m), ccardNA = NULL) %>%
as.xts()