I am starting with a date_of_survey variable that is a string formatted as YYYY-MM-DD. I then run the following commands to convert it to a date variable, and display that variable in a useful format:
gen date = date(date_of_survey, "YMD")
gen date_clean = date
format date_clean %dM_d,_CY
drop date_of_survey
That leaves me with a "date_clean" variable displayed as "September 3, 2020" and a corresponding "date" variable displayed as "22161" (equal to days since January 1, 1960).
I now need to create a variable that indicates the year and quarter of each observation, preferably in YYYY-QQ format. I assumed this shouldn't be difficult, but no matter how I have coded it, I wind up with years in the 7000s and inaccurate quarters. I must be misunderstanding how the dates are stored. My first instinct was to try a simple format date %tq command, but I'm still not getting the output I need. Any help is much appreciated. I read over the help files, and can't find the discrepancy that's causing this little problem.
ANSWER: I needed to put the date variable into quarters since January 1, 2021. a qofd() function call before the format %tq did the trick!
Related
In Stata I have a variable yearmonth which is formatted as 201201, 201202 etc. for the years 2012 - 2019, monthly with no gaps. When I format the variable as
format yearmonth %tm
The results look like: 2.0e+05 for all periods, with the exact same number each time. A Dickey-Fuller test tells me I have gaps in my data (I don't) and a tsfill command generates dozens of empty observations between each period.
How do I properly format my yearmonth variable so I can set it as a monthly date?
You do have gaps — between 201212 and 201301, for example. Consider a statement like
gen wanted = ym(floor(yearmonth/100), mod(yearmonth, 100))
which parses your integers like 201201 into year and month components. So floor(201201/100) is floor(2012.01) and so 2012 while mod(201201, 100) is 1. The two components are then the arguments of ym() which expects a year and a month argument.
Then and only then will your format statement do you want. That command won’t create date variables.
See help datetime in Stata for more information and Problem with displaying reformatted string into a four-digit year in Stata 17 for an explanation of the difference between a date value and a date display format.
I searched on Stackoverflow and was able to write this formula by reading a cell's Month Year Date Time to give me this (Column A from reading Column H):
Column A Cell A2 (A2 indicates 4/1/2021) =DATE(YEAR(H2),MONTH(H2)+1,1) which reads this 3/30/2021 5:09:55 PM.
Then I wrote a formula in Column I for giving me the Month Year format for I2 reading from A2:
=TEXT(A2,"mmmm yyyy") which is April 2021 then I copy and paste as value into Column B (B2 as April 2021)
Is there an IF statement or a formula I can write when scanning say Column I (I2 = "April 2021) to give me the Month~MonthEnd-Date~Year format? ie "April 2021" to "April 30th 2021"?
This might require a new thread but how do I "turn-on" the auto feature whenever I add a new row to keep the new rows as part of a Table?
If you are just looking for the start of the following month in column A, you formula works fine. An alternative formula for column A could be:
=EOMONTH(I2,0)+1
As an alternative to converting the date to a text value, you can format the cell to display date in the same format. The advantage to this is the cell is still an excel date that formulas can easily work with.
Again to get the last day of the month use EOMONTH formula. If you want to keep your text method going, use the following formula:
=TEXT(EOMONTH(A2,0),"mmmm dd yyyy")
Alternatively if you format the cell appropriately, you can just use:
=EOMONTH(A2,0)
Note if you really need the st for days ending in 1 and nd for days end 2, etc, it gets more complicated. I am assuming that they are not really needed.
My data contains several measurements in one day. It is stored in CSV-file and looks like this:
enter image description here
The V1 column is factor type, so I'm adding a extra column which is date-time -type: vd$Vdate <- as_datetime(vd$V1) :
enter image description here
Then I'm trying to convert the vd-data into time series: vd.ts<- ts(vd, frequency = 365)
But then the dates are gone:
enter image description here
I just cannot get it what I am doing wrong! Could someone help me, please.
Your dates are gone because you need to build the ts dataframe from your variables (V1, ... V7) disregarding the date field and your ts command will order R to structure the dates.
Also, I noticed that you have what is seems like hourly data, so you need to provide the frequency that is appropriate to your time not 365. Considering what you posted your frequency seems to be a bit odd. I recommend finding a way to establish the frequency correctly. For example, if I have hourly data for 365 days of the year then I have a frequency of 365.25*24 (0.25 for the leap years).
So the following is just as an example, it still won't work properly with what I see (it is limited view of your dataset so I am not sure 100%)
# Build ts data (univariate)
vs.ts <- ts(vd$V1, frequency = 365, start = c(2019, 4)
# check to see if it is structured correctly
print(vd.ts, calendar = T)
Finally my time series is working properly. I used
ts <- zoo(measurements, date_times)
and I found out that the date_times was supposed to be converted with as_datetime() as otherwise they were character type. The measurements are converted into data.frame type.
I have data spread over a period of two months. When I graph data points for each day, dates (dd/mm/yyyy) are overlapping and it is not possible to make sense of which date a certain point refers to. I tried to remove years from the date as they are not useful for the info I have and the dd/mm should leave enough space.
df$date<-as.Date(df$date, format="%d/%m")
However, it transforms the 01/09/2014 to 2015-09-01. I read that when the year is missing as.Date assumes current year and inputs it. Can I avoid this automatic insertion somehow?
something like this?
date <- as.Date("01/09/2014", format = %d/%m/%Y)
format(date, "%d/%m")
"01/09"
I've been trying to do a time series on my dataframe, and I need to strip times from my csv. This is what I've got:
campbell <-read.csv("campbell.csv")
campbell$date = strptime(campbell$date, "%m/%d")
campbell.ts <- xts(campbell[,-1],order.by=campbell[,1])
First, what I'm trying to do is just get xts to strip the dates as "xx/xx" meaning just the month and day. I have no year for my data. When I try that second line of code and call upon the date column, it converts it to "2013-xx-xx." These months and days have no year associated with them, and I can't figure out how to get rid of the 2013. (The csv file I'm calling on has the dates in the format "9/30,10/1...etc.)
Secondly, once I try and make a time series (the third line), I am unsure what the "order.by" command is calling on. What am I indexing?
Any help??
Thanks!
For strptime, you need to provide the full date, i.e. day, month and year. In case, any of these is not provided, current ones are assumed from the system's time and appended to the incomplete date. So, if you want to retain your date format as you have read it, first make a copy of that and store in a temporary variable and then use strptime over campbell$date to convert into R readable date format. Since, year is not a concern to you, you need not bother about it even though it is automatically appended by strptime.
campbell <-read.csv("campbell.csv")
date <- campbell$date
campbell$date <- strptime(campbell$date, "%m/%d")
Secondly, what you are doing by 'the third line' (xts(campbell[,-1],order.by=campbell[,1])) command is that, your are telling to order all the data of campbell except the first column (campbell[,-1]) according to the index provided by the time data in the first column of campbell (campbell[,1]). So, it would only work given the date is in the first column.
After ordering the data according to time-series, you can replace back the campbell$date column with date to get back the date format you wanted (although here, first you have to order date also like shown below)
date <- xts(date, order.by=campbell[,1]) # assuming campbell$date is campbell[,1]
campbell.ts <- xts(campbell[,-1], order.by=campbell[,1])
campbell.ts <- cbind(date, campbell.ts)
format(as.Date(campbell$dat, "%m/%d/%Y"), "%m/%d")