I have a CSV file that I open using the read.csv() function.
It has 3 columns, Id, Time and Value.
The Time column is formatted like this: 4/12/2016 7:21:00 AM
First
What I want is to split it into Date 4/12/2016 and Time 7:21:00
Second
Convert the Time into 24 hours format instead of AM and PM.
How can this be accomplished?
If you want to split you can use str_split() from the library(stringr)
my_var <- str_split(string = my_df$Time, pattern = " ", n = 2, simplify = TRUE)
my_df$Date <- my_var[, 1] #In this column you'll find the Date
my_df$Time <- my_var[, 2] #In this column you'll find the Time
lubridate is your friend (along with hms package). First convert your Time variable into a datetime object, then use helper functions to parse out the Date and Time. lubridate uses the 24hour format (POSIXIt) when it parses it.
my_df %>%
mutate(Time = mdy_hms(Time),
Date = as_date(Time),
Time = hms::as_hms(Time))
I have a csv file with a column values like "20140929120000" which gives the date and time.
After importing it to R, I want to format this as a date variable while keeping the time part as well(if that is possible).
So, the output should be a date variable '2014-09-29'.
How would I get the time part as a separate column with value "12:00:00"?
how about
install.packages('lubridate')
library(lubridate)
y <- as.numeric(20140929120000)
df %>%
mutate(Date = as.Date(ymd_hms(y), tz= Sys.timezone()),
Time = format(lubridate::ymd_hms(y), "%H:%M:%S")
just change the Y in the mutate to your date column
We can use the POSIXct type here:
val <- "20140929120000"
mask <- "%Y%m%d%H%M%S"
as.POSIXct(strptime(val, mask))
[1] "2014-09-29 12:00:00 UTC"
To see the various components of the timestamp, try:
unclass(strptime(val, mask))
For each county, I want to have a continuous date from 2020-01-01 to 2020-06-01. Before the given date should be "BO". after the date, it should be "DO".
The sample data is here.
The desired output is like this
Generally, you can do a simple if_else command with your cut date as condition. This works well if your date variable is formatted as date.
Using some of your data you linked and reading it in as a data frame df you can use:
library(tidyverse)
df %>%
mutate(date_date = as.Date(date_first, format = "%m/%d/%Y"),
OrderStatus = if_else(date_date > "2020-04-01", "BO", "DO"))
The if_else command is just an example. You can change this to the condition you need.
My current setup
How do I filter the end_time column for data only after 12/01/2018 and then sum these data after this date?
Below is what I have already tried.
setwd("/Users/jackbell/Desktop")
bookings<- read.csv("bookings_data_data_analyst_test.csv", header= TRUE)
end_time<- bookings %>%select(end_time)
end_time
new_date <- filter(end_time< as.Date("12/01/2018"))
We need to convert it to Date class. Based on the image and the OP's code, 'end_time' seems to be the column name and there is also an object created with the same name. In the last step, the semantic is incorrect as we need to apply filter on the data object. The data object ('end_time') was not called. Secondly, the formats for 'Date' is day/month/Year. By default, as.Date returns a Date class if the format is Year-month-day (YYYY-MM-DD). For all other formats, specify the format
library(tidyverse)
end_time %>%
filter(dmy(end_time) < dmy("12/01/2018"))
In the above code, we used dmy from lubridate package. If we use as.Date, it would be
end_time %>%
filter(as.Date(end_time, format = "%d/%m/%Y") < as.Date("2018-01-12"))
I need to sort a data frame by date in R. The dates are all in the form of "dd/mm/yyyy". The dates are in the 3rd column. The column header is V3. I have seen how to sort a data frame by column and I have seen how to convert the string into a date value. I can't combine the two in order to sort the data frame by date.
Assuming your data frame is named d,
d[order(as.Date(d$V3, format="%d/%m/%Y")),]
Read my blog post, Sorting a data frame by the contents of a column, if that doesn't make sense.
Nowadays, it is the most efficient and comfortable to use lubridate and dplyr libraries.
lubridate contains a number of functions that make parsing dates into POSIXct or Date objects easy. Here we use dmy which automatically parses dates in Day, Month, Year formats. Once your data is in a date format, you can sort it with dplyr::arrange (or any other ordering function) as desired:
d$V3 <- lubridate::dmy(d$V3)
dplyr::arrange(d, V3)
In case you want to sort dates with descending order the minus sign doesn't work with Dates.
out <- DF[rev(order(as.Date(DF$end))),]
However you can have the same effect with a general purpose function: rev(). Therefore, you mix rev and order like:
#init data
DF <- data.frame(ID=c('ID3', 'ID2','ID1'), end=c('4/1/09 12:00', '6/1/10 14:20', '1/1/11 11:10')
#change order
out <- DF[rev(order(as.Date(DF$end))),]
Hope it helped.
You can use order() to sort date data.
# Sort date ascending order
d[order(as.Date(d$V3, format = "%d/%m/%Y")),]
# Sort date descending order
d[rev(order(as.Date(d$V3, format = "%d/%m/%y"))),]
Hope this helps,
Link to my quora answer https://qr.ae/TWngCe
Thanks
If you just want to rearrange dates from oldest to newest in r etc. you can always do:
dataframe <- dataframe[nrow(dataframe):1,]
It's saved me exporting in and out from excel just for sort on Yahoo Finance data.
The only way I found to work with hours, through an US format in source (mm-dd-yyyy HH-MM-SS PM/AM)...
df_dataSet$time <- as.POSIXct( df_dataSet$time , format = "%m/%d/%Y %I:%M:%S %p" , tz = "GMT")
class(df_dataSet$time)
df_dataSet <- df_dataSet[do.call(order, df_dataSet), ]
You could also use arrange from the dplyr library.
The following snippet will modify your original date string to a date object, and order by it. This is a good approach, as you store a date as a date, not just a string of characters.
dates <- dates %>%
mutate(date = as.Date(date, "%d/%m/%Y")) %>%
arrange(date)
If you just want to order by the string (usually an inferior option), you can do this:
dates <- dates %>%
arrange(date = as.Date(date, "%d/%m/%Y"))
If you have a dataset named daily_data:
daily_data <- daily_data[order(as.Date(daily_data$date, format="%d/%m/%Y")),]