R: Date discrepancy when creating a dataframe with a date variable - r

I'm trying to create dataframe in R with a date column that is populated from a pre-defined variable. When I do this the date is changed to a seemingly random future date. Here is what I have:
DateVar <- as.Date("2/6/2020", format = "%m/%d/%Y")
Size <- 50
Results <- data.frame(COMP_NAM=character(Size),
COMM_DTE=as.Date(Size, origin = DateVar),
ID=character(Size),
stringsAsFactors=FALSE)
Then when I look at the dataframe, column COMM_DTE is populated with "2020-04-29". But when I print DateVar R returns "2020-02-06".
Does anyone know why this is happening and how to fix it? Thank you!

Related

How to rename the columns of a large data table to dates

I am working from an nc file and after extracting the data to a matrix the time variable is the column variable for this it just gave it a number 1:2087 for the range of time for the dataset. I would like to rename it to the date that they should be (starting at 1981/12/31 to 2021/12/31 where each column is a week) I tried to change the names by using
colnames(tmp_mat) <- rep(seq(as.Date('1982-01-05'), as.Date('2021-12-28'), by = 'weeks'))
this changed the column names but it changed it to a number (the number of days for that date since 1971/01/01.
Does anyone have any suggestions in how to make this work
Your data is a matrix , you have to change it to data.frame then apply your code
tmp_mat = data.frame(tmp_mat)
colnames(tmp_mat) <- rep(seq(as.Date('1982-01-05'), as.Date('2021-12-28'), by = 'weeks'))

R: How to fill in values in a new column using the values of another column

I have a dataset in R with a column called event_date.
The variables look like this:
31-Dec-18
30-Dec-18
28-Dec-18
And so on.
I want to create a new column called date where I separate out the day of the event. So it looks like:
31
30
28
I'm pretty new to working with R, so I'm wondering whether a for loop is the way to go, or if there's a more efficient way I don't know about.
if the dates are of type character
df$date <- sub(".*-.*-(.*)","\\1", df$event_date)
otherwise you can look into creating data type objects in R.
If the days are two digit, then substr would be faster
df$day <- substr(df$event_date, 1, 2)
Or convert to Date class and extract the day
df$day <- format(as.Date(df$event_date, "%d-%b-%y"), "%d")

Change the class of a cell of a data-frame to Date

everyone!
As part of my clinical study I created a xlsx spreadsheet containing a data set. Only columns 2 to 12 and lines 1 to 307 are useful to me. I now manipulate my spreadsheet under R, after importing it (read_excel, etc.).
In my columns 11 and 12 ('data' and 'raw_data'), some cells correspond to dates (for example the first 2 rows of 'data' and 'raw_data'). Indeed, this corresponds to the patient's visit dates. However, as you can see, these dates are given to me in number of days since the origin "1899-12-30". However, I would like to be able to transform them into a current date format (2019-07-05).
My problem is that in these columns I don't only have dates, I have different numerical results (times, means, scores, etc.) .
I started by transforming the class of my columns from character to factor/numeric so that I could better manipulate the columns later. But I can't change only the format of cells corresponding to a date.
Do you know if it is possible to transform only the cells concerned and if so how?
I attach my code and a preview of my data frame.
Part "Unsuccessful trial": I tried with this kind of thing. Of course the date changes format here but as soon as I try to make this change in the data frame it doesn't work.
Thank you for your help!
# Indicate the id of the patient
id = "01_AA"
# Get protocol data of patient
idlst <- dir("/data/protocolData", full.names = T, pattern = id)
# Convert the xlsx database into dataframe
idData <- data.table::rbindlist(lapply(
idlst,
read_excel,
n_max = 307,
range = cell_cols("B:M"), # just keep the table
), fill = TRUE)
idData <- as.tibble(idData)
idData<- idData %>%
mutate_at(vars(1:10), as.factor)%>%
mutate_at(vars(11:length(idData)), as.numeric)
# Unsuccessful trial
as.Date.character(data[1:2,11:12], origin ='1899-12-30')
Thank you for your comments and indeed this is one of the problems with R.
I solved my problem with the following code where idData is my df.
# Change the data format of the date cells of the column Data and Raw_data:
idData$Data[grepl("date",idData$Measure)] <- as.character(as.Date(
as.numeric(
idData$Data[grepl("date",idData$Measure)]),
origin = "1899-12-30"))

How to avoid date formatted values getting converted to numeric when assigned to a matrix or data frame?

I have run into an issue I do not understand, and I have not been able to find an answer to this issue on this website (I keep running into answers about how to convert dates to numeric or vice versa, but that is exactly what I do not want to know).
The issue is that R converts values that are formatted as a date (for instance "20-09-1992") to numeric values when you assign them to a matrix or data frame.
For example, we have "20-09-1992" with a date format, we have checked this using class().
as.Date("20-09-1992", format = "%d-%m-%Y")
class(as.Date("20-09-1992", format = "%d-%m-%Y"))
We now assign this value to a matrix, imaginatively called Matrix:
Matrix <- matrix(NA,1,1)
Matrix[1,1] <- as.Date("20-09-1992", format = "%d-%m-%Y")
Matrix[1,1]
class(Matrix[1,1])
Suddenly the previously date formatted "20-09-1992" has become a numeric with the value 8298. I don't want a numeric with the value 8298, I want a date that looks like "20-09-1992" in date format.
So I was wondering whether this is simply how R works, and we are not allowed to assign dates to matrices and data frames (somehow I have managed to have dates in other matrices/data frames, but it beats me why those other times were different)? Is there a special method to assigning dates to data frames and matrices that I have missed and have failed to deduce from previous (somehow successful) attempts at assigning dates to data frames/matrices?
I don't think you can store dates in a matrix. Use a data frame or data table. If you must store dates in a matrix, you can use a matrix of lists.
Matrix <- matrix(NA,1,1)
Matrix[1,1] <- as.list(as.Date("20-09-1992", format = "%d-%m-%Y"),1)
Matrix
[[1]]
[1] "1992-09-20"
Edited: I also just re-read you had this issue with data frame. I'm not sure why.
mydate<-as.Date("20-09-1992", format = "%d-%m-%Y")
mydf<-data.frame(mydate)
mydf
mydate
1 1992-09-20
Edited: This has been a learning experience for me with R and dates. Apparently the date you supplied was converted to number of days since origin. Origin is defined as Jan 1st,1970. To convert this back to a date format at some point
Matrix
[,1]
[1,] 8298
as.Date(Matrix, origin ="1970-01-01")
[1] "1992-09-20"
try the following: First specify your date vector & then use
rownames(mat) <- as.character(date_vector)
the dates will appear as a text.
This happens mostly when we are loading Excel Workbook
You need to add detectDates = TRUE in the function
DataFrame <- read.xlsx("File_Nmae", sheet = 3, detectDates = TRUE)

Splitting up unusual date & time column format in R

It's likely a trivial question, but I'm attempting to break date and time into their own variables on a GPS data frame containing 1.4 million rows. The timestamp fromat is:
2015-11-19T03:27:56
I've been able to extract the date without any trouble, but the 'T' is giving trouble when attempting to extract time. The following code:
sater001$utc_d <- as.Date(sater001$utc_time_stamp)
where 'sater001$utc_d' is my data frame, and 'utc_time_stamp' is the variable I wish to split, the date is extracted just fine.
But running:
sater001$utc_t <- format(as.POSIXlt(sater001$utc_time_stamp) ,format = "T%H:%M:%S")
Gives me a column filled with T00:00:00 values.
What am I missing here?
We need to also include the T in the format
v2 <- as.POSIXct(v1, format = '%Y-%m-%dT%H:%M:%S')
v2
#[1] "2015-11-19 03:27:56 IST"
Now, we can extract the hms portion
format(v2, "%H:%M:%S")
NOTE: We don't need any additional packages to get the expected result.
data
v1 <- "2015-11-19T03:27:56"

Resources