Hope you can help me out!
For all of my dates in a column, I would like to get a range for each date - 14 days. So for example, if the first date in my column is 29-04-2021, I would like to get the dates from 15-04-2021 until 29-04-2021. I found the function seq that does this but to do this for all the values in my column I need to put the seq function in a for loop.
This is what I tried but the output is only the last row and the date format changed. This is my code (test_IIVAC$'Vacdate 1' is my column with the dates):
df <- data.frame()
for(i in 1:length(test_IIVAC$`Vacdate 1`)){
te <- as.Date(seq(test_IIVAC$`Vacdate 1`[i]-14, test_IIVAC$`Vacdate 1`[i], by = "day"))
df1 <- rbind(df, te)
}
Can anyone help me out in getting the ranges of all the dates in the column and place them in one dataframe with the Date format? The desired output would be:
Output
Thanks a bunch!
You can use any of apply command to generate a sequence of date values and add it as a new column in the original dataframe.
test_IIVAC$dates <- lapply(df$a, function(x) seq(x-14, x, by = 'day'))
Related
Let's say I have a data df like this:
df<-data.frame(date=c(202203,202204,202205,202206))
202203 means March, 2022.
So all the values of date column represent year and month .
However, since I don't know the exact date, I want to insert 01 to every values of date column.That is ,202203should be 20220301:March 1st,2022 .
My expected output is
df<-data.frame(date=c(20220301,20220401,20220501,20220601))
I tried to use gsub but, the output was not what I have expected.
df$date <- as.numeric(paste0(df$date, "01"))
oh hang on, got a better one:
df$date <- df$date * 100 + 1
You should use paste0 and loop.
For example:
for(i in df$date){
paste0(i, "01") -> df$date
}
The goal is to get a dataframe with two columns: the first column would be the month and the second column would be the year. I'd like the for loop to take me to two years from now. I left the for loop empty given that I was nowhere near finding the solution.
D <- data.frame(month(Sys.Date()), year(Sys.Date()))
D <- rename(D, Month = month.Sys.Date..., Year = year.Sys.Date...)
for (x in 1:24) {
D1 <- return()
}
We don't need a loop. An option is to paste the 'Year', 'Month' together, convert to yearmon class (from zoo) and add a sequence of months
library(zoo)
as.yearmon(paste0(D$Year, "-", D$Month)) + 0:24/12
i am working with csv file and i have a column with name "statistics_lastLocatedTime" as shown in
csv file image
i would like to subtract second row of "statistics_lastLocatedTime" from first row; third row from second row and so on till the last row and then store all these differences in a separate column and then combine this column to the other related columns as shown in the code given below:
##select related features
data <- read.csv("D:/smart tech/store/2016-10-11.csv")
(columns <- data[with(data, macAddress == "7c:11:be:ce:df:1d" ),
c(2,10,11,38,39,48,50) ])
write.csv(columns, file = "updated.csv", row.names = FALSE)
## take time difference
date_data <- read.csv("D:/R/data/updated.csv")
(dates <- date_data[1:40, c(2)])
NROW(dates)
for (i in 1:NROW(dates)) {
j <- i+1
r1 <- strptime(paste(dates[i]),"%Y-%m-%d %H:%M:%S")
r2 <- strptime(paste(dates[j]),"%Y-%m-%d %H:%M:%S")
diff <- as.numeric(difftime(r1,r2))
print (diff)
}
## combine time difference with other related columns
combine <- cbind(columns, diff)
combine
now the problem is that i am able to get the difference of rows but not able to store these values as a column and then combine that column with other related columns. please help me. thanks in advance.
This is a four-liner:
Define a custom class 'myDate', and a converter function for your custom datetime, as per Specify custom Date format for colClasses argument in read.table/read.csv
Read in the datetimes as actual datetimes; no need to repeatedly convert later.
Simply use the vectorized diff operator on your date column (it sees their type, and automatically dispatches a diff function for POSIXct Dates). No need for for-loops:
.
setClass('myDate') # this is not strictly necessary
setAs('character','myDate', function(from) {
as.POSIXct(from, format='%d-%m-%y %H:%S', tz='UTC') # or whatever timezone
})
data <- read.csv("D:/smart tech/store/2016-10-11.csv",
colClasses=c('character','myDate','myDate','numeric','numeric','integer','factor'))
# ...
data$date_diff <- c(NA, diff(data$statistics_lastLocatedTime))
Note that diff() produces a result of length one shorter than vector that we diff'ed. Hence we have to pad it (e.g. with a leading NA, or whatever you want).
Consider directly assigning the diff variable using vapply. Also, there is no need for the separate date_data df as all operations can be run on the columns df. Notice too the change in time format to align to the format currently in dataframe:
columns$diff <- vapply(seq(nrow(columns)), function(i){
r1 <- strptime(paste(columns$statistics_lastLocatedTime[i]),"%d-%m-%y %H:%M")
r2 <- strptime(paste(columns$statistics_lastLocatedTime[i+1]),"%d-%m-%y %H:%M")
diff <- difftime(r1, r2)
}, numeric(1))
My aim is to count days of exceedance per year for each column of a dataframe. I want to do this with one fixed value for the whole dataframe, as well as with different values for each column. For one fixed value for the whole dataframe, I found a solution using count with aggregate and another solution using the package plyr with ddply and colwise. But I couldn't figure out how to do this with different values for each column.
Approach for one fixed value:
# create example data
date <- seq(as.Date("1961/1/1"), as.Date("1963/12/31"), "days") # create dates
date <- date[(format.Date(as.Date(date), "%m %d") !="02 29")] # delete leap days
TempX <- rep(airquality$Temp, length.out=length(date))
TempY <- rep(rev(airquality$Temp), length.out=length(date))
df <- data.frame(date, TempX, TempY)
# This approachs works fine for specific values using aggregate.
library(plyr)
dyear <- as.numeric(format(df$date, "%Y")) # year vector
fa80 <- function (fT) {cft <- count(fT>=80); return(cft[2,2])}; # function for counting days of exceedance
aggregate(df[,-1], list(year=dyear), fa80) # use aggregate to apply function to dataframe
# Another approach using ddply with colwise, which works fine for one specific value.
fd80 <- function (fT) {cft <- count(fT>=80); cft[2,2]}; # function to count days of exceedance
ddply(cbind(df[,-1], dyear), .(dyear), colwise(fd80)) # use ddply to apply function colwise to dataframe
In order to use specific values for each column separatly, I tried passing a second argument to the function, but this didn't work.
# pass second argument to function
Oc <- c(80,85) # values
fo80 <- function (fT,fR) {cft <- count(fT>=fR); return(cft[2,2])}; # function for counting days of exceedance
aggregate(df[,-1], list(year=dyear), fo80, fR=Oc) # use aggregate to apply function to dataframe
I tried using apply.yearly, but it didn't work with count. I want to avoid using a loop, as it is slowly and I have a lot of dataframes with > 100 columns and long timeseries to process.
Furthermore the approach has to work for subsets of the dataframe as well.
# subset of dataframe
dfmay <- df[(format.Date(as.Date(df$date),"%m")=="05"),] # subset dataframe - only may
dyearmay <- as.numeric(format(dfmay$date, "%Y")) # year vector
aggregate(dfmay[,-1],list(year=dyearmay),fa80) # use aggregate to apply function to dataframe
I am out of ideas, how to solve this problem. Any help will be appreciated.
You could try something like this:
#set the target temperature for each column
targets<-c(80,80)
dyear <- as.numeric(format(df$date, "%Y"))
#for each row of the data, check if the temp is above the target limit
#this will return a matrix of TRUE/FALSE
exceedance<-t(apply(df[,-1],1,function(x){x>=targets}))
#aggregate by year and sum
aggregate(exceedance,list(year=dyear),sum)
I have some data in the following format:
date x
2001/06 9949
2001/07 8554
2001/08 6954
2001/09 7568
2001/10 11238
2001/11 11969
... more rows
I want to extract the x mean for each month. I tried some code with aggregate, but
failed. Thanks for any help on doing this.
Here I simulate a data frame called df with more data:
df <- data.frame(
date = apply(expand.grid(2001:2012,1:12),1,paste,collapse="/"),
x = rnorm(12^2,1000,1000),
stringsAsFactors=FALSE)
Using the way your date vector is constructed you can obtain months by removing the firs four digits followed by a forward slash. Here I use this as indexing variable in tapply to compute the means:
with(df, tapply(x, gsub("\\d{4}/","",date), mean))
Sorry...just creat an month-sequence vector then used tapply.
It was very easy:
m.seq = rep(c(6:12, 1:5), length = nrow(data))
m.means = tapply(data$x, m.seq, mean)
But thanks for the comments anyway!