Rapidminer : converting unix timestamp - r

Does anybody know a way to convert unix timestamp to date_time attribute?
I tried to use R extensions (my operators are mainly written in R) such as as.POSIXct functions to convert timestamps but it seems that rapidminer doesn't like it and keeps ignoring it.
Any help is appreciated
Thanks

A little known feature of generating attributes is that the input attribute can be the output attribute so no new one is created. In addition, the type of the attribute is changed.
In other words, a construction like this would work as long as the input is milliseconds since the epoch.
unixtime = date_parse(unixtime)

Related

How to convert string as YYYY-MM-DD:HH:mm:SS:sss into epoch time?

I have a string as YYYY-MM-DD:HH:mm:SS:sss (ex : 2017-10-11:04:36:26.376). Now I want to convert it into epoch time . What would be programmatic approach for this ?
I am programming in C++, able to extract information in variable.
It turns out there is a formula, but it's fairly ugly. I originally implemented something similar in BASIC 2.0 in 1982 (when each byte counted), and later converted it to Perl:
sub datestar {
$_=shift;
/^(....)(..)(..)/;
$fy=($1-($2<3));
$jd=$fy*365+int($fy/4)-int($fy/100)+int($fy/400)+int(((($2-3+12*($2<3))*30.6)+.5)+$3);
return(86400*($jd-719469))
}
Note that this takes something like "20171011", not "2017-10-11", and doesn't convert hours/minutes/seconds (which are easy to convert).
As always, doublecheck code before use, and use it as a template to write your own code if you really want to.
However, you would be infinitely better off using your programming language's existing functions to do this.
As others said the formula is so complex and would make whole code a mess, So to avoid these I am calculating the number of days from Input date to 01-01-2000. As I know epoch time till 01-01-2000, thus by finding number of days considering leap year I can calculate total epoch time.

Mismatch when converting unix timestamp to date

I have a database (in CSV) with unix timestamps. I try to convert them in LibreOffice Calc into a human readable date. Everything is ok... except a one-day-lag.
For example, my timestamp is -518144400 (in E2 cell).
My function is : =E2/86400+DATEVAL("1/1/1970").
I obtain 19572,9583333333 which correspond to 1953-07-31.
This on-line calculator confirm the result.
What is the problem ? Just that the right answer is 1953-08-01.
First, I thought the timestamps contained a mistake. But, in this PHP calendar, if I paste -518144400 as parameter in the URL, it works. The on-line calendar associate this timestamp to (what I think is) the right answer.
I don't understand what happens. What I missed ?
One solution could be adding +1 in my function to correct. But I'm not satisfied, I'd like to understand...
It depends on conversion time zone, I mean that -518144400 (Timestamp) is equal to 1953-07-31 in GMT
While it will be 1953-08-01 in all other Time Zone where Time Relative to GTM is +1 or more.

Want only the time portion of a date-time object in R

I have a vector of times in R, all_symbols$Time and I am trying to find out how to get JUST the times (or convert the times to strings without losing information). I use
strptime(all_symbol$Time[j], format="%H:%M:%S")
which for some reason assumes the date is today and returns
[1] "2013-10-18 09:34:16"
Date and time formatting in R is quite annoying. I am trying to get the time only without adding too many packages (really any--I am on a school computer where I cannot install libraries).
Once you use strptime you will of necessity get a date-time object and the default behavior for no date in the format string is to assume today's date. If you don't like that you will need to prepend a string that is the date of your choice.
#James' suggestion is equivalent to what I was going to suggest:
format(all_symbol$Time[j], format="%H:%M:%S")
The only package I know of that has time classes (i.e time of day with no associated date value) is package:chron. However I find that using format as a way to output character values from POSIXt objects lends itself well to functions that require factor input.
In the decade since this was written there is now a package named “hms” that has some sort of facility for hours, minutes, and seconds.
hms: Pretty Time of Day
Implements an S3 class for storing and formatting time-of-day values, based on the 'difftime' class.
Came across the same problem recently and found this and other posts R: How to handle times without dates? inspiring. I'd like to contribute a little for whoever has similar questions.
If you only want to you base R, take advantage of as.Date(..., format = ("...")) to transform your date into a standard format. Then, you can use substr to extract the time. e.g. substr("2013-10-01 01:23:45 UTC", 12, 16) gives you 01:23.
If you can use package lubridate, functions like mdy_hms will make life much easier. And substr works most of the time.
If you want to compare the time, it should work if they are in Date or POSIXt objects. If you only want the time part, maybe force it into numeric (you may need to transform it back later). e.g. as.numeric(hm("00:01")) gives 60, which means it's 60 seconds after 00:00:00. as.numeric(hm("23:59")) will give 86340.

Speedup conversion of 2 million rows of date strings to POSIX.ct

I have a csv which includes about 2 million rows of date strings in the format:
2012/11/13 21:10:00
Lets call that csv$Date.and.Time
I want to convert these dates (and their accompanying data) to xts as fast as possible
I have written a script which performs the conversion just fine (see below), but it's terribly slow and I'd like to speed this up as much as possible.
Here is my current methodology. Does anyone have any suggestions on how to make this faster?
dt <- as.POSIXct(csv$Date.and.Time,tz="UTC")
idx <- format(dt,tz=z,usetz=TRUE)
So the script converts these date strings to POSIX.ct. It then does a timezone conversion using format (z is a variable representing the TZ to which I am converting). I then do a regular xts call to make this an xts series with the rest of the data in the csv.
This works 100%. It's just very, very slow. I've tried running this in parallel (it doesn't do anything; if anything it makes it worse). What do I mean by 'slow'?
user system elapsed
155.246 16.430 171.650
That's on a 3GhZ, 16GB ram 2012 mb pro. I can get about half that on a similar processor with 32GB RAM on a Win7 Machine
I'm sure someone has a better idea - I'm open to suggestions via Rcpp etc. However, ideally the solution works with the csv rather than some other method, like setting up a database. Having said that, I'm up to doing this via whatever method is going to give the fastest conversion.
I'd be super appreciative of any help at all. Thanks in advance.
You want the small and simple fasttime package by Simon which does this in the fastest possible way---by not calling time parsing functions but just using C-level string functions.
It does not support as many formats as strptime. In fact, it doesn't even have a format string. But well-formed ISO format variants, that is yyyy-mm-dd hh:mm:ss.fff will work, and your / separator may just work too.
Try using lubridate - it does all date time parsing using regular expressions, so not only is it much faster, it's also much more flexible.

Operating with time intervals like 08:00-08:15

I would like to import a time-series where the first field indicates a period:
08:00-08:15
08:15-08:30
08:30-08:45
Does R have any features to do this neatly?
Thanks!
Update:
The most promising solution I found, as suggested by Godeke was the cron package and using substring() to extract the start of the interval.
I'm still working on related issues, so I'll update with the solution when I get there.
CRAN shows a package that is actively updated called "chron" that handles dates. You might want to check that and some of the other modules found here: http://cran.r-project.org/web/views/TimeSeries.html
xts and zoo handle irregular time series data on top of that. I'm not familiar with these packages, but a quick look over indicates you should be able to use them fairly easily by splitting on the hyphen and loading into the structures they provide.
So you're given a character vector like c("08:00-08:15",08:15-08:30) and you want to convert to an internal R data type for consistency? Check out the help files for POSIXt and strftime.
How about a function like this:
importTimes <- function(t){
t <- strsplit(t,"-")
return(lapply(t,strptime,format="%H:%M:%S"))
}
This will take a character vector like you described, and return a list of the same length, each element of which is a POSIXt 2-vector giving the start and end times (on today's date). If you want you could add a paste("1970-01-01",x) somewhere inside the function to standardize the date you're looking at if it's an issue.
Does that help at all?

Resources