Package to parse dates in Common Lisp? - common-lisp

I'm writing a simple web scraper in Common Lisp (SBCL) as a learning exercise, & would like to sort by date. To do this, I'll need to parse dates in the format "MM/DD/YYYY" into universal time.
I could simply tokenise the string & pass the bits into encode-universal-time, but I figure that there must be a built-in function (or popular third-party package) for date parsing. I'd greatly appreciate someone recommending one :-)

This answer is very late but the local-time library is featureful and widely used. It is based on the article The long painful history of time.
It supports :
Time and date arithmetic
ISO 8601 timestring formatted output and parsing
Reader macros to embed timestrings directly in code
Timezone handling (will read unix tzfile format)
Conversion between universal and unix time epochs
Julian date calculation

See the net-telent-date and simple-date-time libraries for Common Lisp. The former has a parse-time function you can use (see parse-time.lisp). Both are included in the QuickLisp library collection.

You could try net-telent-date, which has PARSE-TIME which I think will do what you want.
It's now 2022, and net-telent-date is on github and is also deprecated. Better to find something else.

Many implementations have a UNIX interface and, in same cases, this includes the strptime function.

Antik handles dates and times and includes date/time parsers. The result is a "timepoint" which by default is UTC (CL's "universal-time" is something different, but it can be converted to that).

I use local-time and cl-date-time-parser:
edit: and chronicity for parsing natural language dates and times.
(local-time:parse-timestring "2019-11-13T18:09:06.313650+01:00") ;; OK
(local-time:parse-timestring "2019-11-13") ;;OK
This fails with local-time by default:
(local-time:parse-timestring "2019/11/13")
but it works with Chronicity:
(chronicity:parse "2019/11/13")
#2019-11-13T00:00:00.000000+01:00
and we can set the date separator of local-time to "/":
(local-time:parse-timestring "2019/11/13" :date-separator #\/) ;; OK
There is also the time and datetime separators.
Now a format like ""Wed Nov 13 18:13:15 2019" will fail. We'll use the
cl-date-time-parser library:
(cl-date-time-parser:parse-date-time "Wed Nov 13 18:13:15 2019")
;; 3782657595
;; 0
It returns the universal time which, in turn, we can ingest with the
local-time library:
(local-time:universal-to-timestamp *)
;; #2019-11-13T19:13:15.000000+01:00

Related

ISO datetime with timezone issue

I am just printing the ISO datetime with timezone as per the below documentation
http://support.sas.com/documentation/cdl/en/lrdict/64316/HTML/default/viewer.htm#a003169814.htm
This is my code
TimeZone tz = TimeZone.getTimeZone("UTC");
DateFormat df = new SimpleDateFormat("yyyy-mm-dd'T'hh:mm:ss.nnnnnn+|-hh:mm");
df.setTimeZone(tz);
dateTimeWithTimeZone = df.format(new Date());
However i am getting this exception
Illegal pattern character 'n'
I cant use this format directly in Java ?
java.time
dateTimeWithTimeZone = Instant.now().toString();
System.out.println(dateTimeWithTimeZone);
When I ran this snippet just now, I got this output:
2019-03-18T22:28:13.549319Z
It’s not clear from the page you link to, but it’s an ISO 8601 string in UTC, so should be all that you need. I am taking advantage of the fact that the classes of java.time produce ISO 8601 output from their toString methods. The linked page does show the format with hyphens, T and colons (2008-09-15T15:53:00+05:00), it shows another example with decimals on the seconds (15:53:00.322348) and a third one with Z meaning UTC (20080915T155300Z), so I would expect that the combination of all three of these would be OK too.
The format you used in the quesiton seems to try to get the offset as +00:00 rather than Z. If this is a requirement, it’s only a little bit more complicated. We are using an explicit formatter to control the variations within ISO 8601:
DateTimeFormatter iso8601Formatter
= DateTimeFormatter.ofPattern("uuuu-MM-dd'T'HH:mm:ss.SSSSSSxxx");
dateTimeWithTimeZone = OffsetDateTime.now(ZoneOffset.UTC).format(iso8601Formatter);
System.out.println(dateTimeWithTimeZone);
2019-03-18T22:28:13.729711+00:00
What went wrong in your code?
You tried to use the formatting symbols from your source with SimpleDateFormat. First, you should never, and especially not in Java 8 or later, want to use SimpleDateFormat. That class is notoriously troublesome and long outdated. Second, some of its format pattern letters agree with the symbols from your source, some of them don’t, so you cannot just use the symvol string from there. Instead you need to read the documentation and find the correct format pattern letters to use for year, month, etc. And be aware that they are case sensitive: MM and mm are different.
Link
Oracle Tutorial: Date Time
explaining how to use java.time.

Apache Nifi Expression Language - toDate formatting

I am trying to format a date string using the Apache Nifi expression language and the Replace Text processor(regex). Given a date string
date_str : "2018-12-05T11:44:39.717+01:00",
I wish to convert this to:
correct_mod_date_str: "2018-12-05 10:44:39.717",
(notice how the date is converted to UTC, and character 'T' replaced by a space.)
To do this, I am currently using:
toDate("yyyy-MM-dd'T'HH:mm:ss.SSSSSSXXX"):format("yyyy-MM-dd HH:mm:ss.SSS", '+00:00')
and this works perfectly.
However, when the date string has 6 digits in ms, rather than 3, things break:
another_date_str: "2018-12-05T11:44:39.717456+01:00"
is converted to:
incorrect_mod_date_str: "2018-12-05 10:56:36.456"
It seems the first 3 digits in the ms precision interferes with the conversion.
Appreciate inputs to resolve this - what am I missing?
Regards
seems that's a limitation in java.
according to java documentation there is no support of more then 3 milliseconds digits.
https://docs.oracle.com/javase/8/docs/api/java/text/SimpleDateFormat.html
the simplest way is to remove extra digits like this:
attr:replaceAll('(\.\d{3})\d*','$1'):toDate("yyyy-MM-dd'T'HH:mm:ss.SSSXXX"):format("yyyy-MM-dd HH:mm:ss.SSS", '+00:00')
I ran into a similar issue with date time encoded in ISO 8601. The problem is, that the digits after the second are defined as fragment of a second, not milliseconds.
See answer to related topic

Format POSIX in R (quantstrat)

I'm working on extracting a date from a variable: "curIndex."
Here's what the code looks likes
show(txntime1 <- timestamp(mktdata[curIndex+1L])[,1])
show(txntime <- strftime(txntime1, '%Y-%m-%d %H:%M:%OS6'))
And the output is this:
"##------ Tue Mar 08 14:31:58 2016 ------##"
"NULL"
I'm working within ruleOrderProc of the quantstrat package.
The order time needs to be POSIXlt for the order book. Does anyone know what to do with this funky date format that I'm getting?
If so, thanks!
When all else fails, read the documentation. ;-) ?timestamp says:
The timestamp function writes a timestamp (or other message)
into the history and echos it to the console. On platforms that
do not support a history mechanism only the console message is
printed.
You probably meant to call time or index. Also, the time needs to be POSIXct for the order book, not POSIXlt.

Timezone as +0000 for format-dateTime

I'm trying to format the current datetime in XSLT with an explicit UTC offset (and no other literals and no millis), like: 20140710163601+0200.
However, this <x:value-of select="format-dateTime(current-dateTime(), '[Y0001][M01][D01][H01][m01][s01][z]')"/> gives me this: 20140710164200GMT+02:00. Note that I do not want the GMT part.
If there is no offset, I get 20140710144546.
Is there any way to force an explicit offset and set it to the format I want? Obviously, I could do some string manipulation, but maybe there's a library function I'm overlooking. And then there's the no-timezone result I have to force the format for.
Note that it's no problem for me to build a function around this, but rather I'd use something built in or more elegant.
The XSLT 2.0 spec of format-dateTime() is a bit muddled about timezones, so it may depend on which processor you are using. In 3.0 it's specified that you get the format you want with [Z0000]. Recent versions of Saxon implement the function according to the 3.0 spec, but other processors may well do something different. You might be better off using timezone-from-dateTime() to extract the timezone, and then formatting it using format-number().
Try using (capital) Z instead of (lower-case) z.

DateTime issue on different platforms (.NET 2.0)

On a 32 bit OS, with an Intel processor,
DateTime e.g. 2/17/2009 12:00:00 AM
Notice that it is: mm/DD//yyyy
On a 64 bit OS, with an AMD processor,
DateTime e.g. 17-02-2009 00:00:00
Now when I try to parse the 1st format, it throws an error on the 2nd platform.
That means - DateTime.Parse("2/17/2009 12:00:00 AM") - throws an error - cannot convert.
whereas, on the same platform,
DateTime.Parse("17/2/2009 12:00:00 AM") works! That means DD/MM is fine, MM/DD is not.
What is causing this? The 64-bit OS? The processor?
How do I get rid of the problem?
DateTimes themselves don't have formats. You parse them or format them into strings. (It's like numbers - integers aren't stored in hex or decimal, they're just integers. You can format them in hex or decimal, but the value itself is just a number.)
The format will depend on the culture of the operating system (or more accurately, the culture of the thread, which is typically the same as the operating system one).
Personally I like to explicitly set the format I use for either parsing or formatting, unless I'm actually displaying the string to the user and know that the culture is appropriate already.
Check your "Date and time formats" in the "Region and Language" control panel.
Also, if you want DateTime to generate a specific format, don't just call plain ToString(), but pass it a parameter indicating the format you want. Similarly, if you know the format of the date you are asking it to parse, call TryParseExact(), and tell it the format you are providing.
See also MSDN's Standard Date and Time Format Strings

Resources