Output int32 time dimension in netCDF using xarray - netcdf

Let's say I have time data that looks like this in an xarray Dataset:
ds = xr.Dataset({'time': pd.date_range('2000-01-01', periods=10)})
ds.to_netcdf('asdf.nc')
xarray's to_netcdf() method outputs the time dimension as int64:
$ ncdump -v time asdf.nc
netcdf asdf {
dimensions:
time = 10 ;
variables:
int64 time(time) ;
time:units = "days since 2000-01-01 00:00:00" ;
time:calendar = "proleptic_gregorian" ;
data:
time = 0, 1, 2, 3, 4, 5, 6, 7, 8, 9 ;
}
Because I'm working with a THREDDS server which does not support int64, I would like for these time data to be int32. Is this possible to do using xarray?

You can specify the data type of each output variable via the encoding property or the encoding keyword argument to to_netcdf. In your example, this would simply look like:
ds.to_netcdf('asdf.nc', encoding={'time': {'dtype': 'i4'}})
More information on writing encoded data can be found in the xarray documentation: http://xarray.pydata.org/en/latest/io.html#writing-encoded-data

Related

Go time comparison

I'm trying to create simple function just to change time zone of a time to another (Lets assume UTC to +0700 WIB). Here is the source code. I have 2 functions, first GenerateWIB which will change just your time zone into +0700 WIB with same datetime. Second is GenerateUTC which will change given time's timezone into UTC. GenerateUTC works perfectly while another is not.
expect := time.Date(2016, 12, 12, 1, 2, 3, 4, wib)
t1 := time.Date(2016, 12, 12, 1, 2, 3, 4, time.UTC)
res := GenerateWIB(t1)
if res != expect {
fmt.Printf("WIB Expect %+v, but get %+v", expect, res)
}
The res != expect always fullfilled with this result.
WIB Expect 2016-12-12 01:02:03.000000004 +0700 WIB, but get 2016-12-12 01:02:03.000000004 +0700 WIB
But it is the same time right? Did i miss something?
There is an .Equal() method to compare dates :
if !res.Equal(expect) {
...
Quoting the doc :
Note that the Go == operator compares not just the time instant but also the Location and the monotonic clock reading. Therefore, Time values should not be used as map or database keys without first guaranteeing that the identical Location has been set for all values, which can be achieved through use of the UTC or Local method, and that the monotonic clock reading has been stripped by setting t = t.Round(0). In general, prefer t.Equal(u) to t == u, since t.Equal uses the most accurate comparison available and correctly handles the case when only one of its arguments has a monotonic clock reading.
If you look at the code for the time.Time(*) struct, you can see that this struct has three private fields :
type Time struct {
...
wall uint64
ext int64
...
loc *Location
}
and the comments about those fields clearly indicate that, depending on how the Time struct was built, two Time describing the same point in time may have different values for these fields.
Running res == expect compares the values of these inner fields,
running res.Equal(expect) tries to do the thing you expect.
(*) time/time.go source code on master branch as of oct 27th, 2020
Dates in golang must be compared with Equal method. Method Date returns Time type.
func Date(year int, month Month, day, hour, min, sec, nsec int, loc *Location) Time
and Time type have Equal method.
func (t Time) Equal(u Time) bool
Equal reports whether t and u represent the same time instant. Two times can be equal even if they are in different locations. For example, 6:00 +0200 CEST and 4:00 UTC are Equal. See the documentation on the Time type for the pitfalls of using == with Time values; most code should use Equal instead.
Example
package main
import (
"fmt"
"time"
)
func main() {
secondsEastOfUTC := int((8 * time.Hour).Seconds())
beijing := time.FixedZone("Beijing Time", secondsEastOfUTC)
// Unlike the equal operator, Equal is aware that d1 and d2 are the
// same instant but in different time zones.
d1 := time.Date(2000, 2, 1, 12, 30, 0, 0, time.UTC)
d2 := time.Date(2000, 2, 1, 20, 30, 0, 0, beijing)
datesEqualUsingEqualOperator := d1 == d2
datesEqualUsingFunction := d1.Equal(d2)
fmt.Printf("datesEqualUsingEqualOperator = %v\n", datesEqualUsingEqualOperator)
fmt.Printf("datesEqualUsingFunction = %v\n", datesEqualUsingFunction)
}
datesEqualUsingEqualOperator = false
datesEqualUsingFunction = true
resources
Time type documentation
Equal method documentation
time.Date

matplotlib date formatting from UTC millisecond

I have quote data of forex currencies which I am trying to plot using matplotlib.
I can plot the data but the dates on the x-axis are in milliseconds and do not help.
I have managed to convert the dates to a datetime format, but it comes out as follows :
datetime.datetime(2015, 12, 12, 2, 0)
When i try run my code I get the following error:
line 804, in _candlestick
xy=(t - OFFSET, lower),
TypeError: unsupported operand type(s) for -: 'datetime.datetime' and 'float'
Process finished with exit code 1
My code is basically the following:
quotes = [dates, open, highest, low, close, volume]
ax = plt.subplot2grid((6, 4), (1, 0), rowspan=6, colspan=4, facecolor='#FFFFFF')
ax.xaxis.set_major_formatter(mdates.DateFormatter('%y-%m-%d %H:%M:%S'))
MOCHLV = zip(quotes[0], quotes[1], quotes[2], quotes[3], quotes[4], quotes[5])
matplotlib.finance.candlestick_ohlc(ax, MOCHLV, width=1.9, colorup='#53c156', colordown='#ff1717')
plt.ylabel('BTC/USD')
plt.xlabel('Date Hours:Minutes')
plt.show()

What format are the timestamps returned by the LinkedIn API?

LinkedIn's API returns the following value:
[creationTimestamp] => 1407247548000
It looks similar to a UNIX timestamp, but there are three "extra" zeros at the end. What format is this in, and how can I decode it?
It is a timestamp in milliseconds. Handling this is language dependent. Some languages may expect a timestamp in milliseconds, while others may expect it in seconds. Python 3, for example, expects seconds, but also handles microseconds (1000 milliseconds).
from datetime import datetime
ts = 1407247548124
dt = datetime.utcfromtimestamp(ts / 1000)
print(dt) # datetime(2014, 8, 5, 14, 5, 48, 124000)
Python 2 doesn't handle milliseconds directly (it ignores the fractional part), so you need to split the milliseconds out separately.
from datetime import datetime
ts = 14072475481234
secs, millis = divmod(ts, 1000)
dt = datetime.utcfromtimestamp(secs).replace(microsecond=millis * 1000)
print(dt) # datetime(2014, 8, 5, 14, 5, 48, 124000)

How do you use matrices in Nimrod?

I found this project on GitHub; it was the only search term returned for "nimrod matrix". I took the bare bones of it and changed it a little bit so that it compiled without errors, and then I added the last two lines to build a simple matrix, and then output a value, but the "getter" function isn't working for some reason. I adapted the instructions for adding properties found here, but something isn't right.
Here is my code so far. I'd like to use the GNU Scientific Library from within Nimrod, and I figured that this was the first logical step.
type
TMatrix*[T] = object
transposed: bool
dataRows: int
dataCols: int
data: seq[T]
proc index[T](x: TMatrix[T], r,c: int): int {.inline.} =
if r<0 or r>(x.rows()-1):
raise newException(EInvalidIndex, "matrix index out of range")
if c<0 or c>(x.cols()-1):
raise newException(EInvalidIndex, "matrix index out of range")
result = if x.transposed: c*x.dataCols+r else: r*x.dataCols+c
proc rows*[T](x: TMatrix[T]): int {.inline.} =
## Returns the number of rows in the matrix `x`.
result = if x.transposed: x.dataCols else: x.dataRows
proc cols*[T](x: TMatrix[T]): int {.inline.} =
## Returns the number of columns in the matrix `x`.
result = if x.transposed: x.dataRows else: x.dataCols
proc matrix*[T](rows, cols: int, d: openarray[T]): TMatrix[T] =
## Constructor. Initializes the matrix by allocating memory
## for the data and setting the number of rows and columns
## and sets the data to the values specified in `d`.
result.dataRows = rows
result.dataCols = cols
newSeq(result.data, rows*cols)
if len(d)>0:
if len(d)<(rows*cols):
raise newException(EInvalidIndex, "insufficient data supplied in matrix constructor")
for i in countup(0,rows*cols-1):
result.data[i] = d[i]
proc `[][]`*[T](x: TMatrix[T], r,c: int): T =
## Element access. Returns the element at row `r` column `c`.
result = x.data[x.index(r,c)]
proc `[][]=`*[T](x: var TMatrix[T], r,c: int, a: T) =
## Sets the value of the element at row `r` column `c` to
## the value supplied in `a`.
x.data[x.index(r,c)] = a
var m = matrix( 2, 2, [1,2,3,4] )
echo( $m[0][0] )
This is the error I get:
c:\program files (x86)\nimrod\config\nimrod.cfg(36, 11) Hint: added path: 'C:\Users\H127\.babel\libs\' [Path]
Hint: used config file 'C:\Program Files (x86)\Nimrod\config\nimrod.cfg' [Conf]
Hint: system [Processing]
Hint: mat [Processing]
mat.nim(48, 9) Error: type mismatch: got (TMatrix[int], int literal(0))
but expected one of:
system.[](a: array[Idx, T], x: TSlice[Idx]): seq[T]
system.[](a: array[Idx, T], x: TSlice[int]): seq[T]
system.[](s: string, x: TSlice[int]): string
system.[](s: seq[T], x: TSlice[int]): seq[T]
Thanks you guys!
I'd like to first point out that the matrix library you refer to is three years old. For a programming language in development that's a lot of time due to changes, and it doesn't compile any more with the current Nimrod git version:
$ nimrod c matrix
...
private/tmp/n/matrix/matrix.nim(97, 8) Error: ']' expected
It fails on the double array accessor, which seems to have changed syntax. I guess your attempt to create a double [][] accessor is problematic, it could be ambiguous: are you accessing the double array accessor of the object or are you accessing the nested array returned by the first brackets? I had to change the proc to the following:
proc `[]`*[T](x: TMatrix[T], r,c: int): T =
After that change you also need to change the way to access the matrix. Here's what I got:
for x in 0 .. <2:
for y in 0 .. <2:
echo "x: ", x, " y: ", y, " = ", m[x,y]
Basically, instead of specifying two bracket accesses you pass all the parameters inside a single bracket. That code generates:
x: 0 y: 0 = 1
x: 0 y: 1 = 2
x: 1 y: 0 = 3
x: 1 y: 1 = 4
With regards to finding software for Nimrod, I would like to recommend you using Nimble, Nimrod's package manager. Once you have it installed you can search available and maintained packages. The command nimble search math shows two potential packages: linagl and extmath. Not sure if they are what you are looking for, but at least they seem more fresh.

Python Pandas Series gives NaN data when passing a dict with large index values

I am trying to build a Pandas series by passing it a dictionary containing index and data pairs. While doing so I noticed an interesting quirk. If the index of the data pair is a very large integer the data will show up as NaN. This is fixed by reducing the size of the index values, or creating the Series using two lists instead of a single dict. I have large index values because I am using time-stamps in microseconds-since-1970 format. Am I doing something wrong or is this a bug?
Here's an example:
import pandas as pd
test_series_time = [1357230060000000, 1357230180000000, 1357230300000000]
test_series_value = [1, 2, 3]
series = pd.Series(test_series_value, test_series_time, name="this works")
test_series_dict = {1357230060000000: 1, 1357230180000000: 2, 1357230300000000: 3}
series2 = pd.Series(test_series_dict, name="this doesn't")
test_series_dict_smaller_index = {1357230060: 1, 1357230180: 2, 1357230300: 3}
series3 = pd.Series(test_series_dict_smaller_index, name="this does")
print series
print series2
print series3
and the output:
1357230060000000 1
1357230180000000 2
1357230300000000 3
Name: this works
1357230060000000 NaN
1357230180000000 NaN
1357230300000000 NaN
Name: this doesn't
1357230060 1
1357230180 2
1357230300 3
Name: this does
So what's up with this?
I bet you are on 32-bit; on 64-bit this works fine. In 0.10.1, the default of creation via dicts is to use the default numpy integer creation, which is system dependent (e.g. int32 on 32-bit, and int64 on 64-bit). You are overflowing the dtype, which results in unpredictable behavior.
In 0.11 (coming out this week!), this will work as it will default to creating int64s regardless of the system.
In [12]: np.iinfo(np.int32).max
Out[12]: 2147483647
In [13]: np.iinfo(np.int64).max
Out[13]: 9223372036854775807
Convert your microseconds to Timestamps (multiply by 1000 to put in nanoseconds which is what Timestamp accepts as an integer input, then you are good to go
In [5]: pd.Series(test_series_value,
[ pd.Timestamp(k*1000) for k in test_series_time ])
Out[5]:
2013-01-03 16:21:00 1
2013-01-03 16:23:00 2
2013-01-03 16:25:00 3

Resources