TypeError: invalid type promotion xarray spatial mean - netcdf

I'm trying to calculate a meridional cosine weighted mean of a sub-region and timeslice of a netcdf dataset, here is my code,
from netCDF4 import Dataset
import xarray as xr
import numpy as np
min_lat=2
max_lat=9
datestr="2017-03-10"
olrfile="olr-daily_v01r02_20170101_20171231.nc"
ds=xr.open_dataset(olrfile)
olr=ds.sel(lat=slice(min_lat,max_lat),time=datestr)
weights=np.cos(np.deg2rad(ds.lat))
olrw=olr.weighted(weights)
olrm=olrw.mean(dim=('lat'))
The final mean statement falls over with the error
File "/afs/ictp/home/t/tompkins/.local/lib/python3.6/site-packages/numpy/core/einsumfunc.py", line 1350, in einsum
return c_einsum(*operands, **kwargs)
TypeError: invalid type promotion
and I have no idea what is wrong... I know I can do this with cdo, but I thought I would try to do it inline in xarray for speed.
The link to the netcdf file dir is here.

The problem appears to be that your current code is calculating the mean for all xarray variables. As more of a CDO user personally, I get confused by xarray setting bnds to variables. In this case it has time_bnds as a variable, and in your code it is trying to calculate the mean for that variable, but it's not working because (I think) there is no lat dimension for it.
You would just need to select the olr variable before calculating the weighting.
from netCDF4 import Dataset
import xarray as xr
import numpy as np
min_lat=2
max_lat=9
datestr="2017-03-10"
olrfile="olr-daily_v01r02_20170101_20171231.nc"
ds=xr.open_dataset(olrfile)
olr=ds["olr"].sel(lat=slice(min_lat,max_lat),time=datestr)
weights=np.cos(np.deg2rad(ds.lat))
olrw=olr.weighted(weights)
olrm=olrw.mean(dim=('lat'))

Related

Load NPZ sparse matrix in R

How can I read a sparse matrix that I have saved with Python as a *.npz file in R? I already came across two answers* on Stackoverflow but neither seems to do the job in my case.
The data set was created with Python from a Pandas data frame via:
scipy.sparse.save_npz(
"data.npz",
scipy.sparse.csr_matrix(DataFrame.values)
)
It seems like the first steps for importing the data set in R are as follows.
library(reticulate)
np = import("numpy")
npz1 <- np$load("data.npz")
However, this does not yield a data frame yet.
*1 Load sparce NumPy matrix into R
*2 Reading .npz files from R
I cannot access your dataset, so I can only speak from experience. When I try loading a sparse CSR matrix with numpy, it does not work ; the class of the object is numpy.lib.npyio.NpzFile, which I can't use in R.
The way I found to import the matrix into an R object, as has been said in a post you've linked, is to use scipy.sparse.
library(reticulate)
scipy_sparse = import("scipy.sparse")
csr_matrix = scipy_sparse$load_npz("path_to_your_file")
csr_matrix, which was a scipy.sparse.csr_matrix object in Python (Compressed Sparse Row matrix), is automatically converted into a dgRMatrix from the R package Matrix. Note that if you had used scipy.sparse.csc_matrix in Python, you would get a dgCMatrix (Compressed Sparse Column matrix). The actual function doing the hardwork converting the Python object into something R can use is py_to_r.scipy.sparse.csr.csr_matrix, from the reticulate package.
If you want to convert the dgRMatrix into a data frame, you can simply use
df <- as.data.frame(as.matrix(csr_matrix))
although this might not be the best thing to do memory-wise if your dataset is big.
I hope this helped!

gremlin-python-gets nodes with greater than two edges

I am currently using gremlin-python to study a graph. I want to get all vertices having more than two out edges. I am using anonymous traversal to filter out users based on the edge count but below is the error I am getting.
AttributeError: 'list' object has no attribute 'out'
I am new to this, not sure what I am doing wrong here. This is the way described in the limited gremlin-python tutorials/docs available.
It would be helpful if you could include a code snippet showing the imports you used as well as the query. In your Python code did you remember to import this class?
from gremlin_python.process.graph_traversal import __
I am able to run your query without any issues using one of my graphs
g.V().hasLabel('airport').where(__.out().count().is_(P.gte(2))).count().next()
If you do not have that import you will see an error like the one you are seeing.
There is a list of the most commonly needed imports when using gremlin-python at this location
EDITED to add:
As Stephen points out in the comment below given you only ever need to know if there are at least two outgoing edges you can reduce the work the query engine has to do (some optimizers may not need this) by adding a limit step.
g.V().hasLabel('airport').where(__.out().limit(2).count().is_(P.gt(1))).count().next()
from gremlin_python.structure.graph import Graph
from gremlin_python.driver.driver_remote_connection import DriverRemoteConnection
graph = Graph()
graph_db_uri = 'ws://localhost:8182/gremlin'
g = graph.traversal().withRemote(DriverRemoteConnection(graph_db_uri,'g'))
c=g.V().hasLabel('node_name').count().next()
print(c)

R - package marmap - read.bathy - depth values NA

I'm using RStudio and the package Marmap to import my 3 columns txt file (latitude, longitude and depth values).
After the import all the depth values appear as NA.
I've tried different file format (csv) but the result was the same.
Can someone help me.
Thanks,
Paulo
Have you tried plot(praia)? It seems you have an enormous amount of NAs, but you also have a few (726!) non missing values.
Since the data you're trying to import are obviously not regularly spaced, you should check ?griddify()

how to initialize layers by numpy array in keras

I want to convert a pre-trained caffe model to keras, then i need to initialize the layers , layer by layer.
I saved the weights and biases in a mat file and I loaded them to python workspace.
I know "weights" parameter get the numpy array but not how?
Thanks
You can get more information about how to set the weight of a model in the Keras Layers Documentation. Basically you use :
layer.set_weights(weights): sets the weights of the layer from a list of Numpy arrays (with the same shapes as the output of get_weights).
Or you can directly initialize them when you create the layer. Every layer has a parameter weights that you can set with a numpy array. Read each layer's documentation to feed the right weights format. For example, Dense() layers accept this format for the parameter weights :
List of Numpy arrays to set as initial weights. The list should have 2 elements, of shape (input_dim, output_dim) and (output_dim,) for weights and biases respectively. source

Import non-contiguous cell range from Excel to an R object

Does anyone know of a function that could import non-contiguous cells from MS Excel into one R object in a simple way? That is, a function that would take a non-contiguous cell range (e.g. 'A1:B10,D1:C10') in a given Excel worksheet as an input, and return an R object (e.g. numeric vector). readNamedRegion() from the XLConnect package does not work with non-contiguous cell ranges and using appendNamedRegion() would introduce unwanted complexity in my code.

Resources