xarray to_netcdf not netCDF - netcdf

I'm using xarray for some wave model processing, and it works great. As long as I don't try to read to_netcdf results externally. Even with the example from http://xarray.pydata.org/en/stable/io.html, the resulting file can't be ncdump-ed.
import xarray as xr
import numpy as np
import pandas as pd
ds = xr.Dataset({'foo': (('x', 'y'), np.random.rand(4, 5))},
coords={'x': [10, 20, 30, 40],
'y': pd.date_range('2000-01-01', periods=5),
'z': ('x', list('abcd'))})
ds.to_netcdf('saved_on_disk.nc')
-
prompt> ncdump -h saved_on_disk.nc
ncdump: saved_on_disk.nc: NetCDF: Unknown file format
I'm on a Mac, with netCDF4 1.2.4, python 2.7, and xarray 0.9.6

Related

I want to order the data through jupyter notebook of the graph

(https://i.stack.imgur.com/oyGBO.png)
(https://i.stack.imgur.com/nDmnh.png)
be able to sort the values ​​and have them relocated to their labels correctly.
%matplotlib inline
import matplotlib.pyplot as plt
import numpy as np
import csv
import pandas as pd
data_path = "id.csv"
id_csv = pd.read_csv(data_path)
etiquetas=['10','1','5']
rating=id_csv.rating.value_counts()
rating_count=id_csv.rating_count.value_counts()
confirmation_count=id_csv.confirmation_count.value_counts()
n = len(rating)
x = np.arange(n)
width = 0.25
fig, ax = plt.subplots()
plt.bar(x - width,rating,width=width, label='Rating',color=['green'])
plt.bar(x,rating_count, width=width, label='Rating Count',color=['gold'])
plt.bar(x + width,confirmation_count, width=width, label='Confirmation Count', color=['red'])
for i,j in zip (x, rating):
ax.annotate(j,xy=(i-0.3, j+0.3))
for i,j in zip (x, rating_count):
ax.annotate(j,xy=(i+0.0, j+0.3))
for i,j in zip (x, confirmation_count):
ax.annotate(j,xy=(i+0.2, j+0.3))
ax.set_title("Ratings Graph")
ax.set_ylabel("Population")
ax.set_xlabel("assessment")
ax.set_xticks(x)
plt.legend(loc='best')
ax.set_xticklabels(etiquetas)

not able to load the image and pass it to preprocessing for the model prediction

I am trying to upload the image from the local system within the same directory. Post uploading, when I am passing through open cv split and merge for b,g, and r colors, i get the error ValueError: not enough values to unpack (expected 3, got 0)
Error :
this is the error that is showing is there any possibility to debug in the streamlit where I can track changes at different lines of code? (As in the image path,) when executed in a google collab as individual ipynb files run properly and I get by required classification
ValueError: not enough values to unpack (expected 3, got 0)
Traceback:
File "C:\Users\ADARSH\anaconda3\lib\site-packages\streamlit\runtime\scriptrunner\script_runner.py", line 564, in _run_script
exec(code, module.__dict__)
File "C:\Users\ADARSH\streamlit\deploy_test.py", line 76, in <module>
main()
File "C:\Users\ADARSH\streamlit\deploy_test.py", line 68, in main
mask = imageToTensor('image')
File "C:\Users\ADARSH\streamlit\deploy_test.py", line 44, in imageToTensor
b,g,r = cv2.split(bgr_img)
My entire streamlit app code
from pathlib import Path
import cv2
import numpy as np
import pandas as pd
import os
import cv2
import matplotlib.pyplot as plt
import matplotlib.patches as patches
import random
from sklearn.utils import shuffle
from tqdm import tqdm_notebook
import streamlit as st
from PIL import Image as impo
from fastai import *
from fastai.vision import *
from torchvision.models import *
class MyImageItemList(ImageList):
def open(self, fn:PathOrStr)->Image:
img = readCroppedImage(fn.replace('/./','').replace('//','/'))
# This ndarray image has to be converted to tensor before passing on as fastai Image, we can use pil2tensor
return vision.Image(px=pil2tensor(img, np.float32))
def read_image(name):
image = st.file_uploader("Upload an "+ name, type=["png", "jpg", "jpeg",'tif'])
if image is not None:
im = impo.open(image)
im.filename = image.name
return image
def imageToTensor(image):
sz = 68
bgr_img = cv2.imread(image)
b,g,r = cv2.split(bgr_img)
rgb_img = cv2.merge([r,g,b])
# crop to center to the correct size and convert from 0-255 range to 0-1 range
H,W,C = rgb_img.shape
rgb_img = rgb_img[(H-sz)//2:(sz +(H-sz)//2),(H-sz)//2:(sz +(H-sz)//2),:] / 256
return vision.Image(px=pil2tensor(rgb_img, np.float32))
def learn_infernce():
return load_learner('./')
def get_prediction(image):
if st.button('Classify'):
pred, pred_idx, probs = learn_inference.predict(image)
classes = ['negative', 'tumor']
st.write(f'Prediction: {pred}; Probability: {probs[pred_idx]:.04f}')
else:
st.write(f'Click the button to classify')
def main():
st.set_page_config(page_title='Cancer detection', page_icon=None, layout='centered', initial_sidebar_state='auto')
image = read_image('image')
mask = imageToTensor('image')
if mask is not None:
get_prediction('mask')
if __name__ == "__main__":
main()
In your main function you are passing 'str' instead of variables, and also I think your read_image is not well structured.
What you should do is to first save the uploaded file in a directory and fetch the file from that directory and pass it to imageToTensor() as a parameter. That's one work around which will give you a total control over the file. Otherwise you will get other error messages after the first error is fixed.
You can automate some few lines of code in a separate python file to delete the uploaded file from the directory with a given duration.
Note: Keep an eye on the imports because I skipped them to keep the code short
class MyImageItemList(ImageList):
def open(self, fn:PathOrStr)->Image:
img = readCroppedImage(fn.replace('/./','').replace('//','/'))
# This ndarray image has to be converted to tensor before passing on as fastai Image, we can use pil2tensor
return vision.Image(px=pil2tensor(img, np.float32))
# Refactured read_image()
def get_uploaded_image():
upload = st.file_uploader("Upload an image", type=["png", "jpg", "jpeg",'tif'])
if upload is not None:
st.write(upload.name)
# Create a directory and save the image file before proceeding.
file_path = os.path.join("data/uploadedImages/", upload.name)
with open(file_path, "wb") as user_file:
user_file.write(upload.getbuffer())
return file_path # fixed indentation
def imageToTensor(image):
sz = 68
bgr_img = cv2.imread(image)
b,g,r = cv2.split(bgr_img)
rgb_img = cv2.merge([r,g,b])
# crop to center to the correct size and convert from 0-255 range to 0-1 range
H,W,C = rgb_img.shape
rgb_img = rgb_img[(H-sz)//2:(sz +(H-sz)//2),(H-sz)//2:(sz +(H-sz)//2),:] / 256
return vision.Image(px=pil2tensor(rgb_img, np.float32))
def learn_infernce():
return load_learner('./')
def get_prediction(image):
if st.button('Classify'):
pred, pred_idx, probs = learn_inference.predict(image)
classes = ['negative', 'tumor']
st.write(f'Prediction: {pred}; Probability: {probs[pred_idx]:.04f}')
else:
st.write(f'Click the button to classify')
def main():
st.set_page_config(page_title='Cancer detection', page_icon=None, layout='centered', initial_sidebar_state='auto')
# Holds the saved file path
user_image = get_uploaded_image()
if user_image is not None:
# Pass the path to imageToTensor() as a parameter.
mask = imageToTensor(user_image)
get_prediction(mask)
if __name__ == "__main__":
main()
output:

Call Python from Julia

I am new to Julia and I have a Python function that I want to use in Julia. Basically what the function does is to accept a dataframe (passed as a numpy ndarray), a filter value and a list of column indices (from the array) and run a logistic regression using the statsmodels package in Python. So far I have tried this:
using PyCall
py"""
import pandas as pd
import numpy as np
import random
import statsmodels.api as sm
import itertools
def reg_frac(state, ind_vars):
rows = 2000
total_rows = rows*13
data = pd.DataFrame({
'state': ['a', 'b', 'c','d','e','f','g','h','i','j','k','l','m']*rows, \
'y_var': [random.uniform(0,1) for i in range(total_rows)], \
'school': [random.uniform(0,10) for i in range(total_rows)], \
'church': [random.uniform(11,20) for i in range(total_rows)]}).to_numpy()
try:
X, y = sm.add_constant(np.array(data[(data[:,0] == state)][:,ind_vars], dtype=float)), np.array(data[(data[:,0] == state), 1], dtype=float)
model = sm.Logit(y, X).fit(cov_type='HC0', disp=False)
rmse = np.sqrt(np.square(np.subtract(y, model.predict(X))).mean())
except:
rmse = np.nan
return [state, ind_vars, rmse]
"""
reg_frac(state, ind_vars) = (py"reg_frac"(state::Char, ind_vars::Array{Any}))
However, when I run this, I don't expect the results to be NaN. I think it is working but I am missing something.
reg_frac('b', Any[i for i in 2:3])
0.000244 seconds (249 allocations: 7.953 KiB)
3-element Array{Any,1}:
'b'
[2, 3]
NaN
Any help is appreciated.
In Python code you have strs while in Julia code you have Chars - it is not the same.
Python:
>>> type('a')
<class 'str'>
Julia:
julia> typeof('a')
Char
Hence your comparisons do not work.
Your function could look like this:
reg_frac(state, ind_vars) = (py"reg_frac"(state::String, ind_vars::Array{Any}))
And now:
julia> reg_frac("b", Any[i for i in 2:3])
3-element Array{Any,1}:
"b"
[2, 3]
0.2853707270515166
However, I recommed using Vector{Float64} that in PyCall gets converted in-flight into a numpy vector rather than using Vector{Any} so looks like your code still could be improved (depending on what you are actually planning to do).

R package called from python - Error

I am trying to call R packages from python (python 2.7.9) and trying to call Apriori function.
import rpy2
from rpy2 import *
import rpy2.interactive as r
arules = r.packages.importr("arules")
from rpy2.robjects.vectors import ListVector
od = r.OrderedDict()
od["supp"] = 0.0005
od["conf"] = 0.7
od["target"] = 'rules'
result = ListVector(od)
dataset = 'c:/Apriori/testcase.txt'
my_rules = arules.apriori(dataset, parameter=result)
print('my_rules',my_rules)
I am failing in rules. The error is below:
AttributeError: 'module' object has no attribute 'packages'
Please help
from rpy2.robjects.packages import importr
arules = importr("arules")

Plotting with date times and matplotlib

So, I'm using a function from this website to (try) to make stick plots of some netCDF4 data. There is an excerpt of my code below. I got my data from here.
The stick_plot(time,u,v) function is EXACTLY as it appears in the website I linked which is why I did not show a copy of that function below.
When I run my code I get the following error. Any idea on how to get around this?
AttributeError: 'numpy.float64' object has no attribute 'toordinal'
The description of time from the netCDF4 file:
<type 'netCDF4._netCDF4.Variable'>
float64 time(time)
long_name: time
standard_name: time
units: days since 1900-01-01 00:00:00Z
axis: T
ancillary_variables: time_quality_flag
data_min: 2447443.375
data_max: 2448005.16667
unlimited dimensions:
current shape = (13484,)
filling off
Here is an excerpt of my code:
imports:
import matplotlib.pyplot as plot
import numpy as np
from netCDF4 import Dataset
import os
from matplotlib.dates import date2num
from datetime import datetime
trying to generate the plots:
path = '/Users/Kyle/Documents/Summer_Research/east_coast_currents/'
currents = [x for x in os.listdir('%s' %(path)) if '.DS' not in x]
for datum in currents:
working_data = Dataset('%s' %(path+datum), 'r', format = 'NETCDF4')
u = working_data.variables['u'][:][:100]
v = working_data.variables['v'][:][:100]
time = working_data.variables['time'][:][:100]
q = stick_plot(time,u,v)
ref = 1
qk = plot.quiverkey(q, 0.1, 0.85, ref,
"%s N m$^{-2}$" % ref,
labelpos='N', coordinates='axes')
_ = plot.xticks(rotation=70)
Joe Kington answered my question. The netCDF4 file read the times in as a datetime object. All I had to do was replace date2num(time) with time which fixed everything.

Resources