sign recognition like hand written digits example in scikit-learn (python) - multidimensional-array

I watch out this example: http://scikit-learn.org/stable/auto_examples/plot_digits_classification.html#example-plot-digits-classification-py
on handwritten digits in scikit-learn python library.
i would like to prepare a 3d array (N * a* b) where N is my images number (75) and a* b is the matrix of an image (like in the example a 8x8 shape).
My problem is: i have signs in a different shapes for every image: (202, 230), (250, 322).. and give me
this error: ValueError: array dimensions must agree except for d_0 in this code:
#here there is the error:
grigiume = np.dstack(listagrigie)
print(grigiume.shape)
grigiume=np.rollaxis(grigiume,-1)
print(grigiume.shape)
There is a manner to resize all images in a standard size (i.e. 200x200) or a manner to have a 3d array with matrix(a,b) where a != from b and do not give me an error in this code:
data = digits.images.reshape((n_samples, -1))
classifier.fit(data[:n_samples / 2], digits.target[:n_samples / 2])
My code:
import os
import glob
import numpy as np
from numpy import array
listagrigie = []
path = 'resize2/'
for infile in glob.glob( os.path.join(path, '*.jpg') ):
print("current file is: " + infile )
colorato = cv2.imread(infile)
grigiscala = cv2.cvtColor(colorato,cv2.COLOR_BGR2GRAY)
listagrigie.append(grigiscala)
print(len(listagrigie))
#here there is the error:
grigiume = np.dstack(listagrigie)
print(grigiume.shape)
grigiume=np.rollaxis(grigiume,-1)
print(grigiume.shape)
#last step
n_samples = len(digits.images)
data = digits.images.reshape((n_samples, -1))
# Create a classifier: a support vector classifier
classifier = svm.SVC(gamma=0.001)
# We learn the digits on the first half of the digits
classifier.fit(data[:n_samples / 2], digits.target[:n_samples / 2])
# Now predict the value of the digit on the second half:
expected = digits.target[n_samples / 2:]
predicted = classifier.predict(data[n_samples / 2:])
print "Classification report for classifier %s:\n%s\n" % (
classifier, metrics.classification_report(expected, predicted))
print "Confusion matrix:\n%s" % metrics.confusion_matrix(expected, predicted)
for index, (image, prediction) in enumerate(
zip(digits.images[n_samples / 2:], predicted)[:4]):
pl.subplot(2, 4, index + 5)
pl.axis('off')
pl.imshow(image, cmap=pl.cm.gray_r, interpolation='nearest')
pl.title('Prediction: %i' % prediction)
pl.show()

You have to resize all your images to a fixed size. For instance using the Image class of PIL or Pillow:
from PIL import Image
image = Image.open("/path/to/input_image.jpeg")
image.thumbnail((200, 200), Image.ANTIALIAS)
image.save("/path/to/output_image.jpeg")
Edit: the above won't work, try instead resize:
from PIL import Image
image = Image.open("/path/to/input_image.jpeg")
image = image.resize((200, 200), Image.ANTIALIAS)
image.save("/path/to/output_image.jpeg")
Edit 2: there might be a way to preserve the aspect ratio and pad the rest with black pixels but I don't know how to do in a few PIL calls. You could use PIL.Image.thumbnail and use numpy to do the padding though.

Related

RuntimeError: quantile() q tensor must be same dtype as the input tensor in pytorch-forecasting

PyTorch-Forecasting version: 0.10.2
PyTorch version:1.12.1
Python version:3.10.4
Operating System: windows
Expected behavior
No Error
Actual behavior
The Error is
File c:\Users\josepeeterson.er\Miniconda3\envs\pytorch\lib\site-packages\pytorch_forecasting\metrics\base_metrics.py:979, in DistributionLoss.to_quantiles(self, y_pred, quantiles, n_samples)
977 except NotImplementedError: # resort to derive quantiles empirically
978 samples = torch.sort(self.sample(y_pred, n_samples), -1).values
--> 979 quantiles = torch.quantile(samples, torch.tensor(quantiles, device=samples.device), dim=2).permute(1, 2, 0)
980 return quantiles
RuntimeError: quantile() q tensor must be same dtype as the input tensor
How do I set them to be of same datatype? This is happening internally. I do not have control over this. I am not using any GPUs.
The link to the .csv file with input data is https://github.com/JosePeeterson/Demand_forecasting
The data is just sampled from a negative binomila distribution wiht parameters (9,0.5) every 4 hours. the time inbetween is all zero.
I just want to see if DeepAR can learn this pattern.
Code to reproduce the problem
from pytorch_forecasting.data.examples import generate_ar_data
import matplotlib.pyplot as plt
import pandas as pd
from pytorch_forecasting.data import TimeSeriesDataSet
from pytorch_forecasting.data import NaNLabelEncoder
from pytorch_lightning.callbacks import EarlyStopping, LearningRateMonitor
import pytorch_lightning as pl
from pytorch_forecasting import NegativeBinomialDistributionLoss, DeepAR
import torch
from pytorch_forecasting.data.encoders import TorchNormalizer
data = [pd.read_csv('1_f_nbinom_train.csv')]
data["date"] = pd.Timestamp("2021-08-24") + pd.to_timedelta(data.time_idx, "H")
data['_hour_of_day'] = str(data["date"].dt.hour)
data['_day_of_week'] = str(data["date"].dt.dayofweek)
data['_day_of_month'] = str(data["date"].dt.day)
data['_day_of_year'] = str(data["date"].dt.dayofyear)
data['_week_of_year'] = str(data["date"].dt.weekofyear)
data['_month_of_year'] = str(data["date"].dt.month)
data['_year'] = str(data["date"].dt.year)
max_encoder_length = 60
max_prediction_length = 20
training_cutoff = data["time_idx"].max() - max_prediction_length
training = TimeSeriesDataSet(
data.iloc[0:-620],
time_idx="time_idx",
target="value",
categorical_encoders={"series": NaNLabelEncoder(add_nan=True).fit(data.series), "_hour_of_day": NaNLabelEncoder(add_nan=True).fit(data._hour_of_day), \
"_day_of_week": NaNLabelEncoder(add_nan=True).fit(data._day_of_week), "_day_of_month" : NaNLabelEncoder(add_nan=True).fit(data._day_of_month), "_day_of_year" : NaNLabelEncoder(add_nan=True).fit(data._day_of_year), \
"_week_of_year": NaNLabelEncoder(add_nan=True).fit(data._week_of_year), "_year": NaNLabelEncoder(add_nan=True).fit(data._year)},
group_ids=["series"],
min_encoder_length=max_encoder_length,
max_encoder_length=max_encoder_length,
min_prediction_length=max_prediction_length,
max_prediction_length=max_prediction_length,
time_varying_unknown_reals=["value"],
time_varying_known_categoricals=["_hour_of_day","_day_of_week","_day_of_month","_day_of_year","_week_of_year","_year" ],
time_varying_known_reals=["time_idx"],
add_relative_time_idx=False,
randomize_length=None,
scalers=[],
target_normalizer=TorchNormalizer(method="identity",center=False,transformation=None )
)
validation = TimeSeriesDataSet.from_dataset(
training,
data.iloc[-620:-420],
# predict=True,
stop_randomization=True,
)
batch_size = 64
train_dataloader = training.to_dataloader(train=True, batch_size=batch_size, num_workers=8)
val_dataloader = validation.to_dataloader(train=False, batch_size=batch_size, num_workers=8)
# save datasets
training.save("training.pkl")
validation.save("validation.pkl")
early_stop_callback = EarlyStopping(monitor="val_loss", min_delta=1e-4, patience=5, verbose=False, mode="min")
lr_logger = LearningRateMonitor()
trainer = pl.Trainer(
max_epochs=10,
gpus=0,
gradient_clip_val=0.1,
limit_train_batches=30,
limit_val_batches=3,
# fast_dev_run=True,
# logger=logger,
# profiler=True,
callbacks=[lr_logger, early_stop_callback],
)
deepar = DeepAR.from_dataset(
training,
learning_rate=0.1,
hidden_size=32,
dropout=0.1,
loss=NegativeBinomialDistributionLoss(),
log_interval=10,
log_val_interval=3,
# reduce_on_plateau_patience=3,
)
print(f"Number of parameters in network: {deepar.size()/1e3:.1f}k")
torch.set_num_threads(10)
trainer.fit(
deepar,
train_dataloaders=train_dataloader,
val_dataloaders=val_dataloader,
)
Need to cast samples to torch.tensor as shown below. Then save this base_metrics.py and rerun above code.
except NotImplementedError: # resort to derive quantiles empirically
samples = torch.sort(self.sample(y_pred, n_samples), -1).values
quantiles = torch.quantile(torch.tensor(samples), torch.tensor(quantiles, device=samples.device), dim=2).permute(1, 2, 0)
return quantiles

How compare two images using sikuli in robotframework

am totaly new to the sikuli with robotframework. I have some want idea of click and action of the window based but i dont have the concept of compare the image using sikuli with robotframework. Can any one help me?
This is RaiMan from SikuliX.
I recommend to use
https://github.com/rainmanwy/robotframework-SikuliLibrary
I have developed a python script for image comparison, which gets two images to compare the images and write the difference.
You can use this as robot keyword for image comparison. Challenge is that you have to crop the image from desktop application. you can able to do this using robot framework sikuli library keyword crop region by providing x,y, width and height values
compare images.py
# importing the necessary packages
#from skimage.measure import compare_ssim
from skimage.metrics import structural_similarity
import argparse
import imutils
import cv2
import time
import logging
def Compare_Image(image1,image2,Original_image_path,Modified_image_path):
# Reading images
imageA = cv2.imread(image1)
imageB = cv2.imread(image2)
# convert the images to grayscale
grayA = cv2.cvtColor(imageA, cv2.COLOR_BGR2GRAY)
grayB = cv2.cvtColor(imageB, cv2.COLOR_BGR2GRAY)
#getting the structural similarity between the images
(score, diff) = structural_similarity(grayA, grayB, full=True)
diff = (diff * 255).astype("uint8")
print("SSIM: {}".format(score))
#Threshold differences
thresh = cv2.threshold(diff, 0, 255,
cv2.THRESH_BINARY_INV | cv2.THRESH_OTSU)[1]
cnts = cv2.findContours(thresh.copy(), cv2.RETR_EXTERNAL,
cv2.CHAIN_APPROX_SIMPLE)
cnts = imutils.grab_contours(cnts)
# loop over the contours
for c in cnts:
#Displaying Rectangles on image difference
(x, y, w, h) = cv2.boundingRect(c)
cv2.rectangle(imageA, (x, y), (x + w, y + h), (0, 0, 255), 2)
cv2.rectangle(imageB, (x, y), (x + w, y + h), (0, 0, 255), 2)
# show the output images
image1=cv2.imshow("Original", imageA)
image2=cv2.imshow("Modified", imageB)
cv2.imwrite(Original_image_path, imageA)
cv2.imwrite(Modified_image_path, imageB)
# cv2.imshow("Diff", diff)
# cv2.imshow("Thresh", thresh)
#cv2.waitKey(0)
#cv2.destroyAllWindows()
#Validating and Logging Image Results
if score==1.0:
print("Reference and Test images are same")
else:
print("Reference and Test images are different")
Sample.robot
* Setting *
Library compare_Images.py
* Test Cases *
Check_Image_Compare
Compare Image path_to_image1\\img1.png path_to_image2\\image2.png
path_to_saveorginalimage1\\orgimage.png path_to_savemodifiedimage\\modifiedimage.png
Log <img src="path_to_saveorginalimage1\\orgimage.png"> HTML
Log <img src="path_to_savemodifiedimage\\modifiedimage.png"> HTML
On executing the python script it will give two images (original image- image1 to compare Modified image- image2 to compare) which will be saved in specified location and logged as in robot file above.

Pitch Calculation Error via Autocorrelation Method

Aim : Pitch Calculation
Issue : The calculated pitch does not match the expected one. For instance, the output is approx. 'D3', however the expected output is 'C5'.
Source Sound : https://freewavesamples.com/1980s-casio-celesta-c5
Source Code
library("tuneR")
library("seewave")
#0: Acquisition of sample sound
snd_smpl = readWave(paste("~/Music/sample/1980s-Casio-Celesta-C5.wav"),
from = 0, to = 1, units = "seconds")
dur_smpl = duration(snd_smpl)
len_smpl = length(snd_smpl)
#1 : Pre-Processing Stage
#1.1 : Application of Hanning Window
n = 1:len_smpl
han_win = 0.5-0.5*cos(2*pi*n/(len_smpl-1))
wind_sig = han_win*snd_smpl#left
#2.1 : Auto-Correlation Calculation
rev_wind_sig = rev(wind_sig) #Reversing the windowed signal
acorr_1 = convolve(wind_sig, rev_wind_sig, type = "open")
# Obtaining the 2nd half of the correlation, to simplify calculation
n = 2*len_smpl-1
acorr_2 = (1/len_smpl)*acorr_1[len_smpl:n]
#2.2 : Note Calculation
min_index = which.min(acorr_2)
print(min_index)
fs = 44100
fo = fs/min_index #To obtain fundamental frequency
print(fo)
print(notenames(noteFromFF(fo)))
Output
> print(min_index)
[1] 37
> fs = 44100
> fo = fs/min_index
> print(fo)
[1] 1191.892
> print(notenames(noteFromFF(fo)))
[1] "d'''"
The entire calculation is performed in the Time Domain.
I'm currently using autocorrelation as a base to understand more about Pitch Detection & Analysis. I've tried to analyse the sample with 'Audacity' and the result is 'C5'. Hence, I'm wondering where actually the issue is.
Can you all help me find it?
Also, there are a few but important doubts:
How small should actually my analysis window be (20ms, 1s,..)?
Will reinforcement of the Autocorrelation Algorithm with AMDF and other similar algorithms make this Pitch Detection module more robust?
This whole analysis seems not correct. You should not use windowing in time domain analysis.
Attached a short solution in the python language; you can use it as pseudocode
from soundfile import read
from glob import glob
from scipy.signal import correlate, find_peaks
from matplotlib.pyplot import plot, show, xlim, title, xlabel
import numpy as np
%matplotlib inline
name = glob('*wav')[0]
samples, fs = read(name)
corr = correlate(samples, samples)
corr = corr[corr.size / 2:]
time = np.arange(corr.size) / float(fs)
ind = find_peaks(corr[time < 0.002])[0]
plot(time, corr)
plot(time[ind], corr[ind], '*')
xlim([0, 0.005])
title('Frequency = {} Hz'.format(1 / time[ind][0]))
xlabel('Time [Sec]')
show()

Error when implementing RBF kernel bandwidth differentiation in Pytorch

I'm implementing an RBF network by using some beginer examples from Pytorch Website. I have a problem when implementing the kernel bandwidth differentiation for the network. Also, Iwould like to know whether my attempt ti implement the idea is fine. This is a code sample to reproduce the issue. Thanks
# -*- coding: utf-8 -*-
import torch
from torch.autograd import Variable
def kernel_product(x,y, mode = "gaussian", s = 1.):
x_i = x.unsqueeze(1)
y_j = y.unsqueeze(0)
xmy = ((x_i-y_j)**2).sum(2)
if mode == "gaussian" : K = torch.exp( - xmy/s**2) )
elif mode == "laplace" : K = torch.exp( - torch.sqrt(xmy + (s**2)))
elif mode == "energy" : K = torch.pow( xmy + (s**2), -.25 )
return torch.t(K)
class MyReLU(torch.autograd.Function):
"""
We can implement our own custom autograd Functions by subclassing
torch.autograd.Function and implementing the forward and backward passes
which operate on Tensors.
"""
#staticmethod
def forward(ctx, input):
"""
In the forward pass we receive a Tensor containing the input and return
a Tensor containing the output. ctx is a context object that can be used
to stash information for backward computation. You can cache arbitrary
objects for use in the backward pass using the ctx.save_for_backward method.
"""
ctx.save_for_backward(input)
return input.clamp(min=0)
#staticmethod
def backward(ctx, grad_output):
"""
In the backward pass we receive a Tensor containing the gradient of the loss
with respect to the output, and we need to compute the gradient of the loss
with respect to the input.
"""
input, = ctx.saved_tensors
grad_input = grad_output.clone()
grad_input[input < 0] = 0
return grad_input
dtype = torch.cuda.FloatTensor
N, D_in, H, D_out = 64, 1000, 100, 10
# Create random Tensors to hold input and outputs, and wrap them in Variables.
x = Variable(torch.randn(N, D_in).type(dtype), requires_grad=False)
y = Variable(torch.randn(N, D_out).type(dtype), requires_grad=False)
# Create random Tensors for weights, and wrap them in Variables.
w1 = Variable(torch.randn(H, D_in).type(dtype), requires_grad=True)
w2 = Variable(torch.randn(H, D_out).type(dtype), requires_grad=True)
# I've created this scalar variable (the kernel bandwidth)
s = Variable(torch.randn(1).type(dtype), requires_grad=True)
learning_rate = 1e-6
for t in range(500):
# To apply our Function, we use Function.apply method. We alias this as 'relu'.
relu = MyReLU.apply
# Forward pass: compute predicted y using operations on Variables; we compute
# ReLU using our custom autograd operation.
# y_pred = relu(x.mm(w1)).mm(w2)
y_pred = relu(kernel_product(w1, x, s)).mm(w2)
# Compute and print loss
loss = (y_pred - y).pow(2).sum()
print(t, loss.data[0])
# Use autograd to compute the backward pass.
loss.backward()
# Update weights using gradient descent
w1.data -= learning_rate * w1.grad.data
w2.data -= learning_rate * w2.grad.data
# Manually zero the gradients after updating weights
w1.grad.data.zero_()
w2.grad.data.zero_()
However I get this error, which dissapears when I simply use a fixed scalar in the default input parameter of kernel_product():
RuntimeError: eq() received an invalid combination of arguments - got (str), but expected one of:
* (float other)
didn't match because some of the arguments have invalid types: (str)
* (Variable other)
didn't match because some of the arguments have invalid types: (str)
Well, you are calling kernel_product(w1, x, s) where w1, x and s are torch Variable while the definition of the function is: kernel_product(x,y, mode = "gaussian", s = 1.). Seems like s should be a string specifying the mode.

ImageView.setImage axes parameter does not switch X-Y dimensions

I have modified the ImageView example by adding the statement data[:, ::10, :] = 0, which sets every tenth element of the middle dimension to 0. The program now shows horizontal lines. This is consistent with the documentation of the ImageView.setImage function: the default axes dictionary is {'t':0, 'x':1, 'y':2, 'c':3}. However, when I change this to {'t':0, 'x':2, 'y':1, 'c':3}, nothing changes where I would expect to get vertical rows.
So my question is: how can I give the row dimension a higher precedence in PyQtGraph? Of course I can transpose all my arrays myself before passing them to the setImage function but I prefer not to. Especially since both Numpy and Qt use the row/column convention and not X before Y. I don't see why PyQtGraph chooses the latter.
For completeness, find my modified ImageView example below.
import numpy as np
from pyqtgraph.Qt import QtCore, QtGui
import pyqtgraph as pg
app = QtGui.QApplication([])
## Create window with ImageView widget
win = QtGui.QMainWindow()
win.resize(800,800)
imv = pg.ImageView()
win.setCentralWidget(imv)
win.show()
win.setWindowTitle('pyqtgraph example: ImageView')
## Create random 3D data set with noisy signals
img = pg.gaussianFilter(np.random.normal(size=(200, 200)), (5, 5)) * 20 + 100
img = img[np.newaxis,:,:]
decay = np.exp(-np.linspace(0,0.3,100))[:,np.newaxis,np.newaxis]
data = np.random.normal(size=(100, 200, 200))
data += img * decay
data += 2
## Add time-varying signal
sig = np.zeros(data.shape[0])
sig[30:] += np.exp(-np.linspace(1,10, 70))
sig[40:] += np.exp(-np.linspace(1,10, 60))
sig[70:] += np.exp(-np.linspace(1,10, 30))
sig = sig[:,np.newaxis,np.newaxis] * 3
data[:,50:60,50:60] += sig
data[:, ::10, :] = 0 # Make image a-symmetrical
## Display the data and assign each frame a time value from 1.0 to 3.0
imv.setImage(data, xvals=np.linspace(1., 3., data.shape[0]),
axes={'t':0, 'x':2, 'y':1, 'c':3}) # doesn't help
## Start Qt event loop unless running in interactive mode.
if __name__ == '__main__':
import sys
if (sys.flags.interactive != 1) or not hasattr(QtCore, 'PYQT_VERSION'):
QtGui.QApplication.instance().exec_()
Looking through ImageView.py, setImage() parses the axes dictionary and based on presence of 't' it builds the z-axis/frame slider, and that's it. Rearranging the axes seems unimplemented yet.

Resources