Librosa Display Waveplot, why are they totally blue and flat? - wav

I followed this example of Music Synchronization with Dynamic Time Warping
However, when I do this:
import matplolib.pyplot as plt
import librosa
import librosa.display
x_1, fs = librosa.load('musicdata/slow_melody.wav')
plt.figure(figsize=(16, 4))
librosa.display.waveplot(x_1, sr=fs)
plt.title('Slower Version $X_1$')
plt.tight_layout()
and same for the faster version, I get this result:
I can properly reach the pitch classes of the wav files in chroma representations and there are no problems in the wav files.
I created the fast and slow versions of the wav files like this:
# Tone-duration sequence
melody = [('E', 0.3), ('E', 0.3), ('F', 0.3), ('G', 0.3)]
slow_melody = [('E', 0.6), ('E', 0.6), ('F', 0.6), ('G', 0.6)]
melody_output = np.array([])
# Construct the audio signal based on the chord sequence
for item in melody:
input_tone = item[0]
duration = item[1]
synthesized_tone = synthesizer(tone_freq_map[input_tone], duration, amplitude, sampling_freq)
melody_output = np.append(melody_output, synthesized_tone, axis=0)
# Write to the output file
name = 'melody' + '.wav'
write(name, sampling_freq, melody_output)
slow_melody_output = np.array([])
# Construct the audio signal based on the chord sequence
for item in slow_melody:
input_tone = item[0]
duration = item[1]
synthesized_tone = synthesizer(tone_freq_map[input_tone], duration, amplitude, sampling_freq)
slow_melody_output = np.append(slow_melody_output, synthesized_tone, axis=0)
# Write to the output file
name = 'slow_melody' + '.wav'
write(name, sampling_freq, slow_melody_output)
I get the tone frequencies from:
{
"A": 440,
"Asharp": 466,
"B": 494,
"C": 523,
"Csharp": 554,
"D": 587,
"Dsharp": 622,
"E": 659,
"F": 698,
"Fsharp": 740,
"G": 784,
"Gsharp": 831
}
Synthesizer is:
def synthesizer(freq, duration, amp=1.0, sampling_freq=44100):
# Build the time axis
t = np.linspace(0, duration, duration * sampling_freq)
# Construct the audio signal
audio = amp * np.sin(2 * np.pi * freq * t)
return audio.astype(np.int16)
Also, the input parameters are:
duration = 2
amplitude = 10000
sampling_freq = 44100
So, why couldn't I properly visualize the waveplots? What could be the reason that they appear like this?

I believe there is something wrong in the tutorial you are following. librosa.display.waveplot() doesn't plot anything by itself, you still have to call plt.show() to visualize it. From the official documentation, here's an example of it's usage:
y, sr = librosa.load(librosa.util.example_audio_file(), duration=10)
librosa.display.waveplot(y_harm, sr=sr, alpha=0.25)
plt.tight_layout()
plt.show()
You can find more info here https://librosa.github.io/librosa/generated/librosa.display.waveplot.html

Related

urequests.get cant get answer anymore

I was running code on my NodeMCU esp8266 for switching light in my fish tank according to sunset and sunrise. Today code frozen on line 8, and I spend few hours to figure out but it seems I can't find solution. I tried to add headers and proxies to
r = requests.get(url) but no success
boot.py
import network
ssid = 'SSID'
password = 'PASS'
station = network.WLAN(network.STA_IF)
station.active(True)
station.connect(ssid, password)
while station.isconnected() == False:
pass
main.py
import urequests as requests
import ujson, ntptime, utime
import ssd1306
from machine import Pin, PWM, SoftI2C
from utime import time, sleep
url = 'https://api.sunrise-sunset.org/json?lat=50.147240&lng=18.838700&formatted=0'
r = requests.get(url)
timezone_hour = 2 # timezone offset (hours)
Blue = PWM(Pin(14), 1000)
Red = PWM(Pin(12), 1000)
White_1 = PWM(Pin(13), 1000)
i2c = SoftI2C(scl=Pin(5), sda=Pin(4))
while True:
ntptime.settime()
now = utime.localtime()
day = now[0],now[1],now[2]
hours = now[3]+timezone_hour,now[4]
data = ujson.loads(r.content)
sunrise = data['results']['sunrise']
sunset = data['results']['sunset']
sunrise_time = int(sunrise[11:13])+timezone_hour, int(sunrise[14:16])
sunset_time = int(sunset[11:13])+timezone_hour, int(sunset[14:16])
hours_string = str(hours)
sunrise_time_string = str(sunrise_time)
sunset_time_string = str(sunset_time)
oled_width = 128
oled_height = 64
oled = ssd1306.SSD1306_I2C(oled_width, oled_height, i2c)
if hours > sunrise_time and hours < sunset_time:
Blue.duty(1024)
Red.duty(1024)
White_1.duty(1024)
oled.fill(0)
oled.text(hours_string, 0, 0)
oled.text(sunrise_time_string, 0, 10)
oled.text(sunset_time_string, 0, 20)
oled.text("sunrise", 0, 30)
oled.show()
print(hours_string)
else:
Blue.duty(5)
Red.duty(0)
White_1.duty(0)
oled.fill(0)
oled.text(hours_string, 0, 0)
oled.text(sunrise_time_string, 0, 10)
oled.text(sunset_time_string, 0, 20)
oled.text("sunset", 0, 30)
oled.show()
sleep(60)
if str(sunrise[0:10]) != str(day):
r = requests.get(url)
errors:
Traceback (most recent call last):
File "<stdin>", line 8, in <module>
File "urequests.py", line 116, in get
File "urequests.py", line 62, in request
OSError: -40

What problems can lead to a CuDNNError with ConvolutionND

I am using three-dimensional convolution links (with ConvolutionND) in my chain.
The forward computation run smoothly (I checked intermediate result shapes to be sure I understood correctly the meaning of the parameters of convolution_nd), but during the backward a CuDNNError is raised with the message CUDNN_STATUS_NOT_SUPPORTED.
The cover_all parameter of ConvolutionND as its default value of False, so from the doc I don't see what can be the cause of the error.
Here is how I defind one of the convolution layers :
self.conv1 = chainer.links.ConvolutionND(3, 1, 4, (3, 3, 3)).to_gpu(self.GPU_1_ID)
And the call stack is
File "chainer/function_node.py", line 548, in backward_accumulate
gxs = self.backward(target_input_indexes, grad_outputs)
File "chainer/functions/connection/convolution_nd.py", line 118, in backward
gy, W, stride=self.stride, pad=self.pad, outsize=x_shape)
File "chainer/functions/connection/deconvolution_nd.py", line 310, in deconvolution_nd
y, = func.apply(args)
File chainer/function_node.py", line 258, in apply
outputs = self.forward(in_data)
File "chainer/functions/connection/deconvolution_nd.py", line 128, in forward
return self._forward_cudnn(x, W, b)
File "chainer/functions/connection/deconvolution_nd.py", line 105, in _forward_cudnn
tensor_core=tensor_core)
File "cupy/cudnn.pyx", line 881, in cupy.cudnn.convolution_backward_data
File "cupy/cuda/cudnn.pyx", line 975, in cupy.cuda.cudnn.convolutionBackwardData_v3
File "cupy/cuda/cudnn.pyx", line 461, in cupy.cuda.cudnn.check_status
cupy.cuda.cudnn.CuDNNError: CUDNN_STATUS_NOT_SUPPORTED
So are there special points to take care of when using ConvolutionND ?
A failing code is for instance :
import chainer
from chainer import functions as F
from chainer import links as L
from chainer.backends import cuda
import numpy as np
import cupy as cp
chainer.global_config.cudnn_deterministic = False
NB_MASKS = 60
NB_FCN = 3
NB_CLASS = 17
class MFEChain(chainer.Chain):
"""docstring for Wavelphasenet."""
def __init__(self,
FCN_Dim,
gpu_ids=None):
super(MFEChain, self).__init__()
self.GPU_0_ID, self.GPU_1_ID = (0, 1) if gpu_ids is None else gpu_ids
with self.init_scope():
self.conv1 = chainer.links.ConvolutionND(3, 1, 4, (3, 3, 3)).to_gpu(
self.GPU_1_ID
)
def __call__(self, inputs):
### Pad input ###
processed_sequences = []
for convolved in inputs:
## Transform to sequences)
copy = convolved if self.GPU_0_ID == self.GPU_1_ID else F.copy(convolved, self.GPU_1_ID)
processed_sequences.append(copy)
reprocessed_sequences = []
with cuda.get_device(self.GPU_1_ID):
for convolved in processed_sequences:
convolved = F.expand_dims(convolved, 0)
convolved = F.expand_dims(convolved, 0)
convolved = self.conv1(convolved)
reprocessed_sequences.append(convolved)
states = F.vstack(reprocessed_sequences)
logits = states
ret_logits = logits if self.GPU_0_ID == self.GPU_1_ID else F.copy(logits, self.GPU_0_ID)
return ret_logits
def mfe_test():
mfe = MFEChain(150)
inputs = list(
chainer.Variable(
cp.random.randn(
NB_MASKS,
11,
in_len,
dtype=cp.float32
)
) for in_len in [53248]
)
val = mfe(inputs)
grad = cp.ones(val.shape, dtype=cp.float32)
val.grad = grad
val.backward()
for i in inputs:
print(i.grad)
if __name__ == "__main__":
mfe_test()
cupy.cuda.cudnn.convolutionBackwardData_v3 is incompatible with some specific parameters, as described in an issue in official github.
Unfortunately, the issue only dealt with deconvolution_2d.py (not deconvolution_nd.py), therefore the decision-making about whether cudnn is used or not failed in your case, I guess.
you can check your parameter by confirming
check whether dilation parameter (!=1) or group parameter (!=1) is passed to the convolution.
print chainer.config.cudnn_deterministic, configuration.config.autotune, and configuration.config.use_cudnn_tensor_core.
Further support may be obtained by raising an issue in the official github.
The code you showed is much complicated.
To clarify the problem, the code below would help.
from chainer import Variable, Chain
from chainer import links as L
from chainer import functions as F
import numpy as np
from six import print_
batch_size = 1
in_channel = 1
out_channel = 1
class MyLink(Chain):
def __init__(self):
super(MyLink, self).__init__()
with self.init_scope():
self.conv = L.ConvolutionND(3, 1, 1, (3, 3, 3), nobias=True, initialW=np.ones((in_channel, out_channel, 3, 3, 3)))
def __call__(self, x):
return F.sum(self.conv(x))
if __name__ == "__main__":
my_link = MyLink()
my_link.to_gpu(0)
batch = Variable(np.ones((batch_size, in_channel, 3, 3, 3)))
batch.to_gpu(0)
loss = my_link(batch)
loss.backward()
print_(batch.grad)

How to use pykalman filter_update for online regression

I want to use Kalman regression recursively on an incoming stream of price data using kf.filter_update() but I can't make it work. Here's the example code framing the problem:
The dataset (i.e. the stream):
DateTime CAT DOG
2015-01-02 09:01:00, 1471.24, 9868.76
2015-01-02 09:02:00, 1471.75, 9877.75
2015-01-02 09:03:00, 1471.81, 9867.70
2015-01-02 09:04:00, 1471.59, 9849.03
2015-01-02 09:05:00, 1471.45, 9840.15
2015-01-02 09:06:00, 1471.16, 9852.71
2015-01-02 09:07:00, 1471.30, 9860.24
2015-01-02 09:08:00, 1471.39, 9862.94
The data is read into a Pandas dataframe and the following code simulates the stream by iterating over the df:
df = pd.read_csv('data.txt')
df.dropna(inplace=True)
history = {}
history["spread"] = []
history["state_means"] = []
history["state_covs"] = []
for idx, row in df.iterrows():
if idx == 0: # Initialize the Kalman filter
delta = 1e-9
trans_cov = delta / (1 - delta) * np.eye(2)
obs_mat = np.vstack([df.iloc[0].CAT, np.ones(df.iloc[0].CAT.shape)]).T[:, np.newaxis]
kf = KalmanFilter(n_dim_obs=1, n_dim_state=2,
initial_state_mean=np.zeros(2),
initial_state_covariance=np.ones((2, 2)),
transition_matrices=np.eye(2),
observation_matrices=obs_mat,
observation_covariance=1.0,
transition_covariance=trans_cov)
state_means, state_covs = kf.filter(np.asarray(df.iloc[0].DOG))
history["state_means"], history["state_covs"] = state_means, state_covs
slope=state_means[:, 0]
print "SLOPE", slope
else:
state_means, state_covs = kf.filter_update(history["state_means"][-1], history["state_covs"][-1], observation = np.asarray(df.iloc[idx].DOG))
history["state_means"].append(state_means)
history["state_covs"].append(state_covs)
slope=state_means[:, 0]
print "SLOPE", slope
The Kalman filter initializes properly and I get the first regression coefficient, but the subsequent updates throws an exception:
Traceback (most recent call last):
SLOPE [ 6.70319125]
File "C:/Users/.../KalmanUpdate_example.py", line 50, in <module>
KalmanOnline(df)
File "C:/Users/.../KalmanUpdate_example.py", line 43, in KalmanOnline
state_means, state_covs = kf.filter_update(history["state_means"][-1], history["state_covs"][-1], observation = np.asarray(df.iloc[idx].DOG))
File "C:\Python27\Lib\site-packages\pykalman\standard.py", line 1253, in filter_update
2, "observation_matrix"
File "C:\Python27\Lib\site-packages\pykalman\standard.py", line 38, in _arg_or_default
+ ' You must specify it manually.') % (name,)
ValueError: observation_matrix is not constant for all time. You must specify it manually.
Process finished with exit code 1
It seems intuitively clear that the observation matrix is required (it's provided in the initial step, but not in the updating steps), but I cannot figure out how to set it up properly. Any feedback would be highly appreciated.
Pykalman allows you to declare the observation matrix in two ways:
[n_timesteps, n_dim_obs, n_dim_obs] - once for the whole estimation
[n_dim_obs, n_dim_obs] - separately for each estimation step
In your code you used the first option (that's why "observation_matrix is not constant for all time"). But then you used filter_update in the loop and Pykalman could not understand what to use as the observation matrix in each iteration.
I would declare the observation matrix as a 2-element array:
from pykalman import KalmanFilter
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
df = pd.read_csv('data.txt')
df.dropna(inplace=True)
n = df.shape[0]
n_dim_state = 2;
history_state_means = np.zeros((n, n_dim_state))
history_state_covs = np.zeros((n, n_dim_state, n_dim_state))
for idx, row in df.iterrows():
if idx == 0: # Initialize the Kalman filter
delta = 1e-9
trans_cov = delta / (1 - delta) * np.eye(2)
obs_mat = [df.iloc[0].CAT, 1]
kf = KalmanFilter(n_dim_obs=1, n_dim_state=2,
initial_state_mean=np.zeros(2),
initial_state_covariance=np.ones((2, 2)),
transition_matrices=np.eye(2),
observation_matrices=obs_mat,
observation_covariance=1.0,
transition_covariance=trans_cov)
history_state_means[0], history_state_covs[0] = kf.filter(np.asarray(df.iloc[0].DOG))
slope=history_state_means[0, 0]
print "SLOPE", slope
else:
obs_mat = np.asarray([[df.iloc[idx].CAT, 1]])
history_state_means[idx], history_state_covs[idx] = kf.filter_update(history_state_means[idx-1],
history_state_covs[idx-1],
observation = df.iloc[idx].DOG,
observation_matrix=obs_mat)
slope=history_state_means[idx, 0]
print "SLOPE", slope
plt.figure(1)
plt.plot(history_state_means[:, 0], label="Slope")
plt.grid()
plt.show()
It results in the following output:
SLOPE 6.70322464199
SLOPE 6.70512037269
SLOPE 6.70337808649
SLOPE 6.69956406785
SLOPE 6.6961767953
SLOPE 6.69558438828
SLOPE 6.69581682668
SLOPE 6.69617670459
The Pykalman is not really good documented and there are mistakes on the official page. That's why I recomend to test the result using the offline estimation in one step. In this case the observation matrix has to be declared as you did it in your code.
from pykalman import KalmanFilter
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
df = pd.read_csv('data.txt')
df.dropna(inplace=True)
delta = 1e-9
trans_cov = delta / (1 - delta) * np.eye(2)
obs_mat = np.vstack([df.iloc[:].CAT, np.ones(df.iloc[:].CAT.shape)]).T[:, np.newaxis]
kf = KalmanFilter(n_dim_obs=1, n_dim_state=2,
initial_state_mean=np.zeros(2),
initial_state_covariance=np.ones((2, 2)),
transition_matrices=np.eye(2),
observation_matrices=obs_mat,
observation_covariance=1.0,
transition_covariance=trans_cov)
state_means, state_covs = kf.filter(df.iloc[:].DOG)
print "SLOPE", state_means[:, 0]
plt.figure(1)
plt.plot(state_means[:, 0], label="Slope")
plt.grid()
plt.show()
The result is the same.

PyTorch RuntimeError: Assertion `cur_target >= 0 && cur_target < n_classes' failed

I’m trying to create a basic binary classifier in Pytorch that classifies whether my player plays on the right or the left side in the game Pong. The input is an 1x42x42 image and the label is my player's side (right = 1 or left = 2). The code:
class Net(nn.Module):
def __init__(self, input_size, hidden_size, num_classes):
super(Net, self).__init__()
self.fc1 = nn.Linear(input_size, hidden_size)
self.relu = nn.ReLU()
self.fc2 = nn.Linear(hidden_size, num_classes)
def forward(self, x):
out = self.fc1(x)
out = self.relu(out)
out = self.fc2(out)
return out
net = Net(42 * 42, 100, 2)
# Loss and Optimizer
criterion = nn.CrossEntropyLoss()
optimizer_net = torch.optim.Adam(net.parameters(), 0.001)
net.train()
while True:
state = get_game_img()
state = torch.from_numpy(state)
# right = 1, left = 2
current_side = get_player_side()
target = torch.LongTensor(current_side)
x = Variable(state.view(-1, 42 * 42))
y = Variable(target)
optimizer_net.zero_grad()
y_pred = net(x)
loss = criterion(y_pred, y)
loss.backward()
optimizer.step()
The error I get:
File "train.py", line 109, in train
loss = criterion(y_pred, y)
File "/home/shani/anaconda2/lib/python2.7/site-packages/torch/nn/modules/module.py", line 206, in __call__
result = self.forward(*input, **kwargs)
File "/home/shani/anaconda2/lib/python2.7/site-packages/torch/nn/modules/loss.py", line 321, in forward
self.weight, self.size_average)
File "/home/shani/anaconda2/lib/python2.7/site-packages/torch/nn/functional.py", line 533, in cross_entropy
return nll_loss(log_softmax(input), target, weight, size_average)
File "/home/shani/anaconda2/lib/python2.7/site-packages/torch/nn/functional.py", line 501, in nll_loss
return f(input, target)
File "/home/shani/anaconda2/lib/python2.7/site-packages/torch/nn/_functions/thnn/auto.py", line 41, in forward
output, *self.additional_args)
RuntimeError: Assertion `cur_target >= 0 && cur_target < n_classes' failed. at /py/conda-bld/pytorch_1493676237139/work/torch/lib/THNN/generic/ClassNLLCriterion.c:57
For most of deeplearning library, target(or label) should start from 0.
It means that your target should be in the range of [0,n) with n-classes.
It looks like PyTorch expect to get zero-based labels (0/1 in your case) and you probably feed it with one-based labels (1/2)
I had the same error in my program and i just realized that the problem was in the number of output nodes in my neural network
In my program the number of output nodes of my model was not equal to the number of labels in dataset
the number of output was 1 and the number of target labels was 10. then i changed the number of output to 10, there was no error

Categorical image classification always predicts one class, though calculated accuracy reaches 100%

I followed the Keras cat/dog image classification tutorial
Keras Image Classification tutorial
and found similar results to the reported values. I then took the code from the first example in that tutorial Tutorial Example 1 code, slightly altered a few lines, and trained the model for a dataset of grayscale images (~150 thousand images across 7 classes).
This gave me great initial results ( ~84% accuracy), which I am happy with.
Next I tried implementing the image batch generator myself, which is where I am having trouble. Briefly, the code seems to run well, except the reported accuracy of the model quickly shoots to >= 99% within two epochs. Due to noise in the dataset, this amount of accuracy is not believable. After using the trained model to predict a new batch of data ( images outside of the training or validation dataset ), I find the model always predicts the first class ( i.e. [1.,0.,0.,0.,0.,0.,0.]. The loss function is forcing the model to predict a single class 100% of the time, even though the labels I pass in are distributed across all the classes.
After 28 epochs of training, I see the following output:
320/320 [==============================] - 1114s - loss: 1.5820e-07 - categorical_accuracy: 1.0000 - sparse_categorical_accuracy: 0.0000e+00 - val_loss: 16.1181 - val_categorical_accuracy: 0.0000e+00 - val_sparse_categorical_accuracy: 0.0000e+00
When I examine the batch generator output from the tutorial code, and compare my batch generator output, the shape, datatype, and range of values are identical between both generators. I would like to emphasize that the generator passes y labels from each category, not just array([ 1.., 0., 0., 0., 0., 0., 0.], dtype=float32). Therefore, I am lost as to what I am doing incorrectly.
Since I posted this code several days ago, I have used the default Keras image generator, and successfully trained the network on the same dataset and same network architecture. Therefore, something about how I load and pass the data in the generator must be incorrect.
Here is the code I implemented:
from keras.models import Sequential
from keras.layers import Conv2D, MaxPooling2D
from keras.layers import Activation, Dropout, Flatten, Dense
from keras.optimizers import SGD
from keras.callbacks import ModelCheckpoint, EarlyStopping, ReduceLROnPlateau
import imgaug as ia
from imgaug import augmenters as iaa
import numpy as np
import numpy.random as nprand
import imageio
import os, re, random, sys, csv
import scipy
img_width, img_height = 112, 112
input_shape = (img_width,img_height,1)
batch_size = 200
epochs = 2
train_image_directory = '/PATH/To/Directory/train/'
valid_image_directory = '/PATH/To/Directory/validate/'
video_info_file = '/PATH/To/Directory/train_labels.csv'
train_image_paths = [train_image_directory + m.group(1) for m in [re.match(r"(\d+_\d+\.png)", fname) for fname in os.listdir(train_image_directory)] if m is not None]
valid_image_paths = [valid_image_directory + m.group(1) for m in [re.match(r"(\d+_\d+\.png)", fname) for fname in os.listdir(valid_image_directory)] if m is not None]
num_train_images = len(train_image_paths)
num_val_images = len(valid_image_paths)
label_map = {}
label_decode = {
'0': [1.,0.,0.,0.,0.,0.,0.],
'1': [0.,1.,0.,0.,0.,0.,0.],
'2': [0.,0.,1.,0.,0.,0.,0.],
'3': [0.,0.,0.,1.,0.,0.,0.],
'4': [0.,0.,0.,0.,1.,0.,0.],
'5': [0.,0.,0.,0.,0.,1.,0.],
'6': [0.,0.,0.,0.,0.,0.,1.]
}
with open(video_info_file) as f:
reader = csv.reader(f)
for row in reader:
key = row[0]
if key in label_map:
pass
label_map[key] = label_decode[row[1]]
sometimes = lambda aug: iaa.Sometimes(0.5,aug)
seq = iaa.Sequential(
[
iaa.Fliplr(0.5),
iaa.Flipud(0.2),
sometimes(iaa.Crop(percent=(0, 0.1))),
sometimes(iaa.Affine(
scale={"x": (0.8, 1.2), "y": (0.8, 1.2)},
translate_percent={"x": (-0.2, 0.2), "y": (-0.2, 0.2)},
rotate=(-5, 5),
shear=(-16, 16),
order=[0, 1],
cval=(0, 1),
mode=ia.ALL
)),
iaa.SomeOf((0, 3),
[
sometimes(iaa.Superpixels(p_replace=(0, 0.40), n_segments=(20, 100))),
iaa.Sharpen(alpha=(0, 1.0), lightness=(0.75, 1.5)),
iaa.Emboss(alpha=(0, 1.0), strength=(0, 1.0)),
iaa.AdditiveGaussianNoise(loc=0, scale=(0.0, 0.05*255)),
iaa.OneOf([
iaa.Dropout((0.01, 0.1)),
iaa.CoarseDropout((0.03, 0.15), size_percent=(0.02, 0.05)),
]),
iaa.Invert(0.05),
iaa.Add((-10, 10)),
iaa.Multiply((0.5, 1.5), per_channel=0.5),
iaa.ContrastNormalization((0.5, 2.0)),
sometimes(iaa.ElasticTransformation(alpha=(0.5, 1.5), sigma=0.2)),
sometimes(iaa.PiecewiseAffine(scale=(0.01, 0.03))) # sometimes move parts of the image around
],
random_order=True
)
],
random_order=True)
def image_data_generator(image_paths, labels, batch_size, training):
while(1):
image_paths = nprand.choice(image_paths, batch_size)
X0 = np.asarray([imageio.imread(x) for x in image_paths])
Y = np.asarray([labels[x] for x in image_paths],dtype=np.float32)
if(training):
X = np.divide(np.expand_dims(seq.augment_images(X0)[:,:,:,0],axis=3),255.)
else:
X = np.expand_dims(np.divide(X0[:,:,:,0],255.),axis=3)
X = np.asarray(X,dtype=np.float32)
yield X,Y
def predict_videos(model,video_paths):
i=0
predictions=[]
while(i < len(video_paths)):
video_reader = imageio.get_reader(video_paths[i])
X0 = np.expand_dims([ im[:,:,0] for x,im in enumerate(video_reader) ],axis=3)
prediction = model.predict(X0)
i=i+1
predictions.append(prediction)
return predictions
train_gen = image_data_generator(train_image_paths,label_map,batch_size,True)
val_gen = image_data_generator(valid_image_paths,label_map,batch_size,False)
model = Sequential()
model.add(Conv2D(32, (3, 3), input_shape=input_shape))
model.add(Activation('relu'))
model.add(MaxPooling2D(pool_size=(2, 2)))
model.add(Conv2D(32, (3, 3)))
model.add(Activation('relu'))
model.add(MaxPooling2D(pool_size=(2, 2)))
model.add(Conv2D(64, (3, 3)))
model.add(Activation('relu'))
model.add(MaxPooling2D(pool_size=(2, 2)))
model.add(Flatten())
model.add(Dense(64))
model.add(Activation('relu'))
model.add(Dropout(0.4))
model.add(Dense(7))
model.add(Activation('softmax'))
model.load_weights('/PATH/To_pretrained_weights/pretrained_model.h5')
sgd = SGD(lr=0.01, decay=1e-6, momentum=0.9, nesterov=True)
model.compile(loss='categorical_crossentropy',
optimizer='sgd',
metrics=['categorical_accuracy','sparse_categorical_accuracy'])
checkpointer = ModelCheckpoint('/PATH/To_pretrained_weights/pretrained_model.h5', monitor='val_loss', verbose=0, save_best_only=True, save_weights_only=False, mode='auto', period=1)
reduceLR = ReduceLROnPlateau(monitor='val_loss', factor=0.2, patience=20, verbose=0, mode='auto', cooldown=0, min_lr=0)
early_stop = EarlyStopping(monitor='val_loss', patience=20, verbose=1)
callbacks_list = [checkpointer, early_stop, reduceLR]
model.fit_generator(
train_gen,
steps_per_epoch = -(-num_train_images // batch_size),
epochs=epochs,
validation_data=val_gen,
validation_steps = -(-num_val_images // batch_size),
callbacks=callbacks_list)
For some reason that I cannot fully determine, if you do not give the fit_generator function accurate numbers for steps per epoch or steps for validation, the result is inaccurate reporting of the accuracy metric and strange gradient descent steps.
You can fix this problem by using the Train_on_batch function in Keras instead of the fit generator, or by accurately reporting these step numbers.

Resources