Calculate RSI indicator according to tradingview? - python-3.6

I would like to calculate RSI 14 in line with the tradingview chart.
According to there wiki this should be the solution:
https://www.tradingview.com/wiki/Talk:Relative_Strength_Index_(RSI)
I implemented this is in a object called RSI:
Calling within object RSI:
self.df['rsi1'] = self.calculate_RSI_method_1(self.df, period=self.period)
Implementation of the code the calculation:
def calculate_RSI_method_1(self, ohlc: pd.DataFrame, period: int = 14) -> pd.Series:
delta = ohlc["close"].diff()
ohlc['up'] = delta.copy()
ohlc['down'] = delta.copy()
ohlc['up'] = pd.to_numeric(ohlc['up'])
ohlc['down'] = pd.to_numeric(ohlc['down'])
ohlc['up'][ohlc['up'] < 0] = 0
ohlc['down'][ohlc['down'] > 0] = 0
# This one below is not correct, but why?
ohlc['_gain'] = ohlc['up'].ewm(com=(period - 1), min_periods=period).mean()
ohlc['_loss'] = ohlc['down'].abs().ewm(com=(period - 1), min_periods=period).mean()
ohlc['RS`'] = ohlc['_gain']/ohlc['_loss']
ohlc['rsi'] = pd.Series(100 - (100 / (1 + ohlc['RS`'])))
self.currentvalue = round(self.df['rsi'].iloc[-1], 8)
print (self.currentvalue)
self.exportspreadsheetfordebugging(ohlc, 'calculate_RSI_method_1', self.symbol)
I tested several other solution like e.g but non return a good value:
https://github.com/peerchemist/finta
https://gist.github.com/jmoz/1f93b264650376131ed65875782df386
Therefore I created a unittest based on :
https://school.stockcharts.com/doku.php?id=technical_indicators:relative_strength_index_rsi
I created an input file: (See excel image below)
and a output file: (See excel image below)
Running the unittest (unittest code not included here) should result in but is only checking the last value.
if result == 37.77295211:
log.info("Unit test 001 - PASSED")
return True
else:
log.error("Unit test 001 - NOT PASSED")
return False
But again I cannot pass the test.
I checked all values by help with excel.
So now i'm a little bit lost.
If I'm following this question:
Calculate RSI indicator from pandas DataFrame?
But this will not give any value in the gain.
a) How should the calculation be in order to align the unittest?
b) How should the calculation be in order to align with tradingview?

Here is a Python implementation of the current RSI indicator version in TradingView:
https://github.com/lukaszbinden/rsi_tradingview/blob/main/rsi.py

I had same issue in calculating RSI and the result was different from TradingView,
I have found RSI Step 2 formula described in InvestoPedia and I changed the code as below:
N = 14
close_price0 = float(klines[0][4])
gain_avg0 = loss_avg0 = close_price0
for kline in klines[1:]:
close_price = float(kline[4])
if close_price > close_price0:
gain = close_price - close_price0
loss = 0
else:
gain = 0
loss = close_price0 - close_price
close_price0 = close_price
gain_avg = (gain_avg0 * (N - 1) + gain) / N
loss_avg = (loss_avg0 * (N - 1) + loss) / N
rsi = 100 - 100 / (1 + gain_avg / loss_avg)
gain_avg0 = gain_avg
loss_avg0 = loss_avg
N is the number of period for calculating RSI (by default = 14)
the code is put in a loop to calculate all RSI values for a series.

For those who are experience the same.
My raw data contained ticks where the volume is zero. Filtering this OLHCV rows will directly give the good results.

Related

For loop in R with many values

I have equations which I would like to test with many different values and find the one solution for those equations. So the idea is to find T_out, and all the other values are known. For this I want to test values from 45 to 30 in 0.001 steps.
T_out =326.5
for (i in seq(45, 30, by = -0.001)){
T_out=T_out-i;
Q_in=0.16*0.8*(316-300.4-T_out-300.4) / (log(316-300.4 / T_out-300.4));
Q_out=0.00762*1512*(316-T_out);
if (Q_in-Q_out==0){
break
end}
T_out_new=T_out-i;
}
But nothing happens. Do you know what is the mistake?
These are vectorised function and you should need for loop for this.
T_out = 326.5
vals = seq(45, 30, by = -0.001)
T_val = T_out - vals
Q_in = 0.16*0.8*(316-300.4-T_val-300.4) / (log(316-300.4 / T_val-300.4))
Q_out = 0.00762*1512*(316-T_val)
vals[which((Q_in - Q_out) == 0)]
However, none of the numbers satisfy the condition and it returns numeric(0). Maybe you need to adjust some values ?

Improving the speed

Here is my code in Julia and I would like to improve its speed since it is slow for large dataset. I provided the code with a small example so it can be executed and produce the results. I think that bottleneck is using find function in the loop which causes the code to be very slow but I don't know how I can replace it with sth faster.
A = [[1,2,3,4,5], [2,3,4,5,6,7,8], [4,7,8,9], [9,10], [2,3,4,5]]
mx = maximum(maximum(ar))
idx_new = zeros(Int, mx)
flag = ones(Int, mx);
Hscore = rand(1, length(A))
thresh = 0.2 * sum(Hscore)
acc_q = 0
pos = sortperm(vec(Hscore))
iter = 1
while acc_q < thresh
acc_q = acc_q + Hscore[pos[iter]]
nd = A[pos[iter]]
fd_flag = flag[nd]
cc = in.(fd_flag, 2)
node = nd[findall(x->x==0, cc)]
dd = nd[findall(x->x!=0, cc)]
TF = isempty(dd)
if TF == true
q_val = Hscore[pos[iter]]
acc_q = acc_q + q_val
idx_new[vec(node)] .= (val + 1)
flag[node] .= 2
val = val + 1;
iter = iter + 1
end # end of if TF
end ## end of while loop
While "please improve my code" is not a right question style for StackOverflow, generally when searching many times for element among many many options these are the first two that you might consider:
Sort the list of elements (with sort!) and use searchsorted to find the desired element
Use Set(mylist) to create a hash set and than search within the set.

The Julia Language - JuMP package, unexpected model feasibility result using GLPK

I'm new to both the Julia Language and its optimization package JuMP. I'm trying to solve a really simple optimization problem, the objective is the minimization of the fixed costs for a facility location problem
o = Model(with_optimizer(GLPK.Optimizer))
#variable(o, x[i = 1:size(candidates, 1)], Bin) #creating variables for each possible location
#variable(o, ϑ[i = 1:size(demand, 2)] >= 0) #creating one variable for each demand scenario (in this case only one scenario is considered)
#objective(o, Min, dot(x,cost_facility) * (1 + M) + sum(capacities .* x .* TH .* lab_cost_plant) + 1/size(demand,2) * sum(ϑ))
#constraint(o, 30000x[1] + 20000x[2] + 30000x[3] + 20000x[4] >= 1500.0)
#constraint(o, ϑ[1] + 1.6317502375004835e10x[1] + 1.0878334916669891e10x[2] + 1.6318862646433956e10x[3] >= 2.9076517866671824e9)
JuMP.optimize!(o)
st = MOI.get(o, MOI.TerminationStatus())
#info "Status $st"
And I get the following result:
┌ Info: Status INFEASIBLE
└ # Main In[37]:3
I couldn't see why such a problem could be infesible, considering that those two constraints were the only ones present. So, I tried to modify them to understand what was wrong and it turned out that by substituting the second constraint with an equality constraint (and keeping everything else unchanged):
#constraint(o, ϑ[1] + 1.6317502375004835e10x[1] + 1.0878334916669891e10x[2] + 1.6318862646433956e10x[3] == 2.9076517866671824e9)
An optimal solution is found:
┌ Info: Status OPTIMAL
└ # Main In[43]:3
I couldn't find any explanation for that, shouldn't the first problem be feasible too? Is there any mistake in the code? Thank you in advance for your help
candidate_plant = ["Roma", "London", "Berlin", "Milano"]
candidate_whs = ["Munich", "Glasgow","Madrid", "Lione"]
capacity_plant = [30000, 20000, 30000, 20000]
capacity_whs = [400000, 300000, 400000, 300000]
cost_plant = [174739, 293932, 174739, 293932]
cost_whs = [124739, 213932, 124739, 213932]
demand = [10000; 10000; 10000]
TH = 20
M = 0.3
lab_cost_plant = 8.64
candidates = vcat(candidate_plant, candidate_whs)
capacities = vcat(capacity_plant, capacity_whs)
cost_facility = vcat(cost_plant, cost_whs)
This is the input data used

Tensorflow: 6 layer CNN: OOM (use 10Gb GPU memory)

I am using the following code for running a 6 layer CNN with 2 FC layers on top (on Tesla K-80 GPU).
Somehow, it consumes entire memory 10GB and died out of memory.I know that i can reduce the batch_size and then run , but i also want to run with 15 or 20 CNN layers.Whats wrong with the following code and why it takes all the memory? How should i run the code for 15 layers CNN.
Code:
import model
with tf.Graph().as_default() as g_train:
filenames = tf.train.match_filenames_once(FLAGS.train_dir+'*.tfrecords')
filename_queue = tf.train.string_input_producer(filenames, shuffle=True, num_epochs=FLAGS.num_epochs)
feats,labels = get_batch_input(filename_queue, batch_size=FLAGS.batch_size)
### feats size=(batch_size, 100, 50)
logits = model.inference(feats, FLAGS.batch_size)
loss = model.loss(logits, labels, feats)
tvars = tf.trainable_variables()
global_step = tf.Variable(0, name='global_step', trainable=False)
# Add to the Graph operations that train the model.
train_op = model.training(loss, tvars, global_step, FLAGS.learning_rate, FLAGS.clip_gradients)
# Add the Op to compare the logits to the labels during evaluation.
eval_correct = model.evaluation(logits, labels, feats)
summary_op = tf.merge_all_summaries()
saver = tf.train.Saver(tf.all_variables(), max_to_keep=15)
# The op for initializing the variables.
init_op = tf.initialize_all_variables()
sess = tf.Session()
sess.run(init_op)
summary_writer = tf.train.SummaryWriter(FLAGS.model_dir,
graph=sess.graph)
# Start input enqueue threads.
coord = tf.train.Coordinator()
threads = tf.train.start_queue_runners(sess=sess, coord=coord)
try:
step = 0
while not coord.should_stop():
_, loss_value = sess.run([train_op, loss])
if step % 100 == 0:
print('Step %d: loss = %.2f (%.3f sec)' % (step, loss_value))
# Update the events file.
summary_str = sess.run(summary_op)
summary_writer.add_summary(summary_str, step)
if (step == 0) or (step + 1) % 1000 == 0 or (step + 1) == FLAGS.max_steps:
ckpt_model = os.path.join(FLAGS.model_dir, 'model.ckpt')
saver.save(sess, ckpt_model, global_step=step)
#saver.save(sess, FLAGS.model_dir, global_step=step)
step += 1
except tf.errors.OutOfRangeError:
print('Done training for %d epochs, %d steps.' % (FLAGS.num_epochs, step))
finally:
coord.join(threads)
sess.close()
###################### File model.py ####################
def conv2d(x, W, b, strides=1):
# Conv2D wrapper, with bias and relu activation
x = tf.nn.conv2d(x, W, strides=[1, strides, strides, 1],
padding='SAME')
x = tf.nn.bias_add(x, b)
return tf.nn.relu(x)
def maxpool2d(x, k=2,s=2):
# MaxPool2D wrapper
return tf.nn.max_pool(x, ksize=[1, k, k, 1], strides=[1, s,
s,1],padding='SAME')
def inference(feats,batch_size):
#feats size (batch_size,100,50,1) #batch_size=256
conv1_w=tf.get_variable("conv1_w", [filter_size,filter_size,1,256],initializer=tf.uniform_unit_scaling_initializer())
conv1_b=tf.get_variable("conv1_b",[256])
conv1 = conv2d(feats, conv1_w, conv1_b,2)
conv1 = maxpool2d(conv1, k=2,s=2)
### This was replicated for 6 layers and the 2 FC connected layers are added
return logits
def training(loss, train_vars, global_step, learning_rate, clip_gradients):
# Add a scalar summary for the snapshot loss.
tf.scalar_summary(loss.op.name, loss)
grads, _ = tf.clip_by_global_norm(tf.gradients(loss, train_vars,aggregation_method=1), clip_gradients)
optimizer = tf.train.AdamOptimizer(learning_rate)
train_op = optimizer.apply_gradients(zip(grads, train_vars), global_step=global_step)
return train_op
I am not too sure what the model python library is. If it is something you wrote and can change the setting in the optimizer I would suggest the following which I use in my own code
train_step = tf.train.AdamOptimizer(learning_rate).minimize(cost, aggregation_method = tf.AggregationMethod.EXPERIMENTAL_ACCUMULATE_N)
By default the aggeragetion_method is ADD_N but if you change it to EXPERIMENTAL_ACCUMULATE_N or EXPERIMENTAL_TREE this will greatly save memory. The main memory hog in these programs is that tensorflow must save the output values at every neuron so that it can compute the gradients. Changing the aggregation_method helps a lot from my experience.
Also BTW I don't think there is anything wrong with your code. I can run out of memory on small cov-nets as well.

vectorize complex slicing with pandas dataframe

I'd like to be able to vectorize, for speed purposes, this piece of code. the purpose is to calculate a function, in this case a standard deviation, from a tuple of pair of dates that are cointained in two separate arrays.
import pandas as pd
import numpy as np
asd_1 = pd.Series(0.01 * np.random.randn(252), index=pd.date_range('2011-1-1', periods=252))
index_1 = pd.to_datetime(['2011-2-2', '2011-4-3', '2011-5-1',])
index_2 = pd.to_datetime(['2011-2-15', '2011-4-16', '2011-5-17',])
index_tot = list(zip(index_1,index_2))
aux_learning_std = pd.DataFrame([np.nanstd(asd_1.loc[i:j]) for i, j in index_tot], index=index_1)
the solution, that works, is performed through a loop but i'd rather be able to vectorize it through numpy/pandas, which is much faster. initially I though about using something like:
df_aux = pd.concat([asd_1 for _ in range(len(index_1))], axis=1)
results = df_aux.apply(lambda x: np.nanstd(x.loc[i,j]), axis = 0)
but here I fail to put together the vectors into one operation.
any and all advice is welcome.
p.s.: below there is an image for explanatory purposes
Vectorized standard deviation across ranges in an array
def get_ranges_arr(starts,ends):
# Taken from http://stackoverflow.com/a/37626057/3293881
counts = ends - starts
counts_csum = counts.cumsum()
id_arr = np.ones(counts_csum[-1],dtype=int)
id_arr[0] = starts[0]
id_arr[counts_csum[:-1]] = starts[1:] - ends[:-1] + 1
return id_arr.cumsum()
def ranged_std(arr,starts,ends):
# Get all indices and the IDs corresponding to same groups
idx = get_ranges_arr(starts,ends)
id_arr = np.repeat(np.arange(starts.size),ends-starts)
# Extract relevant data
slice_arr = arr[idx]
# Simulate standard deviation implementation for a number of groups
# using id_arr as the basis to perform various mathematical operations
# within each group. Since, std. deviation performs sum/mean reduction,
# we can simply use np.bincount for an efficient implementation.
# Std. deviation formula used :
#https://github.com/numpy/numpy/blob/v1.11.0/numpy/core/fromnumeric.py#L2939
grp_counts = np.bincount(id_arr)
mean_vals = np.bincount(id_arr,slice_arr)/grp_counts
abs_vals = np.abs(slice_arr - mean_vals[id_arr])**2
return np.sqrt(np.bincount(id_arr,abs_vals)/grp_counts)
Sample run (verify against a loopy version)
In [173]: arr = np.random.randint(0,9,(20))
In [174]: starts = np.array([2,6,11])
In [175]: ends = np.array([8,9,15])
In [176]: [np.std(arr[i:j]) for i,j in zip(starts,ends)]
Out[176]: [1.9720265943665387, 0.81649658092772603, 0.82915619758884995]
In [177]: ranged_std(arr,starts,ends)
Out[177]: array([ 1.97202659, 0.81649658, 0.8291562 ])
Runtime test
Case #1 : Very small number of ranges 3
In [21]: arr = np.random.randint(0,9,(20))
In [22]: starts = np.array([2,6,11])
In [23]: ends = np.array([8,9,15])
In [24]: %timeit [np.std(arr[i:j]) for i,j in zip(starts,ends)]
10000 loops, best of 3: 146 µs per loop
In [25]: %timeit ranged_std(arr,starts,ends)
10000 loops, best of 3: 45 µs per loop
Case #2 : Decent number of ranges 1000
In [32]: arr = np.random.randint(0,9,(1010))
In [33]: starts = np.random.randint(0,9,(1000))
In [34]: ends = starts + np.random.randint(0,9,(1000))
In [35]: %timeit [np.std(arr[i:j]) for i,j in zip(starts,ends)]
10 loops, best of 3: 47.5 ms per loop
In [36]: %timeit ranged_std(arr,starts,ends)
1000 loops, best of 3: 217 µs per loop
Case #3 : Large number of ranges 10000
In [60]: arr = np.random.randint(0,9,(1010))
In [61]: arr = np.random.randint(0,9,(10010))
In [62]: starts = np.random.randint(0,9,(10000))
In [63]: ends = starts + np.random.randint(0,9,(10000))
In [64]: %timeit [np.std(arr[i:j]) for i,j in zip(starts,ends)]
1 loops, best of 3: 474 ms per loop
In [65]: %timeit ranged_std(arr,starts,ends)
100 loops, best of 3: 2.17 ms per loop
Really amazing speedups of 200x+!
Using ranged_std to solve our case
# Get start, stop numeric indices as needed for getting ranges array later on
starts = asd_1.index.searchsorted(index_1)
ends = asd_1.index.searchsorted(index_2)
# Create final dataframe output using ranged_std func
df = pd.DataFrame(ranged_std(asd_1.values,starts,ends+1),index=index_1)
Sample run for verification -
In [17]: asd_1 = pd.Series(0.01 * np.random.randn(252), index=\
...: pd.date_range('2011-1-1', periods=252))
...:
...: index_1 = pd.to_datetime(['2011-2-2', '2011-4-3', '2011-5-1',])
...: index_2 = pd.to_datetime(['2011-2-15', '2011-4-16', '2011-5-17',])
...:
...: index_tot = list(zip(index_1,index_2))
...: aux_learning_std = pd.DataFrame([np.nanstd(asd_1.loc[i:j]) for i, j in \
...: index_tot], index=index_1)
...:
In [18]: starts = asd_1.index.searchsorted(index_1)
...: ends = asd_1.index.searchsorted(index_2)
...: df = pd.DataFrame(ranged_std(asd_1.values,starts,ends+1),index=index_1)
...:
In [19]: aux_learning_std
Out[19]:
0
2011-02-02 0.007244
2011-04-03 0.012862
2011-05-01 0.010155
In [20]: df
Out[20]:
0
2011-02-02 0.007244
2011-04-03 0.012862
2011-05-01 0.010155

Resources