How to measure the length of a call stack? - recursion

Recently had a test question asking "how deep" the call stack for fact1 where n = 5. Here is the code:
int fact1(int n)
{
if (n == 1)
{
return 1
}
else {
return n * fact(n-1)
}
}
The answer on the test was 5, but I believe it is 4. I don't believe the first call is to be counted in the number of calls.

Actually, every function call ends in the call stack.
Your example looks like C; in C, there is always a main function; even the main function ends on the call stack.
I don't think there is a way to examine the call stack in C; especially since the compiler is allowed to optimise away whatever it wants. For instance, it could optimise tail-recursion, and then the call stack would be smaller than you'd expect.
In Python the call stack is easy to examine; just crash the function whenever you want, by throwing an exception (for instance with assert(False)). Then the program will produce an error message containing the full "stack trace", including the list of every function on the stack.
Here is an example of a stack trace in python:
def fact1(n):
assert(n != 1)
return n * fact1(n-1)
def main():
f = fact1(3)
print(f)
main()
Output:
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "<stdin>", line 2, in main
File "<stdin>", line 3, in fact1
File "<stdin>", line 3, in fact1
File "<stdin>", line 2, in fact1
AssertionError
And another example just for fun:
def print_even(n):
if (n <= 1):
print('yes' if n == 0 else 'no')
assert(False)
else:
print_odd(n-1)
def print_odd(n):
if (n <= 1):
print('yes' if n == 1 else 'no')
assert(False)
else:
print_even(n-1)
def main():
n = 5
print('Is {} even?'.format(n))
print_even(n)
main()
Output:
Is 5 even?
no
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "<stdin>", line 4, in main
File "<stdin>", line 6, in print_even
File "<stdin>", line 6, in print_odd
File "<stdin>", line 6, in print_even
File "<stdin>", line 6, in print_odd
File "<stdin>", line 4, in print_even
AssertionError

Related

How to calculate the runtime of MPI program in mpi4py

I wrote this code and tried some functions to calculate the time but it shows an error while running.I tried to use the C functions that is used to calculate runtime of each processor by using Wtime() and MPI_Reduce.Wtime() works I guess but the reduce function does not work the way it should.
The code is
from mpi4py import MPI
import mpi4py
import numpy as np
comm = MPI.COMM_WORLD
size = comm.size
rank = comm.Get_rank()
time1 = mpi4py.MPI.Wtime()
a = np.random.randint(10, size=(10, 10))
if rank == 0:
b = np.random.randint(10, size=(10, 10))
print(b)
else:
b = None
b = comm.bcast(b, root=0)
c = np.dot(a, b)
if size == 1:
result = np.dot(a, b)
else:
if rank == 0:
a_row = a.shape[0]
if a_row >= size:
split = np.array_split(a, size, axis=0)
else:
split = None
split = comm.scatter(split, root=0)
split = np.dot(split, b)
data = comm.gather(split, root=0)
time2 = mpi4py.MPI.Wtime()
duration = time2 - time1
totaltime = comm.reduce(duration,op = sum, root = 0)
print("Runtime at %d is %f" %(rank,duration))
if rank == 0:
result = np.vstack(data)
print(result)
print(totaltime)
The error it shows is
Runtime at 3 is 0.000574
Traceback (most recent call last):
File "matrixmultMPI.py", line 48, in <module>
totaltime = comm.reduce(duration,op = sum, root = 0)
File "mpi4py/MPI/Comm.pyx", line 1613, in mpi4py.MPI.Comm.reduce
File "mpi4py/MPI/msgpickle.pxi", line 1322, in mpi4py.MPI.PyMPI_reduce
File "mpi4py/MPI/msgpickle.pxi", line 1254, in mpi4py.MPI.PyMPI_reduce_intra
File "mpi4py/MPI/msgpickle.pxi", line 1126, in mpi4py.MPI.PyMPI_reduce_p2p
TypeError: 'float' object is not iterable
Traceback (most recent call last):
File "matrixmultMPI.py", line 48, in <module>
totaltime = comm.reduce(duration,op = sum, root = 0)
File "mpi4py/MPI/Comm.pyx", line 1613, in mpi4py.MPI.Comm.reduce
File "mpi4py/MPI/msgpickle.pxi", line 1322, in mpi4py.MPI.PyMPI_reduce
File "mpi4py/MPI/msgpickle.pxi", line 1254, in mpi4py.MPI.PyMPI_reduce_intra
File "mpi4py/MPI/msgpickle.pxi", line 1126, in mpi4py.MPI.PyMPI_reduce_p2p
TypeError: 'float' object is not iterable
Traceback (most recent call last):
File "matrixmultMPI.py", line 48, in <module>
totaltime = comm.reduce(duration,op = sum, root = 0)
File "mpi4py/MPI/Comm.pyx", line 1613, in mpi4py.MPI.Comm.reduce
File "mpi4py/MPI/msgpickle.pxi", line 1322, in mpi4py.MPI.PyMPI_reduce
File "mpi4py/MPI/msgpickle.pxi", line 1254, in mpi4py.MPI.PyMPI_reduce_intra
File "mpi4py/MPI/msgpickle.pxi", line 1126, in mpi4py.MPI.PyMPI_reduce_p2p
TypeError: 'float' object is not iterable
What's wrong with it and how do I go about it!!?

Is there a "next" type of function in Julia, as in Python?

Is there something comparable to Python's "next" for going through a Julia iterable? I can't seem to find anything in the documentation.
https://docs.julialang.org/en/v1/base/collections/
next = iterate(iter)
(i, state) = next
Alternatively, it appears peek will get the first element, but won't iterate.
Here is the closest you can achieve, using Stateful from Base.Iterators module:
Julia
julia> it = Iterators.Stateful((1, 2));
julia> popfirst!(it)
1
julia> popfirst!(it)
2
julia> popfirst!(it)
ERROR: EOFError: read end of file
Stacktrace:
[1] popfirst!(s::Base.Iterators.Stateful{Tuple{Int64, Int64}, Any})
# Base.Iterators ./iterators.jl:1355
[2] top-level scope
# REPL[5]:1
Python
>>> it = iter((1, 2))
>>> next(it)
1
>>> next(it)
2
>>> next(it)
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
StopIteration

Rserve: pyServe not able to call basic R functions

I'm calling Rserve from python and it runs for basic operations, but not if I call basic functions as min
import pyRserve
conn = pyRserve.connect()
cars = [1, 2, 3]
conn.r.x = cars
print(conn.eval('x'))
print(conn.eval('min(x)'))
The result is:
[1, 2, 3]
Traceback (most recent call last):
File "test3.py", line 9, in <module>
print(conn.eval('min(x)'))
File "C:\Users\acastro\.windows-build-tools\python27\lib\site-packages\pyRserve\rconn.py", line 78, in decoCheckIfClosed
return func(self, *args, **kw)
File "C:\Users\acastro\.windows-build-tools\python27\lib\site-packages\pyRserve\rconn.py", line 191, in eval
raise REvalError(errorMsg)
pyRserve.rexceptions.REvalError: Error in min(x) : invalid 'type' (list) of argument
Do you know where is the problem?
Thanks
You should try min(unlist(x)).
If the list is simple, you may just try as.data.frame(x).
For some more complicate list, StackOverFlow has many other answers.

Range function is not working properly in jupyter notebook

for i in range():
for j in range(i):
print("*",end = " ")
print()
Error:
TypeError
Traceback (most recent call last) <ipython-input-81-6bc40b03c32e> in <module>
----> 1 for i in range():
2 for j in range(i):
3 print("*",end = " ")
4 print()
5
TypeError: 'int' object is not callable
In Python, range() is not valid, you need to provide an argument to allow it to create a range.
For example:
range(3) will give you a generator for the integers { 0, 1, 2 };
range(3, 20, 7) will generate { 3, 10, 17 };
range() will just generate an error :-)
You have to pass an int parameter in range() just the way you did in the second for loop. like, range(10).
for i in range(10):
for j in range(i):
print("*",end = " ")
print()

How to convert dict to RDD in PySpark

I am learning the Word2Vec Model to process my data.
I using Spark 1.6.0.
Using the example of the official documentation explain my problem:
import pyspark.mllib.feature import Word2Vec
sentence = "a b " * 100 + "a c " * 10
localDoc = [sentence, sentence]
doc = sc.parallelize(localDoc).map(lambda line: line.split(" "))
model = Word2Vec().setVectorSize(10).setSeed(42).fit(doc)
The vectors are as follows:
>>> model.getVectors()
{'a': [0.26699373, -0.26908076, 0.0579859, -0.080141746, 0.18208595, 0.4162335, 0.0258975, -0.2162928, 0.17868409, 0.07642203], 'b': [-0.29602322, -0.67824656, -0.9063686, -0.49016926, 0.14347662, -0.23329848, -0.44695938, -0.69160634, 0.7037, 0.28236762], 'c': [-0.08954003, 0.24668643, 0.16183868, 0.10982372, -0.099240996, -0.1358507, 0.09996107, 0.30981666, -0.2477713, -0.063234895]}
When I use the getVectors() to get the map of representation of the words. How to convert it into RDD, so I can pass it to KMeans Model?
EDIT:
I did what #user9590153 said.
>>> v = sc.parallelize(model.getVectors()).values()
# the above code is successful.
>>> v.collect()
The Spark-Shell shows another problem:
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "D:\spark-1.6.3-bin-hadoop2.6\python\pyspark\rdd.py", line 771, in collect
port = self.ctx._jvm.PythonRDD.collectAndServe(self._jrdd.rdd())
File "D:\spark-1.6.3-bin-hadoop2.6\python\lib\py4j-0.9-src.zip\py4j\java_gateway.py", line 813, in __call__
File "D:\spark-1.6.3-bin-hadoop2.6\python\pyspark\sql\utils.py", line 45, in deco
return f(*a, **kw)
File "D:\spark-1.6.3-bin-hadoop2.6\python\lib\py4j-0.9-src.zip\py4j\protocol.py", line 308, in get_return_value
py4j.protocol.Py4JJavaError: An error occurred while calling z:org.apache.spark.api.python.PythonRDD.collectAndServe.
: org.apache.spark.SparkException: Job aborted due to stage failure: Task 3 in stage 8.0 failed 1 times, most recent failure: Lost task 3.0 in stage 8.0 (TID 29, localhost): org.apache.spark.api.python.PythonException: Traceback (most recent call last):
File "D:\spark-1.6.3-bin-hadoop2.6\python\lib\pyspark.zip\pyspark\worker.py", line 111, in main
File "D:\spark-1.6.3-bin-hadoop2.6\python\lib\pyspark.zip\pyspark\worker.py", line 106, in process
File "D:\spark-1.6.3-bin-hadoop2.6\python\lib\pyspark.zip\pyspark\serializers.py", line 263, in dump_stream
vs = list(itertools.islice(iterator, batch))
File "D:\spark-1.6.3-bin-hadoop2.6\python\pyspark\rdd.py", line 1540, in <lambda>
return self.map(lambda x: x[1])
IndexError: string index out of range
at org.apache.spark.api.python.PythonRunner$$anon$1.read(PythonRDD.scala:166)
at org.apache.spark.api.python.PythonRunner$$anon$1.<init>(PythonRDD.scala:207)
at org.apache.spark.api.python.PythonRunner.compute(PythonRDD.scala:125)
at org.apache.spark.api.python.PythonRDD.compute(PythonRDD.scala:70)
at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:306)
at org.apache.spark.rdd.RDD.iterator(RDD.scala:270)
at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:66)
at org.apache.spark.scheduler.Task.run(Task.scala:89)
at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:227)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)
Just parallelize:
sc.parallelize(model.getVectors()).values()
Parallelized Collections will help you over here.
val data = Array(1, 2, 3, 4, 5) # data here is the collection
val distData = sc.parallelize(data) # converted into rdd
For your case:
sc.parallelize(model.getVectors()).values()
For your doubt:
The action collect() is the common and simplest operation that returns our entire RDDs content to driver program.
The application of collect() is unit testing where the entire RDD is expected to fit in memory. As a result, it makes easy to compare the result of RDD with the expected result.
Action Collect() had a constraint that all the data should fit in the machine, and copies to the driver.
So, you can not perform collect on RDD

Resources