Jupyter Notebook, NameError: is not defined, %%time prevents assignment - jupyter-notebook

I came across a very strange bug running a Jupyter Notebook (IPython: 7.4.0) where a variable was not assigned as normally. It took me quite bit of time to figure out the cause, searching in vain all over, variable scope, type conversion and TensorFlow intricacies ;(
In fact, using %%time cell magic was preventing assignment of the variable in the cell. Therefore the assigned variable was not defined in the cell below giving a characteristic error message: "NameError: 'xxx' is not defined."
It seems to be a known issue, hoping that can help someone else.

The solution is simple, just remove %%time from the cell.
Rather use:
from timeit import default_timer as timer
from datetime import timedelta
start = timer()
# Process
# ...
end = timer()
print ("Execution time HH:MM:SS:",timedelta(seconds=end-start))
Source: Stackoverflow - Measure time elapsed in Python?

Related

How to get the 'execution count' of the most recent execution in IPython or a Jupyter notebook?

IPython and Jupyter notebooks keep track of an 'execution count'. This is e.g. shown in the prompts for input and output: In[...] and Out[...].
How can you get the latest execution count programmatically?
The variable Out cannot be used for this:
Out is a dictionary, with execution counts as keys, but only for those cells that yielded a result; note that not all (cell) executions need to yield a result (e.g. print(...)).
The variable In seems to be usable:
In is a list of all inputs; it seems that In[0] is always primed with an empty string, so that you can use the 1-based execution count as index.
However, when using len(In) - 1, in your code, you need to take into account that the execution count seems to be updated before execution of that new code. So, actually, it seems you need to use len(In) - 2 for the execution count of the most recent already completed execution.
Questions:
Is there a better way to get the current execution count?
Can you rely on the above observations (the 'seems')?
After more digging, I found the (surprisingly simple) answer:
from IPython import get_ipython
ipython = get_ipython()
... ipython.execution_count ...
This does not show up in the IPython documentation, though could (should?) have been mentioned here: https://ipython.readthedocs.io/en/stable/api/generated/IPython.core.interactiveshell.html (I guess attributes are not documented). Here you do find run_line_magic (which was mentioned in a comment to my question).
The way I found this attribute is by defining ipython as above and then doing code completion (TAB) on ipython. (but it does not have documentation).
help(ipython)
gives you documentation about InteractiveShell, and it does mention execution_count (though it does not confirm its purpose):
| execution_count
| An int trait.

Non-blocking cell execution in Jupyter

In Jupyter with an ipython kernel, is there a canonical way to execute cells in a non-blocking fashion?
Ideally I'd like to be able to run a cell
%%background
time.sleep(10)
print("hello")
such that I can start editing and running the next cells and in 10 seconds see "hello" appear in the output of the original cell.
I have tried two approaches, but haven't been happy with either.
(1) Create a thread by hand:
def foo():
time.sleep(10)
print("hello")
threading.Thread(target=foo).start()
The problem with this is that "hello" is printed in whatever cell is active in 10 seconds, not necessarily in the cell where the thread was started.
(2) Use a ipywidget.Output widget.
def foo(out):
time.sleep(10)
out.append_stdout("hello")
out = ipywidgets.Output()
display(out)
threading.Thread(target=foo,args=(out,)).start()
This works, but there are problems when I want to update the output (think of monitoring something like memory consumption):
def foo(out):
while True:
time.sleep(1)
out.clear_output()
out.append_stdout(str(datetime.datetime.now()))
out = ipywidgets.Output()
display(out)
threading.Thread(target=foo,args=(out,)).start()
The output now constantly switches between 0 and 1 lines in size, which results in flickering of the entire notebook.
This should be solvable wait=True in the call to clear_output. Alas, for me it results in the output never showing anything.
I could have asked about that issue, which seems to be a bug, specifically, but I wondered whether there is maybe another solution that doesn't require me doing all of this by hand.
I've experienced some issues like this with plotting to an output, it looks like you have followed the examples in the ipywidgets documentation on async output widgets.
The other approach I have found sometimes helpful (particularly if you know the size of the desired output) is to fix the height of your output widget when you create it.
out = ipywidgets.Output(layout=ipywidgets.Layout(height='25px'))

Qt error is printed on the console; how to see where it originates from?

I'm getting this on the console in a QML app:
QFont::setPointSizeF: Point size <= 0 (0.000000), must be greater than 0
The app is not crashing so I can't use the debugger to get a backtrace for the exception. How do I see where the error originates from?
If you know the function the warning occurs in (in this case, QFont::setPointSizeF()), you can put a breakpoint there. Following the stack trace will lead you to the code that calls that function.
If the warning doesn't include the name of the function and you have the source code available, use git grep with part of the warning to get an idea of where it comes from. This approach can be a bit of trial and error, as the code may span more than one line, etc, and so you might have to try different parts of the string.
If the warning doesn't include the name of the function, you don't have the source code available and/or you don't like the previous approach, use the QT_MESSAGE_PATTERN environment variable:
QT_MESSAGE_PATTERN="%{function}: %{message}"
For the full list of variables at your disposal, see the qSetMessagePattern() docs:
%{appname} - QCoreApplication::applicationName()
%{category} - Logging category
%{file} - Path to source file
%{function} - Function
%{line} - Line in source file
%{message} - The actual message
%{pid} - QCoreApplication::applicationPid()
%{threadid} - The system-wide ID of current thread (if it can be obtained)
%{qthreadptr} - A pointer to the current QThread (result of QThread::currentThread())
%{type} - "debug", "warning", "critical" or "fatal"
%{time process} - time of the message, in seconds since the process started (the token "process" is literal)
%{time boot} - the time of the message, in seconds since the system boot if that can be determined (the token "boot" is literal). If the time since boot could not be obtained, the output is indeterminate (see QElapsedTimer::msecsSinceReference()).
%{time [format]} - system time when the message occurred, formatted by passing the format to QDateTime::toString(). If the format is not specified, the format of Qt::ISODate is used.
%{backtrace [depth=N] [separator="..."]} - A backtrace with the number of frames specified by the optional depth parameter (defaults to 5), and separated by the optional separator parameter (defaults to "|"). This expansion is available only on some platforms (currently only platfoms using glibc). Names are only known for exported functions. If you want to see the name of every function in your application, use QMAKE_LFLAGS += -rdynamic. When reading backtraces, take into account that frames might be missing due to inlining or tail call optimization.
On an unrelated note, the %{time [format]} placeholder is quite useful to quickly "profile" code by qDebug()ing before and after it.
I think you can use qInstallMessageHandler (Qt5) or qInstallMsgHandler (Qt4) to specify a callback which will intercept all qDebug() / qInfo() / etc. messages (example code is in the link). Then you can just add a breakpoint in this callback function and get a nice callstack.
Aside from the obvious, searching your code for calls to setPointSize[F], you can try the following depending on your environment (which you didn't disclose):
If you have the debugging symbols of the Qt libs installed and are using a decent debugger, you can set a conditional breakpoint on the first line in QFont::setPointSizeF() with the condition set to pointSize <= 0. Even if conditional breakpoints don't work you should still be able to set one and step through every call until you've found the culprit.
On Linux there's the tool ltrace which displays all calls of a binary into shared libs, and I suppose there's something similar in the M$ VS toolbox. You can grep the output for calls to setPointSize directly, but of course this won't work for calls within the lib itself (which I guess could be the case when it handles the QML internally).

How can I configure my IPython notebook so it always shows the execution time as part of the output?

Sometimes I execute a method that takes long to compute
In [1]:
long_running_invocation()
Out[1]:
Often I am interested in knowing how much time it took, so I have to write this:
In[2]:
import time
start = time.time()
long_running_invocation()
end = time.time()
print end - start
Out[2]: 1024
Is there a way to configure my IPython notebook so that it automatically prints the execution time of every call I am making like in the following example?
In [1]:
long_running_invocation()
Out[1] (1.59s):
This ipython extension does what you want: https://github.com/cpcloud/ipython-autotime
load it by putting this at the top of your notebook:
%install_ext https://raw.github.com/cpcloud/ipython-autotime/master/autotime.py
%load_ext autotime
Once loaded, every subsequent cell execution will include the time it took to execute as part of its output.
i haven't found a way to have every cell output the time it takes to execute the code, but instead of what you have, you can use cell magics: %time or %timeit
ipython cell magics
You can now just use the %%time magic at the beginning of the cell like this:
%%time
data = sc.textFile(sample_location)
doc_count = data.count()
doc_objs = data.map(lambda x: json.loads(x))
which when executed will print out an output like:
CPU times: user 372 ms, sys: 48 ms, total: 420 ms
Wall time: 37.7 s
The Simplest way to configure your ipython notebook in a way it automatically shows the execution time without running any %%time or %%timelit or time.time() in each cell, is by using ipython-autotime package.
Install the package in the begining of the notebook
pip install ipython-autotime
and then load the extension by running below
%load_ext autotime
Once you have loaded it, any cell run after this ,will give you the execution time of the cell.
And dont worry if you want to turn it off, just unload the extension by running below
%unload_ext autotime
It is pretty simple and easy to use it whenever you want.
And if you want to check out more, can refer to ipython-autime documentation or its github source

How to limit the number of output lines in a given cell of the Ipython notebook?

Sometimes my Ipython notebooks crash because I left a print statement in a big loop or in a recursive function. The kernel shows busy and the stop button is usually unresponsive. Eventually Chrome asks me if I want to kill the page or wait.
Is there a way to limit the number of output lines in a given cell? Or any other way to avoid this problem?
You can suppress output using this command:
‘;’ at the end of a line
Perhaps create a condition in your loop to suppress output past a certain threshold.
For anyone else stumbling across:
If you want to see some of the output rather than suppress the output entirely, there is an extension called limit-output.
You'll have to follow the installation instructions for the extensions at the first link. Then I ran the following code to update the maximum number of characters output by each cell:
from notebook.services.config import ConfigManager
cm = ConfigManager().update('notebook', {'limit_output': 10})
Note: you'll need to run the block of code, then restart your notebook server entirely (not just the kernel) for the change to take effect.
Results on jupyter v4.0.6 running a Python 2.7.12 kernel
for i in range(0,100):
print i
0
1
2
3
4
limit_output extension: Maximum message size exceeded

Resources