Jupyter and Spyder behaviors are different - jupyter-notebook

I have the following data as an example:
data = (
[1,230.1,37.8,69.2,22.1],
[2,44.5,39.3,45.1,10.4],
[3,17.2,45.9,69.3,9.3],
[4,151.5,41.3,58.5,18.5],
[5,180.8,10.8,58.4,12.9],
)
I use the same code on Spyder and Jupyter notebook, but I receive a different output, e.g.:
If I enter data (without print()) in Spyder, I receive no output, just new line, but if I enter data (without print() as well) in Jupyter, I receive the full data in the output
([1, 230.1, 37.8, 69.2, 22.1],
[2, 44.5, 39.3, 45.1, 10.4],
[3, 17.2, 45.9, 69.3, 9.3],
[4, 151.5, 41.3, 58.5, 18.5],
[5, 180.8, 10.8, 58.4, 12.9],
)
The only way to get the output in Spyder is to use the print() command
print(data)
([1, 230.1, 37.8, 69.2, 22.1], [2, 44.5, 39.3, 45.1, 10.4], [3, 17.2, 45.9, 69.3, 9.3], [4, 151.5, 41.3, 58.5, 18.5], [5, 180.8, 10.8, 58.4, 12.9])
Also you can see the difference between the output format in both cases
1- Can someone please example why the difference in behavior?
2- Can I set up Spyder to behave the same way Jupyter notebook is behaving?
Thanks

In regards to '1' at the bottom of your post:
I think the main summary to address #1 is that in IPython and Jupyter, the last line is special. IPython/Jupyter will try to represent what is on the last line in an entered block of code (IPython) or cell (Jupyter) as best considered in the ecosystem. (It comes from the '- print -' part of the REPL model. It will try to 'print' what the last line is. I am purposefully putting the word print in quotes here as 'print' in Jupyter model can mean represent it nicely because it has more abilities in the interface.) Python doesn't have that feature so you have to tell it how to handle what you consider the result by specifying how to display it. Usually you just use print() as you do in your example. (I added some information in my comments to your post on the evolution that may help you understand things in the context of the ecosystem and what to expect in each situation.)
One of the biggest differences is Pandas dataframes handling, if you've come across them. Jupyter has nice representation of them without you needing to anything & party that is only possible because it is in a browser. In Python you have to tell it how and what specifically to display in regards to the dataframe, and it will just be purely text-based in the text based interface.
Plots are another similar one for Jupyter where it displays them nicely right in the interface. However, handling these has evolved further in Jupyter recently so that no longer does this have to be invoked on the last line to trigger 'print'-like handling for the plot, as these days Juptyer will try to display a plot object made in a Jupyter cell if it properly detects one has been made. (You'll see a lot of outdated code suggesting that you need to invoke showing it on the last line by referencing the plot or something like plt.show() and sometimes it is good to try that when troubleshooting, but mostly it is not needed now. So it kind of confuses the paradigm of output handling with the last line being important.) By contrast, in Python you have to have something set up to handle displaying a plot, such as installing Qt so it will trigger displaying it to a Qt window or use your Python code to send the plot object to an image file, and then view that on your system as you would view any other image file.
In regards to '2' at the bottom of your post:
Check out the Spyder-notebook plugin.

Related

don't echo to R console

Is there a way to turn off/on echoing to the R console (without using source()?)
For example, let's say I have a long .R script and I wish to run only one line from it. Say that line is x <- 9.
In RStudio, I can go to the line in question and use the "Run Selected Line(s)" command from the "Code" menu (or keyboard shortcut ctrl-enter on my PC). What'll happen upon doing that is it the console will print x <- 9 (and then obviously, R will create a variable called x and assign it the value 9).
Is there a way to not have this line echoed in the console (but still create the variable)?
The reason I ask is that I have lengthy lines of code that just define functions and every time I want to update a function, it echoes the whole thing to the console, and that burns a lot of time.
Thanks.

Hiding JupyterLab cell's output by default

I am using JupyterLab to build a bioinformatics pipeline that uses both bash and python scripts.
The first bash script results gives a lot of feedback on every step of the process. However, this feedback is not helpful (unless there was an error) and makes the document less readable.
I would like to be able to hide this cell's output by default, but also to be able to open it when necessary to troubleshoot. I know it's possible to click 3 times on the output to collapse it; I was just wondering whether there is a way to do so by default.
I tried to add the tag specified on here (https://jupyterbook.org/features/hiding.html#Hiding-outputs) to the cell, but it does not seem to work for me.
Thanks for your help.
You may just want to suppress the output using %%capture cell magic as illustrated here. Then you simply remove that magic command from the first line of the cell for times you want to see the output, such as when troubleshooting.
If you want to make it so every time you run the cell, you can later decide to review what was captured you can use the %%capture magic command more as it was meant to be used. By assigning what is captured you can also do something like what the %%bash cell magic allows with handling output streams (see here), too. As described and illustrated here using the utils object you can easily get the stdout and/or stderr as a string, see http://ipython.readthedocs.io/en/stable/api/generated/IPython.utils.capture.html.
So say you put the following at the top of you cell to assign what was captured to out:
%%capture out
You can review the stdout stream later with the following:
print(out.stdout)
Or if you just want part of it, something like print(out.stdout[1:500]). I have some fancier handling illustrated in some blocks of code here.

how to print jupyter background job output in cell it was launched from?

When launching a background job from an IPython Jupyter Notebook, how can I make the printed output appear in the cell it was launched from, rather than in the cell I am currently working in?
The print() command seems to print into the current working cell, not into the cell that the background job was launched from. Is there a way to make it print nicely in the cell it was launched from? This is particularly relevant when running multiple background job sets, as to determine which jobset was responsible for that line of output.
Edit: it occurs with any code but here is a small snippet to reproduce it:
from IPython.lib import backgroundjobs as bg
import time
def launchJob():
for i in range(100):
print('x')
time.sleep(1)
jobs = bg.BackgroundJobManager()
for r in range(3):
jobs.new("launchJob()")
that does exactly what you'd expect it to do, print 3 x's every second in the output under the cell. Now go to the next cell and type 1+1 and execute it. The output 2 appears, but also any remaining x's get printed in this new cell rather in the original cell.
I am looking for a way to specifically tell my job to always print to the original cell it was executed from, as to obtain sort of a log in the background, or to generate a bunch of plots in the background, or any kind of data that I want in one place rather than appearing all over my notebook.
The IPython.lib.backgoundjobs.BackgroundJobManager.new documentation states:
All threads running share the same standard output. Thus, if your background jobs generate output, it will come out on top of whatever you are currently writing. For this reason, background jobs are best used with silent functions which simply return their output.
In GitHub pull request #856 of IPython (October 2011), the developers discussed this issue and concluded that keeping the output in the original cell would be difficult to implement. Hence, they decided to table the idea and in the latest version (7.9.0 at the time of writing) it still has not been solved.
In the mean time, a workaround could be to store the output in a variable, or to write/pickle the output to a file and print/display it once the background jobs are finished.
While it is not a direct answer to my question as it does not work with the print command, I did manage to solve my problem partially in the sense that I can have graphs updating in the back (and can hence log any kind of data on them as-we-go without any need to re-run).
Some proof of concept code below, based on What is the currently correct way to dynamically update plots in Jupyter/iPython?
%matplotlib notebook
import numpy as np
import matplotlib.pyplot as plt
import time
from IPython.lib import backgroundjobs as bg
def pltsin(ax, colors=['b']):
x = np.linspace(0,1,100)
if ax.lines:
for line in ax.lines:
line.set_xdata(x)
y = np.random.random(size=(100,1))
line.set_ydata(y)
else:
for color in colors:
y = np.random.random(size=(100,1))
ax.plot(x, y, color)
fig.canvas.draw()
fig,ax = plt.subplots(1,1)
ax.set_xlabel('X')
ax.set_ylabel('Y')
ax.set_xlim(0,1)
ax.set_ylim(0,1)
def launchJob():
for i in range(100):
pltsin(ax, ['b', 'r'])
time.sleep(1)
jobs = bg.BackgroundJobManager()
jobs.new("launchJob()")
While this is running, typing 1+1 in another cell does not disrupt the updating of the figure. For me this was a game changer, so I'll post this in case anyone is helped by it.
#Peter As other answers address the method of doing this using pickle or Output file, I would like to suggest alternate method.
I will advise that you should initiate print operation in parent window. When the certain criteria for execution is reached in child process window then return those values to print function in parent.
Use %matplotlib inline to print the values in cell of parent. This prints the outputs and visuals in the same cell.

Speeding up matlab file import

I am trying to load a matlab file with the R.matlab package. The problem is that it keeps loading indefinitely (e.g. table <- readMat("~/desktop/hg18_with_miR_20080407.mat"). I have a genome file from the Broad Institute (hg18_with_miR_20080407.mat).
You can find it at:
http://genepattern.broadinstitute.org/ftp/distribution/genepattern/dev_archive/GISTIC/broad.mit.edu:cancer.software.genepattern.module.analysis/00125/1.1/
I was wondering: has anyone tried the package and have similar issues?
(hopefully helpful, but not really an answer, though there was too much formatting for a comment)
I think you may need to get a friend with matlab access to save the file into a more reasonable format or use python for data processing. It "hangs" for me as well (OS X 10.9, R 3.1.1). The following works in python:
import scipy.io
mat = scipy.io.loadmat("hg18_with_miR_20080407.mat")
(you can see and work with the rg and cyto' crufty numpy arrays, but they can't be converted to JSON withjson.dumpsand evenjsonpickle.encodecoughs up a lung-full of errors (i.e. you won't be able to userPython` to get access to the object which was the original workaround I was looking for), so no serialization to a file either (and, I have to believe the resultant JSON would have been ugly to deal with).
Your options are to:
get a friend to convert it (as suggested previous)
make CSV files out of the numpy arrays in python
use matlab

how to change line length on Rterm.exe

I am using R 2.15.2 on windows XP.
I was used to use Rgui.exe but it was lacking the UNIX standards I like to use like CTRL+R <=>backward research and CTRL+U <=>erase line ...
If I missed something please tell me !
Then I tried Rterm.exe (which looks identical to R.exe to me) which has all those nice features. I found how to tune it right clicking on the top of the window to set height-width (it is like tuning the window you get from cmd.exe).
The problem is that now I cannot see on the window more than 75 characters, with a $ at the end: like this:
R) ppppppppppppppppppppppppppppppppppppppppppppppppppppppppppppppppppppppppppp$
Not sure if it is a R option of a windows one, but if I set options("width"=180) I can see data.frame on the full width of the window...
Not sure what is happening, can I modify this?
We still do not know the answer to that one, so I guess 50 pts goes to Oscar de León... good for him to bad for me...
Sadly, it appears to be built in.
There used to be a problem with R when trying to print long strings. Apparently it was fixed first in Rterm and other versions of R before being fixed in Rgui.
When Rgui was fixed, possibly it was by a different means, since this issue can be fixed in Rgui but not other windows versions of R. You can change the width of the console for output both in Rgui and (later) Rterm.
The prompt is another story. It is actually not the same as the output space, and thus is controlled with a different option; but, this only works for Rgui. To do it, set pgcolumns=180 in the Rconsole file under [R HOME]\etc\. This modifies the width of the internal pager of the Rgui console, and effectively enables you to type up to 180 characters per input prompt.
Possibly there is a way to integrate that behavior into Rterm, and maybe Duncan Murdoch can point you in the correct direction (or prove me completely wrong).
I'm not really sure what is being requested. If what is needed in RTerm.exe is to display the end of a long line (and position the cursor there), then use CTRL-E. You can go back to the beginning of a line with CTRL-A. One can go back and forth repeatedly as needed until the line is use ENTER.
The control character of readline seem to be active, for instance CTRL-P scrolls back one command and CTRL-N brings up the "next" command from history if you hit CTRL-P too many times. (These are the same behavior as the up/down arrow keys.) See link for other expected readline behaviors.
On my machine alt-f and alt-b (which should have been meta-f and meta-b) did not natively move forward or backward by words, but ESC-b and ESC-f did so on a line that exceeded the console width and had the $'s marking either the right or left extents as having further material to look consider.
If you want to wrap display lines, then you need to consider alternatives or additions to readline: link, but that is an untested suggestion and merely the results of a search for: "readline wrap display".
The command should be options(width = 180) (without the quotes around width), but when you run Rterm in the Windows shell, it doesn't respect changes to this value; it just prints output as wide as the console.
The best way of working with R is (almost always) to use an IDE. Try emacs + ESS or one of the many vim plugins (R.vim, vim-R, VIM:r-plugin) if you want something UNIXy.

Resources