Can Jupyter run a separate R notebook from within a Python notebook? - r

I have a Jupyter notebook (python3) which is a batch job -- it runs three separate python3 notebooks using %run. I want to invoke a fourth Jupyter R-kernel notebook from my batch.
Is there a way to execute an external R notebook from a Python notebook in Jupyter / iPython?
Current setup:
run_all.ipynb: (python3 kernel)
%run '1_py3.ipynb'
%run '2_py3.ipynb'
%run '3_py3.ipynb'
%run '4_R.ipynb'
The three python3 notebooks run correctly. The R notebook runs correctly when opened separately in Jupyter -- however it fails when called using %run from run_all.ipynb. It is interpreted as python, and the cell gives a python error on the first line:
cacheDir <- "caches"
TypeError: bad operand type for unary -: 'str'
I am interested in any solution for running a separate R notebook from a python notebook -- Jupyter magic, shell, python library, et cetera. I would also be interested in a workaround -- e.g. a method (like a shell script) that would run all four notebooks (both python3 and R) even if this can't be done from inside a python3 notebook.
(NOTE: I already understand how to embed %%R in a cell. This is not what I am trying to do. I want to call a complete separate R notebook.)

I don't think you can use the %run magic command that way as it executes the file in the current kernel.
Nbconvert has an execution API that allows you to execute notebooks. So you could create a shell script that executes all your notebooks like so:
#!/bin/bash
jupyter nbconvert --to notebook --execute 1_py3.ipynb
jupyter nbconvert --to notebook --execute 2_py3.ipynb
jupyter nbconvert --to notebook --execute 3_py3.ipynb
jupyter nbconvert --to notebook --execute 4_R.ipynb
Since your notebooks require no shared state this should be fine. Alternatively, if you really wanna do it in a notebook, you use the execute Python API to call nbconvert from your notebook.
import nbformat
from nbconvert.preprocessors import ExecutePreprocessor
with open("1_py3.ipynb") as f1, open("2_py3.ipynb") as f2, open("3_py3.ipynb") as f3, open("4_R.ipynb") as f4:
nb1 = nbformat.read(f1, as_version=4)
nb2 = nbformat.read(f2, as_version=4)
nb3 = nbformat.read(f3, as_version=4)
nb4 = nbformat.read(f4, as_version=4)
ep_python = ExecutePreprocessor(timeout=600, kernel_name='python3')
#Use jupyter kernelspec list to find out what the kernel is called on your system
ep_R = ExecutePreprocessor(timeout=600, kernel_name='ir')
# path specifies which folder to execute the notebooks in, so set it to the one that you need so your file path references are correct
ep_python.preprocess(nb1, {'metadata': {'path': 'notebooks/'}})
ep_python.preprocess(nb2, {'metadata': {'path': 'notebooks/'}})
ep_python.preprocess(nb3, {'metadata': {'path': 'notebooks/'}})
ep_R.preprocess(nb4, {'metadata': {'path': 'notebooks/'}})
with open("1_py3.ipynb", "wt") as f1, open("2_py3.ipynb", "wt") as f2, open("3_py3.ipynb", "wt") as f3, open("4_R.ipynb", "wt") as f4:
nbformat.write(nb1, f1)
nbformat.write(nb2, f2)
nbformat.write(nb3, f3)
nbformat.write(nb4, f4)
Note that this is pretty much just the example copied from the nbconvert execute API docs: link

I was able to use the answer to implement two solutions to running an R notebook from a python3 notebook.
1. call nbconvert from ! shell command
Adding a simple ! shell command to the python3 notebook:
!jupyter nbconvert --to notebook --execute r.ipynb
So the notebook looks like this:
%run '1_py3.ipynb'
%run '2_py3.ipynb'
%run '3_py3.ipynb'
!jupyter nbconvert --to notebook --execute 4_R.ipynb
This seems simple and easy to use.
2. invoke nbformat in a cell
Add this to a cell in the batch notebook:
import nbformat
from nbconvert.preprocessors import ExecutePreprocessor
rnotebook = "r.ipynb"
rnotebook_out = "r_out.ipynb"
rnotebook_path = '/home/jovyan/work/'
with open(rnotebook) as f1:
nb1 = nbformat.read(f1, as_version=4)
ep_R = ExecutePreprocessor(timeout=600, kernel_name='ir')
ep_R.preprocess(nb1, {'metadata': {'path': rnotebook_path}})
with open(rnotebook_out, "wt") as f1:
nbformat.write(nb1, f1)
This is based on the answer from Louise Davies (which is based on the nbcovert docs example), but it only processes one file -- the non-R files can be processed in separate cells with %run.
If the batch notebook is in the same folder as the notebook it is executing then the path variable can be set with the %pwd shell magic, which returns the path of the batch notebook.
When we use nbformat.write we choose between replacing the original notebook (which is convenient and intuitive, but could corrupt or destroy the file) and creating a new file for output. A third option if the cell output isn't needed (e.g. in a workflow that manipulates files and writes logs) is to just ignore writing the cell output entirely.
drawbacks
One drawback to both methods is that they do not pipe cell results back into the master notebook display -- as opposed to the way that %run displays the output of a notebook in its result cell. The !jupyter nbconvert method appears to show stdout from nbconvert, while the import nbconvert method showed me nothing.

Related

Running Jupyter notebook (and generating plots) from the command line

I'm trying to use the terminal to run a jupyter notebook (kernel: Julia v1.6.2), which contains generated using Plots.jl, before uploading the notebook to github for viewing on nbviewer.com.
Following this question:
How to run an .ipynb Jupyter Notebook from terminal?
I have been using nbconvert as follows:
jupyter nbconvert --execute --to notebook --inplace
This runs the notebook (if you tweak the timeout limits), however, it does not display plots when using Plots.jl, even when I explicitly call display(plot()) at the end of a cell.
Does anyone have any idea how notebooks can be run remotely in such a manner that plots will be generated and displayed, particularly when using Julia?
I managed to generate Plots.jl plots by getting from IJulia the same configuration it uses to run notebooks (this is probably the most sure way when you have many Pyhtons etc.).
using Conda, IJulia
Conda.add("nbconvert") # I made sure nbconvert is installed
mycmd = IJulia.find_jupyter_subcommand("nbconvert")
append!(mycmd.exec, ["--ExecutePreprocessor.timeout=600","--to", "notebook" ,"--execute", "note1.ipynb"])
Now mycmd has exactly the same environment as seen by IJulia so we can do run(mycmd):
julia> run(mycmd)
[NbConvertApp] Converting notebook note1.ipynb to notebook
Starting kernel event loops.
[NbConvertApp] Writing 23722 bytes to note1.nbconvert.ipynb
The outcome got saved to note1.nbconvert.ipynb, I open it with nteract to show that graphs actually got generated:
Launch notebook with using IJulia and notebook() in the REPL

Invoke several Jupyter Notebook sequentially

Our team has developed python script to process data in 10 separated but functionally related jupyter notebooks, i.e. output of notebook 1 will be used as input for notebook 2 and so on.
Our next step is to automate the data processing process. Are there any ways to sequentially invoke jupyter notebooks?
nbconvert allows you to run notebooks. To run a notebook and replace the existing output with the new one, you can use.
jupyter nbconvert --execute --to notebook --inplace <notebook>
For more options and different approaches, you can have a look at this
You can create a script with the above commands for each notebook and this should be able to execute the notebooks in the sequential order.
Script:
jupyter nbconvert --execute --to notebook --inplace <notebook1>
jupyter nbconvert --execute --to notebook --inplace <notebook2>
Run script.
Alternative way would be to arrange file names in alphabetical order,
and then in the terminal, you can use
jupyter nbconvert --inplace --execute *.ipynb
Note that, --inplace overwrites the notebook in itself (same file) after execution.

Can I run a jupyter notebook with an R kernel from a jupyter notebook with a python kernel?

I have some R code to update a database stored in update_db.ipynb. When I try to %run update_db.ipynb from a jupyter notebook with a python kernel, I get an error
File "<ipython-input-8-815efb9473c5>", line 14
city_weather <- function(start,end,airports){
^
SyntaxError: invalid syntax
Looks like it thinks that update_db.ipynb is written in python. Can I specify which kernel to use when I use %run?
Your error is not due to the kernel selected. Your command %runĀ is made to run python only, but it has to be a script, not a notebook. You can check in details the ipython magic commands
For your use case I would suggest to install both python and R kernel in jupyter. Then you can use the magic cell command %%R to select to run R kernel for a cell inside the python notebook. Source :this great article on jupyter - tip 19
Other solution is to put your R code in an R script, and then execute it from a jupyter notebook. For this you can run a bash command from a jupyter notebook that will execute the script
!R path/to/script.r

How to update output file after executing each cell in notebook with nbconvert

I'm using following command to execute "file1.ipynb" and write the output to "file2.ipynb".
jupyter nbconvert file1.ipynb --to notebook --execute --output file2.ipynb
It appears "file2.ipynb" is created only after the whole notebook is executed. However I want to see the output for already executed cells halfway through the execution (as opposed to seeing it at the very end).
Is there a way to update the output file after executing each cell?

Save ipython notebook as script programmatically

The excellent ipython notebook has a handy --script command line flag that automatically saves a copy of the notebook as a .py script file (removing any header and markdown cells). Is there a way to switch this feature on from inside the notebook itself after the notebook is opened? Apparently, this option is not accessible to the %config magic.
Is there a way to have a cell that does this conversion? Is there any command-line tool I could use to do the conversion, and just have that in a shell command run from the notebook? (It seems that nbconvert does not output to .py.)
The reason I ask is that I have a git repository of notebooks, and I need to make sure the .py files are kept up to date when users change the notebooks themselves because the .py files are used to create c++ code from the contents of the notebooks. But I can't rely on users to set the --script flag because they'll always forget. (And I include myself in that group of users.)
Better yet (at least for my purposes): ipython respects local copies of the ipython_notebook_config.py file. So I can just add
c = get_config()
c.NotebookManager.save_script = True
to such a file in my notebook directory. Apparently, ipython first reads ~/.ipython/profile_default/ipython_notebook_config.py, and then reads the local copy of that file. So it's safe to use without worrying about demolishing the user settings.
This was not at all clear to me from the documentation, but I just tried it and it worked.
Oh. My mistake. nbconvert can handle conversions to script. So I can do something like this:
!ipython nbconvert --to python MyNB.ipynb
Of course, this line will get saved to the script, which means the script will try to re-save the notebook to itself every time it's executed. That's a bit circular, and I can imagine it could cause problems with some of my more outlandish hacks. Instead, we can ensure that it's only run from ipython by wrapping it as follows:
try :
if(__IPYTHON__) :
!ipython nbconvert --to python MyNB.ipynb
except NameError :
pass
Note that the conversion process will automatically convert the ! syntax to something that is acceptable to plain python. This is apparently not the case with the --script conversion. So the extra-safe way to do this is
try :
if(__IPYTHON__) :
get_ipython().system(u'ipython nbconvert --to python MyNB.ipynb')
except NameError :
pass

Resources