How can I reduce the file size of my iPython notebook? - jupyter-notebook

I have an IPython notebook which is several megabytes big although the code inside is just about 100 lines. I think it is that huge because I load several images inside.
I would like to add this notebook to a git repository. However, I don't want to upload something that big which can easily be generated again.
Is it possible to save just the code of an IPython notebook to reduce its size?

You can try following steps since it worked for me:
Select the "Cell" -> then select "All Outputs" -> There you will find "Clear" option select that.
And then save the file.
This will reduce the size of your file (From MBs to kbs). It will also reduce the time to load the notebook next time you open it in your browser.
As per my understanding this will clear all the output created after execution of the code. Since Notebook is holding code+images+comments in addition to this its also holding the out put in that file therefore it will increase the size of the notebook.

I run into the exact same problem with one of my notebooks, which I solved by changing my df to df.head(5). I did this instead of clearing all outputs as I still wanted to show on GitHub how my code changed data inside the columns in my df.
You also can run !ls -lh in the last cell of your notebook to check size of your notebook before saving. This will give you an idea if you need to clear outputs/replace df with df.head()/remove images in order to reduce the size and be able to save on the GitHub.

Now you generate a simple script linked to the notebook with jupytext which others can rerun.
If you need to keep the images within (because, for example, you are sharing the notebook with someone who does not want to/can not rerun it) you might want to try to reduce the images.
I found this module ipynbcompress which seems to do exactly this, but so far I could not install it.

Related

gnuplot - How can I save a graphics file of a plot that is the same as I designed it in xterminal?

I have been making plots for some time now, and they are precisely the way I like them, on screen. The data is coming in from sensors related to solar power collection and storage.
Plotted on screen they look great so I do a screen region capture to save them.
So now I would like to automate the saving process.
Here is what I have done so far:
I set up a cron job so they would be run right at midnight, capturing the whole day and saving it as a .png file
Then it moves the "today.dat" data file to the archive named by date.
This part is all working as designed.
EXCEPT, by using .PNG the images do not look the same.
I really thought png would be the best option, but it turns out that the font used for the X-axis (HH:MM ticks) is too thick and they run together. It looks like a crayon-drawn version of my plot designs.
Can someone please give me some guidance on how to best programatically generate the plots for saving so they look like the way I designed them?
As pointed out in the comments above, the best way is probably to use a different terminal for output to an image file, and simply ignore the fact that the generated images are not identical to what you see on your screen when using the x11 terminal. However, if you really need an exact copy, there are (at least) two options:
You could automate the process of taking a screenshot. You can even do this from within gnuplot, where it might come handy that the GPVAL_TERM_WINDOWID variable contains the X Windows ID for the current plot window. You can use that to make a screenshot of the window after you made the plot:
system(sprintf("xwd -id 0x%x | convert xwd:- screenshot.png", GPVAL_TERM_WINDOWID))
Here I included a call to convert to convert the xwd file format to png.
Another option is to use the xlib terminal, which saves the sequence of commands that the gnuplot_x11 helper application turns into the window you see on the screen. For example,
set term push; set term xlib; set output "file.xlib"; replot; set output; set term pop
will create the file file.xlib that has all the information of the last plot. To later view this plot, use
gnuplot_x11 -noevents -persist < file.xlib
where you might have to specify the path to gnuplot_x11.
Similar as #user8153 suggested for x11, you can use import, which is as convert an imagemagick tool
system("import -window ".GPVAL_TERM_WINDOWID." screenshot.png")
Convenient is also a shortcut to copy the image into clipboard and paste it with Ctrl+v elsewhere:
bind Ctrl-c 'system("import -window ".GPVAL_TERM_WINDOWID." png:- | xclip -sel clip -t image/png")'
See also Show graph on display and save it to file simultaneously in gnuplot.

Saving .R script File Using Script

I am using R Studio and I want to save my script (i.e., the upper left panel). However, the only ways that I can find to do it are by either clicking the blue floppy disk icon to save or using the drop down menu File > Save > name.R
Is there any way besides using these shortcuts to save the script to a .R file or is the shortcut the only way?
Thanks.
You can use rstudioapi::documentSave() to save the currently open script file to disk.
From the source documentation, one can see that it can be used in conjunction with the id returned with getActiveDocumentContext()$id to make sure the document saved is the one running the script.
For your intended use, try:
rstudioapi::documentSave(rstudioapi::getActiveDocumentContext()$id)
For future reference, here is the reference manual of rstudioapi:
https://cran.rstudio.com/web/packages/rstudioapi/rstudioapi.pdf
I'm not yet allowed to comment, but this refers to the comment above that this does not work with .rmd files:
rstudioapi::documentSave(rstudioapi::getActiveDocumentContext()$id)
I tried and in Rstudio Version 1.2.5042 it does seem to work.
Every new tab in R(created by ctrl+shift+n) can be independently saved by using ctrl+s in the respective tab. If you intend to rename the file though, you may do it as you would rename any file in windows(goto the file location and single click on the filename). Hope my answer was of some help!

Loading (Large) R Script File

Good Morning.
I am new to R.
Need to load a 4.5MB sized R script file using RStudio (ver. 3.0.2). Unfortunately it returns an error as the image below,
Apparently the max script file size is 2MB.
Is there a way to load what considered large script file by R Studio without dividing it into 3 different script files?
I am exploring if there is a place in Global Setting which have set the script max file size to be 2MB but did not find the parameter settings there.
Hope you can guide.
Thanks.
Yes, I can open in RGUI and execute it. Have reworked the R script given to me. Many lines repeated with different combination of options using variables <- c('Option1', 'Option2', ...). Have done a dynamic script with less than 300 lines!

How to load a notebook without the outputs?

I mistakenly printed to much to the output during a single cell's execution and now the browser tab completely freezes every time that notebook is opened. I tried restarting ipython and it didn't help (I am guessing that each time it is loaded, also all the chunk of text is loaded with it).
Is there a way to load a notebook with outputs suspended or clear?
One hack if you're desperate: open the .ipynb file, which is a text file. Scroll down to the lengthy cell output and delete it. Of course, you need to be careful that the result is still a valid .ipynb file.
nbstripout is a simple tool that removes all output from a notebook (without needing to open the notebook in your browser).
your code will be saved in the form of JSON. open it with json viewer and carefully delete the unwanted output cell and save it back.

R Studio/R Accessing Blocks of Previously Run Code

Is there a way to display past commands in R/R Studio?? I know in R Studio there is a shortcut (CTRL+UP Arrow) that allows you to see past lines you have ran. But this shortcut only allows you to access only a single line, not a block of previously run code. Is there a package, or some way in R to display and select blocks of past code in R/R Studio?
Here you can find a description of the panes in R Studio, including the "history" pane.
There you can select several lines and paste them into the code.
Also, you can use the command savehistory() to save your history in a file you can the modify. If you want to choose the name of the file to be saved, use
savehistory(file = filename)
The same option is available in the basic R GUI (MS Windows), with "Save history" in the "File" menu (as a .RHistory file). Then, you can open it with any text editor and modify your history to make a script.
To see a specific number of lines, you can use history(25) (for 25 previous lines).

Resources