how to reset jupyter notebook meta data, keep content. So it will be easier to manage with git - jupyter-notebook

Is is possible to reset notebook run information, only keep content of a jupyter notebook?
Because every time I run a notebook, git will think this file is changed. I don't remember if I change the content of this notebook(some time I open a notebook for days), so I can't just checkout this notebook file for git history. If I just commit notebook to git server no matter if I make "real" change of it, it makes my git log very messy.
There some execution information is not keep in .ipynb_checkpints directory:
for example:
another real messy content is the output of cells.

You could add the following line to your .gitignore to simply make Git ignore the Jupyter Notebook checkpoint files:
.ipynb_checkpoints
If you don't already have a .gitignore file, this is a file that tells Git which files it can safely ignore. You can create it by simply making a file named ".gitignore", then adding the line above to the file.
If you're using Windows, this is a little harder than it should be, since the ".gitignore" file doesn't technically have a file name (only a file extension). Here's how you can do it anyway.
Also note that if you have already added any .ipynb_checkpoints files to your Git repo, you need to manually remove them before this will work. The .gitignore file does not work on files that are already tracked.

Related

Different local and remote organisation R Project and GitHub

I want to version control my R scripts so I've created an R project and a GitHub repo. My scripts are scattered through several directories within the same directory where the R project is.
I would like that my GitHub repository harbors only the scripts, independently of the folders they are locally stored in. However when I run the below command:
git add folder/file.R
git commit -m "my_message"
git push -u origin master
A directory named folder is created containing file.R but I'd like to just see file.R without the folder. Do you know how can I do this? Also, would it be good practice? My local folders are organized so each directory contains its own scripts and results, that's the reason the scripts are separated.
Thank you very much
is there a way to add the file.R without specifying the path?
Not using git add, no. The design constraint for git add is that it should store the file's name exactly as it appears, including the forward slashes, so if the file's name is folder/file.R, that's the file's name.
You have some options here though:
You can make a parallel directory where you put the files with the names you want them to have. Run git init in that directory, copy the folder/file.R file to file.R in that directory. Then cd ../gitdir or whatever is appropriate to get there, and git add file.R.
This method is probably the best because it's the simplest.
You can write your own programs using git hash-file -w and git update-index, which are two of Git's plumbing commands. A plumbing command, in Git, is basically a command that exists so that you can build user-facing commands: they're not meant to be run by humans but rather by other programs. So you write a program (in whatever language you like) that uses these plumbing programs to achieve whatever you want.
In particular, you can create or find a Git blob object holding the contents of file.R as read from anywhere you like, then use git update-index to create an index entry holding whatever path you like and referring to the blob object you created (or found) with git hash-object with the -w flag.
Since Git is a suite of tools, not a solution, you can come up with your own method. The tools in Git are made with particular approaches in mind, but they are flexible enough to be repurposed.

How to restore a deleted Jupyter notebook file

I accidentally deleted a jupyter notebook file on my Google Cloud instance. I wonder if there's anyway to restore/recover the file?
Thanks to this link, I found the solution. Files deleted in the browser should probably be in a Trash folder. In my case and on my Google Cloud instance, the deleted files were in the following path.
cd ~/.local/share/Trash/files/
By using ls, list the files and see if your file is in this folder. If yes, then simply using the mv command you can move your deleted file to the path you want.

How to fix "warning: could not open directory" after "git add ." command on Mac OS X Maverick

I'm new to R and RStudio and am currently taking online classes to learn more about data science. In one of my lectures, I'm being asked to create a project in RStudio prior to creating a repository in github and linking the project with git. In order to make a pre-existing project interact with git, the instructions in my lecture are telling me to navigate to the directory containing my project file by using the "cd" command followed by the location of the file and file name. My project file is currently located on my desktop so I typed in "cd ~ /Desktop/temporary_no_version_control" however, the directory doesn't seem to change and remains set on the original location of the file which was in Users/savannahkeiffer. Just so I could complete the assignment, I re-located the file to my user file and tried to follow the rest of the instructions which told me to type "git init" followed by "git add ." which is where I run into the "warning: could not open directory" warning.
I have a macbook which runs on OS X Maverick. I went into my system preferences > security and privacy and selected Full Disk Access where I manually allowed terminal to have access to all the files on my laptop. However, after closing and re-opening RStudio and attempting the commands again, I got the same error.
This is what I entered when I tried to change the directory
Savannahs-MacBook-Air-2:~ savannahkeiffer$ cd ~
/Desktop/temporary_no_version_control
Savannahs-MacBook-Air-2:~ savannahkeiffer$ git init
Reinitialized existing Git repository in
/Users/savannahkeiffer/.git/
And what I got when I changed the location of the project on my laptop in order to complete the assignment (after already giving access to terminal)
Savannahs-MacBook-Air-2:~ savannahkeiffer$ cd ~
/Users/savannahkeiffer/first project/temporary_no_version_control
Savannahs-MacBook-Air-2:~ savannahkeiffer$ git init
Reinitialized existing Git repository in
/Users/savannahkeiffer/.git/
Savannahs-MacBook-Air-2:~ savannahkeiffer$ git add .
warning: could not open directory 'Pictures/Photos
Library.photoslibrary/': Operation not permitted
warning: could not open directory 'Library/Application
Support/MobileSync/': Operation not permitted
warning: could not open directory 'Library/Application
Support/CallHistoryTransactions/': Operation not permitted
warning: could not open directory 'Library/Application
Support/com.apple.TCC/': Operation not permitted
warning: could not open directory 'Library/Application
Support/AddressBook/': Operation not permitted
And so on.. Is this a directory problem or a "git add ." command problem?
It looks like what happened is that when you typed the cd command, you left a space in between the tilde and the rest of the path, so you changed back into your home directory (represented by the tilde). Then, when you tried to do a git init, you tried to initialize your home directory as a Git repository, and then ran into the fact that macOS restricts some programs (in your case, not Terminal, but maybe still Git) from accessing certain directories.
In the shell, the tilde is just a fancy way of spelling the environment variable $HOME, which points to your home directory (in this case, /Users/savannahkeiffer), so it should immediately precede the rest of the path without a space in between.
The best thing to do in this case is switch into your project directory and then initialize a repository there:
cd ~/Desktop/temporary_no_version_control # note the lack of space after the tilde
git init
If you didn't intend for your home directory to be a repository (i.e., you're not storing your dotfiles in a repository there), then you will probably also want to remove the .git directory from your home directory by running rm -fr ~/.git. Be careful when typing this, as rm removes data without prompting and an unfortunate space could result in all your data being deleted.
Hello this was an issue I had also but in Windows. It was a simple fix, user error. I hadn't used gitbash for awhile so I forgot the process with working in gitbash. First mistake I made was after opening gitbash I directly executed the git status command. That's when I got the "warning: could not open the directory" message. You need to using the cd (change directory) command and the dir (directory) command to navigate to the folder that has the files you want to "git add ." and "git commit -m". Once you get to that folder you will be able to use the "git status" command to see your changes then proceed as normal. I had to post this because it took me hours before I realized what I was doing wrong. No other stack post pointed this obvious user mistake. Hope it helps you.

How to recover deleted iPython Notebooks

I have iPython Notebook through Anaconda. I accidentally deleted an important notebook, and I can't seem to find it in trash (I don't think iPy Notebooks go to the trash).
Does anyone know how I can recover the notebook? I am using Mac OS X.
Thanks!
This is bit of additional info on the answer by Thuener,
I did the following to recover my deleted .ipynb file.
The cache is in ~/.cache/chromium/Default/Cache/ (I use chromium)
used grep in binary search mode, grep -a 'import math' (replace search string by a keyword specific in your code)
Edit the binary file in vim (it doesn't open in gedit)
The python ipynb should file start with '{ "cells":' and
ends with '"nbformat": 4, "nbformat_minor": 2}'
remove everything outside these start and end points
Rename the file as .ipynb, open it in your jupyter-notebook, it works.
The "delete" functionality now sends the file to OS trash rather than permanently deleting it, see this PR: https://github.com/jupyter/notebook/pull/1968. So you can just open your Trash (wherever that is on your system) and restore it.
I think the easiest way (until developers handle this issue) to retrieve your Ipython history is to write them all into an empty file.
You need to check by the date you created your last script. Obviously, it is going to be the last part of your Ipython history.
To write your Ipython history into a file:
%history -g -f anyfilename
On linux:
I did the same error and I finally found the deleted file in the trash
/home/$USER/.local/share/Trash/files
If you deleted it through the OS (rm file.ipynb) then you can probably get it from ~/.ipython_checkpoints/ However, if you deleted it from the browser menu option, it is gone (by design!).
See discussion here: https://github.com/jupyter/notebook/issues/405
If you use PyCharm, you can do the following.
Open the Local History view.
Select the version you want to roll back to.
On the context menu of the selection, choose Revert.
Worked for me!
Source: here
For the unlucky ones like me, that delete some files on JuliaBox(jupyter for julia), there is a solution. I successifly recovery all my deleted files.
The browsers strore cache information about the pages you visit. You have to find your cache browser folder (in ubuntu with crhome was ~/.cache/google-chrome/Default/Cache) and grep for some text of your notebook in the binarys. Then, cut the text part of the file that is correspond to your ipynb.
https://groups.google.com/forum/#!searchin/julia-box/delete%7Csort:relevance/julia-box/Rt9LG9RldrU/3s_vVSrivJEJ
If you're using windows, it sends it to the recycle bin, thankfully. Clearly, it's a good idea to make checkpoints.
As long as your Kernel is active, the code of each executed cell is stored in input history list. This will come in handy when you accidentally deleted a cell and want to retrieve its content.
_ih[-10:] *# code of the 10 most recently run cells (Even if those cells are deleted now)*
If you are running on Jupyterlab on linux like me. What I did is went into command prompt and went to my trash folder.
Trash directories on linux are typically
/home/$USER/.local/share/Trash
or
If you deleted something as root (e.g. deleted a file using Nautilus invoked via gksu), it is at
/root/.local/share/Trash
I ended up changing directories to /home/$USER/.local/share/Trash/files and my deleted notebook was there! depending on how you access your backend you could also try /home/jupyter/.local/share/Trash/
ps
If you are having issues changing directories from Trash to files due to permissions dont forget to become root:
sudo -i
then after sudo -i, go up with:
cd ..
and then
cd home/jupyter/.local/share/Trash
cd files
Best of luck,
Sadly my file was neither in the checkpoints directory, nor chromium's cache. Fortunately, I had an ext4 formatted file system and was able to recover my file using extundelete:
Figure out the drive your missing deleted file was stored on:
df /your/deleted/file/diretory/
Switch to a folder located on another you have write access to:
cd /your/alternate/location/
It is proffered to run extundlete on an unmounted partition. Thus, if your deleted file wasn't stored on the same drive as your operating system, it's recommended you unmount the partition of the deleted file (though you may want to ensure extundlete is already installed before proceeding):
sudo umount /dev/sdax
where sdax is the partition returned by your df command earlier
Use extundelete to restore your file:
sudo extundelete --restore-file /your/deleted/file/diretory/delted.file /dev/sdax
If successful your recovered file will be located at:
/your/alternate/location/your/deleted/file/diretory/delted.file
I had the very problem and I ended up solving it this way. It might be the case for some of the folks.

Git Archive but first put all the files inside a folder then start archiving

As title suggest, I want to know if there is a single git command that put all my project in one folder first (not including .gitignored files) and then proceed archiving the folder— leaving ignored files not included when archiving which is nice.
This can be beneficial for me as I am working on WordPress plugin with multiple release. Some references.
I want all the files (minus the .gitignored files) move to a folder first then proceed archiving that folder
It is possible in one command provided you define an alias but this isn't git-related:
you can:
clone your repo elsewhere (that way you don't get any ignored or private file)
move your files as you see fit in that local clone
archive (tar cpvf yourArchive.tar yourFolder)
But git archive alone won't help you move those files, which is why I would recommend a script with custom bash commands (not git commands).
You don't really need to copy / clone the repo anywhere.
Make sure you committed all your changes.
Process the files any way you want.
Run tar -cvjf dist/archive-name.tbz2 --transform='s,^,archive-name/,' $(git ls-tree --full-tree -r --name-only --full-name HEAD)
run git reset --hard to restore without any of the changes you made in step #2.
Hints:
The --transform='s,^,archive-name/,' is so your files will be extracted toarchive-name/....`, you can remove it if you don't need that.

Resources