Jupyter notebook module not found - jupyter-notebook

I forked a Github Repo and added a module in a new branch.
Now I wanted to show my changes by adding a jupyter notebook.
When I run jupyter though, the newly created module cannot be imported / found.
Now Iam wondering what Iam doing wrong?
Thanks in advance for any advice,
cheers,
Michael

Generally, there are two options. Either, you have the imports in a .py file and are able to import them via the import statement as usual or you copy the contents to be shown into respective cells in the notebook itself and then run these cells to have the classes available. The latter approach is either a work-around or a good way to present everything you have done nicely on the same page. But generally, if the paths to your import files are correctly specified, imports should work as usual.
I would start checking whether the files are actually all in the folders where they are supposed to be. And make sure that all required files (containing import contents) are present on the same branch on which you also want to store and run your notebook.

Related

google colab does not import modules

I was trying to find a way to install modules permanently. I came to this post which teaches how to install packages on google drive, then mounting the drive and then using "sys.path.append" to tell the python where to look for the new package.
this method works as expected when a module directly is imported when you code in the notebook itself.
However, when I tried to run a project that I already had and wanted to run the .py file (by using"!python myCode.py"), the "sys" module can't append the path of the modules that have been installed in google drive.
in short, when you use the approach in the link above, you can only import packages when you directly code in the notebook itself. the approach did not work for me when I tried to use it on my .py files. i.e., when I used "!python myCode.py"
any suggestion on how to solve this problem? do you have the same problem as well?
thanks,

Goland Search functionality not indexing all files in project

I'm using Goland version 2021.1 on a mac, and recently I've noticed when searching for files, symbols, or types (using command+shift+o), it doesn't index across my entire project. For example, if I have two files A.txt in two different directories, the file search will only show one of them.
Has anyone encountered this?
It's a known issue, please see IDEA-266391. You can download Toolbox App and install the Nightly build of GoLand.
As another workaround, you can invalidate caches via File | Invalidate Caches.

Unable to use correct file paths in R/RStudio

Disclaimer: I am very new here.
I am trying to learn R via RStudio through a tutorial and very early have encountered an extremely frustrating issue: when I am trying to use the read.table function, the program consistently reads my files (written as "~/Desktop/R/FILENAME") as going through the path "C:/Users/Chris/Documents/Desktop/R/FILENAME". Note that the program is considering my Desktop folder to be through my documents folder, which is preventing me from reading any files. I have already set and re-set my working directory multiple times and even re-downloaded R and RStudio and I still encounter this error.
When I enter the entire file path instead of using the "~" shortcut, the program is successfully able to access the files, but I don't want to have to type out the full file path every single time I need to access a file.
Does anyone know how to fix this issue? Is there any further internal issue with how my computer is viewing the desktop in relation to my other files?
I've attached a pic.
Best,
Chris L.
The ~ will tell R to look in your default directory, which in Windows is your Documents folder, this is why you are getting this error. You can change the default directory in the RStudio settings or your R profile. It just depends on how you want to set up your project. For example:
Put all the files in the working directory (getwd() will tell you the working directory for the project). Then you can just call the files with the filename, and you will get tab completion (awesome!). You can change the working directory with setwd(), but remember to use the full path not just ~/XX. This might be the easiest for you if you want to minimise typing.
If you use a lot of scripts, or work on multiple computers or cross-platform, the above solution isn't quite as good. In this situation, you can keep all your files in a base directory, and then in your script use the file.path function to construct the paths:
base_dir <- 'C:/Desktop/R/'
read.table(file.path(base_dir, "FILENAME"))
I actually keep the base_dir assignemnt as a code snippet in RStudio, so I can easily insert it into scripts and know explicitly what is going on, as opposed to configuring it in RStudio or R profile. There is a conditional in the code snippet which detects the platform and assigns the directory correctly.
When R reports "cannot open the connection" it means either of two things:
The file does not exist at that location - you can verify whether the file is there by pasting the full path echoed back in the error message into windows file manager. Sometimes the error is as simple as an extra subdirectory. (This seems to be the problem with your current code - Windows Desktop is never nested in Documents).
If the file exists at the location, then R does not have permission to access the folder. This requires changing Windows folder permissions to grant R read and write permission to the folder.
In windows, if you launch RStudio from the folder you consider the "project workspace home", then all path references can use the dot as "relative to workspace home", e.g. "./data/inputfile.csv"

Modifying jupyter notebook in init code

Is it possible to modify the contents of a notebook in the notebook startup code? I want to run some init code and add "header" cells to every notebook on a machine based on the code, for instance grab the hash of the current head from a local git repo, or pull a file from S3 to the local file system.
I can put a bunch of scripts, either .py or .ipy in the ~/.ipython/profile_default/startup/ directory and I'd like to modify the notebook that is currently being opened using those scripts (or some other scripts if that's possible).
According to the docs the shell has already been setup when those scripts run, so I'm thinking there should be some way of accessing, at a minimum, the local path of the notebook that was opened. I could then use nbformat (github) to modify the contents.
Alternatively I could use NotebookApp or ContentsManager to possibly modify the running notebook, but I'm not exactly sure how to do that and the notebook docs are pretty light on the actual API for those classes. This might not be possible as the init code is executed in the kernel, which does not know what the front end is, it could be the case that the kernel is connected to a console not to a notebook or to both a notebook and a console.
So
can I access the filename of the current notebook in a startup script?
should I rather be looking to modify the notebook cells through NotebookApp, FileContentsManager or some other internal class?
related
There is an open issue for template files https://github.com/jupyter/notebook/issues/332 -- this is not what I'm looking for, the template files are static, I need to modify the notebook based on the result of a computation

Set an .Rmd in a package to write files to the current project working directory

I have a .Rmd which I use to report on data quality in a number of different r projects. It would then split the data to remove subsets with missing data, and interpolate missing results where appropriate. It would do this via a write.csv command to a file path in the form of "./Cleansed_data/"
To make an example
open rstudio
go to the rhs 'project' menu , and select and make a new
project wherever you'd like
go to the lhs 'new script' drop down and
select 'new .Rmd'
change the output to .pdf and hit ok
in the last r
chunk include write.csv(mtcars, file = "mtcars.csv")
hit the 'knit
pdf' button, save the report as "writeFile.Rmd" to your project working directory, and
let it run.
Previously I moved this .Rmd from place to place, however now I would like to built it into an internal package. I have included it (as the documentation indicates to) into inst/rmd within the package directory.
In order to do this build or open any package you have access to
add the file to inst/rmd (create it if this doesn't exist)
rebuild the package
I then rebuild the package and open a new project. I load my new package and attempt to run the document via the render command using the system.file command to locate the .rmd like so
rmarkdown::render(input = system.file("rmd/writeFile.Rmd", package="MyPackage"),
output_file = "writeFile.pdf", output_dir = "./Cars/)
This will render the report from the package build into the folder from output_dir, however, there are a number of pitfalls here. First, if I omit the output_dir argument, the report will render into the package library, usually located in the libraries r installation in the c drive. This is however fixable.
What I can't get around is that when the .Rmd hits the write.csv() then (I believe) the .Rmd is being rendered in the package environment at the time, the working directory of which is the package library folder, not the current project directory.
The Questions
How can I inform the template in the package what the current working directory is for the rstudio project? I'm vaguely aware there might be a rstudio api package? I have nearly no understanding of what it is though, or if this would provide a solution.
If this is either outright impossible or just potentially a very bad idea how can I modify the workflow to successfully retrieve a number of r object outputs into the environment or the working directory, on the call to the report, without having to modify the report for each different project? Further, why specifically is this approach such a bad plan?
In order to close this off:
I have selected to keep the .Rmd included in the package. The .Rmd need to move and be versioned with the package as that holds the functions they use to run.
In order to meet my requirements I style the documents to grab the working directory via the rstudio api in the form.
write.csv(mtcars, file = paste0(rstudioapi::getActiveProject(), "mtcars.csv"))
Having tested #CL's answer, this also runs and is not dependant on Rstudio as an IDE, however I know that these documents will
Always be accessed via the rstudio IDE
Always be accessed from within a specific project
I fear (though have not tested) that there would be the potential for other impacts from setting the working directory for the file to be artificially booted into a different WD. Potentially this could be things like child documents I might want to include later, or other code that might need to be relevant to the file path of the package installation, not the project. In this way I think (If I interpreted Yuhui correctly) the r doc is still the centre of it's own universe. It just writes it's data into another one :)

Resources