Error while importing pandas in R via reticulate - r

I am using R and I want to use a function I wrote in Python which needs to import pandas. Hence, I use the following code in R:
library(reticulate)
reticulate::py_install("pandas", force = TRUE)
which runs with no issues. Also, I already installed pandas in Python. Nevertheless, when I run the script which imports pandas:
source("script_with_pandas.py")
I get the following error:
Error in py_run_file_impl(file, local, convert) :
ImportError: C extension: No module named 'pandas._libs.interval' not built. If you want to import pandas from the source directory, you may need to run 'python setup.py build_ext --force' to build the C extensions first.
Any idea how to solve this?

Try
reticulate::source_python("script_with_pandas.py")
But I'm pretty sure this is an environment issue. If you're using RStudio >=v1.4 then you can go to tools --> global options --> python interpreter and check which one you're using, that may be the problem. Aside from that, I think it's only a matter of installing the pandas package onto the right environment.

Related

PackageNotFoundError: bokeh after compiling with pyinstaller

I have a python project that imports bokeh, and I use pip install bokeh to install bokeh 3.0.3 version, and everything works fine. However, after I compile this project using pyinstaller, the executable file crashes at launch with the following error:
importlib.metadata.PackageNotFoundError: bokeh
Also the way I'm importing bokeh functionalities into my project is like this:
from bokeh.io import output_file, show
I have searched around, but haven't been able to find any useful clue on how to solve this issue. I would appreciate any help on this.
Python versoin 3.8.13
Pyinstaller version: 5.7.0
I have also tried using conda to install bokeh, but it does not make any difference.
You may find this issue useful for your case.
https://github.com/pyinstaller/pyinstaller/discussions/6033
In your case there is the need to include the "--copy-metadata bokeh" command when you are running pyInstaller or use the copy_metadata() function in your spec file.
For more details refer to the github issue as it goes a little more in detail.

Accessing Python functions from a large Git repository in R

My company has a large Git repository that is actively being developed (call it really-big-repo). I would like to use the reticulate package to call some of those python functions in R. Apologies in advance as I still try to get my arms around both python and git.
Single python files can be called using reticulate::source_python("flights.py") (see here). However, the python script I would like to imports modules that are from other parts of the repository. For example, the file that I would like to source great_python_functions.py looks like this:
import datetime
import json
import re
import requests
from bs4 import BeautifulSoup
from SomeRepoDirectory import utils
from SomeRepoDirectory.entity.models import Entity, EntityAlias, EntityBase, Subsidiary
import SomeRepoDirectory.entity.wikipedia
from SomeRepoDirectory.io.es import es_h
...
To further complicate it, the repo is pretty large and this file is just a small part of it. I'm not sure it is wise to load ALL of the repo's functions into my R environment.
And one more bonus question. I really would like to access the functions that are on a branch of the repo, not master. I've cloned the repository to a different directory than my R project using git clone git#github.com:my-company-name/really-big-repo.git
Here are the following steps you can try, but gotta say this is going to be complicated, might I say learning python might be easier :p
Like you said you have cloned the repository:
cd ./cloned_repo
conda activate your_vitual_env
git checkout feature/branch
python setup.py develop # this will install the pkg in virtual env, you can also use install instead of develop
Now in your R, use the virtual env in which you installed the repo, in my example I am using conda env, so you can use: reticulate::use_condaenv('your_virtual_env') and then you should be able to use those functions.
Also, in my experience intermingling python and R has caused a lot of pain for production development especially with package management. So I will advise some caution.

Need to restart runtime before import an installed package in Colab

I am trying to install and use an existing python package in Google Colab. For this, I download the code from Github in Colab and install the package, but when trying to import the installed package, I get a ModuleNotFoundError: No module named 'gem' Error.
However, if I restart the runtime and run the importing cell again, then no error appears.
I am wondering why I need to restart the runtime after installing the package and before importing.
Any clever response will be much appreciated.
My code is:
[1] !wget --show-progress --continue -O /content/gem.zip https://github.com/palash1992/GEM/archive/master.zip
[2] !unzip gem.zip
# Installing Dependencies
[3] ! pip install keras==2.0.2
[4] %cd GEM-master
!sudo python3 setup.py install
%cd-
[5] from gem.utils import graph_util, plot_util
And the error that I get is:
ModuleNotFoundError Traceback (most recent call last)
<ipython-input-5-af270a37878a> in <module>()
1 import matplotlib.pyplot as plt
2
----> 3 from gem.utils import graph_util, plot_util
4 from gem.evaluation import visualize_embedding as viz
5 from gem.evaluation import evaluate_graph_reconstruction as gr
ModuleNotFoundError: No module named 'gem'
---------------------------------------------------------------------------
NOTE: If your import is failing due to a missing package, you can
manually install dependencies using either !pip or !apt.
To view examples of installing some common dependencies, click the
"Open Examples" button below.
---------------------------------------------------------------------------
However, if I restart the runtime using os.kill(os.getpid(), 9) after installing the package and before importing it, then the above error does not appear.
It seems that everything except simple !pip installs seem to not get included in colab's module registry except after a runtime restart. Likely, colab has a fairly naive way of keeping track of available modules. You also have to restart the runtime if you import a different version of a previously installed package.
Probably they just have a script that appends the metadata for piply installed packages to a list-like object during runtime. And imports just search from the top of the list (which is why the restart is required for diff versions of packages).
However, when a new runtime is started, the list-like registry is initialized and populated by searching the relevant directories.
To force a restart:
try:
from gem.utils import graph_util, plot_util
except (ImportError, KeyError, ModuleNotFoundError):
## code to install gem
print('Stopping RUNTIME. Colaboratory will restart automatically. Please run again.')
exit()
Based on multiple answers to Google Colab - How to 'restart runtime' using python code or command line interface?.

AttributeError: type object 'pandas._libs.tslib._TSObject' has no attribute' _reduce_cython_'

I have a created a tkinter app for Convolution neural network to identify images. I am trying to compile the py file with pyinstaller but i am receiving this error:
AttributeError: type object 'pandas._libs.tslib._TSObject' has no attribute' _reduce_cython_'
i have also attached the screenshot of the error
Just re-start in case you are using Jupyter notebook. It worked for me. Also make sure, you are working in correct python environment.
This problem just happened to me when i try to create an app of an uncompleted project for testing purpose.
I have pandas imported, but i didnt call it in my script.
Removing the import line or call pandas was the solution.
You need to have cython too,
pip install cython or conda install cython
pyinstaller --onefile --hidden-import pandas._libs.tslibs.timedeltas program.py
I solved it using
pyinstaller --onefile --hidden-import pandas._libs.tslibs.timedeltas myScript.py
and it is working now
The other solutions did not work for me but what did is:
add import pandas at the very top of conftest.py
add ignore:numpy\.ufunc size changed.+:RuntimeWarning to the filterwarnings option of pytest.ini
upgraded freezegun to the latest version

How to resolve No or missing module errors when converting to an executable using pyinstaller -Python

I am attempting to convert a python file to an executable file. Easy enough right?
I used pyinstaller on a simple program that doesn't import anything. It worked like a charm. Then, I tried it with another dummy program with imported modules, (PyQt4, sys, matplotlib) that my actual program would have. Here I encountered problems.
This error appeared when I ran the application in the 'dist' folder pyinstaller created.
Fatal Python error: Py_Initialize: unable to load the file system codec
ImportError: No module named 'encodings'
I found another site with a possible solution to this problem, but his situation wasn't exactly the same: http://code.activestate.com/lists/python-dev/118463/
This lead me to trying the QT designer that I downloaded earlier. Perhaps if I could convert the .ui file it produced into a .py file, I would be fine. I could use his solution and all would be well.
That's when I got this error:
File "C:\Anaconda3\Lib\site-packages\PyQt4\uic\pyuic.py", line 26, in module
from PyQt4 import QtCore
mportError: No module named 'PyQt4'
I should also probably mention that all the modules I have are through Anaconda 3
I thought installing pyqt in a conda... project? Would fix the problem. It didn't. To be honest I don't entirely know what those are for.
Now I'm entertaining the idea of just using the c++ files that QT designer makes instead of converting them and importing python to tell the gui what to do.
What do you guys think would solve the errors above?
Short solution / workaround:
In your python file import the missing module explicitly. In your case: import encodings.
Proper solution:
By importing every module separately so you might end up imorting many modules and submodules. In this case you need tell pyinstaller where to find the modules (e. g. using compile flags).

Resources