Cannot run nbconvert command in Jupyter Notebook - jupyter-notebook

I try to run a code using Jupyter Notebook, and then use nbconvert to export the created notebooks to pdf, but it does not work.
For reference my code is like this (But I don't think it is related to my code directly?):
# Import the necessary libraries
import pandas as pd
import matplotlib.pyplot as plt
from nbconvert import PDFExporter
from nbformat import v4 as nbf
# Read the Excel file using Pandas
excel_file = pd.ExcelFile('survey_responses.xlsx')
# Use a loop to iterate over the sheets and generate pie charts and PDF files for each one
for sheet_name in excel_file.sheet_names:
# Read the sheet using Pandas
df = excel_file.parse(sheet_name)
# Create a new Jupyter notebook for the current sheet
notebook = nbf.new_notebook()
# Add a text cell to the Jupyter notebook
text = 'The following pie charts show the results of the survey for Class {}.'.format(sheet_name)
notebook['cells'].append(nbf.new_markdown_cell(text))
# Use the subplots function to create a figure with multiple subplots
fig, ax = plt.subplots(nrows=len(df.columns), ncols=1, figsize=(8, 6))
# Use a loop to iterate over the columns and generate a pie chart for each one
for i, question in enumerate(df.columns[1:], start=1):
responses = df[question].value_counts()
# Add a code cell to the Jupyter notebook
code = 'ax[{}].pie(responses, labels=responses.index)\nax[{}].set_title("{}")'.format(i, i, question)
notebook['cells'].append(nbf.new_code_cell(code))
# Use nbconvert to convert the Jupyter notebook to a PDF file
exporter = PDFExporter()
pdf, _ = exporter.from_notebook_node(notebook)
with open('{}.pdf'.format(sheet_name), 'wb') as f:
f.write(pdf)
Jupyter Note send me a pop-up window with title "Package Installation" and content enter image description here:
The required file tex\latex\pgf\basiclayer\pgf.sty is missing. It is a part of the following package: pgf. The package will be installed from.................
I click install, then it shows: enter image description here:
[I 14:00:51.966 NotebookApp] Kernel started: 4529aa41-ab04-45c9-ab04-a723aafffe41, name: python3
[IPKernelApp] CRITICAL | x failed: xelatex notebook.tex -quiet
Sorry, but C:\Users\EconUser\anaconda3\Library\miktex\texmfs\install\miktex\bin\xelatex.exe did not succeed.
The log file hopefully contains the information to get MiKTeX going again:
C:/Users/EconUser/AppData/Local/MiKTeX/2.9/miktex/log/xelatex.log
You may want to visit the MiKTeX project page, if you need help.
[I 14:02:52.109 NotebookApp] Saving file at /(Step By Step) Export PDF with texts, codes, and multiple plots.ipynb
I tried web browsing and followed the method in this guide: https://github.com/microsoft/vscode-jupyter/issues/10910, but it did not work either.
I also try to install pandoc and MikTex again in Jupyter Notebook at the beginning of the code
!pip install --upgrade pip
!pip install pandoc
!pip install MikTex
it shows:
Requirement already satisfied: pip in c:\users\econuser\anaconda3\lib\site-packages (22.3.1)
Requirement already satisfied: pandoc in c:\users\econuser\anaconda3\lib\site-packages (2.3)
Requirement already satisfied: ply in c:\users\econuser\anaconda3\lib\site-packages (from pandoc) (3.11)
Requirement already satisfied: plumbum in c:\users\econuser\anaconda3\lib\site-packages (from pandoc) (1.8.0)
Requirement already satisfied: pywin32 in c:\users\econuser\anaconda3\lib\site-packages (from plumbum->pandoc) (227)
ERROR: Could not find a version that satisfies the requirement MikTex (from versions: none)
ERROR: No matching distribution found for MikTex
I have no idea at all. Is Miketex outdated or why?

Related

Issue with staplr package in R: set_fields returns the error 'Error: All unnamed arguments must be length 1'

I have been trying to generate a pdf file in R using the staplr package. However I have been running into issues whilst trying to run the example code. I keep getting an error (All unnamed arguments must be length 1) when trying to use the set_fields function.
library(staplr)
pdfFile = system.file('testForm.pdf',package = 'staplr')
# And do
fields = get_fields(pdfFile)
# You'll get a list of fields that the pdf contains
# along with some additional information about the fields.
# You make modifications in any of the fields by
fields$TextField1$value = 'this is text'
set_fields(pdfFile, 'newFile.pdf', fields)
# This will create a filled pdf file
The get_fields function seems to work fine, I am using it in R and able to see that the field variable is updated with the new information. The issue is only when the set_fields function is applied.
I installed java using the code provided and installed rJava (install.packages("rJava")):
sudo apt update -y
sudo apt install -y openjdk-8-jdk openjdk-8-jre
sudo R CMD javareconf
I first installed the package in R using:
install.packages('staplr', dependencies = TRUE)
Uninstalled this version after is produced this error and tried:
devtools::install_github("pridiltal/staplr")
However the same issue keeps arising.
As part of my troubleshooting I tried to install pdftools using:
sudo apt-get install libpoppler-cpp-dev
I'm at a loss as to what to try next so any insight would be appreciated. Thanks!
The set_fields function should output the pdf with the updated fields. As I am using the exampled code I assumed there was no issue with the structure but perhaps related to how the dependent packages were installed.

Bokeh server is not recognized

Trying to run simple app from Kevin Jolly's Hands-On Data Visualization with Bokeh Packt.
#Import the required packages
from bokeh.layouts import widgetbox
from bokeh.models import Slider
from bokeh.io import curdoc
#Create a slider widget
slider_widget = Slider( start = 0, end = 100, step = 10, title = 'Single Slider')
#Create a layout for the widget
slider_layout = widgetbox( slider_widget)
#Add the slider widget to the application
curdoc(). add_root( slider_layout)
Then tried to start bokeh server:
...\Python_Scripts\Sublime> bokeh serve --show bokeh.py
bokeh : The term 'bokeh' is not recognized as the name of a cmdlet, function, script file, or operable program.
bokeh info
Python version : 3.7.3 (default, Apr 24 2019, 15:29:51) [MSC v.1915 64 bit (AMD64)]
IPython version : 7.8.0
Tornado version : 6.0.3
Bokeh version : 1.3.4
BokehJS static path : C:\Users\k S\Anaconda3\lib\site-packages\bokeh\server\static
node.js version : (not installed)
npm version : (not installed)
Previous post with same problem did not provide working solution, please help.
First I would strongly suggest renaming your file to something other than bokeh.py. Due to the way Python itself works, this can sometimes result in Python trying to load the wrong module.
It's exceedingly strange that bokeh info could work but bokeh serve would not, since they are subcommands of literally the same program file. If renaming the script does not help, then you can always invoke the server using the Python -m command line option:
python -m bokeh serve --show app.py
If this does not work it can mean one thing only: the python executable you are running is a different Python environment than the one that you installed Bokeh into.

Reading csv files from microsoft Azure using R

I have recently started working with databricks and azure.
I have microsoft azure storage explorer. I ran a jar program on databricks
which outputs many csv files in the azure storgae explorer in the path
..../myfolder/subfolder/output/old/p/
The usual thing I do is to go the folder p and download all the csv files
by right clicking the p folder and click download on my local drive
and these csv files in R to do any analysis.
My issue is that sometimes my runs could generate more than 10000 csv files
whose downloading to the local drive takes lot of time.
I wondered if there is a tutorial/R package which helps me to read in
the csv files from the path above without downloading them. For e.g.
is there any way I can set
..../myfolder/subfolder/output/old/p/
as my working directory and process all the files in the same way I do.
EDIT:
the full url to the path looks something like this:
https://temp.blob.core.windows.net/myfolder/subfolder/output/old/p/
According to the offical document CSV Files of Azure Databricks, you can directly read a csv file in R of a notebook of Azure Databricks as the R example of the section Read CSV files notebook example said, as the figure below.
Alternatively, I used R package reticulate and Python package azure-storage-blob to directly read a csv file from a blob url with sas token of Azure Blob Storage.
Here is my steps as below.
I created a R notebook in Azure Databricks workspace.
To install R package reticulate via code install.packages("reticulate").
To install Python package azure-storage-blob as the code below.
%sh
pip install azure-storage-blob
To run Python script to generate a sas token of container level and to use it to get a list of blob urls with sas token, please see the code below.
library(reticulate)
py_run_string("
from azure.storage.blob.baseblobservice import BaseBlobService
from azure.storage.blob import BlobPermissions
from datetime import datetime, timedelta
account_name = '<your storage account name>'
account_key = '<your storage account key>'
container_name = '<your container name>'
blob_service = BaseBlobService(
account_name=account_name,
account_key=account_key
)
sas_token = blob_service.generate_container_shared_access_signature(container_name, permission=BlobPermissions.READ, expiry=datetime.utcnow() + timedelta(hours=1))
blob_names = blob_service.list_blob_names(container_name, prefix = 'myfolder/')
blob_urls_with_sas = ['https://'+account_name+'.blob.core.windows.net/'+container_name+'/'+blob_name+'?'+sas_token for blob_name in blob_names]
")
blob_urls_with_sas <- py$blob_urls_with_sas
Now, I can use different ways in R to read a csv file from the blob url with sas token, such as below.
5.1. df <- read.csv(blob_urls_with_sas[[1]])
5.2. Using R package data.table
install.packages("data.table")
library(data.table)
df <- fread(blob_urls_with_sas[[1]])
5.3. Using R package readr
install.packages("readr")
library(readr)
df <- read_csv(blob_urls_with_sas[[1]])
Note: for reticulate library, please refer to the RStudio article Calling Python from R.
Hope it helps.
Update for your quick question:

Darkflow without GPU on Jupyter-Notebook - Simple Code Required

I am unable to setup & run a simple darkflow program. Infact can't even configure darkflow library:
from darkflow.net.build import TFNet
==> ModuleNotFoundError: No module named 'darkflow'
My Target is to run the following program:
from darkflow.net.build import TFNet
import cv2
options = {"model": "cfg/yolo.cfg", "load": "bin/yolo.weights", "threshold": 0.1}
tfnet = TFNet(options)
imgcv = cv2.imread("./test/dog.jpg")
result = tfnet.return_predict(imgcv)
print(result
Please suggest steps so that I could configure darkflow on Jupyter Notebook (with no GPU) and run the above code
Fixed by creating the file in ipynb file in darkflow directory (downloaded from github) and executing the following from the notebook:
!python3 setup.py build_ext --inplace
!pip install -e .
!pip install .

Export a notebook (.ipynb) as a .py file upon saving

I'm currently working with Jupyter IPython Notebook.I would like to put my notebook under version control.
That's why, when I save and checkpoint a Notebook (.ipynb file), I would like the changes to also be saved and synchronized in the corresponding python script (.py file) in the same folder. (see picture below)
my_files
Does it have something to do with the version of Jupyter I am using? Or do I have to edit a config_file?
Thanks
You need to create jupyter_notebook_config.py in your config-dir.
if doesn't exists, execute below command from home directory :
./jupyter notebook --generate-config
Add paste following code in this file :
import os
from subprocess import check_call
c = get_config()
def post_save(model, os_path, contents_manager):
"""post-save hook for converting notebooks to .py and .html files."""
if model['type'] != 'notebook':
return # only do this for notebooks
d, fname = os.path.split(os_path)
check_call(['ipython', 'nbconvert', '--to', 'script', fname], cwd=d)
check_call(['ipython', 'nbconvert', '--to', 'html', fname], cwd=d)
c.FileContentsManager.post_save_hook = post_save
You will then need to restart your jupyter server and there you go!

Resources