Convert pandas.io.formats.style.Styler to image file - css

I am creating a conditional formatting table in Python using pandas and seaborn which gives me a pandas.io.formats.style.Styler object. I need to export this as an image file .
I cannot use imgkit, unable to install wkhtmltopdf.
import pandas as pd
import seaborn as sns
iris = pd.read_excel('iris.xlsx')
iris.head()
cm = sns.diverging_palette(240, 10, sep=20, as_cmap=True)
sample = iris.style.background_gradient(cmap=cm)
sample

Related

Seaborn code Anscombe’s quartet does not work

The Seaborn code does not work.
I use jupyterlite to execute seaborn python code. first, i import seaborn in the following way --
import piplite
await piplite.install('seaborn')
import matplotlib.pyplot as plt
import seaborn as sn
%matplotlib inline
But when I insert seaborn code like the following one then it shows many errors that i do not understand yet --
link of the code
the problem that I face
But I insert this code in the google colab it works nicely
google colab
The issue is getting the example dataset as I point out in my comments.
The problem step is associated with:
# Load the example dataset for Anscombe's quartet
df = sns.load_dataset("anscombe")
You need to replace the line df = sns.load_dataset("anscombe") with the following:
url = 'https://raw.githubusercontent.com/mwaskom/seaborn-data/master/anscombe.csv' # based on [Data repository for seaborn examples](https://github.com/mwaskom/seaborn-data)
from pyodide.http import open_url
import pandas
df = pandas.read_csv(open_url(url))
That's based on use of open_url() from pyodide.http, see here for more examples.
Alternative with pyfetch and assigning the string obtained
If you've seen pyfetch around, this also works as a replacement of the sns.load_dataset() line based on John Hanley's post, that uses pyfetch to get the CSV data. The code is commented further:
# GET text at URL via pyfetch based on John Hanley's https://www.jhanley.com/blog/pyscript-loading-python-code-in-the-browser/
url = 'https://raw.githubusercontent.com/mwaskom/seaborn-data/master/anscombe.csv' # based on [Data repository for seaborn examples](https://github.com/mwaskom/seaborn-data)
from pyodide.http import pyfetch
response = await pyfetch(url)
content = (await response.bytes()).decode('utf-8')
# READ in string to dataframe based on [farmOS + JupyterLite: Import a CSV of Animals](https://gist.github.com/symbioquine/7641a2ab258726347ec937e8ea02a167)
import io
import pandas
df = pandas.read_csv(io.StringIO(content))

How to import large csv file(400mb approx) in jupyter notebook using python?

Already tried this code:
import pandas as pd
a="C://concatenated_csv.csv"
df=pd.read_csv(a, header=None, sep='\t', iterator=True, chunksize=2000000)
print (df)
but I get the following error:
pandas.io.parsers.TextFileReader object at 0x07CF1310

update chart with ipywidgets:

I'm using seaborn on jupyter notebook and would like a slider to update a chart. My code is as follows:
from ipywidgets import interact, interactive, fixed, interact_manual
import numpy as np
import seaborn as sns
from IPython.display import clear_output
def f(var):
print(var)
clear_output(wait=True)
sns.distplot(list(np.random.normal(1,var,1000)))
interact(f, var=10);
Problem: every time I move the slider, the graph is duplicated. How do I update the chart instead?
Seaborn plots should be handled as regular matplotlib plot. So you need to use plt.show() to display it as explained in this answer for example.
Combined with %matplotlib inline magic command, this works fine for me:
%matplotlib inline
from ipywidgets import interact
import numpy as np
import seaborn as sns
import matplotlib.pyplot as plt
def f(var):
sns.distplot(np.random.normal(1, var, 1000))
plt.show()
interact(f, var = (1,10))
Another solution would be to update the data of the plot instead of redrawing a new one, as explained here: https://stackoverflow.com/a/4098938/2699660

Expected BOF record for XLRD when first line is redundant

I came across the problem when I tried to use xlrd to import an .xls file and create dataframe using python.
Here is my file format:
xls file format
When I run:
import os
import pandas as pd
import xlrd
for filename in os.listdir("."):
if filename.startswith("report_1"):
df = pd.read_excel(filename)
It's showing "XLRDError: Unsupported format, or corrupt file: Expected BOF record; found b'Report g'"
I am pretty sure nothing wrong with xlrd (version 1.0.0) because when I remove the first row, dataframe can be created.
Wonder if there is any way that i can load the original file format?
Try following that accounts for a header line:
df = pd.read_excel(filename, header=0)

jupyter-notebook output for loop not showing

When i run my Jupyter-notebook with python2.7 and try to print items (of a list) using a for-loop it just won't output the print statement after importing the following packages:
import sys
import os
from hachoir_core.cmd_line import unicodeFilename
from hachoir_metadata import extractMetadata
from hachoir_parser import createParser
from hachoir_core.i18n import getTerminalCharset
from hachoir_core.tools import makePrintable
import pandas as pd
example code:
items = [1, 3, 0, 4, 1]
for item in items:
print (item)
output is blank.
When I use the exact same code before importing, it does show.
Looks like hachoir imports are the problem, whenever I import anything containing it, the output stops showing.
Reposting as an answer: The hachoir_metadata module appears to do something odd with stdout which breaks IPython: Bug report.
As described in that link, you need to add the following code before importing hachoir_metadata:
from hachoir_core import config
config.unicode_stdout = False

Resources