Seaborn code Anscombe’s quartet does not work - jupyter-notebook

The Seaborn code does not work.
I use jupyterlite to execute seaborn python code. first, i import seaborn in the following way --
import piplite
await piplite.install('seaborn')
import matplotlib.pyplot as plt
import seaborn as sn
%matplotlib inline
But when I insert seaborn code like the following one then it shows many errors that i do not understand yet --
link of the code
the problem that I face
But I insert this code in the google colab it works nicely
google colab

The issue is getting the example dataset as I point out in my comments.
The problem step is associated with:
# Load the example dataset for Anscombe's quartet
df = sns.load_dataset("anscombe")
You need to replace the line df = sns.load_dataset("anscombe") with the following:
url = 'https://raw.githubusercontent.com/mwaskom/seaborn-data/master/anscombe.csv' # based on [Data repository for seaborn examples](https://github.com/mwaskom/seaborn-data)
from pyodide.http import open_url
import pandas
df = pandas.read_csv(open_url(url))
That's based on use of open_url() from pyodide.http, see here for more examples.
Alternative with pyfetch and assigning the string obtained
If you've seen pyfetch around, this also works as a replacement of the sns.load_dataset() line based on John Hanley's post, that uses pyfetch to get the CSV data. The code is commented further:
# GET text at URL via pyfetch based on John Hanley's https://www.jhanley.com/blog/pyscript-loading-python-code-in-the-browser/
url = 'https://raw.githubusercontent.com/mwaskom/seaborn-data/master/anscombe.csv' # based on [Data repository for seaborn examples](https://github.com/mwaskom/seaborn-data)
from pyodide.http import pyfetch
response = await pyfetch(url)
content = (await response.bytes()).decode('utf-8')
# READ in string to dataframe based on [farmOS + JupyterLite: Import a CSV of Animals](https://gist.github.com/symbioquine/7641a2ab258726347ec937e8ea02a167)
import io
import pandas
df = pandas.read_csv(io.StringIO(content))

Related

Does Hovertool/tooltip work with pandas df or only with ColumnDataSource

Pretty new to Bokeh. Plotting a barplot (after importing pandas_bokey) works well.
But... I want to change the hoover tooltips.
Question: should hoover tooltip work with a pandas df in Bokeh or must ColumnDataSource be used?
thanks
One option in pandas_bokeh to modify the HoverTool is passing a custom string to hovertool_string.
import pandas as pd
import pandas_bokeh
from bokeh.plotting import output_notebook
output_notebook()
df = pd.DataFrame({'a':[1,2], 'b':[3,4]})
df.plot_bokeh.bar(hovertool_string=r"""At index #{__x__values}: a is #{a} and b is #{b}""")
default output
modified tooltop
To see a more complex example check the 2. example in the line plot documentation
Comment
Because your question is very open, I am not sure if the answer is satisfying. Please provide some Minimal Working Example and some example data in future.

floris.tools.power_rose.load() requires class instantiation

I am writing a script to load a PowerRose object from a file I pickled previously using floris.tools.power_rose.PowerRose.save(). The script looks like this:
# General modules
import matplotlib.pyplot as plt
import numpy as np
import pandas as pd
# FLORIS-specific modules
import floris.tools as wfct
import floris.tools.power_rose as pr
power_rose = pr.PowerRose(name, df_power, df_turbine_power_no_wake, df_turbine_power_baseline)
power_rose.load(filename = "PowerRose_All.p")
However, as is clear from the last two lines I have to instantiate the PowerRose class in order to load a PowerRose instance from a pickled PowerRose, which seems to me to be a causality problem. The only solution I can think of would be to create a DataFrames of the same size as "PowerRose_All.p" filled with zeros to use in the instatiation.
Yes, you need to instantiate a power_rose object before you can use the load method. However, you do not need to supply DataFrames to do this. This could be accomplished by:
import floris.tools.power_rose as pr
power_rose = pr.PowerRose()
power_rose.load(filename="PowerRose_All.p")

Convert pandas.io.formats.style.Styler to image file

I am creating a conditional formatting table in Python using pandas and seaborn which gives me a pandas.io.formats.style.Styler object. I need to export this as an image file .
I cannot use imgkit, unable to install wkhtmltopdf.
import pandas as pd
import seaborn as sns
iris = pd.read_excel('iris.xlsx')
iris.head()
cm = sns.diverging_palette(240, 10, sep=20, as_cmap=True)
sample = iris.style.background_gradient(cmap=cm)
sample

How to get data from the live table using web scraping?

I am trying to set up a live table by downloading the data directly from a website through Python. I guess I am following all the steps to the dot but I still am not able to get the data from the said table.
I have referred to many web pages and blogs to try to correct the issue here but was unsuccessful. I would like the stack overflow community's help here.
The following is the table website and there is only one table on the page from which I am trying to get the data:
https://etfdb.com/themes/smart-beta-etfs/#complete-list__esg&sort_name=assets_under_management&sort_order=desc&page=1
The data on the table is partially available for free and the rest is paid. So I guess that is the problem here but I would assume I should be able to download the free data. But since this is my first time trying this and considering I am a beginner at Python, I can be wrong. So please all the help is appreciated.
The code is as follows:
import pandas as pd
import html5lib
import lxml
from bs4 import BeautifulSoup
import requests
site = 'https://etfdb.com/themes/smart-beta-etfs/#complete-list&sort_name=assets_under_management&sort_order=desc&page=1'
page1 = requests.get(site, proxies = proxy_support)
page1
page1.status_code
page1.text
from bs4 import BeautifulSoup
soup = BeautifulSoup(page1.text, 'html.parser')
print(soup)
print(soup.prettify())
table = soup.find_all("div", class_ = "fixed-table-body")
table
When I run the table command, it gives me no data and the field is completely empty even though there is a table available on the website. All the help will be really appreciated.
The page does an another request for this info which returns json you can parse
import requests
r = requests.get('https://etfdb.com/data_set/?tm=77630&cond=&no_null_sort=&count_by_id=&sort=assets_under_management&order=desc&limit=25&offset=0').json()
Some of the keys (those for output columns Symbol and ETF Name - keys symbol and name) are associated with html so you can use bs4 to handle those values and extract the final desired result; the other keys value pairs are straightforward.
For example, if you loop each row in the json
for row in r['rows']:
print(row)
break
You get rows for parsing, of which two items need bs4 like this.
Python:
import requests
from bs4 import BeautifulSoup as bs
import pandas as pd
r = requests.get('https://etfdb.com/data_set/?tm=77630&cond=&no_null_sort=&count_by_id=&sort=assets_under_management&order=desc&limit=25&offset=0').json()
results = []
for row in r['rows']:
soup = bs(row['symbol'], 'lxml')
symbol = soup.select_one('.caps').text
soup = bs(row['name'], 'lxml')
etf_name = soup.select_one('a').text
esg_score = row['esg_quality_score']
esg_quality_score_pctl_peer = row['esg_quality_score_pctl_peer']
esg_quality_score_pctl_global = row['esg_quality_score_pctl_global']
esg_weighted_avg_carbon_inten = row['esg_weighted_avg_carbon_inten']
esg_sustainable_impact_pct = row['esg_sustainable_impact_pct']
row = [symbol, etf_name, esg_score, esg_quality_score_pctl_peer , esg_quality_score_pctl_global, esg_weighted_avg_carbon_inten, esg_sustainable_impact_pct ]
results.append(row)
headers = ['Symbol', 'ETF Name', 'ESG Score', 'ESG Score Peer Percentile (%)', 'ESG Score Global Percentile (%)',
'Carbon Intensity (Tons of CO2e / $M Sales)', 'Sustainable Impact Solutions (%)']
df = pd.DataFrame(results, columns = headers)
print(df)
I would like to use pandas data frame to fetch the table and can export the into csv.
import pandas as pd
tables=pd.read_html("https://etfdb.com/themes/smart-beta-etfs/#complete-list&sort_name=assets_under_management&sort_order=desc&page=1")
table=tables[0][:-1]
print(table)
table.to_csv('table.csv') #You can find the csv file in project folder after run

jupyter-notebook output for loop not showing

When i run my Jupyter-notebook with python2.7 and try to print items (of a list) using a for-loop it just won't output the print statement after importing the following packages:
import sys
import os
from hachoir_core.cmd_line import unicodeFilename
from hachoir_metadata import extractMetadata
from hachoir_parser import createParser
from hachoir_core.i18n import getTerminalCharset
from hachoir_core.tools import makePrintable
import pandas as pd
example code:
items = [1, 3, 0, 4, 1]
for item in items:
print (item)
output is blank.
When I use the exact same code before importing, it does show.
Looks like hachoir imports are the problem, whenever I import anything containing it, the output stops showing.
Reposting as an answer: The hachoir_metadata module appears to do something odd with stdout which breaks IPython: Bug report.
As described in that link, you need to add the following code before importing hachoir_metadata:
from hachoir_core import config
config.unicode_stdout = False

Resources