Display qgrid within the ipywidget tab - jupyter-notebook

I'm trying to use qgrid to visualize dataframes together with some widgets. I have some issues displaying a grid inside a tab, if it is not active at the load.
import pandas as pd
import ipywidgets as widgets
import numpy as np
import qgrid
out1 = widgets.Output()
out2 = widgets.Output()
data1 = pd.DataFrame(np.random.normal(size = 10))
data2 = pd.DataFrame(np.random.normal(size = 10))
with out1:
display(qgrid.show_grid(data1))
with out2:
display(qgrid.show_grid(data1))
tab = widgets.Tab(children = [out1, out2])
tab.set_title(0, 'First')
tab.set_title(1, 'Second')
display(tab)
The code above produce working version for the first tab
But the second one is empty
More surprising is the fact that my attempt to "inspect" this in Chrome fixes my second tab, but hides output from the first.
Any suggestions?

Related

caption multiple figures in 1 Rmarkdown chunk

I want to caption all figures generated in Python chunk in R markdown. Currently it is only giving me 1 caption as I can use fig.cap only once in chunk header. How can I do that?
The code is below for 2 different dataframes.
`{python Test1-plot,fig.cap="The shear stress evolution"}
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
data = pd.read_table("../gb_cyc_csv/T1_gb_cyc.csv",sep=',',decimal='.',low_memory=False,header=0)
X = data[data.keys()[3]]
X = np.array(X)
TS = data[data.keys()[9]]
TS = np.array(TS)
plt.plot(X,TS)
plt.show()
data = pd.read_table("../gb_cyc_csv/T3_gb_cyc.csv",sep=',',decimal='.',low_memory=False,header=0)
X = data[data.keys()[3]]
X = np.array(X)
N = data[data.keys()[15]]
N = np.array(N)
plt.plot(X,N)
plt.show()

Streamlit multiselect line chart

I am trying to do charts with multiselect options using Streamlit app. Actually, I only achieve to do a chart with a unique selectbox (no multiple choices).
This is the code below that works with a unique selection:
df = pd.DataFrame(px.data.gapminder())
def plot():
clist = data['country'].unique()
country = st.selectbox('Select country', clist)
st.header('You selected:', country)
fig = px.line(df[df['country'] == country], x = "year", y = "gdpPercap", title = country)
st.plotly_chart(fig)
But when I replace st.selectbox by st.multiselect the plot does not work.
You can do it like this:
import pandas as pd
import streamlit as st
import plotly.express as px
import plotly.graph_objects as go
def plot():
df = pd.DataFrame(px.data.gapminder())
clist = df["country"].unique().tolist()
countries = st.multiselect("Select country", clist)
st.header("You selected: {}".format(", ".join(countries)))
dfs = {country: df[df["country"] == country] for country in countries}
fig = go.Figure()
for country, df in dfs.items():
fig = fig.add_trace(go.Scatter(x=df["year"], y=df["gdpPercap"], name=country))
st.plotly_chart(fig)
plot()
using st.multiselect and then add_trace to add every country after the other. The dict dfs is there to map every sub-dataframe for the country for easy access.
It gives the following:

dataframe from a url for further analysis(mean,median etc) in python pandas doesnt present proper answer but not suceeded

i am trying to fetch data from a URL(git hub) for further analysis(mean ,average ,percentage , ratio etc) but this code is not properly working
You can use the below code:
import requests
import pandas as pd
res = requests.get('https://raw.githubusercontent.com/justmarkham/DAT8/master/data/u.user')
df = pd.DataFrame([a.split('|') for a in res.text.split('\n')])
df.columns = df.values[0]
df.drop([df.index[0],df.index[df.shape[0]-1]],inplace=True)
print(df['age'].astype(int).mean())
Output:
34.05196182396607
Dataframe:

Python: As date strings get passed into dictionary - values jumbled

As I pull the date data from my excel file on my computer which is listed as: "10/1/10" - and stored in an array dData, and the numerical version of the date is stored in nData as: 734046, so when you call dData[0] it returns "10/1/10" and when you call nData it returns 734046.
HOWEVER
The code in bold as I pass in 10/1/10 it returns 735536, which is not the exact key-value pair that it should be organized chronologically.
import numpy as np
import pandas as pd
import xlrd
import csv
import numpy as np
import matplotlib.pyplot as plt
import matplotlib.dates as mdates
from datetime import datetime
import time
import random
import statistics
import numpy
from numpy.random import normal
from scipy import stats
dData = [] #Date in string format - Month/Day/Year
pData = [] #Date in float format - Value.Decimals
nData = [] #Data in Dates in int - Formatted Date Data for plotting in Matpl
def loadData(dates, prices, numDates):
dateDictionary = {} # empty dictionary that will contain string dates to number dates
numDateToPrice = {} # empty dictionary that will contain number dates to string dates
nestedDictionary = {} # empty dictionary that will contain a nested dictionary str date : {numbertodate: price}
with open('/Users/dvalentin/Code/IndividualResearch/CrudeOilFuturesAll.csv', 'rU') as csvfile: #This is where I pull data from an excel file on my comp
reader = csv.reader(csvfile, delimiter=',')
for row in reader:
dates.append(row[0])
numDates.append(row[1])
prices.append(row[2])
**for x in dates:
for x in numDates:
dateDictionary[x] = y
print dateDictionary**
for x in numDates[x]:
for y in prices[y]:
numDateToPrice[x] = y
plt.plot_date(x=numDates, y=prices, fmt="r-")
plt.plot()
plt.title("Crude Oil Futures")
plt.ylabel("Closing Price")
plt.grid(True)
plt.show()
import pandas as pd
import datetime as dt
dates = ['10/1/10', '10/2/10','11/3/10','1/4/11']
prices = [12,15,13,18]
df = pd.DataFrame({'dates':dates,'prices':prices})
df = df.set_index(pd.DatetimeIndex(df['dates']))
df = df.drop('dates', axis = 1)
print df.ix['20101002']
print df['20101001':'20101002']
print df['2010']
print df['2010-10']
This seems to be a better way to organize your data instead of messing around with the numerical code for the date. You can always manipulate the datetimeindex for graphical parameters and style it out how you want. But this datetimeindex is much easier to manipulate data with instead of having to use dictionaries. More info on datetime indices: http://pandas.pydata.org/pandas-docs/dev/generated/pandas.DatetimeIndex.html. Hope this helps!

Does setDataFormatForType() work correctly for Dates in XLConnect?

I recently tried all sorts of formatting arguments on the function
setDataFormatForType(wb, type=XLC$DATA_TYPE.DATETIME, format="d/m/yy")
for example format="d/m/yy" as shown above, besides numerous others.
This then is followed up by
setStyleAction(wb, XLC$"STYLE_ACTION.DATA_FORMAT_ONLY")
and then I write a worksheet and save the workook.
No form of format tweaking seems to work.
As soon as I mess with any format in the setDataFormatForType command the result is that the numeric time value shows up in the date columns in Excel workbook that I save later on
i.e. for Nov. 6th, 2013 = 41584.
If I do not interfere with any DataFormats then Standard (POSIX) format gets saved but when you look at that in the resulting Excel it has some Custom "XLConnect format" assigned to it so it is displayed "wrong" :-(
- which means American notation (leading month followed by day) but what I want is Eurepean (leading day followed by the month).
If anyone has some experience with setting these DataFormats (especially 'dates') in XLConnect, then sharing some thoughts or wisdom would be highly appreciated.
Thanks, Walter
There's a new style action XLC$"STYLE_ACTION.DATATYPE" in the XLConnect version available from github at https://github.com/miraisolutions/xlconnect. The "datatype" style action can be used to style cells of a specific type using a specific cell style which can be set using setCellStyleForType. See the following example:
require(XLConnect)
wb = loadWorkbook("test.xlsx", create = TRUE)
setStyleAction(wb, XLC$"STYLE_ACTION.DATATYPE")
cs = createCellStyle(wb, name = "mystyle")
setDataFormat(cs, format = "d/m/yy")
setCellStyleForType(wb, style = cs, type = XLC$"DATA_TYPE.DATETIME")
data = data.frame(A = 1:10, B = Sys.time() + 1:10)
createSheet(wb, "data")
writeWorksheet(wb, data = data, sheet = "data")
saveWorkbook(wb)
You do need to have a named region called "Dates". I saved a copy of the template2.xslx file with such a region. The only think that worked for me was to write it out with the format.Date function:
Dates=seq(from=as.Date("2001-01-01"), to=as.Date("2013-01-01"), by=365)
file.copy(system.file("demoFiles/template2.xlsx",
package = "XLConnect"),
"dataformat.xlsx", overwrite = TRUE)
wb <- loadWorkbook("dataformat.xlsx")
setDataFormatForType(wb, type = XLC$"DATA_TYPE.DATETIME",
format = "dd/mm/yyyy")
setStyleAction(wb, XLC$"STYLE_ACTION.DATA_FORMAT_ONLY")
createName(wb, name = "Dates", formula = "mtcars!$A$1")
writeNamedRegion(wb, format(Dates, "%d.%m.%Y"), name = "Dates")
saveWorkbook(wb)

Resources