Jupyter Notebook and Lab can not render this markdown snippet from Google Colab - jupyter-notebook

This is how it looks on Google Colab when rendered,
The actual script
# Notation
Here is a summary of some of the notation you will encounter.
|General <img width=70/> <br /> Notation <img width=70/> | Description<img width=350/>| Python (if applicable) |
|: ------------|: ------------------------------------------------------------||
| $a$ | scalar, non bold ||
| $\mathbf{a}$ | vector, bold ||
| **Regression** | | | |
| $\mathbf{x}$ | Training Example feature values (in this lab - Size (1000 sqft)) | `x_train` |
| $\mathbf{y}$ | Training Example targets (in this lab Price (1000s of dollars)). | `y_train`
| $x^{(i)}$, $y^{(i)}$ | $i_{th}$Training Example | `x_i`, `y_i`|
| m | Number of training examples | `m`|
| $w$ | parameter: weight, | `w` |
| $b$ | parameter: bias | `b` |
| $f_{w,b}(x^{(i)})$ | The result of the model evaluation at $x^{(i)}$ parameterized by $w,b$: $f_{w,b}(x^{(i)}) = wx^{(i)}+b$ | `f_wb` |
However, in my local Jupyter Notebook/Lab it doesn't render correctly, I installed these extensions in Jupyter Lab
still it won't render and looks something like this

Try this below, as it is close:
# Notation
Here is a summary of some of the notation you will encounter.
| General <br> Notation <br /> | Description | Python (if applicable) |
| :-: | :----: | :- |
| $$a$$ | scalar, non bold ||
| $$\mathbf{a}$$ | vector, bold ||
| **Regression** | | | |
| $$\mathbf{x}$$ | Training Example feature values (in this lab - Size (1000 sqft)) | `x_train` |
| $$\mathbf{y}$$ | Training Example targets (in this lab Price (1000s of dollars)). | `y_train`
| $$x^{(i)}, y^{(i)}$$ | $$i_{th}$$Training Example | `x_i`, `y_i`|
| m | Number of training examples | `m`|
| $$w$$ | parameter: weight, | `w` |
| $$b$$ | parameter: bias | `b` |
| $$f_{w,b}(x^{(i)})$$ | The result of the model evaluation at $$x^{(i)}$$ parameterized by $$w,b$$: $$f_{w,b}(x^{(i)}) = wx^{(i)}+b$$ | `f_wb` |
Yields in classic notebook in session launched here:
Most of the issue is explained by here; it seems you need to use double dollar signs when embedding latex in a table. So for all but the first few rows, I simply did find replace to double the dollar sign symbols, and then pasted that in. ( I later realized I needed to hand edit the $$x^{(i)}, y^{(i)}$$ line.) The first few rows I did by hand trying to understand how they matched and attempting to control the alignment.
I cannot say what is going on with the alignment. According to here and even using that code there's a way to align left the first column. It kept messing up the table though incorporating that and the latex.

Related

Markdown table in Jupyter notebook not working

I know how to set to up a table in a jupyter notebook`. I even looked up internet and imitated it. However it is not working? Anyone can tell me what is wrong with my notebook, is there anything I should change while constructing markdown table
Tables | Are | Cool |
| ------------- |:-------------:| -----:|
| col 3 is | right-aligned | $1600 |
| col 2 is | centered | $12 |
| zebra stripes | are neat | $1 |
Two things:
the example code is missing the first | character
dollar signs need to be escaped with a backslash (\) since MathJax is enabled
Try this:
|Tables | Are | Cool |
| ------------- |:-------------:| ------:|
| col 3 is | right-aligned | \$1600 |
| col 2 is | centered | \$12 |
| zebra stripes | are neat | \$1 |

rmarkdown - prevent indentation in list inside a table

When rendering tables such as this one (using RStudio + knitr), there is unwanted indentation (see red zone in the image). How can I avoid such indentation?
I imagine there is some CSS involved, but if there was a way to even prevent rmarkdown from "considering" this as a list, it could simplify matters. This is needed for an R package, so heavy hacks are not really an option, but I'll gladly receive all suggestions. Thx.
The (grid) table:
+------------------------+------------------------------------+
| Variable | Stats / Values |
+========================+====================================+
| SomeVar1 | mean (sd) : 1500000.5 (288675.28)\ |
| [numeric] | min < med < max :\ |
| | 1000001 < 1500000.5 < 2e+06\ |
| | IQR (CV) : 499999.5 (0.19) |
+------------------------+------------------------------------+
| SomeVar2 | 1. AAAAAA\ |
| [factor] | 2. BBBBBB\ |
| | 3. CCCCCC\ |
| | 4. DDDDDD\ |
| | 5. EEEEEE\ |
| | 6. FFFFFF\ |
| | 7. GGGGGG\ |
| | 8. HHHHHH\ |
| | 9. IIIIII\ |
| | 10. JJJJJJ\ |
| | [ 102917 others ] |
+------------------------+------------------------------------+
The rendered html table:

Addition of calculated field in rpivotTable

I want to create a calculated field to use with the rpivotTable package, similar to the functionality seen in excel.
For instance, consider the following table:
+--------------+--------+---------+-------------+-----------------+
| Manufacturer | Vendor | Shipper | Total Units | Defective Units |
+--------------+--------+---------+-------------+-----------------+
| A | P | X | 173247 | 34649 |
| A | P | Y | 451598 | 225799 |
| A | P | Z | 759695 | 463414 |
| A | Q | X | 358040 | 225565 |
| A | Q | Y | 102068 | 36744 |
| A | Q | Z | 994961 | 228841 |
| A | R | X | 454672 | 231883 |
| A | R | Y | 275994 | 124197 |
| A | R | Z | 691100 | 165864 |
| B | P | X | 755594 | 302238 |
| . | . | . | . | . |
| . | . | . | . | . |
+--------------+--------+---------+-------------+-----------------+
(my actual table has many more columns, both dimensions and measures, time, etc. and I need to define multiple such "calculated columns")
If I want to calculate defect rate (which would be Defective Units/Total Units) and I want to aggregate by either of the first three columns, I'm not able to.
I tried assignment by reference (:=), but that still didn't seem to work and summed up defect rates (i.e., sum(Defective_Units/Total_Units)), instead of sum(Defective_Units)/sum(Total_Units):
myData[, Defect.Rate := Defective_Units / Total_Units]
This ended up giving my defect rates greater than 1. Is there anywhere I can declare a calculated field, which is just a formula evaluated post aggregation?
You're lucky - the creator of pivottable.js foresaw cases like yours (and mine, earlier today) by implementing an aggregator called "Sum over Sum" and a few more, likewise, cf. https://github.com/nicolaskruchten/pivottable/blob/master/src/pivot.coffee#L111 and https://github.com/nicolaskruchten/pivottable/blob/master/src/pivot.coffee#L169.
So we'll use "Sum over Sum" as parameter "aggregatorName", and the columns whose quotient we want in the "vals" parameter.
Here's a meaningless usage example from the mtcars data for reproducibility:
require(rpivotTable)
data(mtcars)
rpivotTable(mtcars,rows="gear", cols=c("cyl","carb"),
aggregatorName = "Sum over Sum",
vals =c("mpg","disp"),
width="100%", height="400px")

Combine DataFrame rows into a new column

I am wondering if there is simple way to achieve this in Julia besides iterating over the rows in a for-loop.
I have a table with two columns that looks like this:
| Name | Interest |
|------|----------|
| AJ | Football |
| CJ | Running |
| AJ | Running |
| CC | Baseball |
| CC | Football |
| KD | Cricket |
...
I'd like to create a table where each Name in first column is matched with a combined Interest column as follows:
| Name | Interest |
|------|----------------------|
| AJ | Football, Running |
| CJ | Running |
| CC | Baseball, Football |
| KD | Cricket |
...
How do I achieve this?
UPDATE: OK, so after trying a few things including print_joint and grpby, I realized that the easiest way to do this would be by() function. I'm 99% there.
by(myTable, :Name, df->DataFrame(Interest = string(df[:Interest])))
This gives me my :Interest column as "UTF8String[\"Running\"]", and I can't figure out which method I should use instead of string() (or where to typecast) to get the desired ASCIIString output.

R: graphing upper and lower bounds with ggplot2

I have a dataset with three variables. One continous independent variable, one continous dependent variable, and a binary variable that catagorizes how the measurements were taken. Using ggplot, I know that I can make a scatter plot with the points colored by the catagory:
g <- ggplot(dataset, aes(independent, dependent))
g + geom_point(aes(color=catagory))
However, I want to know if there is a way to make a graph where there is a vertical line comming up from points of catagory 0 and a vertical line going down from points of catagory 1. It would look something like this:
- | | |
| | | |
| | | |
| | | |
- | | o |
| | | | |
| | o | | |
| | o | | | |
- | | | o | o
| | | | |
| o | | |
| | | |
+----|-----|-----|-----|-----|
The reason for wanting a plot like this is that one category represents an upper bound (the points with lines going downwards) and one represents a lower bound (the points with lines going upwards). Having these lines would make it easy to visualize the area which is between these bounds, and whether a function plotted on top could accurately represent the data:
- | | |
| | | |
| | | |
| | | |
- | | o | _____
| | | |_|__/
| | o |_/| |
| | o |__/| | |
- | | /| o | o
| _|_|/ | |
| / o | | |
|/ | | |
+----|-----|-----|-----|-----|
If there is any way to do this using ggplot or any other graphing library for R, I would love to know how. However, if it isn't possible, I'd be open to hearing other ways to represent this data. Simply distinguishing the catagories based on color doesn't do enough to emphasize the upper/lower bound nature of the catagories for my purposes.
The following could work for you, I hope I understood the problem well.
First, generating some random data for the dataframe, as no sample data was provided. The random numbers will make the plot ugly, I hope it will look better with real data:
dataset <- data.frame (
independent = runif(100),
dependent = runif(100),
catagory = floor(runif(100)*2))
Next, find the upper or lower part of the plot (=min/max of values) based on "catagory" for every case:
dataset$end[which(dataset$catagory == 0)] <- max(dataset$dependent)
dataset$end[which(dataset$catagory == 1)] <- min(dataset$dependent)
Now, we can plot data with geom_segment().
g <- ggplot(dataset, aes(independent, dependent, min, max))
g + geom_segment(aes(x=independent, y=dependent, xend=independent, yend=end, color=catagory))
Note, that I also added + theme_bw() + opts(legend.position = "none") parameters to the plot as it looked very strange with random datas.

Resources