How do I pretty print or visualise an object of class 'CoreNLP_pb2.ParseTree' in Python/Jupyter Notebook? - jupyter-notebook

I'm using Stanza's CoreNLP client in a Jupyter notebook to do constituency parsing on a string. The final output came in the form of an object of class 'CoreNLP_pb2.ParseTree'.
>>> print type(result)
<class 'CoreNLP_pb2.ParseTree'>
How should I print this in a visible way? When I directly call print(result), there is no output.

You can conver CoreNLP_pb2.ParseTree into nltk.tree.Tree and the call pretty_print() to print the parse tree in a visible way.
from nltk.tree import Tree
def convert_parse_tree_to_nltk_tree(parse_tree):
return Tree(parse_tree.value, [get_nltk_tree(child) for child in parse_tree.child]) if parse_tree.child else parse_tree.value
convert_parse_tree_to_nltk_tree(constituency_parse).pretty_print()
The result is as follows:
ROOT
|
S
_______________|____________________
| VP |
| ________|___ |
NP | NP |
____|_____ | ________|_____ |
NNP NNP VBZ DT JJ NN .
| | | | | | |
Chris Manning is a nice person .

Related

Jupyter Notebook and Lab can not render this markdown snippet from Google Colab

This is how it looks on Google Colab when rendered,
The actual script
# Notation
Here is a summary of some of the notation you will encounter.
|General <img width=70/> <br /> Notation <img width=70/> | Description<img width=350/>| Python (if applicable) |
|: ------------|: ------------------------------------------------------------||
| $a$ | scalar, non bold ||
| $\mathbf{a}$ | vector, bold ||
| **Regression** | | | |
| $\mathbf{x}$ | Training Example feature values (in this lab - Size (1000 sqft)) | `x_train` |
| $\mathbf{y}$ | Training Example targets (in this lab Price (1000s of dollars)). | `y_train`
| $x^{(i)}$, $y^{(i)}$ | $i_{th}$Training Example | `x_i`, `y_i`|
| m | Number of training examples | `m`|
| $w$ | parameter: weight, | `w` |
| $b$ | parameter: bias | `b` |
| $f_{w,b}(x^{(i)})$ | The result of the model evaluation at $x^{(i)}$ parameterized by $w,b$: $f_{w,b}(x^{(i)}) = wx^{(i)}+b$ | `f_wb` |
However, in my local Jupyter Notebook/Lab it doesn't render correctly, I installed these extensions in Jupyter Lab
still it won't render and looks something like this
Try this below, as it is close:
# Notation
Here is a summary of some of the notation you will encounter.
| General <br> Notation <br /> | Description | Python (if applicable) |
| :-: | :----: | :- |
| $$a$$ | scalar, non bold ||
| $$\mathbf{a}$$ | vector, bold ||
| **Regression** | | | |
| $$\mathbf{x}$$ | Training Example feature values (in this lab - Size (1000 sqft)) | `x_train` |
| $$\mathbf{y}$$ | Training Example targets (in this lab Price (1000s of dollars)). | `y_train`
| $$x^{(i)}, y^{(i)}$$ | $$i_{th}$$Training Example | `x_i`, `y_i`|
| m | Number of training examples | `m`|
| $$w$$ | parameter: weight, | `w` |
| $$b$$ | parameter: bias | `b` |
| $$f_{w,b}(x^{(i)})$$ | The result of the model evaluation at $$x^{(i)}$$ parameterized by $$w,b$$: $$f_{w,b}(x^{(i)}) = wx^{(i)}+b$$ | `f_wb` |
Yields in classic notebook in session launched here:
Most of the issue is explained by here; it seems you need to use double dollar signs when embedding latex in a table. So for all but the first few rows, I simply did find replace to double the dollar sign symbols, and then pasted that in. ( I later realized I needed to hand edit the $$x^{(i)}, y^{(i)}$$ line.) The first few rows I did by hand trying to understand how they matched and attempting to control the alignment.
I cannot say what is going on with the alignment. According to here and even using that code there's a way to align left the first column. It kept messing up the table though incorporating that and the latex.

Parse data in Kusto

I am trying to parse the below data in Kusto. Need help.
[[ObjectCount][LinkCount][DurationInUs]]
[ChangeEnumeration][[88][9][346194]]
[ModifyTargetInLive][[3][6][595903]]
Need generic implementation without any hardcoding.
ideally - you'd be able to change the component that produces source data in that format to use a standard format (e.g. CSV, Json, etc.) instead.
The following could work, but you should consider it very inefficient
let T = datatable(s:string)
[
'[[ObjectCount][LinkCount][DurationInUs]]',
'[ChangeEnumeration][[88][9][346194]]',
'[ModifyTargetInLive][[3][6][595903]]',
];
let keys = toscalar(
T
| where s startswith "[["
| take 1
| project extract_all(#'\[([^\[\]]+)\]', s)
);
T
| where s !startswith "[["
| project values = extract_all(#'\[([^\[\]]+)\]', s)
| mv-apply with_itemindex = i keys on (
extend Category = tostring(values[0]), p = pack(tostring(keys[i]), values[i + 1])
| summarize b = make_bag(p) by Category
)
| project-away values
| evaluate bag_unpack(b)
--->
| Category | ObjectCount | LinkCount | DurationInUs |
|--------------------|-------------|-----------|--------------|
| ChangeEnumeration | 88 | 9 | 346194 |
| ModifyTargetInLive | 3 | 6 | 595903 |

R apply script output in different formats for similar inputs

I'm using a double apply function to get a list of p-values for cor.test between any two columns of two tables.
hel_plist<-apply(bc, 2, function(x) { apply(otud, 2, function(y) { if (cor.test(x,y,method="spearman", exact=FALSE)$p.value<0.05){cor.test(x,y,method="spearman", exact=FALSE)$p.value}}) })
The otud data.frame is 90X11 (90rows,11 colums or to say dim(otud) 90 11) and will be used with different data.frames.
bc and hel - are both 90X2 data.frame-s - so for both I get 2*11=22 p-values out of functions
bc_plist<-apply(bc, 2, function(x) { apply(otud, 2, function(y) { if (cor.test(x,y,method="spearman", exact=FALSE)$p.value<0.05){cor.test(x,y,method="spearman", exact=FALSE)$p.value}}) })
hel_plist<-apply(hel, 2, function(x) { apply(otud, 2, function(y) { if (cor.test(x,y,method="spearman", exact=FALSE)$p.value<0.05){cor.test(x,y,method="spearman", exact=FALSE)$p.value}}) })
For bc I will have an output with dim=NULL a list of elements of otunames$bcnames$ p-value (a format that I have always got from these scripts and are happy with)
But for hel I will get and output of dim(hel) 11 2 - an 11X2 table with p-values written inside.
Shortened examples of output.
hel_plist
+--------+--------------+--------------+
| | axis1 | axis2 |
+--------+--------------+--------------+
| Otu037 | 1.126362e-18 | 0.01158251 |
| Otu005 | 3.017458e-2 | NULL |
| Otu068 | 0.00476002 | NULL |
| Otu070 | 1.27646e-15 | 5.252419e-07 |
+--------+--------------+--------------+
bc_plist
$axis1
$axis1$Otu037
[1] 1.247717e-06
$axis1$Otu005
[1] 1.990313e-05
$axis1$Otu068
[1] 5.664597e-07
Why is it like that when the input formats are all the same? (Shortened examples)
bc
+-------+-----------+-----------+
| group | axis1 | axis2 |
+-------+-----------+-----------+
| 1B041 | 0.125219 | 0.246319 |
| 1B060 | -0.022412 | -0.030227 |
| 1B197 | -0.088005 | -0.305351 |
| 1B222 | -0.119624 | -0.144123 |
| 1B227 | -0.148946 | -0.061741 |
+-------+-----------+-----------+
hel
+-------+---------------+---------------+
| group | axis1 | axis2 |
+-------+---------------+---------------+
| 1B041 | -0.0667782322 | -0.1660606406 |
| 1B060 | 0.0214470932 | -0.0611351008 |
| 1B197 | 0.1761876858 | 0.0927570627 |
| 1B222 | 0.0681058251 | 0.0549292399 |
| 1B227 | 0.0516864361 | 0.0774155225 |
| 1B235 | 0.1205676221 | 0.0181712761 |
+-------+---------------+---------------+
How could I force my scripts to always produce "flat" outputs as in the case of bc
OK different output-s are caused because of the NULL results from conditional function in bc_plist case. If I'd to modify code to replace possible NULL-s with NA-s I'd get 2d tables in any case.
So to keep things constant :
bc_nmds_plist<-apply(bc_nmds, 2, function(x) { apply(stoma_otud, 2, function(y) { if (cor.test(x,y,method="spearman", exact=FALSE)$p.value<0.05){cor.test(x,y,method="spearman", exact=FALSE)$p.value}else NA}) })
And I get a 2d tabel out for bc_nmds_plist too.
So I guess this thing can be called solved - as I now have a piece of code that produces predictable output on any correct input.
If anyone has any idea how to force the output to conform to previos bc_plist format instead I would still be interested as I do actually prefer that form:
$axis1
$axis1$Otu037
[1] 1.247717e-06
$axis1$Otu005
[1] 1.990313e-05
$axis1$Otu068
[1] 5.664597e-07

Combine DataFrame rows into a new column

I am wondering if there is simple way to achieve this in Julia besides iterating over the rows in a for-loop.
I have a table with two columns that looks like this:
| Name | Interest |
|------|----------|
| AJ | Football |
| CJ | Running |
| AJ | Running |
| CC | Baseball |
| CC | Football |
| KD | Cricket |
...
I'd like to create a table where each Name in first column is matched with a combined Interest column as follows:
| Name | Interest |
|------|----------------------|
| AJ | Football, Running |
| CJ | Running |
| CC | Baseball, Football |
| KD | Cricket |
...
How do I achieve this?
UPDATE: OK, so after trying a few things including print_joint and grpby, I realized that the easiest way to do this would be by() function. I'm 99% there.
by(myTable, :Name, df->DataFrame(Interest = string(df[:Interest])))
This gives me my :Interest column as "UTF8String[\"Running\"]", and I can't figure out which method I should use instead of string() (or where to typecast) to get the desired ASCIIString output.

Drools - Decision tables without constraints

I need to do a rule with no constraints in a decision table.
i.e.:
rule ...
when
$p : Person()
then
$p.setCity("none");
end
I tried these:
| 1 | RuleTable example |
| 2 | CONDITION | ACTION |
| 3 | p:Person() | |
| 4 | name | p.setCity("$param"); |
| 5 | description | config person |
| 6 | | none |
But when I run application throws this exception:
person cannot be resolved
Exception in thread "main" java.lang.IllegalArgumentException: No se puede parsear base de conocimiento.
Probably it fails because you have no real condition in your table.
Try putting $param == $param as condition
Use condition like as shown in picture. It will generate DRL as:
rule "XYZ"
when
doc:Document()
then
doc.setX("Y");
end

Resources