How to ignore annotation characters on SyntaxNet? - syntaxnet

I want to ignore annotation characters when parsing text on syntaxnet.
For example, in the case below, I want to ignore <X> and </X> annotation characters.
<PERSON>Michael Jordan</PERSON> is a professor at <LOC>Berkeley</LOC>.
So, I expect next output.
_ <PERSON> _ ...
1 Michael _ ...
2 Jordan _ ...
_ </PERSON> _ ...
3 is _ ...
...
Isn't SyntaxNet has such kind of features?

No, SyntaxNet does not have specific features to manipulate xml tags. However you can preprocess your data easily in Python with something like:
import xml.etree.ElementTree as ET
tree = ET.fromstring(
"<DOC><PERSON>Michael Jordan</PERSON> is a "
"professor at <LOC>Berkeley</LOC>.</DOC>")
notags = ET.tostring(tree, encoding='utf8', method='text')
print(notags)
See also Python strip XML tags from document.

Related

How to replace the single quote ( ' ) into double quotes ( " ) in Robot Framework?

I have a List, items are created by append by a Loop. I want to use this list as a json. The problem is that the items of that list use the single quote so it can't be the json.
Get Order Items
[Tags] Get Order Items
Set Headers ${HEADER CUSTOMER}
GET ${ORDER GET BY CODE ENDPOINT}/${ORDER CODE}
Integer response status 200
${NUMBER OF ITEMS} Output $.number_of_items
${NUMBER OF ITEMS} Evaluate ${NUMBER OF ITEMS} + 1
${ORDER ITEMS} Create List
:FOR ${i} IN RANGE 1 ${NUMBER OF ITEMS}
\ Append To List ${ORDER ITEMS} ${ORDER CODE}${i}
Set Global Variable ${ORDER ITEMS}
Actual result: ['N19072596HB1', 'N19072596HB2', 'N19072596HB3', 'N19072596HB4', 'N19072596HB5']
Expected result: ["N19072596HB1", "N19072596HB2", "N19072596HB3", "N19072596HB4", "N19072596HB5"]
This: ['N19072596HB1', 'N19072596HB2', 'N19072596HB3', 'N19072596HB4', 'N19072596HB5'] , is python's string representation of a list, and they have picked to use single quotes for it.
As your end goal is to use the double-quoted version in a json, the best bet is to use the python's json library to convert it for you, instead of replacing single with double quotes:
${list as string}= Evaluate json.dumps($ORDER_ITEMS) json
(note how the variable is not surrounded by curly brackets - thus you're passing the actual variable to the method, not its value)
Why not to use a simple string replacement?
Because you don't want to deal with cases where a list member may have a quote, single or double - like ['N19072596HB1', "N1907'2596HB2", 'N19072596HB1', 'N19072"596HB2'].
A trivial string replace will fail miserably, while the json library is designed to handle these cases as expected.

I cannot transfer the list/dictionary into my test library in robot framework

I wanna to transfer one list as parameter into my library keyword:
ModifyDefaultValue
${DataJson} ModifyDefaultValue ${DataJson} #{vargs}
And the #vargs list is combined with string and List:
#{vargs} Create List NO=1227003021 requestType=0 destination=#{destinations}
In my library:
def ModifyDefaultValue(self, dictOri, *vargs):
'''<br/>
*vargs: List Tyep and format is: var1=value1, var2=value2
'''
logger.info("SmartComLibrary ModifyDefaultValue()", also_console=True)
for i in range(len(vargs)):
logger.info("\t----Type: %s" % str(vargs[i].split("=")[1].__class__))
They always are:
20160630 22:11:07.501 : INFO : ----Type: <type 'unicode'>
But i wanna the "destination" should be "list".
Create list will create a list of 3 strings no matter what you put after destination= below.
Create List NO=1227003021 requestType=0 destination=#{destinations}
It looks like you are manually trying to use keyword arguments. But Python and Robot Framework support them so there is no need to parse and split on '=', etc. Change your keyword to accept keyword arguments. Then instead of building a list, you build a dictionary.
def ModifyDefaultValue(self, dictOri, **kwargs):
logger.info("SmartComLibrary ModifyDefaultValue()", also_console=True)
for k, v in kwargs.items():
logger.info("\t----Type: %s: %s" % (k, type(v)))
In your test:
${destinations} Create List a b c
&{kwargs} Create Dictionary NO=1227003021 requestType=0 destination=${destinations}
ModifyDefaultValue asdf &{kwargs} # note the & here
Output:
20160630 12:12:41.923 : INFO : ----Type: requestType: <type 'unicode'>
20160630 12:12:41.923 : INFO : ----Type: destination: <type 'list'>
20160630 12:12:41.923 : INFO : ----Type: NO: <type 'unicode'>
Alternatively, you could also have ModifyDefaultValue take a dict as the second argument.
def ModifyDefaultValue(self, dictOri, args):
logger.info("SmartComLibrary ModifyDefaultValue()", also_console=True)
for k, v in args.items():
logger.info("\t----Type: %s: %s" % (k, type(v)))
In your data:
${destinations} Create List a b c
&{args} Create Dictionary NO=1227003021 requestType=0 destination=${destinations}
ModifyDefaultValue asdf ${args} # note the $ here
See also:
http://robotframework.org/robotframework/latest/RobotFrameworkUserGuide.html#free-keyword-arguments
http://robotframework.org/robotframework/latest/RobotFrameworkUserGuide.html#free-keyword-arguments-kwargs

Erlang: How to create a function that returns a string containing the date in YYMMDD format?

I am trying to learn Erlang and I am working on the practice problems Erlang has on the site. One of them is:
Write the function time:swedish_date() which returns a string containing the date in swedish YYMMDD format:
time:swedish_date()
"080901"
My function:
-module(demo).
-export([swedish_date/0]).
swedish_date() ->
[YYYY,MM,DD] = tuple_to_list(date()),
string:substr((integer_to_list(YYYY, 3,4)++pad_string(integer_to_list(MM))++pad_string(integer_to_list(DD)).
pad_string(String) ->
if
length(String) == 1 -> '0' ++ String;
true -> String
end.
I'm getting the following errors when compiled.
demo.erl:6: syntax error before: '.'
demo.erl:2: function swedish_date/0 undefined
demo.erl:9: Warning: function pad_string/1 is unused
error
How do I fix this?
After fixing your compilation errors, you're still facing runtime errors. Since you're trying to learn Erlang, it's instructive to look at your approach and see if it can be improved, and fix those runtime errors along the way.
First let's look at swedish_date/0:
swedish_date() ->
[YYYY,MM,DD] = tuple_to_list(date()),
Why convert the list to a tuple? Since you use the list elements individually and never use the list as a whole, the conversion serves no purpose. You can instead just pattern-match the returned tuple:
{YYYY,MM,DD} = date(),
Next, you're calling string:substr/1, which doesn't exist:
string:substr((integer_to_list(YYYY,3,4) ++
pad_string(integer_to_list(MM)) ++
pad_string(integer_to_list(DD))).
The string:substr/2,3 functions both take a starting position, and the 3-arity version also takes a length. You don't need either, and can avoid string:substr entirely and instead just return the assembled string:
integer_to_list(YYYY,3,4) ++
pad_string(integer_to_list(MM)) ++
pad_string(integer_to_list(DD)).
Whoops, this is still not right: there is no such function integer_to_list/3, so just replace that first call with integer_to_list/1:
integer_to_list(YYYY) ++
pad_string(integer_to_list(MM)) ++
pad_string(integer_to_list(DD)).
Next, let's look at pad_string/1:
pad_string(String) ->
if
length(String) == 1 -> '0' ++ String;
true -> String
end.
There's a runtime error here because '0' is an atom and you're attempting to append String, which is a list, to it. The error looks like this:
** exception error: bad argument
in operator ++/2
called as '0' ++ "8"
Instead of just fixing that directly, let's consider what pad_string/1 does: it adds a leading 0 character if the string is a single digit. Instead of using if to check for this condition — if isn't used that often in Erlang code — use pattern matching:
pad_string([D]) ->
[$0,D];
pad_string(S) ->
S.
The first clause matches a single-element list, and returns a new list with the element D preceded with $0, which is the character constant for the character 0. The second clause matches all other arguments and just returns whatever is passed in.
Here's the full version with all changes:
-module(demo).
-export([swedish_date/0]).
swedish_date() ->
{YYYY,MM,DD} = date(),
integer_to_list(YYYY) ++
pad_string(integer_to_list(MM)) ++
pad_string(integer_to_list(DD)).
pad_string([D]) ->
[$0,D];
pad_string(S) ->
S.
But a simpler approach would be to use the io_lib:format/2 function to just format the desired string directly:
swedish_date() ->
io_lib:format("~w~2..0w~2..0w", tuple_to_list(date())).
First, note that we're back to calling tuple_to_list(date()). This is because the second argument for io_lib:format/2 must be a list. Its first argument is a format string, which in our case says to expect three arguments, formatting each as an Erlang term, and formatting the 2nd and 3rd arguments with a width of 2 and 0-padded.
But there's still one more step to address, because if we run the io_lib:format/2 version we get:
1> demo:swedish_date().
["2015",["0",56],"29"]
Whoa, what's that? It's simply a deep list, where each element of the list is itself a list. To get the format we want, we can flatten that list:
swedish_date() ->
lists:flatten(io_lib:format("~w~2..0w~2..0w", tuple_to_list(date()))).
Executing this version gives us what we want:
2> demo:swedish_date().
"20150829"
Find the final full version of the code below.
-module(demo).
-export([swedish_date/0]).
swedish_date() ->
lists:flatten(io_lib:format("~w~2..0w~2..0w", tuple_to_list(date()))).
UPDATE: #Pascal comments that the year should be printed as 2 digits rather than 4. We can achieve this by passing the date list through a list comprehension:
swedish_date() ->
DateVals = [D rem 100 || D <- tuple_to_list(date())],
lists:flatten(io_lib:format("~w~2..0w~2..0w", DateVals)).
This applies the rem remainder operator to each of the list elements returned by tuple_to_list(date()). The operation is needless for month and day but I think it's cleaner than extracting the year and processing it individually. The result:
3> demo:swedish_date().
"150829"
There are a few issues here:
You are missing a parenthesis at the end of line 6.
You are trying to call integer_to_list/3 when Erlang only defines integer_to_list/1,2.
This will work:
-module(demo).
-export([swedish_date/0]).
swedish_date() ->
[YYYY,MM,DD] = tuple_to_list(date()),
string:substr(
integer_to_list(YYYY) ++
pad_string(integer_to_list(MM)) ++
pad_string(integer_to_list(DD))
).
pad_string(String) ->
if
length(String) == 1 -> '0' ++ String;
true -> String
end.
In addition to the parenthesis error on line 6, you also have an error on line 10 where yo use the form '0' instead of "0", so you define an atom rather than a string.
I understand you are doing this for educational purpose, but I encourage you to dig into erlang libraries, it is something you will have to do. For a common problem like this, it already exists function that help you:
swedish_date() ->
{YYYY,MM,DD} = date(), % not useful to transform into list
lists:flatten(io_lib:format("~2.10.0B~2.10.0B~2.10.0B",[YYYY rem 100,MM,DD])).
% ~X.Y.ZB means: uses format integer in base Y, print X characters, uses Z for padding

Anagrams of a word in Ada Programming

How can I get anagrams of a word in ada programming. For example:
I have a string 'one'. How can it be jumbled into 'neo' or 'eon' etc?
example code:
with Ada.Text_IO; use Ada.Text_IO;
procedure Main is
WordText : String (1 .. 80);
Last : Natural;
begin
Put_Line("Enter Text: ");
Get_Line (WordText, Last);
-- example: I entered 'one'
-- it must be shuffle text per character
-- then it will print shuffled text: 'neo' or 'eno' or 'oen' etc.
Put_Line ("Text Shuffle: " &WordText (1 .. Last));
end Main;
Implement one of the jumble algorithms described here and here. For example,
For each word in a dictionary, sort its constiuent letters, preserving duplicates. Use an instance of Ada.Containers.Generic_Array_Sort to sort the letters.
Create a hash map; enter the sorted string as the map's key; add the original word to the set of words that are permutatively equivalent, and use the set as the map's value. Use an instance of Ada.Containers.Ordered_Sets to hold the set of words. Use an instance of Ada.Containers.Hashed_Maps for the map.
For a given string, sort its letters, look up the mapped set and print the words it contains.
A complete example is seen here.

Pulling out a column in all records in a Sqlite table into a concatenated string in Haskell with Persist

I'm trying to learn Haskell, specifically Snap, Blaze HTML5 and Persist. I would like to take every row in a table, select a single column from it, and then concatenate the values into a single string.
I've previously worked with C#'s LINQ quite extensively and under Entity Framework I could do it like this:
String.Join(", ", dbContext.People.Select(p => p.Name));
This would compile down to SELECT Name FROM People, with C# then concatenating those rows into a string with ", " in between.
To try and get the concatenation part right, I put this together, which seems to work:
intercalate ", " $ map show [1..10]
(it counts 1-9, concatenates with ", " in between the items)
However, I can't get this to work with Database.Persist.Sqlite. I'm not sure I quite understand the syntax here in Haskell. To contact the DB and retrieve the rows, I have to call: (as far as I understand)
runSqlite "TestDB" $ selectList ([] :: [Filter Person]) [] 0 0
The problem is that I'm not sure how to get the list out of runSqlite. runSqlite doesn't return the type I'm after, so I can't use the return value of runSqlite. How would I do this?
Thank you for reading.
To clarify:
Snap requires that I define a function to return the HTML I wish to send back to the client making the HTTP request. This means that:
page = runSqlite "TestDB" $ do
{pull data from the DB)
Is no-go as I can't return the data via the runSqlite call, and as far as I know I can't have a variable in the page function which is set within the runSqlite do block. All examples I can find just write to IO in the runSqlite do block, which is not what needs to be done here.
The type of runSqlite is:
runSqlite :: (MonadBaseControl IO m, MonadIO m) => Text -> SqlPersistT (NoLoggingT (ResourceT m)) a -> m a
And the type of selectList is:
[Filter val] -> [SelectOpt val] -> m [Entity val]
So, you can actually, use the nice do notation of Monad, to extract it:
runSqlite "TestDB" $ do
myData <- selectList ([] :: [Filter Person]) [] 0 0
-- Now do stuff with myData
The <- thing gets the list out of the monad. I would suggest you to go through this chapter to get an idea of how Persistent is used. Note that the chapters in the book assume a basic Haskell understanding.
The issue is that I want to use the selectList outside of runSqlite as
I need to pass the concatenated string to a Blaze HTML5 tag builder:
body $ do p (concatenated list...)
For this case, just define a function that does your intended task:
myLogic :: [SqlColumnData] -> String -- Note that SqlColumnData is hypothetical
myLogic xs = undefined
And then just call them appropriately in your main function:
main = runSqlite "TestDB" $ do
myData <- selectList ([] :: [Filter Person]) [] 0 0
let string = myLogic myData
-- do any other remaining stuff
It hadn't clicked that if I didn't use a do block with runSqlite, the result of the last call in the statement was the return value of the statement - this makes total sense.
https://gist.github.com/egonSchiele/5400694
In this example (not mine) the readPosts function does exactly what I'm after and cleared up some Haskell syntax confusion.
Thank you for your help #Sibi.

Resources