In Python how remove extra space elements from two dimensional array? - python-3.4

For eg.
list = [['2', '2', '', ''], ['3', '3', '', ''], ['4', '4', '', '']]
and i want as
newlist = [['2','2'],['3','3'],['4','4']]
is there any list comprehensive compact way to achieve this
like for 1D array we have [x for x in strings if x] is there any thing similar to this.

I think you mean you wish to remove empty elements fromm your 2darray. If that is the case then:
old_list = [['2', '2', '', ''], ['3', '3', '', ''], ['4', '4', '', '']]
new_list = [[instance for instance in sublist if len(instance)>0] for sublist in old_list]
If you wish to remove elements containing only whitespace(spaces etc), then yoy may do something like:
old_list = [['2', '2', '', ''], ['3', '3', '', ''], ['4', '4', '', '']]
new_list = [[instance for instance in sublist if not instance.isspace()] for sublist in old_list]

list = [[x for x in y if x != ''] for y in list]

You can achieve this using filter. Also unrelated but since list is a reserved word it's best not to use it and try to come up with a more meaningful name, I've simply renamed it to original_list since the list() method won't work otherwise.
original_list = [['2', '2', '', ''], ['3', '3', '', ''], ['4', '4', '', '']]
new_list = []
for sub_list in original_list:
new_sub_list = list(filter(None, sub_list))
new_list.append(new_sub_list)
print(new_list)
Or in short
new_list2 = [ list(filter(None, sub_list)) for sub_list in original_list ]
print(new_list2)

use this list comprehension:
list = [[x for x in a if x] for a in list]

Related

spaCy Example object format for SpanCategorizer

I am having an issue with SpanCategorizer that I believe is due to my Example object format and possible its initialization.
Can someone provide a very simple Example object with the correct format? Just an example with two docs and two labels will make it for me.
I am not getting how the prediction and the reference should look like. There is a gold standard mentioned in spacy documentation, but it looks out-of-date because the line reference = parse_gold_doc(my_data) doesn't work. Thanks so much for your help!
Here is the code I am using to annotate the docs:
``` phrase_matches = phrase_matcher(doc)
# Initializing SpanGroups
for label in labels:
doc.spans[label]=[]
# phrase_matches detection and labeling of spans, and generation of SpanGrups for each doc
for match_id, start, end in phrase_matches:
match_label = nlp.vocab.strings[match_id]
span = doc[start:end]
span = Span(doc, start, end, label = match_label)
# Set up of the SpanGroup for each doc, for the different labels
doc.spans[match_label].append(span) ```
However spaCy is not recognizing my labels.
If you want/need to create Example objects directly, the easiest way to do so is to use the function Example.from_dict, which takes a predicted doc and a dict. predicted in this context is a Doc with partial annotations, representing data from previous components. For many use-cases, it can just be a "clean" doc created with nlp.make_doc(text):
from spacy.training import Example
from spacy.lang.en import English
nlp = English()
text = "I like London and Berlin"
span_dict = {"spans": {"my_spans": [(7, 13, "LOC"), (18, 24, "LOC"), (7, 24, "DOUBLE_LOC")]}}
predicted = nlp.make_doc(text)
eg = Example.from_dict(predicted, span_dict)
What this function does, is taking the annotations from the dict and using those to define the gold-standard that is now stored in the Example object eg.
If you print this object (using spaCy >= 3.4.2), you'll see the internal representation of those gold-standard annotations:
{'doc_annotation': {'cats': {}, 'entities': ['O', 'O', 'O', 'O', 'O'], 'spans': {'my_spans': [(7, 13, 'LOC', ''), (18, 24, 'LOC', ''), (7, 24, 'DOUBLE_LOC', '')]}, 'links': {}}, 'token_annotation': {'ORTH': ['I', 'like', 'London', 'and', 'Berlin'], 'SPACY': [True, True, True, True, False], 'TAG': ['', '', '', '', ''], 'LEMMA': ['', '', '', '', ''], 'POS': ['', '', '', '', ''], 'MORPH': ['', '', '', '', ''], 'HEAD': [0, 1, 2, 3, 4], 'DEP': ['', '', '', '', ''], 'SENT_START': [1, 0, 0, 0, 0]}}
PS: the parse_gold_doc function in the docs is just a placeholder/dummy function. We'll clarify that in the docs to avoid confusion!

Mutating a Julia dictionary through aliases

Let's say I have a dictionary like so:
my_object = Dict{Symbol, Any}(
:foo => Dict{Symbol, Any}(
:foo_items => ["item_a", "item_b", "item_c"],
:bar => Dict{Symbol, Any}(
:bar_name => ["name_a", "name_b", "name_c"],
:type_before => ["Float32", "Float64", "String"],
:type_after => ["Int32", "Int64", "Int8"]
)
)
)
And I want to convert these arrays, each with different functions, such as making them vectors of Symbol rather than String. I could mutate this dictionary directly, like this:
# Need to check these keys are present
if haskey(my_object, :foo)
if haskey(my_object[:foo], :foo_items)
my_object[:foo][:foo_items] = Symbol.(my_object[:foo][:foo_items])
...
end
This however quickly becomes very tedious, with lots of duplication, and is therefore error-prone. I was hoping to use aliasing to make this a bit simpler and more readable, especially because containers like Dict are passed by reference:
if haskey(my_object, :foo)
foo = my_object[:foo]
if haskey(foo, :foo_items)
foo_items = foo[:foo_items]
foo_items = Symbol.(foo_items)
...
end
This however does not work, with my_object remaining unchanged. Which is strange, because === implies that the memory addresses are the same up until the actual change is made:
julia> foo = my_object[:foo];
julia> foo === my_object[:foo]
true
julia> foo_items = foo[:foo_items];
julia> foo_items === my_object[:foo][:foo_items]
true
Is this a case of copy-on-write? Why can't I mutate the dictionary this way? And what can I do instead if I want to mutate elements of a nested dictionary in a simpler way?
I would do it recursively
function conversion!(dict, keyset)
for (k, v) in dict
if v isa Dict
conversion!(v, keyset)
else
if k in keyset
dict[k] = converted(Val(k), v)
end
end
end
end
converted(::Val{:foo_items}, value) = Symbol.(value)
# converted(::Val{:bar_name}, value) = ...
my_object = Dict{Symbol, Any}(
:foo => Dict{Symbol, Any}(
:foo_items => ["item_a", "item_b", "item_c"],
:bar => Dict{Symbol, Any}(
:bar_name => ["name_a", "name_b", "name_c"],
:type_before => ["Float32", "Float64", "String"],
:type_after => ["Int32", "Int64", "Int8"]
)
)
)
toconvert = Set([:foo_items])#, :bar_name, :type_before, :type_after])
#show my_object
conversion!(my_object, toconvert)
#show my_object
my_object = Dict{Symbol, Any}(:foo => Dict{Symbol, Any}(:foo_items => ["item_a", "item_b", "item_c"], :bar => Dict{Symbol, Any}(:type_before => ["Float32", "Float64", "String"], :bar_name => ["name_a", "name_b", "name_c"], :type_after => ["Int32", "Int64", "Int8"])))
my_object = Dict{Symbol, Any}(:foo => Dict{Symbol, Any}(:foo_items => [:item_a, :item_b, :item_c], :bar => Dict{Symbol, Any}(:type_before => ["Float32", "Float64", "String"], :bar_name => ["name_a", "name_b", "name_c"], :type_after => ["Int32", "Int64", "Int8"])))
Feel like the code may be more type-stable.

Is there a way to find if a sequence of two chars are found in a list only if they are consecutive?

I am currently working in the elm syntax. An example would be like this:
(Sequence ('a') ('b')) ('c') ['a', 'b', 'c', 'd'] . In this example, i only test if the elements 'a', 'b', 'c' are members of the list. If yes, then i partition it and obtain (['a','b','c'],['d'])
I encountered problems in the following case:
(Sequence ('a') ('b')) ('c') ['a', 'b', 'c', 'a']
obtaining the result :
(['a','b','c','a'],[])
My question is: what condition should i put such that the elements 'a' and 'b' must be consecutive avoiding the case when they are matched alone?
This answer assumes that if you have Sequence 'a' 'b' 'c' and test it against the list ['a', 'b', 'c', 'a'], you want to receive the result (['a', 'b', 'c'], ['a']) (as asked in this comment).
In pseudo-code:
Split the list into two, list1 and list2. list1 should have the same length as your sequence. Elm provides List.take and List.drop for that
Convert your sequence into a list list_sequence with a helper function
Test if list1 and list_sequence are equal
If they are, return the tuple (list1, list2)
And here is the actual Elm code:
https://ellie-app.com/bjBLns4dKkra1
Here is some code that tests if a sequence of elements occurs in a list:
module Main exposing (main)
import Html exposing (Html, text)
containsSeq : List a -> List a -> Bool
containsSeq seq list =
let
helper remainingSeq remainingList savedSeq savedList =
case remainingSeq of
[] ->
True
x :: xs ->
case remainingList of
[] ->
False
y :: ys ->
if x == y then
helper xs ys (savedSeq ++ [ x ]) (savedList ++ [ y ])
else
case savedList of
[] ->
helper (savedSeq ++ remainingSeq) ys [] []
y2 :: y2s ->
helper (savedSeq ++ remainingSeq) (y2s ++ remainingList) [] []
in
helper seq list [] []
main =
text <| Debug.toString <| containsSeq [ 'a', 'b', 'c' ] [ 'a', 'b', 'a', 'b', 'c', 'd' ]
This only checks if the sequences appears and the type of the elements have to be comparable.
Here is the above function altered to return a partitioning of the old list as a 3 elements Tuple with (elementsBefore, sequence, elementsAfter). The result is wrapped in a Maybe so that if the sequence is not found, it returns Nothing.
module Main exposing (main)
import Html exposing (Html, text)
partitionBySeq : List a -> List a -> Maybe ( List a, List a, List a )
partitionBySeq seq list =
let
helper remainingSeq remainingList savedSeq savedCurrentList savedOldList =
case remainingSeq of
[] ->
Just ( savedOldList, seq, remainingList )
x :: xs ->
case remainingList of
[] ->
Nothing
y :: ys ->
if x == y then
helper xs ys (savedSeq ++ [ x ]) (savedCurrentList ++ [ y ]) savedOldList
else
case savedCurrentList of
[] ->
helper (savedSeq ++ remainingSeq) ys [] [] (savedOldList ++ [ y ])
y2 :: y2s ->
helper (savedSeq ++ remainingSeq) (y2s ++ remainingList) [] [] (savedOldList ++ [ y ])
in
helper seq list [] [] []
main =
text <| Debug.toString <| partitionBySeq [ 'a', 'b', 'c' ] [ 'a', 'b', 'a', 'b', 'c', 'd' ]
Of course, if you are only dealing with characters, you might as well convert the list into a String using String.fromList and use String.contains "abc" "ababcd" for the first version and String.split "abc" "ababcd" to implement the second one.

How to input a list of integers in python 3.6.5?

1)L=list(input("Enter some values:"))
print(L)
**input:**54321
**Output:**['5', '4', '3', '2', '1']
2)L=list(eval(input("Enter some values:")))
print(L)
**Error:**TypeError: 'int' object is not iterable
try this :
L = list(str(input("Enter some values:")))
print(L)

How to functionally convert a nested hash to a list of records?

Let's say I have a nested hash describing money quantities:
my %money = (coins => {'50c' => 4}, notes => {'10' => 1, '20' => 5});
My desired format is a record list:
my #money = [
(:type('coins'), :subtype('50c'), value => 4),
(:type('notes'), :subtype('10'), value => 1),
(:type('notes'), :subtype('20'), value => 5),
];
The most obvious answer is loops:
my #money;
for %money.kv -> $type, %sub-records {
for %sub-records.kv -> $subtype, $value {
#money.push: (:$type, :$subtype, :$value);
}
}
But I'm allergic to separating a variable from the code that populates it. Next, I tried to create the variable with functional transformations on the input hash:
%money.kv.map: -> $k1, %hsh2 { :type($k1) X, %hsh2.kv.map(->$k2, $v2 {:subtype($k2), :$v2, :value($v2)}) }
But I didn't get the nesting right. I want a list of flat lists. Plus, the above is a mess to read.
The compromise is the gather/take construct which lets me construct a list by iteration without any temporary/uninitialized junk in the main scope:
my #money = gather for %money.kv -> $type, %sub-records {
for %sub-records.kv -> $subtype, $value {
take (:$type, :$subtype, :$value);
}
};
But I'm curious, what is the right way to get this right with just list transformations like map, X or Z, and flat? ("key1", "key2", and "value" are fine field names, since an algorithm shouldn't be domain specific.)
Edit: I should mention that in Perl 6, gather/take is the most readable solution (best for code that's not write-only). I'm still curious about the pure functional solution.
my #money = %money.map:
-> ( :key($type), :value(%records) ) {
slip
:$type xx *
Z
( 'subtype' X=> %records.keys )
Z
( 'value' X=> %records.values )
}
You could do .kv.map: -> $type, %records {…}
-> ( :key($type), :value(%records) ) {…} destructures a Pair object
:$type creates a type => $type Pair
:$type xx * repeats :$type infinitely (Z stops when any of it's inputs stops)
('subtype' X=> %records.keys) creates a list of Pairs
(Note that .keys and .values are in the same order if you don't modify the Hash between the calls)
Z zips two lists
slip causes the elements of the sequence to slip into the outer sequence
(flat would flatten too much)
If you wanted them to be sorted
my #money = %money.sort.map: # 'coins' sorts before 'notes'
-> ( :key($type), :value(%records) ) {
# sort by the numeric part of the key
my #sorted = %records.sort( +*.key.match(/^\d+/) );
slip
:$type xx *
Z
( 'subtype' X=> #sorted».key )
Z
( 'value' X=> #sorted».value )
}
You could do .sort».kv.map: -> ($type, %records) {…}

Resources