Extracting dict keys from values - dictionary

I am still learning about python and I face some trouble extracting data from a dict. I need to create a loop which check each values and extract the keys. So for this code I need to find the nice students. I am stuck at line 3 #blank.
How do i go about this?
Thanks in advance
class = {"James":"naughty", "Lisa":"nice", "Bryan":"nice"}
for student in class:
if #blank:
print("Hello, "+student+" students!")
else:
print("odd")

Uses dictionary methods "keys(), values(), items()":
def get_students_by_criteria(student_class, criteria):
students = []
for candidate, value in student_class.items():
if value == criteria:
students.append(candidate)
return students
my_class = {"James":"naughty", "Lisa":"nice", "Bryan":"nice"}
print(get_students_by_criteria(my_class, "nice"))
Warning to the word "class" it is a keyword reserved for python programming oriented object

Related

Neo4j - Project graph from Neo4j with GraphDataScience

So I have a graph of nodes -- "Papers" and relationships -- "Citations".
Nodes have properties: "x", a list with 0/1 entries corresponding to whether a word is present in the paper or not, and "y" an integer label (one of the classes from 0-6).
I want to project the graph from Neo4j using GraphDataScience.
I've been using this documentation and I indeed managed to project the nodes and vertices of the graph:
Code
from graphdatascience import GraphDataScience
AURA_CONNECTION_URI = "neo4j+s://xxxx.databases.neo4j.io"
AURA_USERNAME = "neo4j"
AURA_PASSWORD = "my_code:)"
# Client instantiation
gds = GraphDataScience(
AURA_CONNECTION_URI,
auth=(AURA_USERNAME, AURA_PASSWORD),
aura_ds=True
)
#Shorthand projection --works
shorthand_graph, result = gds.graph.project(
"short-example-graph",
["Paper"],
["Citation"]
)
When I do print(result) it shows
nodeProjection {'Paper': {'label': 'Paper', 'properties': {}}}
relationshipProjection {'Citation': {'orientation': 'NATURAL', 'aggre...
graphName short-example-graph
nodeCount 2708
relationshipCount 10556
projectMillis 34
Name: 0, dtype: object
However, no properties of the nodes are projected. I then use the extended syntax as described in the documentation:
# Project a graph using the extended syntax
extended_form_graph, result = gds.graph.project(
"Long-form-example-graph",
{'Paper': {properties: "x"}},
"Citation"
)
print(result)
#Errors
I get the error:
NameError: name 'properties' is not defined
I tried various variations of this, with or without " ", but none have worked so far (also documentation is very confusing because one of the docs always uses " " and in another place I did not see " ").
Also, note that all my properties are integers in the Neo4j db (in AuraDS), as I used to have the error that String properties are not supported.
Some clarification on the correct way of projecting node features (aka properties) would be very useful.
thank you,
Dina
The keys in the Python dictionaries that you use with the GraphDataScience library should be enclosed in quotation marks. This is different from Cypher syntax, where map keys are not enclosed with quotation marks.
This should work for you.
extended_form_graph, result = gds.graph.project(
"Long-form-example-graph",
{'Paper': {"properties": "x"}},
"Citation"
)
Best wishes,
Nathan

BertModel transformers outputs string instead of tensor

I'm following this tutorial that codes a sentiment analysis classifier using BERT with the huggingface library and I'm having a very odd behavior. When trying the BERT model with a sample text I get a string instead of the hidden state. This is the code I'm using:
import transformers
from transformers import BertModel, BertTokenizer
print(transformers.__version__)
PRE_TRAINED_MODEL_NAME = 'bert-base-cased'
PATH_OF_CACHE = "/home/mwon/data-mwon/paperChega/src_classificador/data/hugingface"
tokenizer = BertTokenizer.from_pretrained(PRE_TRAINED_MODEL_NAME,cache_dir = PATH_OF_CACHE)
sample_txt = 'When was I last outside? I am stuck at home for 2 weeks.'
encoding_sample = tokenizer.encode_plus(
sample_txt,
max_length=32,
add_special_tokens=True, # Add '[CLS]' and '[SEP]'
return_token_type_ids=False,
padding=True,
truncation = True,
return_attention_mask=True,
return_tensors='pt', # Return PyTorch tensors
)
bert_model = BertModel.from_pretrained(PRE_TRAINED_MODEL_NAME,cache_dir = PATH_OF_CACHE)
last_hidden_state, pooled_output = bert_model(
encoding_sample['input_ids'],
encoding_sample['attention_mask']
)
print([last_hidden_state,pooled_output])
that outputs:
4.0.0
['last_hidden_state', 'pooler_output']
While the answer from Aakash provides a solution to the problem, it does not explain the issue. Since one of the 3.X releases of the transformers library, the models do not return tuples anymore but specific output objects:
o = bert_model(
encoding_sample['input_ids'],
encoding_sample['attention_mask']
)
print(type(o))
print(o.keys())
Output:
transformers.modeling_outputs.BaseModelOutputWithPoolingAndCrossAttentions
odict_keys(['last_hidden_state', 'pooler_output'])
You can return to the previous behavior by adding return_dict=False to get a tuple:
o = bert_model(
encoding_sample['input_ids'],
encoding_sample['attention_mask'],
return_dict=False
)
print(type(o))
Output:
<class 'tuple'>
I do not recommend that, because it is now unambiguous to select a specific part of the output without turning to the documentation as shown in the example below:
o = bert_model(encoding_sample['input_ids'], encoding_sample['attention_mask'], return_dict=False, output_attentions=True, output_hidden_states=True)
print('I am a tuple with {} elements. You do not know what each element presents without checking the documentation'.format(len(o)))
o = bert_model(encoding_sample['input_ids'], encoding_sample['attention_mask'], output_attentions=True, output_hidden_states=True)
print('I am a cool object and you can acces my elements with o.last_hidden_state, o["last_hidden_state"] or even o[0]. My keys are; {} '.format(o.keys()))
Output:
I am a tuple with 4 elements. You do not know what each element presents without checking the documentation
I am a cool object and you can acces my elements with o.last_hidden_state, o["last_hidden_state"] or even o[0]. My keys are; odict_keys(['last_hidden_state', 'pooler_output', 'hidden_states', 'attentions'])
I faced the same issue while learning how to implement Bert. I noticed that using
last_hidden_state, pooled_output = bert_model(encoding_sample['input_ids'], encoding_sample['attention_mask'])
is the issue. Use:
outputs = bert_model(encoding_sample['input_ids'], encoding_sample['attention_mask'])
and extract the last_hidden state using
output[0]
You can refer to the documentation here which tells you what is returned by the BertModel

Merging Value of one dictionary as key of another

I have 2 Dictionaries:
StatePops={'AL':4887871, 'AK':737438, 'AZ':7278717, 'AR':3013825}
StateNames={'AL':'Alabama', 'AK':'Alaska', 'AZ':'Arizona', 'AR':'Arkansas'}
I am trying to merge so the Value of StateNames is the Key for StatePops.
Ex.
{'Alabama': 4887871, 'Alaska': 737438, ...
I also have to display the name of states w/ population over 4million.
Any help is appreciated!!!
You have not specified in what programming language you want this problem to be solved.
Nonetheless, here is a solution in Python.
state_pops = {
'AL': 4887871,
'AK': 737438,
'AZ':7278717,
'AR':3013825
}
state_names = {
'AL':'Alabama',
'AK':'Alaska',
'AZ':'Arizona',
'AR':'Arkansas'
}
states = dict([([state_names[k],state_pops[k]]) for k in state_pops])
final = {k:v for k, v in states.items() if v > 4000000}
print(states)
print(final)
First, you can merge two dictionaries with the predefined dict python function in the states variable as such. Here, k is an iterator and it is used as index for state_names and state_pops.
Then, store the filtered dictionary in final where the states.items() is used to access the keys and values in states and type-cast it as a string with the str function.
There may be more simpler solutions but this is as far as I can optimize the problem.
Hope this helps.
Dictionary Keys cannot be changed in Python. You need to either add the modified key-value pair to the dictionary and then delete the old key, or you can create a new dictionary. I'd opt for the second option, i.e., creating a new dictionary.
myDict = {}
for i in StatePops:
myDict.update({StateNames[i] : StatePops[i]})
This outputs myDict as
{'Alabama': 4887871, 'Alaska': 737438, 'Arizona': 7278717, 'Arkansas': 3013825}

Update dictionary key inside list using map function -Python

I have a dictionary of phone numbers where number is Key and country is value. I want to update the key and add country code based on value country. I tried to use the map function for this:
print('**Exmaple: Update phone book to add Country code using map function** ')
user=[{'952-201-3787':'US'},{'952-201-5984':'US'},{'9871299':'BD'},{'01632 960513':'UK'}]
#A function that takes a dictionary as arg, not list. List is the outer part
def add_Country_Code(aDict):
for k,v in aDict.items():
if(v == 'US'):
aDict[( '1+'+k)]=aDict.pop(k)
if(v == 'UK'):
aDict[( '044+'+k)]=aDict.pop(k)
if (v == 'BD'):
aDict[('001+'+k)] =aDict.pop(k)
return aDict
new_user=list(map(add_Country_Code,user))
print(new_user)
This works partially when I run, output below :
[{'1+952-201-3787': 'US'}, {'1+1+1+952-201-5984': 'US'}, {'001+9871299': 'BD'}, {'044+01632 960513': 'UK'}]
Notice the 2nd US number has 2 additional 1s'. What is causing that?How to fix? Thanks a lot.
Issue
You are mutating a dict while iterating it. Don't do this. The Pythonic convention would be:
Make a new_dict = {}
While iterating the input a_dict, assign new items to new_dict.
Return the new_dict
IOW, create new things, rather than change old things - likely the source of your woes.
Some notes
Use lowercase with underscores when defining variable names (see PEP 8).
Lookup values rather than change the input dict, e.g. a_dict[k] vs. a_dict.pop(k)
Indent the correct number of spaces (see PEP 8)

What's wrong with my filter query to figure out if a key is a member of a list(db.key) property?

I'm having trouble retrieving a filtered list from google app engine datastore (using python for server side). My data entity is defined as the following
class Course_Table(db.Model):
course_name = db.StringProperty(required=True, indexed=True)
....
head_tags_1=db.ListProperty(db.Key)
So the head_tags_1 property is a list of keys (which are the keys to a different entity called Headings_1).
I'm in the Handler below to spin through my Course_Table entity to filter the courses that have a particular Headings_1 key as a member of the head_tags_1 property. However, it doesn't seem like it is retrieving anything when I know there is data there to fulfill the request since it never displays the logs below when I go back to iterate through the results of my query (below). Any ideas of what I'm doing wrong?
def get(self,level_num,h_key):
path = []
if level_num == "1":
q = Course_Table.all().filter("head_tags_1 =", h_key)
for each in q:
logging.info('going through courses with this heading name')
logging.info("course name filtered is %s ", each.course_name)
MANY MANY THANK YOUS
I assume h_key is key of headings_1, since head_tags_1 is a list, I believe what you need is IN operator. https://developers.google.com/appengine/docs/python/datastore/queries
Note: your indentation inside the for loop does not seem correct.
My bad apparently '=' for list is already check membership. Using = to check membership is working for me, can you make sure h_key is really a datastore key class?
Here is my example, the first get produces result, where the 2nd one is not
import webapp2 from google.appengine.ext import db
class Greeting(db.Model):
author = db.StringProperty()
x = db.ListProperty(db.Key)
class C(db.Model): name = db.StringProperty()
class MainPage(webapp2.RequestHandler):
def get(self):
ckey = db.Key.from_path('C', 'abc')
dkey = db.Key.from_path('C', 'def')
ekey = db.Key.from_path('C', 'ghi')
Greeting(author='xxx', x=[ckey, dkey]).put()
x = Greeting.all().filter('x =',ckey).get()
self.response.write(x and x.author or 'None')
x = Greeting.all().filter('x =',ekey).get()
self.response.write(x and x.author or 'None')
app = webapp2.WSGIApplication([('/', MainPage)],
debug=True)

Resources