How do I check if object exist in pony orm cache? - python-3.6

I am trying some data processing and then insert in one go, tables have some composite keys which I use to check if the record for that Id exists then it should update the record instead of creating.
While data processing there is a chance that same id exists in sample data multiple times and while processing it won't find record matching to that Id in db hence it tries to create the record everytime.
Is there any way I can check cache for composite key match?

You can use a get method of entity instance to find an object by combination of primary key values.
The get method searches both db_session cache and database, and returns None if no object was found:
from pony import orm
db = orm.Database('sqlite', ':memory:')
class Point(db.Entity):
x = orm.Required(int)
y = orm.Required(int)
description = orm.Optional(str)
orm.PrimaryKey(x, y)
db.generate_mapping(create_tables=True)
points = [(1, 2, 'foo'), (3, 4, 'bar'), (1, 2, 'baz'), (5, 6, 'qux')]
with orm.db_session:
for a, b, s in points:
point = Point.get(x=a, y=b)
if point is None:
point = Point(x=a, y=b, description=s)
The get method also works with secondary composite keys.
Also you can search for an entity instance using square brackets and catch an exception:
with orm.db_session:
try:
point = Point[10, 20]
except orm.ObjectNotFound:
point = Point(x=10, y=20)
Update: clarification about IdentityMap patern usage
PonyORM uses IdentityMap design pattern. That means that all objects of the same class loaded or created inside a db_session are indexed by its primary key values. It is not possible multiple objects of the same class with the same primary key inside the same db_session. If the get method is called several times with the same primary key value, it will return the same instance:
with db_session:
a = Point.get(x=10, y=20)
b = Point.get(x=10, y=20)
assert a is b # the same object!
The same applied to situation when you just create an object and then attempts to get object with the same primary key: Pony will return the object that you just created:
with db_session:
a = Point.get(x=10, y=20)
if a is None:
a = Point(x=10, y=20)
# later, in the same db_session:
b = Point.get(x=10, y=20)
assert a is b # the same object, not saved yet
On the other side, if you attempts to create two different objects with the same primary key, you will receive an error:
with db_session:
a = Point(x=10, y=20)
b = Point(x=10, y=20)
Traceback (most recent call last):
...
pony.orm.core.CacheIndexError: Cannot create Point: instance with primary key 10, 20 already exists
Also, if you attempt to create an object which already exists in the database without checking it, you will get an error:
with db_session:
a = Point(x=30, y=40)
with db_session:
b = Point(x=30, y=40)
Traceback (most recent call last):
...
pony.orm.core.TransactionIntegrityError: Object Point[30, 40] cannot be stored in the database. IntegrityError: UNIQUE constraint failed: Point.x, Point.y
In order to avoid this error, you can check for object existense using get before creating a new one.

Related

Vertex in Python Gremlin not updating

Using python gremlin on Neptune workbench, I have two functions:
The first adds a Vertex with a set of properties, and returns a reference to the traversal operation
The second adds to that traversal operation.
For some reason, the first function's operations are getting persisted to the DB, but the second operations do not. Why is this?
Here are the two functions:
def add_v(v_type, name):
tmp_id = get_id(f"{v_type}-{name}")
result = g.addV(v_type).property('id', tmp_id).property('name', name)
result.iterate()
return result
def process_records(features):
for i in features:
v_type = i[0]
name = i[1]
v = add_v(v_type, name)
if len(i) > 2:
%debug
props = i[2]
for r in props:
v.property(r[0], r[1]).iterate()
Your add_V method has already iterated the traversal. If you want to return the traversal from add_v in a way that you can add to it remove the iterate.

Java 8 Map merge VS compute, essential difference?

It seems Both merge and compute Map methods are created to reduce if("~key exists here~") when put.
My problem is: add to map a [key, value] pair when I know nothing: neither key existing in map nor it exist but has value nor value == null nor key == null.
words.forEach(word ->
map.compute(word, (w, prev) -> prev != null ? prev + 1 : 1)
);
words.forEach(word ->
map.merge(word, 1, (prev, one) -> prev + one)
);
Is the only difference 1 is moved from Bifunction to parameter?
What is better to use? Does any of merge, compute suggests key/val are existing?
And what is essential difference in use case of them?
The documentation of Map#compute(K, BiFunction) says:
Attempts to compute a mapping for the specified key and its current mapped value (or null if there is no current mapping). For example, to either create or append a String msg to a value mapping:
map.compute(key, (k, v) -> (v == null) ? msg : v.concat(msg))
(Method merge() is often simpler to use for such purposes.)
If the remapping function returns null, the mapping is removed (or remains absent if initially absent). If the remapping function itself throws an (unchecked) exception, the exception is rethrown, and the current mapping is left unchanged.
The remapping function should not modify this map during computation.
And the documentation of Map#merge(K, V, BiFunction) says:
If the specified key is not already associated with a value or is associated with null, associates it with the given non-null value. Otherwise, replaces the associated value with the results of the given remapping function, or removes if the result is null. This method may be of use when combining multiple mapped values for a key. For example, to either create or append a String msg to a value mapping:
map.merge(key, msg, String::concat)
If the remapping function returns null, the mapping is removed. If the remapping function itself throws an (unchecked) exception, the exception is rethrown, and the current mapping is left unchanged.
The remapping function should not modify this map during computation.
The important differences are:
For compute(K, BiFunction<? super K, ? super V, ? extends V>):
The BiFunction is always invoked.
The BiFunction accepts the given key and the current value, if any, as arguments and returns a new value.
Meant for taking the key and current value (if any), performing an arbitrary computation, and returning the result. The computation may be a reduction operation (i.e. merge) but it doesn't have to be.
For merge(K, V, BiFunction<? super V, ? super V, ? extends V>):
The BiFunction is invoked only if the given key is already associated with a non-null value.
The BiFunction accepts the current value and the given value as arguments and returns a new value. Unlike with compute, the BiFunction is not given the key.
Meant for taking two values and reducing them into a single value.
If the mapping function, as in your case, only depends on the current mapped value, then you can use both. But I would prefer:
compute if you can guarantee that a value for the given key exists. In this case the extra value parameter taken by the merge method is not needed.
merge if it is possible that no value for the given key exists. In this case merge has the advantage that null does NOT have to be handled by the mapping function.

Runtime error:dictionary changed size during iteration

I iterate thru items of a dictionary "var_dict".
Then as I iterate in a for loop, I need to update the dictionary.
I understand that is not possible and that triggers the runtime error I experienced.
My question is, do I need to create a different dictionary to store data? As is now, I am trying to use same dictionary with different keys.
I know the problem is related to iteration thru the key and values of a dictionary and attempt to change it. I want to know if the best option in this case if to create a separate dictionary.
for k, v in var_dict.items():
match = str(match)
match = match.strip("[]")
match = match.strip("&apos;&apos;")
result = [index for index, value in enumerate(v) if match in value]
result = str(result)
result = result.strip("[]")
result = result.strip("&apos;")
#====> IF I print(var_dict), at this point I have no error *********
if result == "0":
#It means a match between interface on RP PSE2 model was found; Interface position is on PSE2 architecture
print (f&apos;PSE-2 Line cards:{v} Interfaces on PSE2:{entry} Interface PortID:{port_id}&apos;)
port_id = int(port_id)
print(port_id)
if port_id >= 19:
#print(f&apos;interface:{entry} portID={port_id} CPU_POS={port_cpu_pos} REPLICATION=YES&apos;)
if_info = [entry,&apos;PSE2=YES&apos;,port_id,port_cpu_pos,&apos;REPLICATION=YES&apos;]
var_dict[&apos;IF_PSE2&apos;].append(if_info)
#===> *** This is the point that if i attempt to print var_dict, I get the Error during olist(): dictionary changed size during iteration
else:
#print(f&apos;interface:{entry},portID={port_id} CPU_POS={port_cpu_pos} REPLICATION=NO&apos;)
if_info = [entry,&apos;PSE2=YES&apos;,port_id,port_cpu_pos,&apos;REPLICATION=NO&apos;]
var_dict[&apos;IF_PSE2&apos;].append(if_info)
else:
#it means the interface is on single PSE. No replication is applicable. Just check threshold between incoming and outgoing rate.
if_info = [entry,&apos;PSE2=NO&apos;,int(port_id),port_cpu_pos,&apos;REPLICATION=NO&apos;]
var_dict[&apos;IF_PSE1&apos;].append(if_info)
I did a shallow copy and that allowed me to iterate a dictionary copy and make modifications to the original dictionary. Problem solved. Thanks.
(...)
temp_var_dict = var_dict.copy()
for k, v in temp_var_dict.items():
(...)

the relation between add key/value and mapassign method in go1.10.3

I am reading the source code of map in go1.10.3.It seemed there exist corresponding method about operation such as:
makemap(t *maptype, hint int, h *hmap) *hmap ==> m = make(map[xx]yy)
mapaccess1(t *maptype, h *hmap, key unsafe.Pointer)==> m['key']
but I cant find the correspond method for the operation which add key/value as below:
m['xx']='yy'
there exist a method called mapassign which has some similarity with this
operation.
mapassign(t *maptype, h *hmap, key unsafe.Pointer) unsafe.Pointer
this will add a new key to the map, but as we can see, the input arguments has no value. And another question is when it has already this key, it maybe update this key.
if !alg.equal(key, k) {
continue
}
// already have a mapping for key. Update it.
if t.needkeyupdate {//why??
typedmemmove(t.key, k, key)
}
since the two key is equal, why should update it?
summary:
1. the relation between add key/value operation and method mapassign?
2. why it maybe need to update the key since the insert key and the key which has already exist is equal in mapassign method?
In the operation m[k] = v, the caller copies the value v to the address returned by mapassign.
The comments in the function needkeyupdate explain why some types need key updates: floating point & complex -0 and 0 are equal, but different values; string might have smaller backing store.

Converting a map into another map using the java 8 stream API

Say I have the following map:
Map<Member, List<Message>> messages = ... //constructed somehow
I would like to use the java 8 stream api in order to obtain a:
SortedMap<Message, Member> latestMessages = ...
Where the comparator passed into the SortedMap/TreeMap would be based on the message sendDate field.
Furthermore, of the list of sent messages, I would select the latest message which would become the key to the sorted map.
How can I achieve that?
edit 1:
Comparator<Message> bySendDate = Comparator.comparing(Message::getSendDate);
SortedMap<Message, Member> latestMessages = third.entrySet().stream()
.collect(Collectors.toMap(e -> e.getValue().stream().max(bySendDate).get(), Map.Entry::getKey, (x, y) -> {
throw new AssertionError();
}, () -> new TreeMap(bySendDate.thenComparing(Comparator.comparing(Message::getId)))));
I get the following compilation error:
The method collect(Collector<? super T,A,R>) in the type Stream<T> is not applicable for the arguments (Collector<Map.Entry<Member,List<Message>>,?,TreeMap>)
Let’s dissolve this into two parts.
First, transform Map<Member, List<Message>> messages into a Map<Message, Member> latestMessages by reducing the messages for a particular communication partner (Member) to the latest:
Map<Message, Member> latestMessages0 = messages.entrySet().stream()
.collect(Collectors.toMap(
e -> e.getValue().stream().max(Comparator.comparing(Message::getSendDate)).get(),
Map.Entry::getKey));
Here, the resulting map isn’t sorted but each mapping will contain the latest message shared with that participant.
Second, if you want to have the resulting map sorted by sendDate, you have to add another secondary sort criteria to avoid losing Messages which happen to have the same date. Assuming that you have a Long ID that is unique, adding this ID as secondary sort criteria for messages with the same date would be sufficient:
Comparator<Message> bySendDate=Comparator.comparing(Message::getSendDate);
SortedMap<Message, Member> latestMessages = messages.entrySet().stream()
.collect(Collectors.toMap(
e -> e.getValue().stream().max(bySendDate).get(),
Map.Entry::getKey, (x,y) -> {throw new AssertionError();},
()->new TreeMap<>(bySendDate.thenComparing(Comparator.comparing(Message::getId)))));
Since sorting by the unique IDs should solve any ambiguity, I provided a merge function which will unconditionally throw, as calling it should never be required.

Resources