incremental select using id from SQLite in Twisted - sqlite

I am trying to select data from a table in SQLite one row ONLY at a time for each call to the function, and I want the row to increment on each call (self.count is initialized elsewhere and 'line' is irrelevant here) I am using an adbapi connection pool in Twisted to connect to the DB. Here is the code I have tried:
def queryBTData4(self,line):
self.count=self.count+1
uuId=self.count
query="SELECT co2_data, patient_Id FROM btdata4 WHERE uid=:uid",{"uid": uuId}
d = self.dbpool.runQuery(query)
return d
This code works if I just set uid=1 or any other number in the DB (I used autoincrement for uid when I created the DB) but when I try to assign a value to uid (i.e. self.count via uuId) it reports that the operator has to be string or unicode.(I have tried both but it does not seem to help) However, I know that the query statement above works just fine in a previous program when I use a cursor and the execute command but I cannot see why it does not work here. I have tried all sorts of combinations and searched for a solution but have not found anything yet that works.(I have also tried the statement with brackets and other forms)
Thanks for any help or advice.
Here is the entire code:
from twisted.internet import protocol, reactor
from twisted.protocols import basic
from twisted.enterprise import adbapi
import sqlite3, time
class ServerProtocol(basic.LineReceiver):
def __init__(self):
self.conn = sqlite3.connect('biomed2.db',check_same_thread=False)
self.dbpool = adbapi.ConnectionPool("sqlite3" , 'biomed2.db', check_same_thread=False)
def connectionMade(self):
self.sendLine("conn made")
factory = protocol.ClientFactory()
factory.protocol = ClientProtocol
factory.originator = self
reactor.connectTCP('localhost', 1234, factory)
def lineReceived(self, line):
self._received = line
self.insertBTData4(self._received)
self.sendLine("line recvd")
def forwardLine(self, recipient):
recipient.sendLine(self._received)
def insertBTData4(self,data):
print "data in insert is",data
chx=data
PID=2
device_ID=5
query="INSERT INTO btdata4(co2_data,patient_Id, sensor_Id) VALUES ('%s','%s','%s')" % (chx, PID, device_ID)
dF = self.dbpool.runQuery(query)
return dF
class ClientProtocol(basic.LineReceiver):
def __init__(self):
self.conn = sqlite3.connect('biomed2.db',check_same_thread=False)
self.dbpool = adbapi.ConnectionPool("sqlite3" , 'biomed2.db', check_same_thread=False)
self.count=0
def connectionMade(self):
print "server-client made connection with client"
self.factory.originator.forwardLine(self)
#self.transport.loseConnection()
def lineReceived(self, line):
d=self.queryBTData4(self)
d.addCallbacks(self.sendData,self.printError )
def queryBTData4(self,line):
self.count=self.count+1
query=("SELECT co2_data, patient_Id FROM btdata4 WHERE uid=:uid",{"uid": uuId})
dF = self.dbpool.runQuery(query)
return dF
def sendData(self,line):
data=str(line)
self.sendLine(data)
def printError(self,error):
print "Got Error: %r" % error
error.printTraceback()
def main():
factory = protocol.ServerFactory()
factory.protocol = ServerProtocol
reactor.listenTCP(4321, factory)
reactor.run()
if __name__ == '__main__':
main()
The DB is created in another program, thus:
import sqlite3, time, string
conn = sqlite3.connect('biomed2.db')
c = conn.cursor()
c.execute('''CREATE TABLE btdata4
(uid INTEGER PRIMARY KEY, co2_data integer, patient_Id integer, sensor_Id integer)''')
The main program takes data into the server socket and inserts into DB. On the client socket side, data is removed from the DB one line at a time and sent to an external server. The program also has the ability to send data from the server side to the client side if required but I am not doing so here at the moment.
In queryBTData(), every time the function is called the count increments and I assign that value to uuId, which I then pass to the query. I have had this query statement working in a program where I do not use the adbapi but it does not seem to work here. I hope this is clear enough but if not please let me know and I will try again.
EDIT:
I have modified the program to take one row from the DB at a time (see queryBTData() below) but have come across another problem.
def queryBTData4(self,line):
self.count=self.count+1
xuId= self.count
#xuId=10
return self.dbpool.runQuery("SELECT co2_data FROM btdata4 WHERE uid = ?",xuId)
#return self.dbpool.runQuery("SELECT co2_data FROM btdata4 WHERE uid = 10")
When the count gets to 10 I get an error (which I will post below) which states that: "Incorrect number of bindings supplied. The current statement uses 1, and there are 2 supplied"
I have tried setting xuId to 10 (see commented out line xuId=10) but I still get the same error. However, if I switch the return statements (to commented out return) I do indeed get correct row with no error. I have tried converting xuId to unicode but that makes no difference, I still get the same error. Basically, if I I set uid in the return statement to 10 or more (commented out return) it works, but if I set uid to xuId (i.e. uid=?,xuId) in the first return, it only works when xuId is below 10. The API documentation, as far as I can tell, gives no clue as to why this occurs.(I have also disabled the insert into DB to eliminate this and checked the SQLite3_ limit, which is 999)
Here are the errors I am getting when using the first return statement.
Got Error: <twisted.python.failure.Failure <class 'sqlite3.ProgrammingError'>>
Traceback (most recent call last):
File "c:\python26\lib\threading.py", line 504, in __bootstrap
self.__bootstrap_inner()
File "c:\python26\lib\threading.py", line 532, in __bootstrap_inner
self.run()
File "c:\python26\lib\threading.py", line 484, in run
self.__target(*self.__args, **self.__kwargs)
--- <exception caught here> ---
File "c:\python26\lib\site-packages\twisted\python\threadpool.py", line 207, i
n _worker
result = context.call(ctx, function, *args, **kwargs)
File "c:\python26\lib\site-packages\twisted\python\context.py", line 118, in c
allWithContext
return self.currentContext().callWithContext(ctx, func, *args, **kw)
File "c:\python26\lib\site-packages\twisted\python\context.py", line 81, in ca
llWithContext
return func(*args,**kw)
File "c:\python26\lib\site-packages\twisted\enterprise\adbapi.py", line 448, i
n _runInteraction
result = interaction(trans, *args, **kw)
File "c:\python26\lib\site-packages\twisted\enterprise\adbapi.py", line 462, i
n _runQuery
trans.execute(*args, **kw)
sqlite3.ProgrammingError: Incorrect number of bindings supplied. The current sta
tement uses 1, and there are 2 supplied.
Thanks.

Consider the API documentation for runQuery. Next, consider the difference between these three function calls:
c = a, b
f(a, b)
f((a, b))
f(c)
Finally, don't paraphrase error messages. Always quote them verbatim. Copy/paste whenever possible; make a note when you've manually transcribed them.

Related

Flask Testing with SQLite DB, deleting not working

In order to test my flask application, I setup a local SQLite DB with create_engine. To delete, I have my route call TablesEtl().delete_record_by_id().
class TablesEtl:
def __init__(self) -> None:
self.engine = create_engine(get_database_engine_uri(), connect_args={"check_same_thread": False})
def get_session(self) -> Session:
session = Session(self.engine, autoflush=False)
return session
def delete_record_by_id(self, record_id: int, table: Base) -> None:
session = self.get_session()
session.query(table).filter(table.id == record_id).delete()
session.commit()
session.close()
def get_record_by_id(self, record_id: int, table: Base) -> dict:
session = self.get_session()
query: Base = Query(table, session=session).filter(table.id == record_id).first()
session.close()
if query is None:
return {}
info = {}
counting = 0
for field in query.__table__.columns.keys():
info[field.replace("_", "-")] = getattr(query, field)
counting += 1
return info
The test_delete is done by creating a record, deleting it through this route and then requesting to the "information" route, information about the record I've just created. It should return a 404, since the record doesn't exist.
The problem is that it returns 200 and the correct information about the record I've just created.
This didn't occured when I ran the tests with pytest alone, but now I'm running with pytest-xdist and I think the second request is coming before the record was actually deleted. I also noticed that a db-name.db-journal was created side by side with my original db-name.db.
def test_not_finding_model_info_after_delete(self) -> None:
model_id = create_model_on_client(self.client, self.authorization_header, "Model Name")
self.client.delete(API_DELETE_MODEL, headers=self.authorization_header, json={"model-id": model_id})
response = get_model(self.client, self.authorization_header, model_id)
self.assertEqual(404, response.status_code)
When I run it with some time.sleep it works, but I don't want to depend upon a sleep for it to work and when I run pytest a second time with --lf option it works, but I believe it's because there are less calls being made to the db. When I get all the records that were asked to be deleted and check against the db, they are in fact deleted. I believe it's something to do with this .db-journal.
Can someone shade a light to it? Another DB would be better for this problem?

Pull list xcoms in TaskGroups not working

My airflow code has the below Python Operator callable where I am creating a list and pushing it to xcoms:
keys = []
values = []
def attribute_count_check(e_run_id,**context):
job_run_id = int(e_run_id)
da = "select count (distinct row_num) from dds_metadata.dds_temp_att_table where run_id ={}".format(job_run_id)
cursor.execute(da)
res = cursor.fetchall()
view_res = [x for res in res for x in res]
count_of_sql = view_res[0]
print(count_of_sql)
if count_of_sql < 1:
print("deleting of cluster")
return 'delete_cluster'
else :
print("triggering attr_check")
num_attributes_per_task = num_attr #job_config
diff = math.ceil (count_of_sql / num_attributes_per_task)
instance = int(diff)
n = num_attributes_per_task
global values
global keys
for r in range(1, instance+1):
#a = r
keys.append(r)
lower_ranges =(n*(r-1)) +1
upper_range = (n*(r - 1)) + n
b =(lower_ranges,upper_range)
values.append(b)
task_instance = context['task_instance']
task_instance.xcom_push(key="di_keys", value=keys)
task_instance.xcom_push(key="di_values", value=values)
The xcoms from the job is as in the below screenshot :
Now I am trying to fetch the values from xcoms to create cluster dynamically with the code below:
with TaskGroup('dataproc_create_cluster',prefix_group_id=False) as dataproc_create_clusters:
for i in zip('{{ ti.xcom_pull(key="di_keys")}}','{{ ti.xcom_pull(key="di_values")}}'):
dynmaic_create_cluster = DataprocCreateClusterOperator(
task_id="create_cluster_{}".format(list(eval(str(i)))[0]),
project_id='{0}'.format(PROJECT),
cluster_config=CLUSTER_GENERATOR_CONFIG,
region='{0}'.format(REGION),
cluster_name="dataproc-cluster-{}-sit".format(str(i[0])),
)
But I am getting the below error:
Broken DAG: [/opt/airflow/dags/Cluster_config.py] Traceback (most recent call last):
File "/usr/local/lib/python3.6/site-packages/airflow/models/baseoperator.py", line 547, in __init__
validate_key(task_id)
File "/usr/local/lib/python3.6/site-packages/airflow/utils/helpers.py", line 56, in validate_key
"dots and underscores exclusively".format(k=k)
airflow.exceptions.AirflowException: The key (create_cluster_{) has to be made of alphanumeric characters, dashes, dots and underscores exclusively
So I changed the task_id as below:
task_id="create_cluster_"+re.sub(r'\W+', '', str(list(eval(str(i)))[0])),
After which I got the below error:
airflow.exceptions.DuplicateTaskIdFound: Task id 'create_cluster_' has already been added to the DAG
This made me think that the value in Xcoms is being parsed one literal at a time, so I used render_template_as_native_obj=True, .
But I am still getting the duplicate task id error
Regarding the jinja2 templating outside of templated fields
First, you can only use jinja2 templating in templated fields. Simply said, there are two processes. One is parsing the DAG (which happens first), the other is executing the tasks. At the moment your DAG is parsed, no tasks have run yet and there is no TaskInstance available, and thus also no XCOM pull available. However, with templated fields, you can use jinja2 templating for which the value of the fields are computed at the moment your task executes. At that point, the TaskInstance and the XCOM pull is available.
For example, in a PythonOperator you can use the following templated fields;
template_fields: Sequence[str] = ('templates_dict', 'op_args', 'op_kwargs')
Changing the number of tasks based on a result of a task.
Second, you can not change the number of tasks it contains based on the output of a task. Airflow simply does not support this. There is one exception; which is using mapped tasks. There is a nice example in the docs that I copied here;
#task
def make_list():
# This can also be from an API call, checking a database, -- almost anything you like, as long as the
# resulting list/dictionary can be stored in the current XCom backend.
return [1, 2, {"a": "b"}, "str"]
#task
def consumer(arg):
print(list(arg))
with DAG(dag_id="dynamic-map", start_date=datetime(2022, 4, 2)) as dag:
consumer.expand(arg=make_list())

How do I turn a file's contents into a dictionary?

I have a function that I want to open .dat files with, to extract data from them, but the problem is I don't know how to turn that data back into a dictionary to store in a variable. Currently, the data in the files are stored like this: "{"x":0,"y":1}" (it uses up only one line of the file, which is just the normal structure of a dictionary).
Below is just the function where I open the .dat file and try to extract stuff from it.
def openData():
file = fd.askopenfile(filetypes=[("Data",".dat"),("All Files",".*")])
filepath = file.name
if file is None:
return
with open(filepath,"r") as f:
contents = dict(f.read())
print(contents["x"]) #let's say there is a key called "x" in that dictionary
This is the error that I get from it: (not because the key "x" is not in dict, trust me)
Exception in Tkinter callback
Traceback (most recent call last):
File "...\AppData\Local\Programs\Python\Python39\lib\tkinter\__init__.py", line 1892, in __call__
return self.func(*args)
File "...\PycharmProjects\[this project]\main.py", line 204, in openData
contents = dict(f.read())
ValueError: dictionary update sequence element #0 has length 1; 2 is required
Process finished with exit code 0
Update: I tried using json and it worked, thanks to #match
def openData():
file = fd.askopenfile(filetypes=[("Data",".dat"),("All Files",".*")])
filepath = file.name
if file is None:
return
with open(filepath,"r") as f:
contents = dict(json.load(f))
print(contents["x"])
You need to parse the data to get a data structure from a string, fortunately, Python provides a function for safely parsing Python data structures: ast.literal_eval(). E.g:
import ast
...
with open("/path/to/file", "r") as data:
dictionary = ast.literal_eval(data.read())
Reference stackoverflow

Rename Dexterity object (id) after copy

It's simple to choose the object ID at creation time with INameChooser.
But we also want to be able to choose the object ID after a clone (and avoid copy_of in object ID).
We tried several different solutions :
subscribers on events :
OFS.interfaces.IObjectClonedEvent
zope.lifecycleevent.interfaces.IObjectAddedEvent
...
manage_afterClone method on content class
Every time, we get a traceback because we changed the ID "too soon". For example when using Plone API :
File "/Users/laurent/.buildout/eggs/plone.api-2.0.0a1-py3.7.egg/plone/api/content.py", line 256, in copy
return target[new_id]
File "/Users/laurent/.buildout/eggs/plone.folder-3.0.3-py3.7.egg/plone/folder/ordered.py", line 241, in __getitem__
raise KeyError(key)
KeyError: 'copy_of_87c7f9b7e7924d039b832d3796e7b5a3'
Or, with a Copy / Paste in the Plone instance :
Module plone.app.content.browser.contents.paste, line 42, in __call__
Module OFS.CopySupport, line 317, in manage_pasteObjects
Module OFS.CopySupport, line 229, in _pasteObjects
Module plone.folder.ordered, line 73, in _getOb
AttributeError: 'copy_of_a7ed3d678f2643bc990309cde61a6bc5'
It's logical, because the ID is stored before events notification / manage_afterClone call for later use.
Even defining a _get_id on containers cannot work to define the ID, because we don't have the object to get attributes from (and generate the ID).
But then, how could we achieve this in a clean way ?
Please tell me there is a better solution than redefining _pasteObjects (OFS.CopySupport) !
Thank you for your inputs !
So unfortunately not...
But you can access the original object from within _get_id.
For example:
from OFS.CopySupport import _cb_decode
from plone import api
...
def _get_id(self, id_):
# copy_or_move => 0 means copy, 1 means move
copy_or_move, path_segments = _cb_decode(self.REQUEST['__cp']
source_path = '/'.join(path_segments[0]) # Imlement for loop for more than one copied obj.
app = self.getPhysicalRoot()
# access to source obj
source_obj = app.restrictedTraverse(source_path)
# Do whatever you need - probably using INameChooser
....
To have a canonical way of patching this I use collective.monkeypatcher
Once installed add the following in ZCML:
<include package="collective.monkeypatcher" />
<monkey:patch
class="OFS.CopySupport.CopyContainer"
original="_get_id"
replacement="patches.patched_get_id"
/>
Where patches.py is your module containing the new method patched_get_id, which replaces _get_id.
I'm sorry I don't have better news for you, but this is how I solved a similar requirement.
This code (patched _get_id) adds a counter at the end of a id if already there.
def patched_get_id(self, id_)
match = re.match('^(.*)-([\d]+)$', id_)
if match:
id_ = match.group(1)
number = int(match.group(2))
else:
number = 1
new_id = id_
while new_id in self.objectIds():
new_id = '{}-{}'.format(id_, number)
number += 1
return new_id

python3 correct argv usage

Here is the code I have for reading a text file a storing it as a dictionary:
from sys import argv
def data(file):
d = {}
for line in file:
if line.strip() != '':
key,value = line.split(":")
if key == 'RootObject':
continue
if key == 'Object':
obj = value.strip()
d[obj]={}
else:
d[obj][key] = value.strip()
return d
file = open(argv[1])
planets = data(file)
print(planets)
My question is did I implement the argv correctly so that any user can run the dictionary by just typing solardictionary.py random.txt in the commandline and run it. I tried running this but I keep getting an index error and I'm not sure there might be something wrong with my argv implementation.
You need to type file = open(sys.argv[1],'r') in order to access the array because it is contained within the sys module.
http://docs.python.org/3.1/library/sys.html#module-sys
You may also be interested in opening the file within a try-catch block.

Resources