urequests micropython problem (multiple POST requests to google forms)

urequests micropython problem (multiple POST requests to google forms) - http

I'm trying to send data to Google Forms directly (without and external service like IFTTT) using an esp8266 with micropython. I've already used IFTTT but at this point is not useful for me, i need a sampling rate of more or equal to 100 Hz and as you know this exceeds the IFTTT's usage limit. I've tried making a RAM buffer, but i got a error saying that the buffer exceded the RAM size (4 MB) so that's why im trying to do directly.
After trying some time i got it partially. I say "partially" because i have to do a random get-request after the post-request; i don't know why it works, but it works (in this way i can send data to Google Forms every second approximately, or maybe less). I guess the problem is that the esp8266 can't close the connection with Google Forms and it gets stuck when it tries to do a new post-request, if this were the problem, i don't know how to fix it in another way, any suggestions? The complete code is here:
ssid = 'my_network'
password = 'my_password'
import urequests
def do_connect():
import network
sta_if = network.WLAN(network.STA_IF)
if not sta_if.isconnected():
print('connecting to network...')
sta_if.active(True)
sta_if.connect(ssid, password)
while not sta_if.isconnected():
pass
print('network config:', sta_if.ifconfig())
def main():
do_connect()
print ("CONNECTED")
url = 'url_of_my_google_form'
form_data = 'entry.61639300=example' #have to change the entry
user_agent = {'Content-Type': 'application/x-www-form-urlencoded'}
while True:
response = urequests.post(url, data=form_data, headers=user_agent)
print ("DATA HAVE BEEN SENT")
response.close
print("TRYING TO SEND ANOTHER ONE...")
response = urequests.get("http://micropython.org/ks/test.html") #<------ RANDOM URL, I DON'T KNOW WHY THIS CODE WORKS CORRECTLY IN THIS WAY
print("RANDOM GET:")
print(response.text)
response.close
if __name__ == '__main__':
main()
Thank you for your time guys. Also i've tried with this code before but it DOESN'T WORK. Without the random get-request, it gets stuck after one or two times of posting:
while True:
response = urequests.post(url, data=form_data, headers=user_agent)
print ("DATA HAVE BEEN SENT")
response.close
print("TRYING TO SEND ANOTHER ONE...")

Shouldn't it be response.close() (with brackets)?.. 🤔,
Without brackets you access a (non existing) property close of the object response instead of calling the method close(), and do not really close the connection. This could lead to memory overflow.

Related

Watson Conversation always returning first Variation text

We are using Watson Conversation from Python. Our dialog has responses with variation texts, but we always receive the first one variation -that is the problem. The dialog does work well when you run it from Bluemix Converation Tooling.
def wd_conv_send_message(sTexto):
# Replace with the context obtained from the initial request
context = {}
workspace_id = conv_workspaceid
response = conversation.message(
workspace_id=workspace_id,
message_input={'text': sTexto},
context=context
)
# print(json.dumps(response, indent=2))
print(response['output']['text'][0])

Change:
response = conversation.message(
workspace_id=workspace_id,
message_input={'text': sTexto},
context=context
)
to:
response = conversation.message(
workspace_id=workspace_id,
message_input={'text': sTexto},
context=context
)
context = response['context']
Conversation is stateless. So you need to send back the context you received or it won't know where to continue on from.

It turned out to be a somewhat erratic behaviour from Watson Conversation side, combined with debugging: if you run/debug from Pycharm -either setting Sequential or Random- you get only the very first Variation several times (five or more). But if you run from Python interpreter command line, it seems to work fine. So, I guess -just speculative- it has to do with some timing issue when running from Pycharm.

Can't figure how phone number reveal works

I am pretty new to web-scraping and recently I am trying to automatically scrap phone number for pages like this. I am not supposed to use Selenium/headless url browser libraries and I am trying to find the a way to actually request the phone number using let say a web service or any other possible solution that could give me the phone number hopefully directly without having to go through the actual button press by selenium.
I totally understand that it may not even be possible to automatically reveal the phone number in one shut as it is meant not be accessible by nosy newbie web-scraper like me; but I still like to raise the question for my information to get detailed answer from an expert point of view.
If I search the "Reveal" button DOM element, it shows some tags which I have never seen before. I have two main questions which I believe could be helpful for newbies like me.
1) Given a set of unknown tags/attribues (ie. data-q and data-reveal in the blow button), how is one able to find out which scripts in the page are actually using them?
2) I googled the button element's tag like: data-q and data-reveal the only relevant I could find was this which for some reason I don't have access two even-if I use proxy.
Any clue particularly on the first question is much appreciate it.
Regards,
Below is the href-button code
Reveal

Ok, according to your demand there are several steps before you finally get a solution.
1st step : open your own browser and enter your target page(https://www.gumtree.com/p/vans/2015-ford-transit-custom-2.2tdci-290-l1-h1/1190345514)
2nd step : (Assume you are using Chrome as your favorite browser) Press Ctrl+Shift+I to open the console, and then select 'Network' tag in the console.
3rd step : Press the 'Reveal' button on that page, watch the console carefully, catch the http request which is sent immediately when you press the 'Reveal' button. You can see the request contains a long string of number in Query String Parameters, actually it is a timestamp.
4th step : Also you can see there is a part named 'Request Headers' in that http request, and you should copy the values of referer , user-agent , x-gumtree-token.
5th step : Try to construct your request (I am a fan of Python, So I am going to show you my example code in Python)
import time
import requests
import json
headers = {
'referer': 'please enter the value you just copied from that specific request',
'user-agent': 'please enter the value you just copied from that specific request',
'x-gumtree-token': 'please enter the value you just copied from that specific request'
}
url = 'https://www.gumtree.com/ajax/account/seller/reveal/number/1190345514?_='
current_time = time.time()
current_time = str(current_time)
current_time = current_time.split('.')[0] + current_time.split('.')[1] + '0'
url += current_time
response = requests.get(url=url,headers=headers)
response_result = json.loads(response.content)
phone_number = response_result['data']

Memory leak while sending response from rebus handler

I saw a very strange behavior in my rebus handler which is self hosted in exe. Right after sending response using bus.send method it adds up some memory consumed by process. I tried to look up object graph using memory profile and found that rebus is holding response message in serialized format somewhere.
Object graph was showing below hierarchy to the root.
System.Message --> CachedBodyMessage --> stream
Give me some pointers if anybody is aware of this thing.

I understand that a memory leak is a grave concern, but my belief is that it is unlikely that Rebus should contain a memory leak.
This belief is rooted in the fact that I have been running Windows Service-hosted Rebus endpoints in production for 1,5 years now, and several of them (e.g. the timeout managers) have sometimes been running for several months without being restarted.
I'd like to be absolutely bulletproof sure though, so I'm willing to investigate the issue you're reporting.
You're mentioning "CachedBodyMessage" - judging by the names of fields inside System.Messaging.Message, it sounds like it's something within MSMQ. To try to reproduce your issue, I coded the following test:
[Test, Ignore("Only works in RELEASE mode because otherwise object references are held on to for the duration of the method")]
public void DoesNotLeakMessages()
{
// arrange
const string inputQueueName = "test.leak.input";
var queue = new MsmqMessageQueue(inputQueueName);
disposables.Add(queue);
var body = Encoding.UTF8.GetBytes(new string('*', 32768));
var message = new TransportMessageToSend
{
Headers = new Dictionary<string, object> { { Headers.MessageId, "msg-1" } },
Body = body
};
var weakMessageRef = new WeakReference(message);
var weakBodyRef = new WeakReference(body);
// act
queue.Send(inputQueueName, message, new NoTransaction());
message = null;
body = null;
GC.Collect();
GC.WaitForPendingFinalizers();
// assert
Assert.That(weakMessageRef.IsAlive, Is.False, "Expected the message to have been collected");
Assert.That(weakBodyRef.IsAlive, Is.False, "Expected the body bytes to have been collected");
}
which verifies that the sent transport message is collected as it should (will only do this in RELEASE mode though, because of the way DEBUG mode holds on to object references within scope)
I'll try and run the TimePrinter sample now and leave it running for a while to see if I can reproduce the issue. If you stumble upon more information about e.g. exactly which objects are leaking, it would be very helpful.
Thanks again for taking the time to report your worries to me :)
Followup:
I've modified the TimePrinter sample so that it sends 50 msg/s and includes a 64 KB random string payload with each message, and I've tracked the memory usage for almost four hours now. As you can see, it does not look like memory is being leaked.
I'll leave it running the rest of the day, just to be sure.
Maybe you can tell me some more about why you suspected there was a memory leak in the first place?
Update:
As you can see from the trace, it has now been running for 7 hours and thus more than 1,200,000 messages containing more than 70 GB of data has been sent and consumed by the same process. If cached message bodies were leaking, I am pretty sure that we would have been able to see something rising on the graph.

How do you Identify the request that a QNetworkReply finished signal is emitted in response to when you are doing multiple requests In QtNetwork?

I have a project will load a HTTP page, parse it, and then open other pages based on the data it received from the first page.
Since Qt's QNetworkAccessManager works asyncronusly, it seems I should be able to load more than one page at a time by continuing to make HTTP requests, and then taking care of the response would happen in the order the replies come back and would be handled by the even loop.
I'm a having a few problems figuring out how to do this though:
First, I read somewhere on stackoverflow that you should use only one QNetworkAccess manager. I do not know if that is true.
The problem is that I'm connecting to the finished slot on the single QNetworkAccess manager. If I do more than one request at a time, I don't know what request the finished signal is in response to. I don't know if there is a way to inspect the QNetworkReply object that is passed from the signal to know what reply it is in response to? Or should I actually be using a different QNetworkAccessManager for each request?
Here is an example of how I'm chaining stuff together right now. But I know this won't work when I'm doing more than one request at at time:
from PyQt4 import QtCore,QtGui,QtNetwork
class Example(QtCore.QObject):
def __init__(self):
super().__init__()
self.QNetworkAccessManager_1 = QtNetwork.QNetworkAccessManager()
self.QNetworkCookieJar_1 = QtNetwork.QNetworkCookieJar()
self.QNetworkAccessManager_1.setCookieJar(self.QNetworkCookieJar_1)
self.app = QtGui.QApplication([])
def start_request(self):
QUrl_1 = QtCore.QUrl('https://erikbandersen.com/')
QNetworkRequest_1 = QtNetwork.QNetworkRequest(QUrl_1)
#
self.QNetworkAccessManager_1.finished.connect(self.someurl_finshed)
self.QNetworkAccessManager_1.get(QNetworkRequest_1)
def someurl_finshed(self, NetworkReply):
# I do this so that this function won't get called for a diffent request
# But it will only work if I'm doing one request at a time
self.QNetworkAccessManager_1.finished.disconnect(self.someurl_finshed)
page = bytes(NetworkReply.readAll())
# Do something with it
print(page)
QUrl_1 = QtCore.QUrl('https://erikbandersen.com/ipv6/')
QNetworkRequest_1 = QtNetwork.QNetworkRequest(QUrl_1)
#
self.QNetworkAccessManager_1.finished.connect(self.someurl2_finshed)
self.QNetworkAccessManager_1.get(QNetworkRequest_1)
def someurl2_finshed(self, NetworkReply):
page = bytes(NetworkReply.readAll())
# Do something with it
print(page)
kls = Example()
kls.start_request()

I am not familiar to PyQt but from general Qt programming point of view
Using only one QNetworkAccessManager is right design choice
finished signal provides QNetworkReply*, with that we can identify corresponding request using request().
I hope this will solve your problem with one manager and multiple requests.
This is a C++ example doing the same.

Seeking not working in HTML5 audio tag

I have a lighttpd server running locally. If I load a static file on the server (through an html5 audio tag), it plays and seeks fine.
However, seeking doesn't work when running a dev server (web.py/CherryPy) or if I return the bytes via a defined action url instead of as a static file. It won't load the duration either.
According to the "HTTP byte range requests" section in this Opera Page it's something to do with support for byte range requests/partial content responses. The content is treated as streaming instead.
What I don't understand is:
If the browser has the whole file downloaded surely it can display the duration, and surely it can seek.
What I need to do on the web server to enable byte range requests (for non-static urls).
Any advice would be most gratefully received.

Here's some web.py code to get you started (just happened to need this as well and ran into your question):
## experimental partial content support
## perhaps this shouldn't be enabled by default
range = web.ctx.env.get('HTTP_RANGE')
if range is None:
return result
total = len(result)
_, r = range.split("=")
partial_start, partial_end = r.split("-")
start = int(partial_start)
if not partial_end:
end = total-1
else:
end = int(partial_end)
chunksize = (end-start)+1
web.ctx.status = "206 Partial Content"
web.header("Content-Range", "bytes %d-%d/%d" % (start, end, total))
web.header("Accept-Ranges", "bytes")
web.header("Content-Length", chunksize)
return result[start:end+1]

Google tells me you have to use the staticFilter for byte ranges to work in CherryPy - but that is for static files only. Luckily this posting also includes pointers on how to do it for non-static data :-)