Flask request.stream.read() is stopping when uploading file using SocketIO - flask-socketio

I am running a Flask production server using flask-socketio and eventlet and when trying to submit a form which contains a file to upload, Flask fails to read the entire request. This occurs when the file reaches above a few kb (around 50kb or more). The file I am trying to upload is a 60kb .txt file with a word on each line. Things work as expected with smaller file sizes of 1-2kb.
def get_file(request):
if 'uploadFile' not in request.files:
return redirect(request.url)
return = request.file['uploadFile']
Having done some tests I was able to determine that the code does not get passed checking request.files. As an example, I would try to print(request.files) and the code would not move on from there, it would just hang.
I understand that Flask's built-in methods may not be the most effecient so I found a library streaming-form-data that can assist with loading large files and so implemented this as a replacement
def get_file(request):
parser = StreamingFormDataParser(headers=request.headers)
parser.register('file', FileTarget('/temp/file.txt')
while True:
chunk = request.stream.read(8192)
if not chunk:
break
parser.data_received(chunk) # add read bytes to file
socketio.sleep(1)
The file would still not finish uploading however I was able to find that it stops when trying to do chunk = request.stream.read(8192). This wouldn't happen straight away, it would usually stop around 4-5th iteration. Tested with and without socketio.sleep() as I understand this can be needed with tasks that require more time.
I did some more testing using the Flask Dev Server and found that if I use app.run(threaded=True) it would complete as expected and the code would continue. However I haven't been able to get it to work using socketio.run() and eventlet, example of my main.py
import eventlet
eventlet.monkey_patch()
from app import app, socketio
if __name__ == "__main__":
# app.run(threaded=True) # This will work and finish loading the file
socketio.run(app) # does not work
During the while loop to read the stream I do not receive any errors, it merely does not continue to to get the next chunk = request.stream.read(8192)

Found the solution I needed for this and turns out it really wasn't down to Flask. In my POST I just needed to include chunking: true and it seems to be working fine from there.
E.g
$.ajax({
type: 'POST',
chunking: true,
data: form_data,
contentType: false,
});
As a test, I reverted the changes I made above (though these changes are probably for the better anyway) and confirmed it worked.

Related

Failing to upload JSON file through Chrome to Firebase Database

This is really frustrating. I have a 104 MB JSON file that I want to upload to my Firebase database through the web front end, but after a random period of time (I've timed it, it's not constant, anywhere from 2 to 20 seconds) I get the error:
There was a problem contacting the server. Try uploading your file again.
So I do try again, and it just keeps failing. I've uploaded files nearly this big before, and the limit for stored data in the realtime DB is 1 GB,
I'm not even close to that. Why does it keep failing to upload?
This is the error I get in chrome dev tools:
Failed to load resource: net::ERR_CONNECTION_ABORTED
https://project.firebaseio.com/.upload?auth=eyJhbGciOiJIUzI1NiIsInR5cCI6…Q3NiwiYWRtaW4iOnRydWUsInYiOjB9.CihvjvLSlx43nOBynAJeyibkBRtygeRlG4Yo1t3jKVA
Failed to load resource: net::ERR_CONNECTION_ABORTED
If I click on the link that shows up in the error, it's a page with the words POST request required.
Turns out the answer is to ignore the web importer entirely and use firebase-import. It worked perfectly first time, and only took a minute to upload the whole json. And it also has merging capabilities.
Using firebase-import as the accepted answer suggested, I get error:
Error: WRITE_TOO_BIG: Data to write exceeds the maximum size that can be modified with a single request.
However, with the firebase-cli I was successful in deleting my entire database:
firebase database:remove /
It seems like it automatically traverses down your database tree to find requests that are under the limit size, then it does multiple delete requests automatically. It takes some time, but definitely works.
You can also import via a json file:
firebase database:set / data.json
I'm unsure if firebase database:set supports merging.

How to capture stdout from a console application whose life cycle is very long by CreatePipe?

I tried to write an test application to capture the text from stdout of an 3rd console application.
I studied from many articles to use the CreatePipe API and have INDEED obtained the text AFTER the console application HAD FINISHED running.
I tried to make the console application keep printing something for more than 60 seconds, and the ReadFile funcion didn't return during this 60 seconds at all.
For the same purpose, I tried popen and fread, and everything went fine except the black console window created by popen.
Although the ReadFileEx and something about the overlapped I/O seems to be able to solve this problem but it's actually not.
Because the ReadFileEx required the file handle to be created to support overlapping, and this is always impossilbe because the file handle is created by the 3rd console application. It won't be under control unless we develop the console application by ourselves.
So is there any way to capture stdout from a 3rd console application whose life cycle is very long by CreatePipe?
Thanks in advance!
I finaly figured out the problem is the 3rd console application "MAC.EXE" doesn't invoke "fflush" after each progress output....
I manually append the fflush operation in the source code of mac.exe and the problem gets resolved.
So a new question is:
If the child process never call fflush and seldom print during running, how to read the content correctly?

Meteor: execute calls from client console "everywere"

Meteor is said to automagically (in most cases) figure out what code to run on the client and what code to run on the server so you could theoretically just write all your code in one .js file.
I would like to be able to write code in my browser console and have it executed pretty much as if I had put the code in a file on my server.
For example, in my browser console:
[20:08:19.397] Pages = new Meteor.Collection("pages");
[20:08:30.612] Pages.insert({name:"bro"});
[20:08:30.614] "sGmRrQfezZMXuPfW8"
[20:08:30.618] insert failed: Method not found
Meteor says "method not found" because I need to do new Meteor.Collection("pages"); on the server.
But is there a workaround for this, whether using the above-mentioned automagic or by explicitly saying in my browser console "run the following line of code on the server!"?
Well it doesn't "automagically" figure it out - you have to very explicitly do one of two things:
Separate the code into client and server directories.
Wrap the code in an isClient or an isServer section.
Otherwise, any code you write will execute in both environments. However, any code input by the user on the client will only be executed on the client. Meteor has been specifically designed to protect this boundary.
You can call a method on the server from the client, but again the server cannot be tricked into executing client-defined functions.
In your specific example, you can always define the collection only on the client like so:
Pages = new Meteor.Collection(null);
That will allow you do freely manipulate the collection data on the client, but it will not involve the server (nothing will be stored in the db).

Split sqllite file into chunks for appcfg.py

I have a 750MB sql3 file that I want to load into appcfg.py, a program that can restore appengine data. It's taking forever to load in there. Is there a way I could split it into smaller, totally-separate chunks, to be loaded independantly?
I don't need to run queries across the data, or maintain any other kind of relationship. I just need to copy a list of the records to my appengine app.
Elaboration:
I'm trying to restore a 750 MB sql3 file I got from
appcfg.py download_data --appl=myapp --url=https://myapp.appspot.com/remote_path --file=backup.sql3
Now, I'm trying to restore the file with
appcfg.py upload_data --appl=restoreapp --url=https://restoreapp.appspot.com/remote_api --file=backup.sql3
I also set some parameters tweaking the default limits.
This prints out some initial logging information, repeating the parameters, etc. Then nothing happens for about 45 minutes, except that python takes about 50% cpu for the duration. Then, finally, it starts to upload to appengine.
From there, it seems to work. But, if there's an error in the transmission, I have to wait the 45 minutes again, even after specifying the progress database. That's why I'm looking for a way to split up the file, or something.
FWIW, both the original app and the restore app use the Java sdk

ActiveMQ 5.2.0 + REST + HTTP POST = java.lang.OutOfMemoryError

First off, I am a newbie when it comes to JMS & ActiveMQ.
I have been looking into a messaging solution to serve as middleware for a message producer that will insert XML messages into a queue via HTTP POST. The producer is an existing system written in C++ that cannot be modified (so Java and the C++ API are out).
Using the "demo" examples and some trial and error, I have cobbled together a working example of what I want to do (on a windows box).
The web.xml I configured in a test directory under "webapps" specifies that the HTTP POST messages received from the producer are to be handled by the MessageServlet.
I added a line for the text app in "activemq.xml" ('ow' is the test app dir):
I created a test script to "insert" messages into the queue which works well.
The problem I am running into is that it as I continue to insert messages via REST/HTTP POST, the memory consumption and thread count used by ActiveMQ continues to rise (It happens when I have timely consumers as well as slow or non-existent consumers).
When memory consumption gets around 250MB's and the thread count exceeds 5000 (as shown in windows task manager), ActiveMQ crashes and I see this in the log:
Exception in thread "ActiveMQ Transport Initiator: vm://localhost#3564" java.lang.OutOfMemoryError: unable to create new native thread
It is as if Jetty is spawning a new thread to handle each HTTP POST and the thread never dies.
I did look at this page:
http://activemq.apache.org/javalangoutofmemory.html
and tried but that didn't fix the problem (although I didn't fully understand the implications of the change either).
Does anyone have any ideas?
Thanks!
Bruce Loth
PS - I included the "test message producer" python script below for what it is worth. I created batches of 100 messages and continued to run the script manually from the command line while watching the memory consumption and thread count of ActiveMQ in task manager.
def foo():
import httplib, urllib
body = "<?xml version='1.0' encoding='UTF-8'?>\n \
<ROOT>\n \
[snip: xml deleted to save space]
</ROOT>"
headers = {"content-type": "text/xml",
"content-length": str(len(body))}
conn = httplib.HTTPConnection("127.0.0.1:8161")
conn.request("POST", "/ow/message/RDRCP_Inbox?type=queue", body, headers)
response = conn.getresponse()
print response.status, response.reason
data = response.read()
conn.close()
## end method definition
## Begin test code
count = 0;
while(count < 100):
# Test with batches of 100 msgs
count += 1
foo()
The error is not directly caused by ActiveMQ but by the Java Runtime. Take a look here:
http://activemq.apache.org/javalangoutofmemory.html
how you can up your memory for the Java HEAP. There is also interessting stuff about WHY this happens and what you might do to prevent it. ActiveMQ is pretty good but needs some customizing here and there in the config files.
You may want to add the following to the URL's query string:
JMSDeliveryMode=persistent
Otherwise, by definition (read "by default"), the messages would be kept in AMQ's memory.

Resources