corrupted file when sending it via POST with reqwest - http

I'm trying to send a POST request to my server including a file. I can do that with curl following this example with no problems https://gokapi.readthedocs.io/en/latest/advanced.html#interacting-with-the-api, but I can't with Rust.
When I try to implement the request with Rust I have issues, namely the file is corrupted as if it was sent in the wrong way. I tried to get it working with this code,
fn upload(file: &String) -> Result<(), Box<dyn std::error::Error>> {
let client = reqwest::blocking::Client::new();
let mut form = multipart::Form::new()
.file("file", file)
.unwrap()
.text("allowedDownloads", "0")
.text("expiryDays", "2")
.text("password", "");
let res = client
.post("http://myserver.com/api/files/add")
.header(ACCEPT, "application/json")
.header("apikey", "secret")
.header("Accept-Encoding", "gzip, deflate, br")
.multipart(form)
.send();
let response_json = json::parse(&res.unwrap().text().unwrap()).unwrap();
let id = &response_json["FileInfo"]["Id"];
print!("http://myserver.com/downloadFile?id={}", id);
Ok(())
}
but the server receives a bad file, 7zip gives me this error.
tried doing the same script in python, and I got it working in 3 lines.
import requests
files = {'file': ("1398608 Fractal Dreamers - Gardens Under a Spring Sky.osz", open("1398608 Fractal Dreamers - Gardens Under a Spring Sky.osz", "rb"), "application/octet-stream")}
request = requests.post("http://myserver/api/files/add", files=files, headers={'apikey': 'api'})
the file uploaded from the python script works flawlessly, while the rust doesn't.
Any help is appreciated as I'm still a beginner with Rust.
I also did try Sending attachment with reqwest but I get
{"Result":"error","ErrorMessage":"multipart: NextPart: bufio: buffer full"}
EDIT: the issue looks like it's related to file (being the filename) including some strange characters? Test subject file was "1398608 Fractal Dreamers - Gardens Under a Spring Sky.osz", but changing it to "a.osz" made the issue disappear. I have no clue why and how is that
content of the zip is:
"Fractal Dreamers - Gardens Under a Spring Sky ([Crz]xz1z1z) [Vernal].osu"
"audio.mp3"
I get the error with the full name, but "1398608 Fractal Dreamers - Gardens Under a Spring Sky.zip" works as well. What's the issue with .osz?

Related

Why is Rust's std::thread::sleep allowing my HTTP response to return the correct body?

I am working on the beginning of the final chapter of The Rust Programming Language, which is teaching how to write an HTTP response with Rust.
For some reason, the HTML file being sent does not display in the browser unless I have Rust wait before calling TcpResponse::flush().
Here is the code:
use std::io::prelude::*;
use std::net::TcpListener;
use std::net::TcpStream;
use std::fs;
use std::thread::sleep;
use std::time::Duration;
fn main() {
let listener = TcpListener::bind("127.0.0.1:7878").unwrap();
for stream in listener.incoming() {
let stream = stream.unwrap();
handle_connection(stream);
}
}
fn handle_connection(mut stream: TcpStream) {
let mut buffer = [0; 1024];
stream.read(&mut buffer).unwrap();
let contents = fs::read_to_string("hello.html").unwrap();
let response = format!(
"HTTP/1.1 200 OK\r\nContent-Length: {}\r\n{}",
contents.len(),
contents
);
stream.write(response.as_bytes()).unwrap();
// let i = stream.write(response.as_bytes()).unwrap();
// println!("{} bytes written to the stream", i);
// ^^ using this code instead will sometimes make it display properly
sleep(Duration::from_secs(1));
// ^^ uncommenting this will cause a blank page to load.
stream.flush().unwrap();
}
I observe the same behavior in multiple browsers.
According to the Rust book, calling TcpListener::flush should ensure that the bytes finish writing to the stream. So why would I be unable to view the HTML file in the browser unless I sleep the thread before flushing?
I have done hard reloading and restarted the server with cargo run multiple times and the behavior is the same. I have also printed out the file contents to the terminal, and the contents are being read fine under either condition (of course they are).
I wonder if this is a problem with my operating system. I'm on Windows 10.
It isn't really holding the project up as I can continue learning (and I'm not planning on putting an actual web project into production right now), but I would appreciate any insight anyone has on this issue. There must be something about Rust's handling of the stream or the environment that I am not understanding.
Thanks for your time!

Cycling through IP addresses in Asynchronous Webscraping

I am using a relatively cookie cutter code to asynchronously request the HTMLs from a few hundred urls that I have scraped with another piece of code. The code works perfectly.
Unfortunately, this is causing my IP to be blocked due to the high number of requests.
My thought is to write some code to grab some proxy IP addresses, place them in a list, and cycle through them randomly as the requests are sent. Assuming I have no problems in creating this list, I am having trouble conceptualising how to splice the random rotation of these proxy IPs into my asychronous request code. This is my code so far.
async def download_file(url):
async with aiohttp.ClientSession() as session:
async with session.get(url) as resp:
content = await resp.read()
return content
async def write_file(n, content):
filename = f'sync_{n}.html'
with open(filename, 'wb') as f:
f.write(content)
async def scrape_task(n, url):
content = await download_file(url)
await write_file(n, content)
async def main():
tasks = []
for n, url in enumerate(open('links.txt').readlines()):
tasks.append(scrape_task(n,url))
await asyncio.wait(tasks)
if __name__ == '__main__':
asyncio.run(main())
I am thinking that I need to put:
conn = aiohttp.TCPConnector(local_addr=(x, 0), loop=loop)
async with aiohttp.ClientSession(connector=conn) as session:
...
as the second and third lines of my code, where x is going to be one of the random IP addresses from a list earlier defined. How would I go about doing this? I am unsure if placing the whole code in a simple synchoronous loop will defeat the purpose of using the asynchronous requests.
If there is a simpler solution to the problem of being blocked from a website for rapid fire requests, that would be very helpful too. Please note I am very new to coding.

urequests micropython problem (multiple POST requests to google forms)

I'm trying to send data to Google Forms directly (without and external service like IFTTT) using an esp8266 with micropython. I've already used IFTTT but at this point is not useful for me, i need a sampling rate of more or equal to 100 Hz and as you know this exceeds the IFTTT's usage limit. I've tried making a RAM buffer, but i got a error saying that the buffer exceded the RAM size (4 MB) so that's why im trying to do directly.
After trying some time i got it partially. I say "partially" because i have to do a random get-request after the post-request; i don't know why it works, but it works (in this way i can send data to Google Forms every second approximately, or maybe less). I guess the problem is that the esp8266 can't close the connection with Google Forms and it gets stuck when it tries to do a new post-request, if this were the problem, i don't know how to fix it in another way, any suggestions? The complete code is here:
ssid = 'my_network'
password = 'my_password'
import urequests
def do_connect():
import network
sta_if = network.WLAN(network.STA_IF)
if not sta_if.isconnected():
print('connecting to network...')
sta_if.active(True)
sta_if.connect(ssid, password)
while not sta_if.isconnected():
pass
print('network config:', sta_if.ifconfig())
def main():
do_connect()
print ("CONNECTED")
url = 'url_of_my_google_form'
form_data = 'entry.61639300=example' #have to change the entry
user_agent = {'Content-Type': 'application/x-www-form-urlencoded'}
while True:
response = urequests.post(url, data=form_data, headers=user_agent)
print ("DATA HAVE BEEN SENT")
response.close
print("TRYING TO SEND ANOTHER ONE...")
response = urequests.get("http://micropython.org/ks/test.html") #<------ RANDOM URL, I DON'T KNOW WHY THIS CODE WORKS CORRECTLY IN THIS WAY
print("RANDOM GET:")
print(response.text)
response.close
if __name__ == '__main__':
main()
Thank you for your time guys. Also i've tried with this code before but it DOESN'T WORK. Without the random get-request, it gets stuck after one or two times of posting:
while True:
response = urequests.post(url, data=form_data, headers=user_agent)
print ("DATA HAVE BEEN SENT")
response.close
print("TRYING TO SEND ANOTHER ONE...")
Shouldn't it be response.close() (with brackets)?.. 🤔,
Without brackets you access a (non existing) property close of the object response instead of calling the method close(), and do not really close the connection. This could lead to memory overflow.

http.Client rejects request with >unsupported protocol scheme ""< even if it's set

I try to upload some videos to youtube. Somewhere in the stack it comes down to a http.Client. This part somehow behaves weird.
The request and everything is created inside the youtube package.
After doing my request in the end it fails with:
Error uploading video: Post https://www.googleapis.com/upload/youtube/v3/videos?alt=json&part=snippet%2Cstatus&uploadType=multipart: Post : unsupported protocol scheme ""
I debugged the library a bit and printed the URL.Scheme content. As a string the result is https and in []byte [104 116 116 112 115]
https://golang.org/src/net/http/transport.go on line 288 is the location where the error is thrown.
https://godoc.org/google.golang.org/api/youtube/v3 the library I use
My code where I prepare/upload the video:
//create video struct which holds info about the video
video := &yt3.Video{
//TODO: set all required video info
}
//create the insert call
insertCall := service.Videos.Insert("snippet,status", video)
//attach media data to the call
insertCall = insertCall.Media(tmp, googleapi.ChunkSize(1*1024*1024)) //1MB chunk
video, err = insertCall.Do()
if err != nil {
log.Printf("Error uploading video: %v", err)
return
//return errgo.Notef(err, "Failed to upload to youtube")
}
So I have not idea why the schema check fails.
Ok, I figured it out. The problem was not the call to YouTube itself.
The library tried to refresh the token in the background but there was something wrong with the TokenURL.
Ensuring there is a valid URL fixed the problem.
A nicer error message would have helped a lot, but well...
This will probably apply to very, very few who arrive here: but my problem was that a RoundTripper was overriding the Host field with an empty string.

Seeking not working in HTML5 audio tag

I have a lighttpd server running locally. If I load a static file on the server (through an html5 audio tag), it plays and seeks fine.
However, seeking doesn't work when running a dev server (web.py/CherryPy) or if I return the bytes via a defined action url instead of as a static file. It won't load the duration either.
According to the "HTTP byte range requests" section in this Opera Page it's something to do with support for byte range requests/partial content responses. The content is treated as streaming instead.
What I don't understand is:
If the browser has the whole file downloaded surely it can display the duration, and surely it can seek.
What I need to do on the web server to enable byte range requests (for non-static urls).
Any advice would be most gratefully received.
Here's some web.py code to get you started (just happened to need this as well and ran into your question):
## experimental partial content support
## perhaps this shouldn't be enabled by default
range = web.ctx.env.get('HTTP_RANGE')
if range is None:
return result
total = len(result)
_, r = range.split("=")
partial_start, partial_end = r.split("-")
start = int(partial_start)
if not partial_end:
end = total-1
else:
end = int(partial_end)
chunksize = (end-start)+1
web.ctx.status = "206 Partial Content"
web.header("Content-Range", "bytes %d-%d/%d" % (start, end, total))
web.header("Accept-Ranges", "bytes")
web.header("Content-Length", chunksize)
return result[start:end+1]
Google tells me you have to use the staticFilter for byte ranges to work in CherryPy - but that is for static files only. Luckily this posting also includes pointers on how to do it for non-static data :-)

Resources