Unable to read body returned by HTTPoison - http

I have this:
HTTPoison.start
case HTTPoison.get("some url") do
{:ok, %HTTPoison.Response{status_code: 200, body: body}} ->
IO.puts(body)
Which produces an exception:
Generated escript grex_cli with MIX_ENV=dev
** (ArgumentError) argument error
(stdlib) :io.put_chars(#PID<0.49.0>, :unicode, [<<60, 33, 100, 111,..... 58, ...>>, 10])
(elixir) lib/kernel/cli.ex:76: anonymous fn/3 in Kernel.CLI.exec_fun/2
How can I read the body and print it?

IO.puts fails with that error because body here is not valid UTF-8 data. If you want to write non UTF-8 binary to stdout, you can use IO.binwrite instead of IO.puts:
IO.binwrite(body)

I got this error when I got compressed content.
So I unzipped it before printing:
IO.puts :zlib.gunzip(body)
You also can check if body is compressed before unzip: https://github.com/edgurgel/httpoison/issues/81#issuecomment-219920101

Related

Yoututbe scraping by colab

I need to scrape car type video from YouTube by some tags like this list in Google Colab :
Abarth
AC
Acura
Adam
Adler
AEC
Aero
Aixam
Albion
SO i have tried these two code to find the video tag ( for example tag='Peugeot') in google colab:
!pip install youtube-search-python
from youtubesearchpython import SearchVideos
search = SearchVideos("NoCopyrightSounds", offset = 1, mode = "json", max_results = 20)
print(search.result())
and
!pip install youtube-dl
!echo '' > ford_video_list.txt
!chmod 755 ford_video_list.txt
!youtube-dl --match-title 'ford' --add-metadata --write-thumbnail --list-thumbnails --mark-watched --write-info-json 'ford_video_description_json.txt' --write-description 'ford_video_description.txt' --cookies='Search-youtube-url-file.txt' --ignore-errors --skip-download --get-url -f bestvideo+bestaudio/best --default-search "ytsearch2000:" "Ford Festiva" >> ford_video_list.txt
!echo '*****End of test 1 ******'
But by trying this code it don't showing any result:
import urllib.request
from bs4 import BeautifulSoup
textToSearch = 'python tutorials'
query = urllib.parse.quote(textToSearch)
url = "https://www.youtube.com/results?search_query=" + query
response = urllib.request.urlopen(url)
html = response.read()
soup = BeautifulSoup(html, 'html.parser')
for vid in soup.findAll(attrs={'class':'yt-uix-tile-link'}):
if not vid['href'].startswith("https://googleads.g.doubleclick.net/"):
print('https://www.youtube.com' + vid['href'])
So, i guess the class name is not correct!, and i asked here for debugging it.
Update:
I have made one google colab page (shown below) to test those codes ( also the code of youtube-dl showing this error:
https://colab.research.google.com/drive/1bZQ68gLggTQHCG_5fQQJJTICHA4K3HJ3?usp=sharing
ERROR: Unable to download webpage: HTTP Error 429: Too Many Requests
(caused by <HTTPError 429: 'Too Many Requests'>); please report this
issue on https://yt-dl.org/bug . Make sure you are using the latest
version; see https://yt-dl.org/update on how to update. Be sure to
call youtube-dl with the --verbose flag and include its complete
output.
I understand that error made because:
The google don't like too many request form one IP Address.
So tried to add these tags(--rm-cache-dir --force-ipv4 --verbose) to youtube-dl command as you can see below ( based of these reffrences 1 2 3):
ERROR: Unable to download webpage: HTTP Error 429: Too Many Requests (caused by <HTTPError 429: 'Too Many Requests'>); please report this issue on https://yt-dl.org/bug . Make sure you are using the latest version; see https://yt-dl.org/update on how to update. Be sure to call youtube-dl with the --verbose flag and include its complete output.
File "/usr/local/lib/python3.6/dist-packages/youtube_dl/extractor/common.py", line 632, in _request_webpage
return self._downloader.urlopen(url_or_request)
File "/usr/local/lib/python3.6/dist-packages/youtube_dl/YoutubeDL.py", line 2238, in urlopen
return self._opener.open(req, timeout=self._socket_timeout)
File "/usr/lib/python3.6/urllib/request.py", line 532, in open
response = meth(req, response)
File "/usr/lib/python3.6/urllib/request.py", line 642, in http_response
'http', request, response, code, msg, hdrs)
File "/usr/lib/python3.6/urllib/request.py", line 564, in error
result = self._call_chain(*args)
File "/usr/lib/python3.6/urllib/request.py", line 504, in _call_chain
result = func(*args)
File "/usr/lib/python3.6/urllib/request.py", line 756, in http_error_302
return self.parent.open(new, timeout=req.timeout)
File "/usr/lib/python3.6/urllib/request.py", line 532, in open
response = meth(req, response)
File "/usr/lib/python3.6/urllib/request.py", line 642, in http_response
'http', request, response, code, msg, hdrs)
File "/usr/lib/python3.6/urllib/request.py", line 570, in error
return self._call_chain(*args)
File "/usr/lib/python3.6/urllib/request.py", line 504, in _call_chain
result = func(*args)
File "/usr/lib/python3.6/urllib/request.py", line 650, in http_error_default
raise HTTPError(req.full_url, code, msg, hdrs, fp)
Thanks.
It has been working by changing the '"ytsearch2000:" "Ford Festiva"` :
to ' "ytsearch50":"Ford Festiva" as you can see below:
!pip install youtube-dl
# !youtube-dl --default-search gvsearch5:how to develop for android --no-playlist --write-info-json --write-annotation --write-thumbnail --write-sub --skip-download
!youtube-dl --match-title 'ford' "ytsearch50":"Ford Festiva"+"peugeot 405" --write-info-json --write-annotation --write-thumbnail --write-sub -f 'bestvideo[ext=mp4]+bestaudio[ext=m4a]/mp4'
and the problem was because of : and " location mistaking!
the entire code for scraping the video of some car type video from google could be seen here:
https://colab.research.google.com/github/CAR-Driving/yoloOnGoogleColab/blob/master/database_creating/Yoututbe_scraping_by_colab.ipynb

why the Python logging module throwing Attribute error?

I am working with the logging module for the first time. I was able to write the program as per my requirement. The logic written inside the try and except is also working almost successfully, and logs are getting generated in the log file. But for some reason I am seeing "AttributeError" on my IDE console for all the "logging.info" and "logging.exception". So, to cross verify I commented out all those location and this time my code ran without any error, but nothing was getting logged in the log file. Which was quite obvious. Below is the whole program
import logging
from logging.handlers import TimedRotatingFileHandler
logger = logging.handlers.TimedRotatingFileHandler('amitesh.log', when='midnight', interval=1)
logger.suffix = '%y_%m_%d.log'
# create a logging format
LOGGING_MSG_FORMAT = logging.Formatter('%(asctime)s - %(name)s - %(levelname)s - %(message)s')
logger.setFormatter(LOGGING_MSG_FORMAT)
LOGGING_DATE_FORMAT = '%d-%b-%y %H:%M:%S'
# create basic configuration
logging.basicConfig(level=logging.INFO, format=LOGGING_MSG_FORMAT, datefmt=LOGGING_DATE_FORMAT)
root_logger = logging.getLogger('')
# add the handlers to the logger
root_logger.addHandler(logger)
while True:
print(" ")
print("This is a Logging demo")
print(" ")
logging.info("new request came")
print(" ")
try:
x = int(input("Enter the first number: "))
y = int(input("Enter the second number: "))
print(x / y)
except ZeroDivisionError as msg:
print("cannot divide with zero")
logging.exception(msg)
print(" ")
except ValueError as msg:
print("enter only integer value")
logging.exception(msg)
print(" ")
logging.info("executed successfully")
print(" ")
Below is the error message from my IDE console:
return self._fmt.find(self.asctime_search) >= 0
AttributeError: 'Formatter' object has no attribute 'find'
Call stack:
File "/Users/amitesh/PycharmProjects/Automation/Databases/DB_Conn.py", line 68, in <module>
logging.info("new request came")
Message: 'new request came'
Arguments: ()
I have gone through the internet in last 2 days without any luck.
Please help me figure out my mistake(s).
added more error:
File "/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/logging/__init__.py", line 388, in usesTime
return self._fmt.find(self.asctime_search) >= 0
AttributeError: 'Formatter' object has no attribute 'find'
Call stack:
File "/Users/amitesh/PycharmProjects/Automation/Databases/DB_Conn.py", line 68, in <module>
logging.info("new request came")
Message: 'new request came'
Arguments: ()
Thank you.
Removed format=LOGGING_MSG_FORMAT, LOGGING_MSG_FORMAT and defined the values inside the basicConfig with "format" param directly as the parameter "format" takes a String not the type Formatter.

UnicodeWarning While Using Dictionaries

I've been playing with Python 2.7 for a while, and I'm now tryin to make my own Encryption/Decryption algorithm.
I'm trying to make it support non-Ascii characters.
So this is a part of the dictionnary :
... u'\xe6': '1101100', 'i': '0001000', u'\xea': '1100001', 'm': '0001100', u'\xee': '1100111', 'q': '0010000', 'u': '0010100', u'\xf6': '1110010', 'y': '0011000', '}': '1001111'}
But when I try to convert, by exemple, "é" into binairy, doing
base64 = encrypt[i]
where as encrypt is the name of the dic and i = u"é"
I get this error :
Warning (from warnings module):
File "D:\DeskTop 2\Programs\Projects\4.py", line 174
base64 = encrypt[i]
UnicodeWarning: Unicode equal comparison failed to convert both arguments to Unicode - interpreting them as being unequal
Traceback (most recent call last):
File "D:\DeskTop 2\Programs\Projects\4.py", line 197, in
main()
File "D:\DeskTop 2\Programs\Projects\4.py", line 196, in main
decryption(key, encrypt, decrypt)
File "D:\DeskTop 2\Programs\Projects\4.py", line 174, in decryption
base64 = encrypt[i]
KeyError: '\xf1'
Also, I did start with
# -*- coding: utf-8-*-
Alright, sorry for the useless post.
I found the fix. Basically, I did :
for i in user_input:
base64 = encrypt[i]
but i would be like \0xe
I added
j = i.decode("latin-1")
so j = u"\0xe"
And now it works :D

pexpect python throw error

Although this is my first attempt at using pexpect, the python3 script using pexpect is pretty simple; yet it fails.
#!/usr/bin/env python3
import sys
import pexpect
SSH_NEWKEY = r'Are you sure you want to continue connecting \(yes/no\)\?'
child = pexpect.spawn("ssh -i /user/aws/key.pem ec2-user#xxx.xxx.xxx.xxx date")
i = child.expect( [ pexpect.TIMEOUT, SSH_NEWKEY )
if i == 1:
child.sendline('yes')
print(child.before)
The SSH_NEWKEY is the only response I'm expecting, but the example showed a list containing pexpect.TIMEOUT in it so I used it.
$ ./test.py
Traceback (most recent call last):
File "/usr/local/lib/python3.4/site-packages/pexpect/spawnbase.py", line 144, in read_nonblocking
s = os.read(self.child_fd, size)
OSError: [Errno 5] Input/output error
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/usr/local/lib/python3.4/site-packages/pexpect/expect.py", line 97, in expect_loop
incoming = spawn.read_nonblocking(spawn.maxread, timeout)
File "/usr/local/lib/python3.4/site-packages/pexpect/pty_spawn.py", line 455, in read_nonblocking
return super(spawn, self).read_nonblocking(size)
File "/usr/local/lib/python3.4/site-packages/pexpect/spawnbase.py", line 149, in read_nonblocking
raise EOF('End Of File (EOF). Exception style platform.')
pexpect.exceptions.EOF: End Of File (EOF). Exception style platform.
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "./min.py", line 15, in <module>
i = child.expect( [ pexpect.TIMEOUT, SSH_NEWKEY ] )
File "/usr/local/lib/python3.4/site-packages/pexpect/spawnbase.py", line 315, in expect
timeout, searchwindowsize, async)
File "/usr/local/lib/python3.4/site-packages/pexpect/spawnbase.py", line 339, in expect_list
return exp.expect_loop(timeout)
File "/usr/local/lib/python3.4/site-packages/pexpect/expect.py", line 102, in expect_loop
return self.eof(e)
File "/usr/local/lib/python3.4/site-packages/pexpect/expect.py", line 49, in eof
raise EOF(msg)
pexpect.exceptions.EOF: End Of File (EOF). Exception style platform.
<pexpect.pty_spawn.spawn object at 0x7f70ea4fbcf8>
command: /usr/bin/ssh
args: ['/usr/bin/ssh', '-i', '/user/aws/key.pem', 'ec2-user#xxx.xxx.xxx.xxx', 'date']
searcher: None
buffer (last 100 chars): b''
before (last 100 chars): b'Fri May 6 13:50:18 EDT 2016\r\n'
after: <class 'pexpect.exceptions.EOF'>
match: None
match_index: None
exitstatus: 0
flag_eof: True
pid: 31293
child_fd: 5
closed: False
timeout: 30
delimiter: <class 'pexpect.exceptions.EOF'>
logfile: None
logfile_read: None
logfile_send: None
maxread: 2000
ignorecase: False
searchwindowsize: None
delaybeforesend: 0.05
delayafterclose: 0.1
delayafterterminate: 0.1
What am I missing?
CentOS 6.4
python 3.4.3
An EOF error is being raised during your expect call. This means that the response received does not match SSH_NEWKEY, and reaches end of file within the timeout period. To catch this exception, you should change your except line to read:
i = child.expect( [ pexpect.TIMEOUT, SSH_NEWKEY, pexpect.EOF)
You can then make your if more robust:
if i == 1:
child.sendline('yes')
elif i == 0:
print "Timeout"
elif i == 2:
print "EOF"
print(child.before)
This doesn't solve the reason behind why you are on receiving a response with the expected string - it's hard to know without looking at more code but it's likely because you have the response slightly wrong. If you manually type in the SSH string, you should be able to see the response you can expect, and enter this response into your code.
You can also print child.before after your expect call, or print child.read() instead of your expect call to see what is being sent back as a response.

Determine difference in http request between Python2 and Python3

I am attempting to use Python3 to send metrics to Hosted Graphite. The examples given on the site are Python2, and I have successfully ported the TCP and UDP examples to Python3 (despite my inexperience, and have submitted the examples so the docs may be updated), however I have been unable to get the HTTP method to work.
The Python2 example looks like this:
import urllib2, base64
url = "https://hostedgraphite.com/api/v1/sink"
api_key = "YOUR-API-KEY"
request = urllib2.Request(url, "foo 1.2")
request.add_header("Authorization", "Basic %s" % base64.encodestring(api_key).strip())
result = urllib2.urlopen(request)
This works successfully, returning a HTTP 200.
So far I have ported this much to Python3, and while I was (finally) able to get it to make a valid HTTP request (i.e. no syntax errors), the request fails, returning HTTP 400
import urllib.request, base64
url = "https://hostedgraphite.com/api/v1/sink"
api_key = b'YOUR-API-KEY'
metric = "testing.python3.http 1".encode('utf-8')
request = urllib.request.Request(url, metric)
request.add_header("Authorization", "Basic %s" % base64.encodestring(api_key).strip())
result = urllib.request.urlopen(request)
The full result is:
>>> result = urllib.request.urlopen(request)
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/usr/local/Cellar/python3/3.3.1/Frameworks/Python.framework/Versions/3.3/lib/python3.3/urllib/request.py", line 160, in urlopen
return opener.open(url, data, timeout)
File "/usr/local/Cellar/python3/3.3.1/Frameworks/Python.framework/Versions/3.3/lib/python3.3/urllib/request.py", line 479, in open
response = meth(req, response)
File "/usr/local/Cellar/python3/3.3.1/Frameworks/Python.framework/Versions/3.3/lib/python3.3/urllib/request.py", line 591, in http_response
'http', request, response, code, msg, hdrs)
File "/usr/local/Cellar/python3/3.3.1/Frameworks/Python.framework/Versions/3.3/lib/python3.3/urllib/request.py", line 517, in error
return self._call_chain(*args)
File "/usr/local/Cellar/python3/3.3.1/Frameworks/Python.framework/Versions/3.3/lib/python3.3/urllib/request.py", line 451, in _call_chain
result = func(*args)
File "/usr/local/Cellar/python3/3.3.1/Frameworks/Python.framework/Versions/3.3/lib/python3.3/urllib/request.py", line 599, in http_error_default
raise HTTPError(req.full_url, code, msg, hdrs, fp)
urllib.error.HTTPError: HTTP Error 400: Bad Request
Is it obvious what I am doing wrong? Are there any suggestions on how I might capture and compare what the successful (python2) and failing (python3) requests are actually sending?
Don't mix Unicode strings and bytes:
>>> "abc %s" % b"def"
"abc b'def'"
You could construct the header as follows:
from base64 import b64encode
headers = {'Authorization': b'Basic ' + b64encode(api_key)}
A quick way to see the request is to change the host in the url to localhost:8888 and run before making the request:
$ nc -l 8888
You could also use wireshark to see the requests.

Resources