Capture File Stream Data in Argparse - python-3.6

I'd like to capture the help output from argparse in a string:
import argparse
pp = argparse.ArgumentParser(description="foo")
pp.add_argument('bundle_dir', help='Directory containing bundles',
default='default_val')
pp.print_help() # What this to go to a string
For its part argparse lets you pass a file handle to print_help:
def print_help(self, file=None):
if file is None:
file = _sys.stdout
self._print_message(self.format_help(), file)
Is there anyway to create an object that will act like a file but will capture the help data so I can use it as a string?

Related

Extract Hyperlink from a spool pdf file in Python

I am getting my form data from frontend and reading it using fast api as shown below:
#app.post("/file_upload")
async def upload_file(pdf: UploadFile = File(...)):
print("Content = ",pdf.content_type,pdf.filename,pdf.spool_max_size)
return {"filename": "Succcess"}
Now what I need to do is extract hyperlinks from these spool Files with the help of pypdfextractor as shown below:
import pdfx
from os.path import exists
from config import availableUris
def getHrefsFromPDF(pdfPath:str)->dict:
if not(exists(pdfPath)):
raise FileNotFoundError("PDF File not Found")
pdf = pdfx.PDFx(pdfPath)
return pdf.get_references_as_dict().get('url',[])
But I am not sure how to convert spool file (Received from FAST API) to pdfx readable file format.
Additionally, I also tried to study the bytes that come out of the file. When I try to do this:
data = await pdf.read()
data type shows as : bytes when I try to convert it using str function it gives a unicoded encoded string which is totally a gibberish to me, I also tried to decode using "utf-8" which throws UnicodeDecodeError.
fastapi gives you a SpooledTemporaryFile. You may be able to use that file object directly if there is some api in pdfx which will work on a File() object rather than a str representing a path (!). Otherwise make a new temporary file on disk and work with that:
from tempfile import TemporaryDirectory
from pathlib import Path
import pdfx
#app.post("/file_upload")
async def upload_file(pdf: UploadFile = File(...)):
with TemporaryDirectory() as d: #Adding the file into a temporary storage for re-reading purposes
tmpf = Path(d) / "pdf.pdf"
with tmpf.open("wb") as f:
f.write(pdf.read())
p = pdfx.PDFX(str(tmpf))
...
It may be that pdfx.PDFX will take a Path object. I'll update this answer if so. I've kept the read-write loop synchronous for ease, but you can make it asynchronous if there is a reason to do so.
Note that it would be better to find a way of doing this with the SpooledTemporaryFile.
As to your data showing as bytes: well, pdfs are (basically) binary files: what did you expect?

Using Mainframe Datasets in Python 3.6 within Anaconda Spyder

I am trying to read and write the Mainframe Datasets data in Python3.6. I am using Anaconda's Spyder(version 3.2.4). I am using Zosftplib inorder to import mainframe features. Below is the code snippet:
import zosftplib
Myzftp = zosftplib.Zftp("ip address-mainframe","username","password")
mf_file = open("mainframe ps file-name", 'r+')
ffa = mf_file.read(16);
print ("Read record is :", ffa)
mf_file.close()
Mainframe PS-file name contains 1 record with data-0010021023457893.But the output I am getting is spaces in Spyder kernel.I also tried using ftplib but it didn't worked there too.I believe there's conversion required as its not a text file which I am reading.Does anyone has any suggestion on this.Please reply.Thanks
Thru FTPLIB and Zosftplib import
import zosftplib
Myzftp = zosftplib.Zftp("ip address-mainframe","username","password")
mf_file = open("mainframe ps file-name", 'r+')
ffa = mf_file.read(16);
print ("Read record is :", ffa)
mf_file.close()
Expected result should be 0010021023457893 after file read and print.
The zosftplib package will provide you ftp access to your dataset on z/OS, meaning you can download it, but you have to open it locally. Also, you need to be aware of the encoding differences between your local machine and the z/OS environment, so you should specify the sbdataconn() argument to provide codepage translation. I was able to do what you want with code like this:
import zosftplib
Myzftp = zosftplib.Zftp('mainframe_ip',
'mainframe_userid',
'mainframe_password',
timeout=500.0,
sbdataconn='(ibm-1147,iso8859-1)')
Myzftp.download_text('mainframe_dataset_name', '/tmp/local_filename.txt')
mf_file = open('/tmp/local_filename.txt', 'r+')
ffa = mf_file.read(16);
print ("Read record is :", ffa)
mf_file.close()

Save a copy of a notebook from within the notebook itself

I would like to save a copy of a notebook (or rename it) from within a cell of the notebook.
Preferably without too much JavaScript. Actually I guess something of this form should work
from IPython.display import display_html
display_html("script>Jupyter....???...()</script>")
Here is a solution only in Python. The notebook_path function comes from P.Toccaceli's solution on How do I get the current IPython Notebook name.
from notebook import notebookapp
import urllib
import json
import os
import ipykernel
from shutil import copy2
def notebook_path():
"""Returns the absolute path of the Notebook or None if it cannot be determined
NOTE: works only when the security is token-based or there is also no password
"""
connection_file = os.path.basename(ipykernel.get_connection_file())
kernel_id = connection_file.split('-', 1)[1].split('.')[0]
for srv in notebookapp.list_running_servers():
try:
if srv['token']=='' and not srv['password']: # No token and no password, ahem...
req = urllib.request.urlopen(srv['url']+'api/sessions')
else:
req = urllib.request.urlopen(srv['url']+'api/sessions?token='+srv['token'])
sessions = json.load(req)
for sess in sessions:
if sess['kernel']['id'] == kernel_id:
return os.path.join(srv['notebook_dir'],sess['notebook']['path'])
except:
pass # There may be stale entries in the runtime directory
return None
def copy_current_nb(new_name):
nb = notebook_path()
if nb:
new_path = os.path.join(os.path.dirname(nb), new_name+'.ipynb')
copy2(nb, new_path)
else:
print("Current notebook path cannot be determined.")
Then, simply use copy_current_nb('Save1') to create a copy named Save1.ipynb in the same directory.

ipython notebook: custom cells and execute hook

I would like to override what happens when run is pressed for certain cells in ipython notebook.
For example, I would like to be able to write an SQL query directly in a cell and define a function that processes it.
It seems it should be possible to do this as with ipython-notebook extensions. Does anyone know of a similar extension? An easy way to do this directly from ipython?
Ideally this would involve defining a custom cell type, but I would be happy to use special tags to separate the usual python code from, say, a custom SQL query cell.
I've once had the similar desire and I ended up with the following solution:
from sqlalchemy import create_engine
import pandas as pd
from IPython.core.magic import register_cell_magic
from IPython import get_ipython
con = create_engine(DB_URL)
#register_cell_magic
def sql(line, cell):
cell = cell.format(**globals())
if line.strip() != '-':
res = pd.read_sql(cell, con)
if line.strip() != '': get_ipython().user_ns[line.strip()] = res
return res
else:
con.execute(cell)
del sql
You can now write in other cells:
%sql outputvar
select * from whatever where ...
For example:

How to overcome Python 3.4 NameError: name 'basestring' is not defined

I've got a file called hello.txt in the local directory along side the test.py, which contains this Python 3.4 code:
import easywebdav
webdav = easywebdav.connect('192.168.1.6', username='myUser', password='myPasswd', protocol='http', port=80)
srcDir = "myDir"
webdav.mkdir(srcDir)
webdav.upload("hello.txt", srcDir)
When I run this I get this:
Traceback (most recent call last):
File "./test.py", line 196, in <module>
webdav.upload("hello.txt", srcDir)
File "/Library/Frameworks/Python.framework/Versions/3.4/lib/python3.4/site-packages/easywebdav/client.py", line 153, in upload
if isinstance(local_path_or_fileobj, basestring):
NameError: name 'basestring' is not defined
Googling this results in several hits, all of which point to the same fix which, in case the paths moved in future, is to include "right after import types":
try:
unicode = unicode
except NameError:
# 'unicode' is undefined, must be Python 3
str = str
unicode = str
bytes = bytes
basestring = (str,bytes)
else:
# 'unicode' exists, must be Python 2
str = str
unicode = unicode
bytes = str
basestring = basestring
I wasn't using import types, but to include it or not doesn't appear to make a difference in PyDev - I get an error either way. The line which causes an error is:
unicode = unicode
saying, 'undefined variable'.
OK my python knowledge falters at this point and I've looked for similar posts on this site and not found one specific enough to basestring that I understand to help. I know I need to specify basestring but I don't know how to. Would anyone be charitable enough to point me in the right direction?
You can change easywebdav's client.py file like the top two changes in this checkin: https://github.com/hhaderer/easywebdav/commit/983ced508751788434c97b43586a68101eaee67b
The changes consist in replacing basestring by str in client.py.
I came up with an elegant pattern that does not require modification of any source files. Please note it might be extended for other modules to keep all 'hacks' in one place:
# py3ports.py
import easywebdav.client
easywebdav.basestring = str
easywebdav.client.basestring = str
# mylib.py
from py3ports import easywebdav

Resources