I'm quite new to Scrapy, started using a week ago, and I use the -o property in the command line to generate a file and I'd like that file to be encrypted. I believe I need to write a custom Feed Exporter (instead of Item Exporter, because I need to encrypt the whole file and not each item separately), but I honestly have no idea how. I'm kinda lost, I read the scrapy docs, but they are not very clear and detailed.
Note: I also use custom settings like 'FEED_EXPORT_FIELDS' that I'd like to keep working.
For anyone in the future trying to solve this problem, I created a custom pipeline. I instantiate an exporter in open_spider, and in close_spider I call exporter.finish_exporting() and when I access the file after the method all the scraped items are written in it. Before returning I read all the data from the file, truncate the file, encrypt the data and then write it again. Then I just close the file and pass.
Related
Guys I am working on getting data as tables from QuickBase using Requests library (Python). I found somebody doing it using the URL of the report, but he added two parameters to the URL like that:
&dlta=xs%xx&ridlist=xxxx.
Can anybody please tell me what are those two parameters, I searched for them in the internet but found nothing related to them.
I've been using Quickbase for over ten years and haven't seen documentation for either of these parameters. I have noticed that ridList seems to be used by Quickbase's grid edit view of reports (I suspect it's an ID for a server-side cached list of record IDs to display especially when using the type-ahead search of a report before choosing to grid edit) and dlta is used in the "Download report as CSV" button.
That example you're following may have simply copy and pasted a link generated by Quickbase as a hack to get a CSV instead of XML response. I recommend following the Quickbase HTTP API Reference instead. If you don't want an XML response, Quickbase also has a JSON RESTful API which may be easier to work with.
I need very simple text file logging. I'll only append lines to it. never change existing ones nor delete them. If it would be XML file it would be easier to bind to grids to view them. but question remains for both text files and xml files as they are in file system.
in web server there will be file locking while appending log entries. and maybe also while reading them. So this method has to be thread safe. At the same moment multiple instances can write date to file.
I know there are some third party tools like serilog etc but I want to know:
how can I append (not change) lines to text file (or xml file) without concerning about file locks ?
if I read xml file to dataset, add a new row to it and save it as xml I would use other entries made by other instances.
if I open a text file with streamwriter and append a line to it, other instances would get lock error.
I get the list of logs from admin panel again, file will be locked and instances wouldn't append logs.
any ideas ?
After long reserch hours and experiments I found out that using Nlog is the best option for me. most important thing is people who use it are very happy. I created small example page that writes a log everytime it called and tested it. I have a multithreaded application that calls this sample page again and again. If was fast enough so I could not see the counting numbers of threads. no problem raised so far.
So, I'll stick to Nlog.
best.
i can't find the documentation anywhere for how to actually use the Conduit API. I'm able to create a task using some really weird methods, but once i create the ticket, i can't find any documentation about how to actually upload a file anywhere.
i tried looking at:
https://secure.phabricator.com/conduit/method/maniphest.createtask/
and i get so confused on how this actually works. what actually is this?
I think you need to upload the File separately through the file.upload conduit method, then use an {Fnnn} reference in the Task or Comment text to link to it. I presume when file.upload says it returns a GUID, it means a PHID, so you'll also need to use file.info to get the id to use in place of the nnn in the reference text.
I've got a bunch of audio files (let's say ogg or mp3), with metadata.
I wish to read their metadata into R so to create a data.frame with:
file name
file location
file artist
file album
etc
Any way you know of for doing that ?
You take an existing mp3 or ogg client, look at what library it uses and then write a binding for said library to R, using the existing client as guide for that side -- and something like Rcpp as a guide on the other side to show you how to connect C/C++ libraries to R.
No magic bullet.
A cheaper and less reliable way is to use a cmdline tool that does what you want and write little helper functions that use system() to run that tool over the file, re-reading the output in R. Not pretty, not reliable, but possibly less challenging.
Possible, yes, easy, no.
You "could" use a combination of readChar and/or readBin on the file and parse out the contents. This would be highly dependent, though, on parsing the frame tags from the raw bytes of the ID3v2 tag (and mind you it would change if it was a version 1 tag). If would certainly be a lot of work to implement a straight R solution. Take this Python code for example, it's very clean straight python code but a lot of branching and parsing.
You can use exiftool with system command available in R. Optionally, you can create regexp to handle the fields you need... If I were you, I'd stick with Dirk's advice (as usual) =)!
Out here in 2021, I wanted to do this so I did the following...
Create a new playlist while in 'songs' view.
Select all songs and drag to the new playlist. Highlight that playlist
File> Library>Export Playlist. My default file was to save as .txt, if not, designate.
Open Excel to save as csv or read.delim() in r as the txt file is tab-separated
import to R
I have an unusual environment in a project where I have many files that are each independent standalone scripts. All of the code required by the script must be in the one file and I can't reference outside files with includes etc.
There is a common function in all of these files that does authorization that is the last function in each file. If this function changes at all (as it does now and then) it has to changed in all the files and there are plenty of them.
Initially I was thinking of keeping the authorization function in a separate file and running a batch process that produced the final files by combining the auth file with each of the others. However, this is extremely cumbersome when debugging because the auth function needs to be in the main file for this purpose. So I'd always be testing and debugging in the folder with the combined file and then have to copy changes back to the uncombined files.
Can anyone think of a way to solve this problem? i.e. maintain an identical fragment of code in multiple files.
I'm not sure what you mean by "the auth function needs to be in the main file for this purpose", but a typical Unix solution might be to use make(1) and cpp(1) here.
Not sure what environment/editor your using but one thing you can do is to use prebuild events. create a start-tag/end-tag which defines the import region, and then in the prebuild event copy the common code between the tags and then compile...
//$start-tag-common-auth
..... code here .....
//$end-tag-common-auth
In your prebuild event just find those tags, and replace them with the import code and then finish compiling.
VS supports pre-post build events which can call external processes, but do not directly interact with the environment (like batch files or scripts).
Instead of keeping the authentication code in a separate file, designate one of your existing scripts as the primary or master script. Use this one to edit/debug/work on the authentication code. Then add a build/batch process like you are talking about that copies the authentication code from the master script into all of the other scripts.
That way you can still debug and work with the master script at any time, you don't have to worry about one more file, and your build/deploy process keeps everything in sync.
You can use a technique like #Priyank Bolia suggested to make it easy to find/replace the required bit of code.
I ugly way I can think of:
have the original code in all the files, and surround it with markers like:
///To be replaced automatically by the build process to the latest code
String str = "my code copy that can be old";
///Marker end.
This code block can be replaced automatically by the build process, from one common code file.