My requirement is to encode data and send it across network via HTTP, but I am stuck trying to choose the best encoding technique. Which of the three above is best? Is there anything better?
The criteria for "best" should be small size and fast encoding/decoding.
yEnc has less overhead but has other problems mentioned here http://en.wikipedia.org/wiki/YEnc#Criticisms.
What is "best" depends on the criteria you might have and how you plan to send data over the network. Are you using a web service, email or other means. You need to provide more details.
Edit:
Since you are uploading data via HTTP, you do not need to use any of Base64, yEnc or Uuencode. You just use the standard http file upload built in facility in both browser and web server. See this question as a reference:
How does HTTP file upload work?
Also this reference:
http://www.hanselman.com/blog/ABackToBasicsCaseStudyImplementingHTTPFileUploadWithASPNETMVCIncludingTestsAndMocks.aspx
Related
I am starting to work on a project where I need to stream Twitter data using PowerTrack/GNIP and I have to be honest when I say I am very very inexperienced when it comes to networks and I have absolutely no knowledge when it comes to Data Stream (HTTP), how they work etc.
Are there any resources out there that go through all of this in simple terms? I would love to be able to map Data streaming process in my head before I start looking at APIs etc.
Thanks
Take a look at the following two resources which give a good overview of video streaming. Video streaming has probably more background available and should help you understand the concepts:
https://developer.apple.com/library/ios/documentation/NetworkingInternet/Conceptual/StreamingMediaGuide/Introduction/Introduction.html
http://www.jwplayer.com/blog/what-is-video-streaming/
In very simple terms, streaming breaks a large file or live stream into chunks, and sends those chunks one after another to a client (e.g. browser). The client can generally request a start point for content which is not a live stream. In the background this generally works by the client sending requests for each individual chunk (rather than just one request with multiple responses).
The advantage of the multiple request approach is that you know the client is actually still interested (e.g. the user has not browser to another page etc) and for video and audio etc the client can dynamically request different bandwidth files depending on the current network connection - see: http://en.wikipedia.org/wiki/Adaptive_bitrate_streaming.
Twitter do have a streaming page also, but you have probably already seen this:
https://dev.twitter.com/streaming/overview
I am new to screen scraping. When i use proxy server and when i track the HTTP transactions, i am getting my post datas revealed to me. So my doubt/problem here is,
1)Will it get stored in the server side or it will be revealed only to the client side?
2)Do we have an option of encrypting the post data in screen scraping?
3)Is it advisable to use screen scraping for banking applications?
I am using screen scraper tool which i have downloaded it from
http://www.screen-scraper.com/download/choose_version.php. (Enterprise version)
Thanks in advance.
My experience with scraping is that if you aren't doing anything super complex (like logging into a secure website like an online banking website, etc.) then Python has some great libraries that will help you out a lot.
To answer your questions:
1) You may need to be more clear, but this really depends on your server/client architecture.
2) As a matter of fact you do. Urllib and Urllib2 (built-in Python libraries) both have functions that enable you to encrypt data before you make a POST. As far as how secure this encryption is, for most applications, this will suffice.
3) I actually have done scraping on online banking sites! I'm not exactly familiar with that tool, but I would recommend using something a little different than a scraper. Selenium, which is a "web-driver", allows you to simulate the use of a browser, meaning anything that the broswer does in the background in order to validate the session is automatically taken care of. The main problem I ran into while trying to scrape the banking site was the loss of important session data.
Selenium - https://pypi.python.org/pypi/selenium
Other libraries you may find useful are: urllib, urllib2, and Mechanize
I hope I was somewhat helpful!
I've used screen-scraper to scrape banking sites before. It will impact the site just like your browser--if the site uses encryption the connection from screen-scraper to the site will be too.
If you have a client page sending data to screen-scraper, you probably should encrypt that. I generally just make the connection via SSH.
1) What do you mean by server side? Your proxy server or screen-scraper software? Any of them can read/store your information.
2) If you are connecting through HTTPS then your software should warn you about malicious proxy server: https://security.stackexchange.com/questions/8145/does-https-prevent-man-in-the-middle-attacks-by-proxy-server
3) I don't think they have some logger which they can read. But if you are concerned you can try to write your own. There are some APIs which you can read HTML easily with jQuery sintax:
https://pypi.python.org/pypi/pyquery or XPath: http://net.tutsplus.com/tutorials/javascript-ajax/web-scraping-with-node-js/
So I'm trying to figure out how much capabilities comes with Intersystems to send data to an XDS repository. Specifically with using the basic Ensemble package (NO HSF) Assume it's not the one Intersystems delivers, but an external XDS repository.
Is there a built-in way to send a large blob and wrap the ebRim around that blob?
As you can see at http://www.intersystemsbenelux.com/media/media_manager/pdf/1398.pdf, Ensemble does not natively support ebRIM, but it does support XML and XML schemas.
Maybe you could assemble an XML and use that to wrap your blob content.
You can send that over whatever protocol your XDS system provides (xDBC, SOAP, file system etc). Take a look at the items listed on sections "Ensemble Interoperability" and "Ensemble Adapter and Gateway Guides" of http://docs.intersystems.com/ens20122/csp/docbook/DocBook.UI.Page.cls for a full list of connectivity options.
Regards,
There is healthshare foundation product which has XDS connectivity
See this good answer on google groups https://groups.google.com/forum/m/?fromgroups#!topic/Ensemble-in-Healthcare/h7R300H68KQ
Or healthshare part of their website
HSF (HealthShare Foundation) XDS.b connectivity for query and retrieve and also the Provide and Register Operation.
Ok, so I re-read your question and have an answer for you. I think what you are trying to say is that you have Ensemble, not HSF, and you still want to be able to send documents (XDS provide and Register).
I did some testing with the Open Source Integration mirth and stumbled across an example channel of theirs, and it is doing a provide and register with straight up SOAP calls to the end point.
Basically, build the required soap envelope accordingly, then send a PDF or document to the repository using MTOM.
This is what makes HealthShare its money, encapsulating all that manual construction of objects that need to be sent to endpoints.
Anyway, a screenshot of the Mirth channel destination make give you an understanding:
http://www.integrationrequired.com/wp-content/uploads/2013/02/Capture.PNG
I have data files , mostly text files of very large size and they are spread over systems .
Now i want to perform various operations like sorting , searching and other similar operations over these systems .
I have used parallel python so far in order to access other files in different systems and perform the operations like search. but i want to provide a web interface so that clients can send the request through it rather than writing programs for that.
I was wondering on whether to use HTTP or FTP request in order to fetch and process the request .
I apologise if the question sounds ridiculous , since i am myself figuring out stuff.
That pretty much depends on what you're trying to do, IMHO. If you're fetching the files and sorting/searching/whatever them client-side, FTP would be the appropriate protocol (since you're just transferring files). On the other hand, if the files are being processed (sorted/searched/etc) server-side, a HTTP POST request would be more appropriate.
So, judging from your post and what I think you're trying to do, I'd go with a HTTP POST request.
I am doing file transfers, but the filereference API doesn't support file chunking. Has anyone done this before? For example, I would like to be able to upload a 1 gig file from an AIR client to a custom PHP/Java/etc. service.
It seems that all you should have to do is use the upload() routine. The php or java service should be doing the chunking.
var myHugeFile = new air.File('myHugeLocal.file');
myHugeFile.upload(new URLRequest("http://your.website.com/uploadchunker.php"));
There is a much more elaborate example of using filereference in the adobe learning area here:
http://www.adobe.com/devnet/air/flex/articles/uploading_air_app_to_server.html
Three options jump out on this:
Use an FTP service that supports resumable transfers, assuming flash supports this as well. Maybe not an option if you are wanting to communicate with a custom service of your own.
Leverage the http file part header support. Only applicable if AIR allows access to the appropriate http headers (content-range & content-length). This is what BITS does. Probably a bit harder to implement.
Hand roll your own TCP or UDP protocol exchange. Not for the faint of heart. I'd look in the OSS space before going this route.
I think FileReference does chunk, at least that is what I have observed. Using a tool like Fiddler, you can watch it in action. If you analyze the outgoing headers of a FileReference upload, they are chunked.
If resumes are what you're after, I cannot say how you would go about that with FileReference. I have uploaded small files in generic posts, but that requires the flash/air client to load all bytes into the app. In Air that may or may not crash flash with a 1GB file (depends on your system I guess).