Facebook-like image scraping - web-scraping

I'm trying to implement an image scraper feature that works similar to the way Facebook does when you post a link. I don't care about the actual UI part. I just want to pass a URL to a script and have it return the URLs of all the images on the page.
It's really easy to put together something that only works some of the time - this, for example - but I want something that works reasonably well.
I'm capable of writing this kind of thing myself, but it of course would be dumb for me to do that if there's already something written that's available for free.
Does anyone know of such a tool that exists? I don't care what language it uses as long as it will run on *nix.

You can start with python and Scrapy http://doc.scrapy.org/en/latest/index.html

Related

how to config SlickUpload 6 AJAX to NOT require a file

I've been trying to figure out how to make the current SlickUpload 6.1.7 play nice in a form that does not require someone to include files. We want it to be optional. This form is for people to contact us, and we want to give them the option to include attachments, just not require it. It works if you upload a file, however the form will not submit, if you do not upload any file.
We are trying to use the Ajax version with Memory stream, but the 'AspNetAjaxCs-VS2010' sample code is not helping much... the documentation is sparse, and the samples have little constancy between them, making it hard to understand how exactly it all works. (at least for me) The fact that they got bought out, and the new owner company pretty much ignores people unless they are paying for it, is not cool.
Any ideas?
You could use a div to detect a drop inside your page. That way you could only activate it when the user really need it, or whenever the using actually select something for uploading, like a trigger.
I'm not sure I understand your question, but if so, it sounds like you already have a form that you want to use SlickUpload with to upload files IF there are files to be uploaded. Are you using the CustomUploadStreamProvider form?

Does anybody know of a Wordpress plugin that produces a PDF locally?

The function of the plugin is to print the page contents as a PDF.
I can find a bunch that connect to an external service in a popup to generate the PDF, but I can't seem to find one that produces the PDF on the server where the PDF needs to be generated.
The reason I don't want to use these external services is because of branding, and because most of them have advertising. If it wasn't for this, I wouldn't mind using these plugins either.
I'm OK with paying money if the plugin is good enough.
P.S. On an unrelated note, Wordpress' plugin search sucks :( I can't filter by version number, compatibility, etc.
I've found this http://www.investintech.com/resources/blog/archives/167-top-10-handy-wordpress-pdf-plugins.html
Should help you find what you need.
I already tried Print Friendly, and I think it could work in your situation.

Grab Xbox live friends list from bungie

Hey all, I'm trying to grab and display a friends list from bungies friends list.aspx file:
https://www.bungie.net/Stats/LiveFriends.aspx
and display them in a desktop application.. VB or something
How would I be able to do this? Does it have anything to do with asp? Are there any tutorials that can show me how to grab and display information?
If you're really interested about consuming information from Xbox Live, you can apply for the XBL Community Developer program from free here: http://www.xbox.com/en-US/community/developer/
There you'll be provided with API access that will be quicker and more reliable then parsing data from the Bungie site.
You'll need to fetch the data ("scrape" it) through something like a WebRequest. That will give you the raw HTML or whatever it outputs.
I'm sure, without even looking, that it uses some kind of login as well, which you will have to support. I would guess that involves making a request with the credentials to some page and extract the cookie returned which you will have to pass around. The cookies are passed around as headers.
The first thing you'll have to do is examine the HTML returned and determine how to process it to get the information you want. I would use Chrome and it's excellent developer tools for this, or another browser like Opera or Firefox with similar capabilities. This will also work for figuring out how to handle the session cookie.
Maybe 360voice can help? Haven't looked at the API enough to know if it has what you need.
http://360voice.gamerdna.com/forum/topic.asp?TOPIC_ID=3

Writing web forms filler/submitter with QT C++

I'm scratching my head on how to accomplish the following task: I need to write a simple web forms filler/submitter with QT C++, it does the following:
1) Loads page url
2) Fills in form fields
3) Submits the form
Sounds easy, but I'm a web developer and can't find the way how to make QT accomplish the task, I only managed to load url with QWebView object using WebKit, have no idea what to do next, how to fill in fields and submit forms. Any hints, tutorials, videos? I appreciate this.
QWebElement class does all the work, just reading through the class documentation gave me a full idea on how to accomplish my task. Thanks to eveyrone for suggestions.
The best solution would be to write the logic in JavaScript that does what you want and then inject it into the page using QWebFrame::evaluateJavaScript() after it finishes loading.
There's also another way to do this; involving the document tree traversal API that's been available in QtWebKit since 4.6: QWebElement. You'd basically process the form pretty much the same as you would do in JavaScript, except that here the API is different and more limited. It's C++ though and might be a little bit faster. I guess, this approach might be less attractive for you, given you're a web developer and probably already know JavaScript.

how to design a game web app?

i know vb.net, but have had no experience at all with web programming. i need to make a web app that can run in a browser where there is a board game and pieces that you can move around. can someone help me get started? are there any examples in asp.net?
i need something like this:
http://www.hallofbrightcarvings.com/game/grid
i don't know what language this is built in, but i would much prefer vb.net. i would like the pieces to be pictures instead of text. please help get started.
I have a very basic example of moving pieces around a grid written in javascript.
You can see it in action here and if you take a look at the source you can see it's done with jquery mostly. Feel free to take a prod around, I haven't updated that version in a long time but hopefully you might find it useful.
I think ASP.NET can do very little for you according to what you described. What you need is either Flash or Javascript skills.
Let's decompose this, you need two things if you want to make the whole thing yourself
Client Side: Flash, SilverLight, JAVA
Server Side:PHP, ASP.net, Java
As you know vb.net and want to work with asp.net, so I recommand to use Silverlight.
How complex can this be?
Depends on what you want to build, if you want to build a Mafia war games, then you'll need to work the user interface and it'll be very hard. Also the server side will be important as you need to handle registration and relation between different players.
If you specify more your question, you could get better answers.
The example you cited above is fully client-side, which means the code all sits on the browser and the server doesn't do anything to enable the grid. So if you did a "Save As" of that page on your computer, you could run it offline.
You should use the view source functionality of your browser on the page you cited, and look at how it's built. It's done using HTML, CSS and javascript. Use w3schools to get yourself started on those three matters.
If you really need to code it using vb.net, I don't know of any way that allows drag-and-drop for web forms. I'd be interested to know though. Ajax and .net drag-and-drop should be keywords for you to look into.
To do this on the web, you'd probably want to divide the project into two components: Client-side and server-side.
On the server-side, you'll want to use language like PHP, Python or ASP.NET. I think ASP.NET has some way to use VB.NET, so that would be a good choice for you to minimize the number of new things you need to learn.
Client-side is going to be the big hurdle. There's basically two different approaches to take here:
HTML+CSS+Javascript, using HTTP callbacks (ie, AJAX) to communicate with the server.
Flash using Flex (I think HTTP calls is probably the easiest way to talk to your server here as well.)
For a game like that, I would think that Flash is probably the best way to go. It will be easier to do graphics and sounds, and it'll run the same in every browser that has Flash support.

Resources