Obtaining reddit data [closed] - web-scraping

Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
We don’t allow questions seeking recommendations for books, tools, software libraries, and more. You can edit the question so it can be answered with facts and citations.
Closed 4 years ago.
Improve this question
I am interested in obtaining data from different reddit subreddits. Does anyone know if there is a reddit/other api similar like twitter does to crawl all the pages?

Yes, reddit has an API that can be used for a variety of purposes such as data collection, automatic commenting bots, or even to assist in subreddit moderation.
There are a few places to discover information on reddit's API:
github reddit wiki -- provides the overview and rules for using reddit's API (follow the rules)
automatically generated API docs -- provides information on the requests needed to access most of the API endpoints
/r/redditdev -- the reddit community dedicated to answering questions both about reddit's source code and about reddit's API
If there is a particular programming language you are already familiar with, you should check out the existing set of API wrappers for various languages. Despite my bias (I am the package maintainer) I am quite certain PRAW, for python, has support for the largest number of reddit API features.

Note that if you are only reading data, and not interested into posting back to reddit, you can get quite a bit of data from the json feeds associated with each subreddit. With this method, you don't need to worry about an API at all -- you simply request the relevant json file and parse it in your language of choice.
Here's an example URL that will return a json object containing the hot posts from the Justrolledintotheshop subreddit:
https://www.reddit.com/r/Justrolledintotheshop/top.json
In place of top, you can use hot, new, or controversial. When using top, you can add ?t=day to the end of the url to specify the top post for the day. Other valid values are hour, day, week, month, year, or all.

To parse JSON data from reddit with ajax/javascript.
Reddit has CORS enabled for GET requests.
Here as example, parse the last videos from reddit in JSON format:
xhr = new XMLHttpRequest
xhr.open("GET","https://www.reddit.com/r/videos/.json",true)
xhr.send(null)
xhr.onreadystatechange = function() {
if(this.status === 200) {
console.log(JSON.parse(xhr.responseText))
}
}
https://developer.mozilla.org/fr/docs/Web/API/XMLHttpRequest
To go deeper, check out this question:
Change youtube video ID without page reloading

Related

Is there a way to post new ARTICLE using Linkedin API?

like in the title, is it possible to create an article using current Linkedin API?
I am not interested in sharing an article but posting a new one. I found some questions regarding that from last year and based on answers it wasn’t possible back then. Is it still a case? I couldn’t find this in Linkedin documentation.
Also additional question, did the limit for the amount of characters inside a post (it is possible to create a post via API) change recently? Currently it is set to 1300 characters but I am pretty sure that a few weeks ago we were able to create a longer post.

Google Analytics - What is "Vitaly rules google ☆*:.。...)ノʕ•̫͡•ʔᶘ ᵒᴥᵒᶅ(=^. .^=)oO"? [closed]

Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
This question does not appear to be about programming within the scope defined in the help center.
Closed 6 years ago.
Improve this question
I was checking my Google Analytics Realtime Overview and I found the following.
Is it harmful to my site? Should I take any precautions?
UPDATE December 13th 2016: The person behind this is back again, this time sending spam as a language with a similar message:
Vitaly rules google ☆*:。゜゚・ヽ(^ᴗ^)ノ・゜゚。:*☆
¯_(ツ)_/¯(ಠ益ಠ)(ಥ‿ಥ)(ʘ‿ʘ)ლ(ಠ_ಠლ)( ͡° ͜ʖ ͡°)ヽ(゚Д゚)ノʕ•̫͡•ʔᶘ ᵒᴥᵒᶅ(=^ ^=)oO
The method in the original answer was only for keyword spam. For a more comprehensive solution, the following guide will help prevent this and any other type of spam in Google Analytics.
Ultimate Guide to Getting Rid of the Spam and Other Junk Traffic in Google Analytics
This is Spam. Vitaly is the name behind some of the last Referrer Spam hitting Google Analytics. But this time is using a different method with keywords.
This type of spam never accesses your site so you don't have to worry about security on your site. The only thing you should do is stop it with filters in GA to keep clean your stats.
Go to Admin tab in Google Analytics
Select the View you want to filter > Filter > New Filter
In Filter Type choose Custom Filter > Exclude Filter
Field: Campaign Term
Filter Pattern: Enter Vitaly rules google
And here you can find information about this specific issue
https://www.ohow.co/secret-%C9%A2oogle-com-trump-spam-google-analytics/
Related answers https://stackoverflow.com/a/28354319/3197362
If you need to how how to get rid of this garbage, it is as simple as creating a filter in Google Analytics. Many people have mentioned editing the htaccess file, but I haven't found a way to do it like that yet.
Just create a custom exclusion filter. If you need the exact specifics, check out this post on how to remove vitaly rules google

Is there a way to download all the questions and answers from stack overflow?

I'm interested at looking at website usage, question types, and answers on stack overflow. Is there a way to download all of the content?
I've considered web scraping with beautiful soup or similar as an option, but thought that there are so many expert users the information might be readily available through an API.
Yes, as you guess, there's a JSON API, check https://blog.stackoverflow.com/2012/09/stack-exchange-api-v2-1/
To get all the questions by example : https://api.stackexchange.com/docs/questions
It require a programmatic browser and a JSON parser. It's quite simple with perl, python or ruby.
Another solution proposed by fvu in the comments is to parse a full snapshot of any stackexchange site

Google places API does this violate the TOC? [closed]

Closed. This question is off-topic. It is not currently accepting answers.
Want to improve this question? Update the question so it's on-topic for Stack Overflow.
Closed 10 years ago.
Improve this question
I want to have a little search box, where the user can search for a place (using API), then when they select a place e.g "statue of liberty, new york", I want to take them to a detail
page I.E mysite/ID/statue-of-liberty-new-york
and then let them do some things on that page,
The only data I want from google on the detail page, is the places ID and of course the name and address of the place, that's it then I want to do my own stuff and attach my own data to the places.
I'm a bit annoyed it's hard to understand what is acceptable, they should have expressed this TOC in laymans terms.
Here's some relveant info from their TOC:
a) No Unauthorized Copying, Modification, Creation of Derivative Works, or Display of the Content. You must not copy, translate, modify, or create a derivative work (including creating or contributing to a database) of, or publicly display any Content or any part thereof except as explicitly permitted under these Terms. For example, the following are prohibited: (i) creating server-side modification of map tiles; (ii) stitching multiple static map images together to display a map that is larger than permitted in the Maps APIs Documentation; (iii) creating mailing lists or telemarketing lists based on the Content; or (iv) exporting, writing, or saving the Content to a third party's location-based platform or service.
(b) No Pre-Fetching, Caching, or Storage of Content. You must not pre-fetch, cache, or store any Content, except that you may store: (i) limited amounts of Content for the purpose of improving the performance of your Maps API Implementation if you do so temporarily (and in no event for more than 30 calendar days), securely, and in a manner that does not permit use of the Content outside of the Service; and (ii) any content identifier or key that the Maps APIs Documentation specifically permits you to store. For example, you must not use the Content to create an independent database of "places" or other local listings information.
The relevant policies for your question you'll find here: https://developers.google.com/places/policies
you may use the searchbox and you also may use the results to show them on a page without showing a google-map
it's still not clear if you use any type of map inside your application, but when you do so, you may not use any data delivered by the places-service to use them on this map(e.g. you may not use the latLng of the place-result to create a marker on this map)
you must show the google-logo (because you show data -name and address- received from the service)
when the response contains html_attributions for the place you also must show these html_attributions

How to create a small form that pull data out of a database in Drupal [closed]

Closed. This question needs debugging details. It is not currently accepting answers.
Edit the question to include desired behavior, a specific problem or error, and the shortest code necessary to reproduce the problem. This will help others answer the question.
Closed 7 years ago.
Improve this question
I created a content type for some Quizz on my site, and now I'd like to create a basic form (only available to admins) to pull some stats on them.
The fields used for the quizz are name, start date, end date and correct answer. Each of these fields should be a searchable criteria in the form, and return a list of quizz. There should also be a relationship with the user table to display a list of those who answered the quizz.
Later I'm gonna need an option to extract the data in excel, but let's focus on the form first.
The version i'm using is Drupal6 and please take in consideration that I'm still pretty new to Drupal.
How can I do this?
I imagine you are using CCK for the 'quiz' content type?
If you are, then the best way to 'mash' this data up with getting overly complex is to use Views. You can think of views as an interactive SQL query builder.
You can create pages, blocks or even RSS feeds from the output of Views.
Module Forena seems like a valid alternative to consider. For more details about Forena, 2 types of documentation are available:
Community documentation.
Documentation that comes with Forena, which you can access right after install and enable of the module. Checkout the demo site for an online example of the current:
Forena documentation - use the link 'Reporting documentation' or visit relative link /reports/help.
Forena samples - use the link 'Reporting samples' or visit relative link /reports/samples (these samples are fully functional, so make sure to experiment a bit with it, such as the drill downs available on the SVG Graph sample).
The newest 7.x-4.x version also includes an amazing (I think) UI for either creating your reports (the WYSIWYG report editor) and/or for creating your SQL queries (the Query Builder).
Be aware: I'm a co-maintainer of Forena.

Resources