How to get text only from Wikimedia API? And how to edit CSS? - css

First. How can i edit with CSS this render?
http://en.wikipedia.org/w/index.php?title=Albert_Einstein&action=render
Second
I have e link http://en.wikipedia.org/w/api.php?action=query&prop=extracts&format=json&exintro=&titles=Stack%20Overflow
RESULT:
{"query":{"pages":{"21721040":{"pageid":21721040,"ns":0,"title":"Stack Overflow","extract":"<p><b>Stack Overflow</b> is a website, the flagship site of the Stack Exchange Network, created in 2008 by Jeff Atwood and Joel Spolsky, as a more open alternative to earlier Q&A sites such as Experts Exchange. The name for the website was chosen by voting in April 2008 by readers of <i>Coding Horror</i>, Atwood's popular programming blog.</p>\n<p>It features questions and answers on a wide range of topics in computer programming. The website serves as a platform for users to ask and answer questions, and, through membership and active participation, to vote questions and answers up or down and edit questions and answers in a fashion similar to a wiki or digg. Users of Stack Overflow can earn reputation points and \"badges\"; for example, a person is awarded 10 reputation points for receiving an \"up\" vote on an answer given to a question, and can receive badges for their valued contributions, which represents a kind of gamification of the traditional Q&A site or forum. All user-generated content is licensed under a Creative Commons Attribute-ShareAlike license.</p>\n<p>As of August 2013<sup class=\"plainlinks noprint asof-tag update\" style=\"display:none;\">[update]</sup>, Stack Overflow has over 1,900,000 registered users and more than 5,500,000 questions. Based on the type of tags assigned to questions, the top eight most discussed topics on the site are: C#, Java, PHP, JavaScript, Android, jQuery, C++ and Python.</p>\n<p></p>"}}}}
How can i give without this and other charachters?
{"query":{"pages":{"21721040":{"pageid":21721040,"ns":0,"title":"Stack Overflow","extract":"

That thing you're getting is a JSON object. I'm not really sure how you're trying to use the data, but you should probably parse that JSON and get only that part which is important to you: extract property.

Related

Is web scraping allowed? [closed]

Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 4 years ago.
Improve this question
I'm working on a project that requires certain statistics from another website, and I've created an HTML scraper that gets this data every 15 minutes, automatically. However, I stopped the bot now, as in their terms of use, they mention they do not allow it.
I really want to respect this, and especially if there's a law prohibiting me from taking this data, but I've been contacting them through email several times without a single answer, so now I've come to the conclusion that I'll simply grab the data, if it is legal.
On certain forums I've read that it IS legal, but I would much rather get a more "precise" answer here on StackOverflow.
And let's say that this is in fact not illegal, would they have any software to spot my bot making several connections every 15 minutes?
Also, when talking about taking their data, we're talking about a single number for each "team", and this number I will transfer in to our own number.
I'll quote Pablo Hoffman's (Scrapinghub co-founder) answer to "What is the legality of web scraping?", I found on other site:
First things first: I am not a lawyer and these comments are solely
based on my experience working at Scrapinghub, please seek legal
assistance accordingly.
Here are a few things to consider when scraping public data from websites (note that the following addresses only US law):
As long as they don't crawl at a disruptive rate, scrapers do not breach any contract (in the form of terms of use) or commit a crime
(as defined in the Computer Fraud and Abuse Act).
Website's user agreement is not enforceable as a browsewrap agreement because companies do not provide sufficient notice of the
terms to site visitors.
Scrapers accesses website data as a visitor,
and by following paths similar to a search engine. This can be done
without registering as a user (and explicitly accepting any terms).
In Nguyen v. Barnes & Noble, Inc. the courts ruled that simply placing a
link to a terms of use at the bottom of webpage is not sufficient to
"give rise to constructive notice." In other words, there is nothing
on a public page that would imply that merely accessing the
information is subject to any contractual terms. Scrapers gives
neither explicit nor implicit assent to any agreement, therefore
breaches no contract.
Social networks, for example, assign the value of becoming a user (based on call-to-action on public page), as the ability to: i) Gain access to full profiles, ii) Identify common friends/connections, iii) Get introduced to others, and iv) Contact members directly. As long as scrapers makes no attempt to perform any of these actions they do not gain "unauthorized access" to their services and thus does not violate CFAA
A thorough evaluation of the legal issues involved can be seen here: http://www.bna.com/legal-issues-raised-by-the-use-of-web-crawling-and-scraping-tools-for-analytics-purposes
There must be robots.txt file in root folder of that site.
There are specified paths, that are forbidden to harass with scrappers, and those, which is allowed (with acceptable timeouts specified).
If that file doesn't exists - anything is allowed, and you take no responsibility for website owners fail to provide that info.
Also, here you can find some explanation about robots exclusion standard.

Has questionanswering (QA) been used with Freebase as a knowledge base?

Has there been any prior work done on question answering machines using Freebase as a knowledge base? I searched for this on the web but couldn't get anything substantial. Does anyone know of any work around this area where the input is an unstructured question and the QA engine leverages Freebase to provide answers?
Take a look at the papers on Question Answering in our Mendeley group to see how people are using Freebase data to do question answering. There's a paper in there that covers the IBM Watson project that Tom mentions.
I also made a little question answering demo on FreebaseApps.com that you can try here:
http://answers.freebaseapps.com/?q=what+is+the+population+of+paris
This doesn't sound like a programming question, but IBM's Jeopardy-playing Watson reportedly used Freebase (among many other sources of information) and TrueKnowledge in the UK uses it as one input.

Programming podcast recommendations [duplicate]

This question already has answers here:
Closed 12 years ago.
Possible Duplicate:
What good technology podcasts are out there?
Travelling an hour each way to work in the car I have taken to juicing my ears and brain by listening to podcasts. Currently this includes .NET Rocks, The Java Posse and (for the few I could find) Hacker Medley. Could anyone recommend any others?
My background is .NET, Java and I write some Android apps in my spare time. Any recommendations along these lines would very cool.
You might want to check out Hanselminutes from Scott Hanselman.
Even though they're no longer recording new episodes, a pretty goon one was Stackoverflow's own podcast.
I also like FLOSS Weekly.
You can check out IT Conversations Networks. They have a variety of PODCASTs listed.

whats the best way to faciliate collaboration within a software org [closed]

Closed. This question is off-topic. It is not currently accepting answers.
Want to improve this question? Update the question so it's on-topic for Stack Overflow.
Closed 10 years ago.
Improve this question
we have an org of around 300 people and certain people are very good at sharing articles, tips, blogs, etc but it usually happens within sub teams (between 5-15 people). whats the best way to scale this up to facilitate a culture of collaboration across a larger set of folks.
Post to central WIKI instead of email links?
Reward contributors and encourage bottom up organic collaboration ?
"Force" collaboration top down ?
You have to create an culture in which sharing is rewarded.
Post to central WIKI instead of email links.
Reward contributors and encourage bottom up organic collaboration
"Force" collaboration top down. By "force" you mean reward and encourage.
You must do all of this. And more.
You must teach collaboration
You must assure that all managers value and reward collaboration
You must measure collaboration.
Even then, you'll probably have to do even more.
Good answer by S.Lott.
I'd add: You need to make sure people can easily find things when they need them. That's partly cultural - do people think to look at the wiki, and do they know where to look. It's also about the wiki's structure & quality:
Is it easy to navigate & search?
Is it kept up to date?
How does it mesh with other documentation (eg javadoc)?
From my experience, forced = hated. So you have to make people want to use it, ie make it useful. A central Wiki sounds like the best solution, but it's hard to say. You might want to look into MediaWiki, Traq, or Sharepoint Services (not to be confused with Office Sharepoint).
Your organization may find it encouraging if you post a list of the top contributors, editors, or visitors to the site. But that depends on how your org perceives competition.
I wouldn't suggest a central wiki for collaboration (aside from internal specific stuff). But for sharing information found online you should encourage people to use one of the many existing systems for this. Google Reader has a really nice sharing and commenting mechanism. Delicious would also be a good fit for what you want.
There's no reason to try to create a walled garden inside your organization for content that is being created outside of it. The system you create will not be as good as the ones that already exist and that will kill adoption.

What are some good computer science resources for a blind programmer?

I'm a totally blind individual who would like to learn more of the theory aspect of computer science. I've had an intro data structures class and the general intro programming but would like to learn more on things such as software design, advanced data structures, and compiler design. I want to do this as a self study course not as part of college classes.
Unfortunately there aren’t many text books available on computer science from Recordings for the Blind and Dyslexic where I normally get my textbooks. I would appreciate any electronic resources preferably free that could help me get more of a computer science education rather then the newest language or platform that a lot of programming sites appear to focus on.
You might find the Experiences of a Blind Computer Scientist a good read.
MIT's Open Courseware would be a good resource for you with the amount of videos/audio they have.
Really though, for the core computer-science topics I find it pretty hard to beat some of the better textbooks out there. Some offer digital versions of their book with purchase and some don't. For those that don't, I would just purchase the book and then download via a torrent site a digital e-book equivelant. Since you already own the book I don't think this would be a major problem.
UC Berkley has a couple of computer science courses online for free as mp3 and video files (including RSS feed for each course). And if reading PDF files aren't an issue you could check out O'Reilly's Safari.
The text book for Structure and Interpretation of Computer Programs appears to be accessible. Software engineering radio is a good podcast that I listen to but recently has focused a lot on model driven development and UML which doesn't interest me. The UC Berkley
lectures are of varying quality, it's like all other college classes it depends on the professor. I've found I can follow along with the cs162 lectures fine but not so much with the cs61b. Part of this is because of the professor and part is probably because 61b is more math heavy since it's a data structures class. Unfortunately the RSS feeds are useless since the file names are meaningless. I used my podcatcher to download the entire lecture series, then used the converting capability of foobar 2000 to rename the files with there track number so I could listen to them in order. I've used Safari at work before and it is accessible although to expensive for me to get a yearly subscription. Open Courseware appears to have a lot of good stuff. Unfortunately I don't use itunes so instead of downloading each mp3 file individually I used the firefox extension DownThemAll! with a custom filter to grab all the mp3 files at once from the specific course I wanted. Another series of books that looks useful are the data structures books by Bruno R. Preiss several of which are available online at
http://www.brpreiss.com/books/opus5/
Some of the equations are represented as graphics but I can often tell what the general idea is by context.
I wonder would the Structure and Interpretation of Computer Programs video lectures by Hal Abelson and Gerald Jay Sussman be of any use?
If the audio content is enough on its own without the video, they are an excellent digital resource.
The podcast "software engineering radio" is excellent. Though not CS courseware, it is the most academic and intellectually stimulating podcast I have found about software development and computer science.
http://www.se-radio.net/
personally I am just blown away by the questioner. I mean, the challenge alone of programming is too much for most people but being without the primary sense used in the task is amazing to me. What is ironic though is I bet that given this challenge the questioner is still FAR more adept at most CS tasks than the people I work with day to day. Just saying.
I'm also a totally blind programmer, currently working for Microsoft. The most valuable resource for te technical books is Safari (safari.oreilly.com). You can read thousands of computer science texts there. if you're in the USA, you can also get many of those titles for free from BookShare (www.bookshare.org). In both cases graphical images will be an issue, but there's no easy solution for that. Most good books have enough descriptive text that one can manage without the diagrams.
I to am a new blind programmer! I only lost my vision 5 years ago. Anyway, I have been programming in Visual Basic 2008 throughout the past year. It turned out to be more accessible than I had at first suspected.
I start a Java class next semester and the required text is a free online text! It is posted below.
Introduction to Programming Using Java, Fifth Edition
http://math.hws.edu/javanotes/
Can some of you seasoned blind programmers share with us any blogs or websites where other blind programmers can be found??
Check out this Stack Overflow question about podcasts.
A language called Quorum is a lot like Python but optimized across a few more syntactic details, and the corresponding development environment is designed with the blind in mind. https://quorumlanguage.com/ This might fit especially well with the use case where most students are using Python.
A 2016 blog about CSed (actually a response to a blog post) points to
program-l discussion board for blind programmers at https://www.freelists.org/list/program-l
The EPIQ conference for blind and other programmers interested in Quorum
https://quorumlanguage.com/epiq.html
Also, see other ideas in a similar question on another SO site: https://cseducators.stackexchange.com/questions/3441/teaching-a-blind-high-school-student

Resources