For-Profit Dictionary/Synonyms/Antonyms Dataset - dictionary

So, I'm going to be making an application, but for one of the features to work, I'll need to be able to look up Definitions, Antonyms, and Synonyms for words.
The kicker here is that I'll need one that can be used for-profit, as I plan to make money from the application.
Any idea where I can find a dataset that matches my needs?

WordNet may be close to what you are looking for, and it can be used commercially.
http://wordnet.princeton.edu/
and the wiki article:
http://en.wikipedia.org/wiki/WordNet

You can use WordsAPI and buy its data set. According to the description on their website:
Purchase of the Words API data set entitles you to use the data as much as you want, for as long as you want...The only things you cannot do with the data are resell it, or use it in a service that competes directly with WordsAPI.
Also, you can download sample data (10% of the data set) for free.

Related

How to only display traffic from particular sources in a Google Analytics dashboard widget

I can't believe this is as difficult as I'm finding it to be, so I must be missing something obvious!
I want to track data from two particular ads, one on TheSixFifty.com and one in the mountain view voice email newsletter. I've gotten as far as identifying these two sources in a table:
https://imgur.com/a/ljbeonT
I want to only display those two sources, so I thought a filter would be the way to do that, set up like so:
https://imgur.com/p7lBxnk
But that results in this sad, sad empty table:
https://imgur.com/hOzdOdu
Please tell me what I am doing wrong! Does "containing" not mean what I think it does? Help!
You're right - it is something simple!
Your filter contains an AND statement, so it will only show data where the source contains BOTH mv-voice.com and TheSixFifty.com.
Your filter should look like:
Only show Source Matching RegEx:
(TheSixFifty|mv-voice)\.com
Here's a great intro to Regular Expressions from Robbin Steif's guide, they'll be incredibly useful for any analysis.

GraceNote - generate playlist with music of a given country

I would like to use GraceNote to generate play-lists which contain songs likely to appeal to, or, at least, be known to, residents of a given country. E.G, Japan, Korea, Turkey, Brazil, France ...
They don't necessarily have to be in the local language, as I don't think that I can do that with GraceNote (can I ?), but local artists would be nice. Is there any way, for instance, to query and generate a playlist using artist origin?
I realize that something like Gangnam Style might be known in most countries ;-) and that play-list generation is inexact when used this way, but I would be happy with a 70 or 80% "I know that song" reaction.
Can it be done? If so, how? #cweichen, can you help?
It seems likely you are referring the the Rhythm API. As you probably can see from the function definition, you cannot create a playlist using 'ARTIST_ORIGIN'.
The closest thing I can think of is creating a playlist (aka radio station) using on a popular song in the given country as a seed.
You may try configuring the 'focus_similarity' value to get a wider variety of songs. This is just a suggestion and I am not sure if this will get you what your looking for.
*Pygn currently does not support 'focus_similarity' configuration but it should not be too difficult to add yourself.

Scrape all google search result for a specific name

I think the question has been answered here before,but i could not find the desired topic.I am a newbie in web scraping.I have to develop a script that will take all the google search result for a specific name.Then it will grab the related data against that name and if there is found more than one,the data will be grouped according to their names.
All I know is that,google has some kind of restriction on scraping.They provide a custom search api.I still did not use that api,but hoping to get all the resulted links corresponding to a query from that api. But, could not understand what will be the ideal process to do the scraping of the information from that links.Any tutorial link or suggestion is very much appreciated.
You should have provided a bit more what you have been doing, it does not sound like you even tried to solve it yourself.
Anyway, if you are still on it:
You can scrape Google through two ways, one is allowed one is not allowed.
a) Use their API, you can get around 2k results a day.
You can up it to around 3k a day for 2000 USD/year. You can up it more by getting in contact with them directly.
You will not be able to get accurate ranking positions from this method, if you only need a lower number of requests and are mainly interested in getting some websites according to a keyword that's the choice.
Starting point would be here: https://code.google.com/apis/console/
b) You can scrape the real search results
That's the only way to get the true ranking positions, for SEO purposes or to track website positions. Also it allows to get a large amount of results, if done right.
You can Google for code, the most advanced free (PHP) code I know is at http://scraping.compunect.com
However, there are other projects and code snippets.
You can start off at 300-500 requests per day and this can be multiplied by multiple IPs. Look at the linked article if you want to go that route, it explains it in more details and is quite accurate.
That said, if you choose route b) you break Googles terms, so either do not accept them or make sure you are not detected. If Google detects you, your script will be banned by IP/captcha. Not getting detected should be a priority.

Website Layout Statistics

I have a client who has suggested laying out a long list of categories in a custom order. The order is to be decided by them based on product items they sell the most etc.
I tend to disagree and feel that people browsing the internet prefer to search lists of categories that are in alphabetical order or sorted by something they can take reference of such as a date.
I would like to know others thoughts on this and it would be appreciated if anyone could point me in the direction of any open source surveys that have been taken in this area.
Thanks
Ben
What a silly stance to take regarding a simple customer request. Allow for both orderings, and other ones too. There is no survey that will demonstrate that the client is wrong as they are - by definition - correct.
Code that allows for different orderings has greater utility anyway, and real user data will be able to show them which - if either - should be the default.

Using Yahoo! Pipes

Have you used pipes.yahoo.com to quickly and easily do... anything? I've recently created a quick mashup of StackOverflow tags (via rss) so that I can browse through new questions in fields I like to follow.
This has been around for some time, but I've just recently revisited it and I'm completely impressed with it's ease of use. It's almost to the point where I could set up a pipe and then give a client privileges to go in and edit feed sources... and I didn't have to write more than a few lines of code.
So, what other practical uses can you think of for pipes?
It's nice for aggregating feeds, yes, but the other handy thing to do is filtering the feeds. A while back, I created a feed for Digg (before Digg fell into the Fark pit of dispair). I didn't care about the overwhelming Apple and Ubuntu news, so I filtered those keywords out of Technology, which I then combined with Science and World & Business feeds.
Anyway, you can do a lot more than just combine things. If you wanted to be smart about it, you could set up per-subfeed and whole-feed filters to give granular or over-arching filtering abilities as the news changes and you get bored with one topic or another.
The one thing I have really used Y! Pipes for (rather than just playing around with it) is to clean up item titles, merge and finally de-dupe the feeds I got from querying multiple blog search engines with the same search term. This is something I’ve done in several very different contexts, eg. for my own ego surfing, in another case for the planet site set up by some conference’s organisers to keep an eye on their conference’s buzz, etc. Highly recommended.
You can do tons of things with pipes. For example for sites like digg or reddit, you can make one to bypass the site and go directly to the linked article (rewriting the RSS).
I like also to filter webcomics' feeds to keep just the comics, and then mix them all in only one feed
I've taken the liberty of copying your pipe and rearranging it a bit so that it's easier to add and remove tags:
Yahoo Pipe: StackOverflow Merge Tags
Tags are now listed in a string builder, so to add a tag you just have to hit the + button on the string builder and type in the tag preceded by a slash.
Well, pipes are real fast and useful.
Other effective uses might be:
1) combine many feeds into one, then sort, filter and translate it.
2) geocode your favorite feeds and browse the items on an interactive map.
3) power widgets/badges on your web site.
4) grab the output of any Pipes as RSS, JSON, KML, and other formats.
This is by no means a comprehensive list.
One of my favorite things to do with Yahoo! Pipes is to aggregate multiple craigslist feeds into a single feed. You can make a feed out of any category or search criteria on craigslist. I live in a university town and am always on the lookout for tickets to sporting events, for example. I have a half-dozen craigslist searches all being combined into a single feed via Yahoo! Pipes. This works a lot better for me than simply monitoring the entire "Tickets" category; filters out most of the tickets I am not interested in. Yes, this is another aggregating feeds example, but the craigslist usage is quite valuable with the ability to aggregate feeds that are themselves based upon searches.
I've used Pipes to translate blogs into English. I would have liked to use it to fetch the full text for blogs which only provide a summary of the content in the feed, but unfortunately they don't provide any input which fetches the content from a parameterizable source :-(.
Just stumbled on this while looking for ways to connect Excel to Pipes. A bit necromancer-ish, but here goes.
One thing I've done, is take an HTML page (science data) which has links to tons of CSV files for a bunch of Army Corps measurement stations. Each station has a big table of datafiles, all organized individually by month and year. I use YQL to parse out and organize the links to the individual CSV files in a way that Pipes can read them. Then, I use that as input into a Pipe, which has a user input for "Station" and "Date."
Using this, I can go to the Pipes page, type in those values and get the values only for a specific station and date, rather than have to find the station on a website, find the year and month in a big table, click the link, open the CSV file, and find the values for a day within that month's worth of data. I can even change the pipe to specify the hour, and the parameter, and then get a single value returned.
Now, I wish I could figure out how to program Excel so that I can use "=yahoo_function(station, datetime)" to place that value automatically into a cell give the values of other columns!

Resources