is there anything out there that extracts information from unstructured text(news articles, books etc) [closed] - information-extraction

Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
We don’t allow questions seeking recommendations for books, tools, software libraries, and more. You can edit the question so it can be answered with facts and citations.
Closed 8 years ago.
Improve this question
I have been trying to find a program that can extract information from unstructured text(news articles, books, etc).
My eventual goal is to create a program that can take regular sentences and cache it in a database much like google does but without all its duplicate information.
lets take the NLTK example: "At eight o'clock on Thursday morning Arthur didn't feel very good."
the things that i would want extracted would be:
time: 8:00pm
date: thursday
person: Arthur
action: didn't feel good
is there a program that can do this?
i have tried using NLTK but i cant seem to find any good way to accomplish extracting the information.

This problem is called Fine grained entity recognition. No, There are no tools (except for research works) that can add such semantics.
To start with, you can recognise Person and Time with appropriate models using Entity Recogniser.
You can recognise the actions from sentence parsing as suggested by #Junuxx.
Also give Wikify a try.
Thank you.

Related

Determining if certain courses are taken together frequently in a term. R analysis [closed]

Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
We don’t allow questions seeking recommendations for books, tools, software libraries, and more. You can edit the question so it can be answered with facts and citations.
Closed 16 days ago.
This post was edited and submitted for review 16 days ago and failed to reopen the post:
Original close reason(s) were not resolved
Improve this question
Assuming a table that gives student ID and the courses they took (like English 101, Math 101, etc) and the term term the course was taken in (Fall 2022, Spring 2023, etc) what is the best way to go about finding if students tended to take certain courses together in the same semester?
What's the general term for this analysis?
Possibly similar to R: Which products are bought together more frequently? but that doesn't give the general term for this analysis type.
This type of analysis is called "Association Rule Mining", some packages that you can check out in R are "arules" and "arulesViz" which can be used to perform association rule mining on this type of data.
This medium article is pretty helpful in explaining this concept.
Hope this helps! :)

Any thing about How to Design Components mentioned in How to Design Programs? [closed]

Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
We don’t allow questions seeking recommendations for books, tools, software libraries, and more. You can edit the question so it can be answered with facts and citations.
Closed 3 years ago.
Improve this question
I've just started reading How to Design Programs (2nd edition) on htdp.org
There are several notes in this book mentioned next volume called How to Design Components (e.g. the 3rd note in part one), however, I just can't google anything about the 2nd volume book.
I'm wondering why it is so hard to find any information about the latter volume. Has it finished? If it has not finished yet, how can I get information about the book?
The first author provides more information on his website:
We have decided to provide the draft of "How to Design Classes" (pdf)
on an "as is" basis for now. You are free to download and print it.

Data science project example [closed]

Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
We don’t allow questions seeking recommendations for books, tools, software libraries, and more. You can edit the question so it can be answered with facts and citations.
Closed 5 years ago.
Improve this question
Do anybody know about a github repo with a full well organized data science project? Preferable in Python. My hobby project often get mezzy with a mix of Python code and notesbook. A worked out project is the best way to learn some new tricks.
Data Science is regarded a bit differently by different people, so you might consider focusing on what exactly you wish to learn.
But, take a look at those:
https://github.com/bulutyazilim/awesome-datascience
https://www.kaggle.com/
The first one contains lots of relevant sources of information. The second is originally a competition site with varied different problems in ML, but also contains past competitions (and datasets). They added a cool feature called "kernels" which are just code files people publish and you could learn from those.

Any free database of English-Spanish words? [closed]

Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
We don’t allow questions seeking recommendations for books, tools, software libraries, and more. You can edit the question so it can be answered with facts and citations.
Closed 5 years ago.
Improve this question
I want to make a Vocabulary Trainer and I was thinking about the best way to do it. First I searched some translation APIs to use, to avoid having to build my own dictionary, but I found that most of them are paid and some are free but have limitations.
So, I think the best way is to make my own dictionary, which also allow me to work offline, but I wonder if there is any free database of English-Spanish words to avoid starting from scratch.
Do you know any?
Thanks a lot!
You could try http://www.omegawiki.org/ as they claim this:
The aim of our project is to create a dictionary of all words of all languages, including lexical, terminological and ontological information. Our data is available in a relational database, as a result it is possible to use the data for many purposes.

Free source for yellow pages data? [closed]

Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
We don’t allow questions seeking recommendations for books, tools, software libraries, and more. You can edit the question so it can be answered with facts and citations.
Closed 7 years ago.
Improve this question
Is there a free source with basic yellow pages data(name,address, phone#)? I don't mind if its out of date. I couldn't find anything with google. To clarify I'm looking for a data dump, I know I can just go to yellow pages.com or whatever for regular queries. As a last resort I'll probably scrape it.
This sort of data tends to be very expensive, so you're unlikely to find anyone offering a free directory. If they are it will probably be horribly out of date or have many duplicates.
In my previous job the company was looking at business directories - the main stumbling block was the cost of good, clean data.
Yeah, I'd recommend something like Yellabot.com and get the GOLD version if you can and automate scraping the data. I don't know of anywhere that is going to give that data out for free but if you're willing to pay for it, I'm sure there are companies that would sell the whole shebang for 10's of 1000's of dollars. If you do find it though, let me know, lol.

Resources