Closed. This question needs details or clarity. It is not currently accepting answers.
Want to improve this question? Add details and clarify the problem by editing this post.
Closed 8 years ago.
Improve this question
i was trying to import freebase rdf to google refine but getting an error....but now how to extract topic names with notable type from 18 gb rdf to csv etc....any gui tool ?
146 GB is too big for OpenRefine (ex-Google Refine) to handle. If there is a GUI tool that will do this out of the box, I'm not familiar with it, but since this is a programming Q & A site, I'll give a shell programming solution. You don't need to know anything about Linux, but you do need to know how to use Unix shell commands (you could use Cygwin on Windows).
curl -L http://download.freebaseapps.com | gunzip | egrep 'notable_for|notable_type|rdfs:label'
will give you all the raw data that you need to assemble the solution. The lines with the key information look like this, but if you just want labels/names, you'll need to substitute them for the subject/object IDs in the first and last colum.
ns:m.01nsxs2 ns:common.topic.notable_types ns:m.0kpv17.
Related
Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 4 years ago.
Improve this question
I have many word document files in which i need to change few letters which are common in all files. I want to make a Script for replacing the text in all files once. I am using windows machine and R is installed. Suggest me even if you have any other way of doing it.
This files are microsoft word document files and stored in one folder. I have code which get data into R list and find and replace the text. But it creates the file with changed format.
Please suggest me a better way.
I think you are asking if a package exists that can deal with MS Word file? Yes. It does. It is on CRAN: https://cran.r-project.org/web/packages/officer/index.html.
Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 5 years ago.
Improve this question
I am running into a bit of a dilemma and I was hoping to be pointed in the right direction.
I have Git repository which two (self-explanatory) folders: scripts and data. I keep adding new data files to analyze in data, while in scripts I write R scripts to analyze those files.
I track changes in both folders. Therefore, I commit additions of new data files to data. This has nothing to do with tracking changes. I just want the scripts and the data to move together since I work on at least two machines.
I feel like I am using Git improperly, as (with respect to the data folder) I basically use it as a syncing tool.
So my question: is it bad habit to use Git also for data?
I don't think you are doing something particularly awful. Perhaps you could keep data on its own branch and then use it as a submodule or subtree?
Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 7 years ago.
Improve this question
I am writing an R code that has many different functions that eventually I will want to use all together on different data sets.
As I keep building functions it seems to be getting harder to keep track of everything in my script.
My question is, is it proper R coding to break functions into separate R Scripts or should it all be in one massive script?
Thank you for your help. This is my first time trying to code something this large!
-B
Yes, you can store your functions in multiple R scripts.
If you need to call them, you can use source().
For eg:
Say you have func1 , func2 saved in myfunc.R.
To call them,
source('myfunc.R')
#other codes
func1()
func2()
As to whether this approach is recommended, depends on your project requirements.
Alternatively, you can consider packaging them as recommended by Richard.
Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 8 years ago.
Improve this question
Is there a way to remove some uneeded locale to reduce the size of Qt Core ?
You'll need to be more specific about what your application requires. Regardless, I'd recommend reading through this thread on the interest mailing list, as it has some interesting information regarding slimming Qt Core. In particular, you can reduce the size of ICU:
I'll leave it for others to pass comment on the standard configure
options and size, but if you're really desperate for every last saving
then removing the locales you don't need can save you 230 KB (on Linux
64bit it reduces my default release build from 5.5MB to 5.2MB), but
it's a manual process:
Download http://unicode.org/Public/cldr/24/core.zip and unzip
Run "../path/to/qt5/qtbase/util/local_database/cldr2qlocalexml.py
core/common/main >> qlocale.xml"
Edit qlocale.xml to remove all the locales you don't need: only
remove groups from inside and nothing else, I
suggest you always keep C and en_US in addition to the locales you
require.
Run "../path/to/qt5/qtbase/util/local_database/qlocalexml2cpp.py
qlocale.xml ../path/to/qt5/qtbase/"
Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
We don’t allow questions seeking recommendations for books, tools, software libraries, and more. You can edit the question so it can be answered with facts and citations.
Closed 7 years ago.
This post was edited and submitted for review 6 months ago and failed to reopen the post:
Original close reason(s) were not resolved
Improve this question
I am writing a program that needs a list of English words as a source file for it to work. I realise that these source files are available for students writing games such as Hangman or Crossword solvers but I am having trouble locating such a source file and wonder if anyone knows how I can attain one without slowly scraping websites and building up a dictionary manually.
What about /usr/share/dict/words on any Unix system? How many words are we talking about? Like OED-Unabridged?
For an English dictionary .txt file, you can use Custom Dictionary.
You can also generate a list aspell or wordlist with own settings.
Also you can take a look at http://wordlist.sourceforge.net/
Only english words: http://www.math.sjsu.edu/~foster/dictionary.txt
Also take a look at:
http://wordlist.sourceforge.net/
http://www.math.sjsu.edu/~foster/dictionary.txt
350,000 words
Very late, but might be useful for others.
There's also WordNet. Its data files format are well-documented.
I used it for building an embeddable dictionary library for iOS developers (www.lexicontext.com) and also in one of my apps.
#Future-searchers: you can use aspell to do the dictionary checks, it has bindings in ruby and python. It would make your job much simpler.