I would like to access user edit history for freebase to see what pages were edited most often over the past couple of years and by whom. I have downloaded the freebase data dump, but I can't find an indication that it contains edit history or timestamps. Is this data part of the dump or is it somewhere else? If it is part of the dump, what subject or predicate id's would I need to search for?
This data is not currently available in the data dumps. There are currently over 2 billion facts in Freebase so publishing a history of edits would be a lot of data. There is a data dump for the deleted triples so if you combine them that would tell you which topics were edited the most but not who edited them.
Related
I am currently working on an EHR-similar project with my database being realtime database on Firebase. I have important data such as patient information encrypted before pushing it into the database, but then I have a page where the doctor will search through patients, it’s ofcourse a massive mistake to pull all patients, decrypt them, and then search through then with my code.
I have found some solution such as Acra search and more, but I’m not sure if there’s a best-practice approach for what I’m looking for (p.s. I need to search in between records, for example if I search by name and the patient is John, I want it to show John if I just type “hn”, and to also be able to search by a variety of data, phone/email/name/..etc)
I have a Firestore data structure and a document where all my followers can see the recentPosts of mine by querying the collection of documents based on the users field of the document where querying users name is present just like below.
my question is how to share a post of others to my followers, currently i am duplicating the shared post to my recentPostsand my seperate Collection of posts documents, but what if a user deletes the post and the post was shared by million users? i have to delete all the shared posts, is there a better solution?
Given your choice in data model, having to delete the duplicated posts is pretty much the normal solution. I also don't see this as problematic, given that:
You've already written the duplicate post to all these followers to begin with, so the delete is just another write.
Deletes and other writes are relatively uncommon in most applications. If not, consider whether you should really be duplicating the data to all followers.
You could choose to implement this with a global list of deleted posts, that each client then reads. But at that point you're making the code that reads data more complex to prevent writes, which is typically not the best approach when using NoSQL databases.
I am developing an app which presents a feed of posts and allows users to vote on these posts.
I want to prevent users from voting multiple times on a single post. To do that, I want to store a list of id's of the posts voted on already so that I can check that each time the user tries to vote.
What's the most efficient way of storing these post IDs if there's a chance of the user voting on up to thousands of posts within a year?
Sqlite, core data, p list or nsuserdefaults?
Since you would also like to know how many people voted (I think), I would save it to a server (using sqlite to store it).
Saving this on a user device seems redundant.
If you do want to store it I would advice Core Data.
It is too much information for NSUserdata, plists… I don’t know why but it just doesn’t seem like a good idea, and Coredata is just a better version of Sqlite (for swift usage)
I have a Plone site that has a lot of data in it and I would like to query the database for usage statistics; ie How many cals with more than 1 entries, how many blogs per group with entries after a given date, etc.
I want to run the script from the command line... something like so:
bin/instance [script name]
I've been googling for a while now but can't find out how to do this.
Also, can anybody provide some help on how to get user specific information. Information like, last logged in, items created.
Thanks!
Eric
In general, you can query the portal_catalog to locate content by searching various indexes. See http://plone.org/documentation/manual/developer-manual/indexing-and-searching/querying-the-catalog and http://docs.zope.org/zope2/zope2book/SearchingZCatalog.html for an introduction to the catalog.
In some cases the built-in indexes will allow you to do the query you want. In other cases you may need to write some Python to narrow down the results after doing an initial catalog query.
If you put your querying code in a file called foo.py, you can run it via:
bin/instance run foo.py
Within foo.py, you can refer to the root of the database as 'app'. The catalog would then be found at app.site.portal_catalog, where 'site' is the id of your Plone site.
Finding information about users happens via a separate API (for the Pluggable Auth Service). I'd suggest asking a separate question about that.
I've searched through the site and haven't found a question/answer that quite answer my question, the closest one I found was: Syncing objects between two disparate systems best approach.
Anyway to begun, because there is no RSS feeds available, I'm screen scraping a webpage, hence it does a fetch then it goes through the webpage to scrap out all of the information that I'm interested in and dumps that information into a sqlite database so that I can query the information at my leisure without doing repeat fetching from the website.
However I'm also storing various metadata on the data itself that is stored in the sqlite db, such as: have I looked at the data, is the data new/old, bookmark to a chunk of data (Think of it as a collection of unrelated data, and the bookmark is just a pointer to where I am in processing/reading of the said data).
So right now my current problem is trying to figure out how to update the local sqlite database with new data and/or changed data from the website in a manner that is effective and straightforward.
Here's my current idea:
Download the page itself
Create a temporary table for the parsed data to go into
Do a comparison between the official and the temporary table and copy updates and/or new information to the official table
This process seems kind of complicated because I would have to figure out how to determine if the data in the temporary table is new, updated, or unchanged. So I am wondering if there isn't a better approach or if anyone has any suggestion on how to architecture/structure such system?
Edit 1:
I'm not sure where to put the additional information, in an comment or as an edit, so I'm going to add it here.
This expands a bit on the metadata in regards of bookmarking, basically the data source can create new data/addition to the current data, so one reason why I was thinking of doing the temporary table idea was so that I would be able to determine if an data source that has been "bookmarked" has any new data or not.
Is it really important to determine if the data in the temporary table is new, updated or unchanged? Do you really need to keep an history of the changes?
NO: don't use the temporary table but just mark as old (timestamp) your old records, don't do updates, and just insert your new data.
YES: your idea seems correct to me but all depends on how much data you need to process each time; i don't think it is feasible with a large amount of data.