Freebase 'alias' dataset dump - freebase

Does a dataset dump file for Freebase's aliases exist (e.g. for 'Sports' topic - sport teams' aliases, athletes' aliases and so on)?
I know I could get the aliases for a specific object via http request/application but I just need a list of the aliases and do not want to issue http requests for 2.5 millions(or more) times..

All aliases are included in the quad & RDF dump files.
http://wiki.freebase.com/wiki/Data_dumps

Related

How to get filtered list of files from SFTP server using SSHJ [duplicate]

I am using SSHJ SFTP library to get file list from SFTP-server.
The connection to server is very slow and there are tens of thousands of files in directory. Often getting file list will end in various timeout / socket errors.
Is there possibility to tell the client to retrieve file list only from eg. ".zip" files so that it would have positive impact on the performance? Pseudo command: sftpClient.ls("*.zip")
I know there is a method List<RemoteResourceInfo> net.schmizz.sshj.sftp.SFTPClient.ls(String path, RemoteResourceFilter filter) which will filter the list, but from what I understand, the filtering would happen only in client side? ie. the client would still receive whole file list and just after then it would be filtered.
Is there any way to achieve this so that server would only return the names requested? Does the SFTP-protocol even support this?
Indeed, the SFTP protocol does not have a way to provide a list of files matching any criteria. It does not matter, what SFTP library you are using.
You would have to use another interface/API if you need the filtered list. If you have a shell access, you might use shell command ls *.zip.
Or build you own (REST?) API.

List available collections for database in ArangoDB using HTTP interface?

I am trying to use ArangoDB's HTTP interface to dump all collections belonging to a specific database.
I am able to view all available databases using the following command:
curl http://localhost:8529/_api/database
However, once I find a database name (for example, "test") I am unable to dump the collections belonging to this database. Ultimately, I would like to dump the collections for this database, and then all results within a chosen collection.
I have followed the documentation provided here: https://www.arangodb.com/docs/stable/http/general.html, however I am still unable to find the relevant documentation for this request.
You can get the list of all collections with
curl http://localhost:8529/_db/DBNAME/_api/collection
as is implied in this part of the documentation: https://www.arangodb.com/docs/stable/http/collection.html#address-of-a-collection
The rest of the interface works accordingly, e.g.
curl http://localhost:8529/_db/DBNAME/_api/collection/COLLNAME
to get information about a single collection (that is already included in the output of the first call).
You find the complete swagger documentation with just two clicks in the Web-Interface:

What is the best means of securely delivering minion specific files using Salt?

I have a number of files that I need to transfer to specific minion hosts in a secure manner, and I would like to automate that process using Salt. However, I am having trouble figuring out the best means of implementing a host restricted transfer.
The salt fileserver works great for non-host-specific transfers. However, some of the files that I need to transfer are customer specific and so I need to ensure that they are only accessible from specific hosts. Assumedly Pillar would be the ideal candidate for minion specific restrictions, but I am having trouble figuring out a means of specifying file transfers using pillar as the source.
As far as I can tell Pillar only supports SLS based dictionary data, not file transfers. I’ve tried various combinations of file.managed state specifications with paths constructed using various convolutions (including salt://_pillar/xxx), but thus far I have not been able to access anything other than token data defined within an SLS file.
Any suggestions for how to do this? I am assuming that secure file transfers should be a common enough need that there should be a standard means of doing it, as opposed to writing a custom function.
The answer depends on what exactly you're trying to secure. If only a part of the files involved are "sensitive" (for example, passwords in configuration files), you probably want to use a template that pulls the sensitive parts in from pillar:
# /srv/salt/app/files/app.conf.jinja
[global]
user = {{ salt['pillar.get']("app:user") }}
password = {{ salt['pillar.get']("app:password") }}
# ...and so on
For this case you don't need to care if the template itself is accessible to minions.
If the entire file(s) involved are sensitive, then I think you want to set up the file_tree external pillar, and use file.managed with the contents_pillar option. That's not something I've worked with, so I don't have a good example.
Solution Synopsis: Using PILLAR.FILE_TREE
A: On your master, set-up a directory from which you will server the private files (e.g: /srv/salt/private).
B: Beneath that create a “hosts” subdirectory, and then beneath that create a directory for each of the hosts that will have private files.
/srv/salt/private/hosts/hostA
/srv/salt/private/hosts/hostB
… where hostA and hostB are the ids of the target minions.
See the docs if you want to use node-groups instead of host ids.
C: Beneath the host dirs, include any files you want to transfer via pillar.
echo “I am Foo\!” > /srv/salt/private/hosts/hostA/testme
D: In your master config file (e.g: /etc/salt/master), include the following stanza:
ext_pillar:
- file_tree:
root_dir: /srv/salt/private
follow_dir_links: False
keep_newline: True
debug: True
E: Create a salt state file to handle the transfer.
cat > /srv/salt/files/base/foo.sls << END
/tmp/pt_test:
file.managed:
- contents_pillar: testme
END
F: Run pillar refresh, and then run your state command:
salt hostA state.apply foo
Following the last step, hostA should have a file named /tmp/pt_test that contains the text “I am Foo!”.

How read nginx-logs in symfony-admin?

I want to read nginx access-log files in admin and filter data by some parameters (url, date and etc). Maybe exist some bundles for do this?

Simulating CMIS Atom API doesn't load the information properly

I was requested to simulate a CMIS Atom API for my company's content management using our API. but I'm stuck in what it seems to be something simple. So I'm trying to load the CMIS TCK, but for some reason the values of the responses doesn't make it into the next request. So I think I'm missing something.
The first request I get is to getRepositories
/cmisatom/getRepositories
Then I get the request to get a specific repository
/cmisatom/getRepositories?repositoryId=c9ad76c6-d121-4a32-bb14-e5d43bf91ee6
Which kinda tells me that the data from the first request was parsed properly.
Now on the third request is where things get weird. I get the request for the id
/cmisatom/c9ad76c6-d121-4a32-bb14-e5d43bf91ee6/id?id=&filter=&includeAllowableActions=&includeACL=&includePolicyIds=&includeRelationships=&renditionFilter=
but no information of the id, not filter nor anything else, was loaded. I'm matching the responses to a alfresco CMIS Atom that I have running on my local. So the response its identical except for the jsession. Can you share any guidance on this?
The steps go like below.
Service document is the first one to fetch - your example refers to it as "/cmisatom/getRepositories". This lists the list of all repository data. It also includes the repository url templates like OBJECT_BY_ID, TYPE_BY_ID etc. That means, for navigation / listing folders etc, your link "/cmisatom/getRepositories?id=c9ad76c6-d121-4a32-bb14-e5d43bf91ee6" is not used.
The third link you're referring to looks like a URL template OBJECT_BY_ID - and here you have to provide the object id and populate other params before you make a request.
The param object id for the first request is again a value which you obtain from service document. This value is called ROOT FOLDER ID.
Use root folder id to update object by id url template and get the root folder details - from there you get the children and proceed further.
You can refer further to Apache Chemistry In Memory repository - https://chemistry.apache.org/java/developing/repositories/dev-repositories-inmemory.html - it is an open source implementation which can help you dig deep.
And this is the spec: http://docs.oasis-open.org/cmis/CMIS/v1.1/CMIS-v1.1.html

Resources