Find Out the content of the pdf is indexed or not?

Find Out the content of the pdf is indexed or not? - alfresco

How can i find-out whether the content of a file is indexed or not in
alfresco?
I am using alfresco-5.0.d.

You can hit the Solr web app directly. First, figure out what your node's sys:node-dbid is. You can get that from the node browser in the admin console.
Suppose it is 6834, for example. You can then go to:
https://localhost:8443/solr4/alfresco/select?q=DBID:6834&wt=json
Which will return:
{
"responseHeader":{
"status":0,
"QTime":0,
"params":{
"q":"DBID:6834","wt":"json"
}
},
"response":{
"numFound":1,
"start":0,"docs":[{
"id":"_DEFAULT_!800000000000000b!8000000000001ab2",
"_version_":0,
"DBID":6834
}]
}
}
Assuming the node has been indexed. If it has not been indexed, numFound will be 0.
Doing this assumes you have the client certificate Solr and Alfresco are using to identify each other added to your browser.

Related

Security rules for nested subdocuments with firebase

I am trying to create a text editor app using firebase that allows users to create documents, but they can also nest a new document inside an existing document (when editing a document, they would be able to click on a button that would add a new document in the database and insert a link in the editor that redirects towards this page):
A user would be able to share a document with other users, but then they should have access to all the nested documents as well. So now I am wondering how to write the security rules to do that.
I think the best way to structure the realtime database would be to store all documents at the root, and then add a parentDocument or path property to each document:
{
"documents": {
"doc-1": {
"title":"Lorem ipsum",
"content": "...",
"path":"/",
"owner":"user-1",
"canAccess":{
"user-3":true
}
},
"doc-2": {
"title":"Dolor sit",
"content": "...",
"path":"/doc-1/",
"owner": "user-1"
"canAccess": {
"user-2":true
}
}
},
"users": {
"user-1": { ... },
"user-2": { ... },
"user-3": { ... }
}
}
↑ In the example below,
doc-2 is nested inside doc-1
user-1 can access both doc-1 and doc-2
user-2 can access doc-2 only
user-3 can access both doc-1 and doc-2
But now I do not know how to manage the security rules, because to check if a user has access to a specific document, I guess it would need to go through each of its parents (using its path or parentDocument prop). Perhaps I could also specify the canAccess prop on each document, but then I would have to update each nested document whenever a parent's canAccess prop is updated...
Any help would be greatly appreciated

In the Firebase Realtime Database model permission automatically cascades downwards. This means that once you grant a user (read or write) permission on a specific path, they can also access all data under that path. You can never revoke the permission at a lower level anymore.
So your requirement actually matches really nicely with this model, and I'd recommend just trying to implement it and reporting back if you run into problems.

How to store keywords in firebase firestore

My application use keywords extensively, everything is tagged with keywords, so whenever use wants to search data or add data I have to show keywords in auto complete box.
As of now I am storing keywords in another collection as below
export interface IKeyword {
Id:string;
Name:string;
CreatedBy:IUserMin;
CreatedOn:firestore.Timestamp;
}
export interface IUserMin {
UserId:string;
DisplayName:string;
}
export interface IKeywordMin {
Id:string;
Name:string;
}
My main document holds array of Keywords
export interface MainDocument{
Field1:string;
Field2:string;
........
other fields
........
Keywords:IKeywordMin[];
}
But problem is auto complete reads data frequently and my document reads quota increases very fast.
Is there a way to implement this without increasing reads for keyword ? Because keyword is not the real data we need to get.
Below is my query to get main documents
query = query.where("Keywords", "array-contains-any", keywords)
I use below query to get keywords in auto complete text box
query = query.orderBy("Name").startAt(searchTerm).endAt(searchTerm+ '\uf8ff').limit(20)
this query run many times when user types auto complete search which is causing more document reads

Does this answer your question
https://fireship.io/lessons/typeahead-autocomplete-with-firestore/
Though the receommended solution is to use 3rd party tool
https://firebase.google.com/docs/firestore/solutions/search
To reduce documents read:
A solution that come to my mind however I'm not sure if it's suitable for your use case is using Firestore caching feature. By default, firestore client will always try to reach the server to get the new changes on your documents and if it cannot reach the server, it will reach to the cached data on the client device. you can take advantage of this feature by using the cache first and reach the server only when you want. For web application, this feature is disabled by default and you can enable it like in
https://firebase.google.com/docs/firestore/manage-data/enable-offline
to help you understand this feature more check this article:
https://firebase.google.com/docs/firestore/manage-data/enable-offline

I found a solution, thought I would share here
Create a new collection named typeaheads in below format
export interface ITypeAHead {
Prefix:string;
CollectionName:string;
FieldName:string;
MatchingValues:ILookupItem[]
}
export interface ILookupItem {
Key:string;
Value:string;
}
depending on the minimum letters add either 2 or 3 letters to Prefix, and search based on the prefix, collection and field. so most probably you will end up with 2 or 3 document reads for on search.
Hope this helps someone else.

Opening Alfresco document from Solr4 Search result

I'm using Alfresco 5.1 community Edition with Solr4 configured as Search Service and Transaction queries configured as Hybrid (Solr & DB)
When I do a search in Solr GUI from the below URL
Solr Query GUI: https://localhost:8443/solr4/#/alfresco/query
I get the search results in the below format with some ID & other info.
Solr Search Result (Results JSON truncated for readability)
{
"responseHeader": {
"status": 0,
"QTime": 25,
"params": {
"q": "testing",
"defType": "dismax",
"qt": "",
"indent": "true",
"wt": "json",
"_": "1476349027637"
}
},
...
"docs": [
{
"id": "_DEFAULT_!8000000000000040!80000000000008e3",
"_version_": 0,
"DBID": 2275
},
{
"id": "_DEFAULT_!8000000000000072!8000000000000902",
"_version_": 0,
"DBID": 2306
},
{
"id": "_DEFAULT_!8000000000000040!80000000000008ea",
"_version_": 0,
"DBID": 2282
},
{
"id": "_DEFAULT_!800000000000000b!80000000000008ef",
"_version_": 0,
"DBID": 2287
},
{
"id": "_DEFAULT_!8000000000000071!80000000000008f0",
"_version_": 0,
"DBID": 2288
},
{
"id": "_DEFAULT_!8000000000000025!80000000000008eb",
"_version_": 0,
"DBID": 2283
}
]
},
"processedDenies": false
}
I'm trying to build a UI where in these search results displayed, a user can click through to retrieve the respective document in Alfresco. Below is the Alfresco API I use to retrieve content from Alfresco.
Alfresco API URL to open a Document : http://localhost:8080/alfresco/api/-default-/public/cmis/versions/1.1/atom/content?id=
A sample Alfresco document ID looks like the one shown below. I don't get such ID returned in Solr4 search results.
Sample Document Id:
7edf97f4-43cf-4fe5-8099-85608776d159
Questions:
1) What is the ID returned by Solr4 ?
2) How do I get the relevant Alfresco document ID to be able to retrieve the same from the search result ?
EDIT:
Some background about my requirement to use Solr directly
Alfresco will be used to create documents based on some templates by interal users (business content administrators from Intranet typically). We've a front end web app (customer facing) which will have a Search section. When users perform a search operation with some keywords (Typically full text search), we would be invoking Solr API to search content in the documents created by Business Admins and the same results would be displayed on the Front end of Web app. When users clicks on the respective search results, the document content would be retrieved from Alfresco & displayed on the Front end webapp.
Thanks in advance.

It would be much easier to implement it as Alfresco Web Script.
With Web Scripts, you can either build your own RESTful interface
using light-weight scripting technologies such as JavaScript and
Freemarker.
Using web script you can access search root object:
search - org.alfresco.repo.jscript.Search -
Root object providing access to the various Alfresco search interfaces
such as FTS-Alfresco, Lucene, XPath, and Saved Search results
Your REST web script may be available to every user but run as admin:
<webscript>
<shortname>My Rest Query</shortname>
<url>/api/my/query</url>
<format default="json">argument</format>
<authentication runas="admin">guest</authentication>
<transaction allow="readonly">required</transaction>
</webscript>
There are many tutorials...

1) The ID returned by Solr is probably the ID of the indexed document in Solr. You can't use it with Alfresco.
2) It seems that Solr returns the DBID of the nodes. DBID is the property sys:node-dbid from aspect sys:referenceable defined in the file systemModel.xml and which refers to the database id of the node.
You can build an Alfresco repo webscript which takes this DBID as parameter and returns the document.
But as imagine said, you'd better directly ask Alfresco to execute your Solr query. It would return a list of documents with all the metadata you need, including the download URL of each document.

Adding a partial answer to your 2nd question because locating this info was hard and took quite some time. (2. How do I get the relevant Alfresco document ID to be able to retrieve the same from the search result ?)
To find the document associated with that DBID, you can use the following search syntax:
Go to Admin Tools -> Node Browser
Change query type to lucene
Enter the following search term: #sys\:node-dbid:THE_DBID_YOU_WANT_TO_FIND
For example, looking at our local solr4 error report:
{
"responseHeader":{
"status":0,
"QTime":0,
"params":{
"q":"ERROR*"}},
"response":{"numFound":2,"start":0,"docs":[
{
"id":"_DEFAULT_!800000000000008c!8000000000002289",
"_version_":0,
"DBID":4499},
...
To find that document, search for: #sys\:node-dbid:4499
You can add quotes around the numeric DBID - it works with and without them.
The '#' and the first backslash '\' (escaping the first colon) are REQUIRED - the query breaks if these are removed and an error will be logged in catalina.out.
The second colon MUST NOT include a backslash escape - it is NOT an error (nothing in the log) but no result will be found.
If necessary change the search scope from workspace://SpacesStore to archive://SpacesStore to locate docs that have been deleted.
You can join the DBID's as shown below to find them all at once (at least those in the same spaces store):
#sys\:node-dbid:1234 OR #sys\:node-dbid:2345 OR #sys\:node-dbid:...

Key vault values from deployment, and linked templates parameters

I have a template to create a key vault and a secret within it. I also have a service fabric template, that requires 3 things from the key vault: the Vault URI, the certificate URL, and the certificate thumbprint.
If I create the key vault and secret with powershell, it is easy to manually copy these 3 things from the output, and paste them into the parameters of the service fabric template. However, what I am hoping to do, due to the fact that this cert has the same life cycle as the service fabric cluster, is to link from the key vault template to the service fabric template, so when I deploy the key vault and secret (which btw is a key that has been base 64 encoded to a string. I could have this as a secret in yet another key vault...), I can pass the 3 values on as parameters.
So I have two questions.
How do I retrieve the 3 values in the arm template. Powershell outputs them as 'ResourceId' of the key vault, 'Id' of the secret, and 'Version' of the secret. My attempt:
"sourceVaultValue": {
"value": "resourceId('Microsoft.KeyVault/vaults/', parameters('keyVaultName')"
},
"certificateThumbprint": {
"value": "[listKeys(resourceId('secrets', parameters('secretName')), '2015-06-01')"
},
"certificateUrlValue": { "value": "[concat('https://', parameters('keyVaultName'), '.vault.azure.net:443/secrets/', parameters('secretName'), resourceId('secrets', parameters('secretName')))]"
But the certificateUrlValue is incorrect. You can see I tried with and without listKeys, but neither seemed to work... (The thumbprint is within the certUrl itself)
If I were to get the correct values, I would like to try pass them as parameters to the next template. The template in question has quite a few more parameters than the 3 I want to pass however. So is it possible to have a parametersLink element to link to the parameter file, as well as a parameters element for just those 3? Or is there an intended way of doing this?
Cheers

Ok, try this when you get back to the keyboard...
1) for the uri, you can use an output like:
"secretUri": {
"type": "string",
"value": "[reference(resourceId('Microsoft.KeyVault/vaults/secrets', parameters('keyVaultName'), parameters('secretName'))).secretUri]"
}
For #2, you cannot mix and match the link and some values, it's one or the other.
A couple thoughts on how you could do this (it depends a bit on how you want to structure the rest of your deployment)...
One way to think of this is instead of nesting the SF, deploy them in the same template since they have the same lifecycle
instead of nesting the SF template, nest the KV template and reference the outputs of that deployment in the SF template...
Aside from that I can't think of anything elegant - since you want to pass "dynamic" params to a nested deployment really the only way to do that is to dynamically write the param file behind the link or pass all the params into the deployment resource.
HTH - LMK if it doesn't...

Can't Reference a secret with dynamic id !!!!
The obvious problems with this way of doing things are:
Someone needs to type the cleartext password which means:
it needs to be known to anyone who provisions the environment and how do I feed it into an automated environment deployment? If I store the password in a parameter… ???????
"variables": {
"tenantPassword": {
"reference": {
"keyVault": {
"ID": "[concat(subscription().id,'/resourceGroups/',parameters('keyVaultResourceGroup'),'/providers/Microsoft.KeyVault/vaults/', parameters('VaultName'))]"
},
"secretName": "tenantPassword"
}
}
},

Meteor utilities:avatar data

I'd like to use the utilities:avatar package, but I'm having some major reservations.
The docs tell me that I should publish my user data, like this:
Meteor.publish("otherUserData", function() {
var data = Meteor.users.find({
}, {
fields : {
"services.twitter.profile_image_url_https" : true,
"services.facebook.id" : true,
"services.google.picture" : true,
"services.github.username" : true,
"services.instagram.profile_picture" : true
}
});
return data;
});
If I understand Meteor's publish/subscribe mechanism correctly, this would push these fields for the entire user database to every client! Clearly, this is not a sustainable solution. Equally clearly, however, either I am doing something wrong, or I am understanding something wrong.
Also: This unscalable solution works fine in a browser, but no avatar icons are visible when the app is deployed to a mobile device, for some reason.
Any ideas?

Separate the issue of which fields to publish from which users you want to publish data on.
Presumably you want to show avatars for other users that the current user is interacting with. You need to decide what query to use in
Meteor.users.find(query,{fields: {...}});
so that you narrow down the list from all users to just pertinent ones.
In my app I end up using reywood:publish-composite to publish the users that are related to the current user via an intermediate collection.

The unscalability of utilities:avatar seems, as far as I can tell, to be a real issue, and there isn't much to be done about it except to remove utilities:avatar and rewrite the avatar URL-fetching code by hand.
As for the avatars not appearing on mobile devices, the answer was simply that we needed to grant permission to access remote URLs in mobile-config.js, like this:
App.accessRule("http://*");
App.accessRule("https://*");

Develop Reference

r css asp.net wordpress firebase qt symfony nginx http apache-flex

Find Out the content of the pdf is indexed or not? - alfresco

How can i find-out whether the content of a file is indexed or not in alfresco? I am using alfresco-5.0.d.

Related

Security rules for nested subdocuments with firebase

How to store keywords in firebase firestore

Opening Alfresco document from Solr4 Search result

Key vault values from deployment, and linked templates parameters

Meteor utilities:avatar data

Categories

Resources