Analysing data within Alfresco - alfresco

I am new to Alfresco and I want to analyse the data present(no of files, size e.t.c., ) in an Alfresco system using C# . I am writing code following the below steps:
Get the login token
Get List of Sites inside Alfresco
Foreach site, get the list of Document libraries
Foreach document library, get all of its contents
Foreach file, get its info.
Is there any better approach than this?

1, 2, 3 & 4 are already available in AAAR

Related

Can I use PowerBI to access SharePoint files, and R to write those files to a local directory (without opening them)?

I have a couple of large .xlsb files in 2FA-protected SharePoint. They refresh periodically, and I'd like to automate the process of pulling them across to a local directory. I can do this in PowerBI already by polling the folder list, filtering to the folder/files that I want, importing them and using an R script to write that to an .rds (it doesn't need to be .rds - any compressed format would do). Here's the code:
let
#"~ Query ~"="",
//Address for the SP folder
SPAddress="https://....sharepoint.com/sites/...",
//Poll the content
Source15 = SharePoint.Files(SPAddress, [ApiVersion=15]),
//... some code to filter the content list down to the 2 .xlsb files I'm interested in - they're listed as nested 'binary' items under column 'Content' within table 'xlsbList'
//R export within an arbitrary 'add column' instruction
ExportRDS = Table.AddColumn(xlsbList, "Export", each R.Execute(
"saveRDS(dataset, file = ""C:/Users/current.user/Desktop/XLSBs/" & [Label] & ".rds"")",[dataset=Excel.Workbook([Content])[Data]{0}]))
However, the files are so large that my login times out before the refresh can complete. I've tried using R's file.copy command instead of saveRDS, to pick up the files as binaries (so PowerBI never has to import them):
R.Execute("file.copy(dataset, ""C:/Users/current.user/Desktop/XLSBs/""),[dataset=[Content]])
with dataset=[Content] instead of dataset=Excel.Workbook([Content])[Data]{0} (which gives me a different error, but in any event would result in the same runtime issues as before) but it tells me The Parameter 'dataset' isn't a Table. Is there a way to reference what PowerBI sees as binary objects, from within nested R (or Python) code so that I can copy them to a local directory without PowerBI importing them as data?
Unfortunately I don't have permissions to set the SharePoint site up for direct access from R/Python, or I'd leave PowerBI out entirely.
Thanks in advance for your help

Alfresco 3.3 - Selecting document based on category

We have limitation of using 3.3 for sometime. Simple scanerio:
Created a folder structure in Site.
Created couple of documents in a folder under site
Created categories
Applied couple of categories on documents
Now I need to fetch those documents based on categories....
Tried:
1. CMIS way. - Not possible.
2. What other way?
Please suggest.
You can do a Lucene search, like this:
PATH:"/cm:categoryRoot/cm:generalclassifiable/cm:Languages/cm:German//member"
You can run that search from the node browser to try it out. You can also run it from server-side JavaScript, like this:
var results = search.luceneSearch('PATH:"/cm:categoryRoot/cm:generalclassifiable/cm:Languages/cm:German//member"');
print (results.length);

push external multimedia file in to package at tridion publish time

When we publish some page/dynamic component from tridion is it possible to add some external multimedia file/content(ex:jpg image) in to current executing/rendering package at publish time.So that final transportation package has this binary file present along with original published content?
Is this achivable using customization of tridion renderer/resolver?If yes please provide some inputs.
*Note:*The binary content that needs to be pushed in to package at publish time is not present as multimedia component in tridion, it is located at other file location outside tridion CMS.Instead we have some stub multimedia component being used inside published component/page which has some dummy image. we plan to replace the stub image with original image at publish(rendering/resolving) time.
Since we have huge bulk of binary content stored in DAM tool we dont want that data to be recreated as multimedia component in tridion, insted we want to use that data by querying DAM tool and attach it in to tridion package with some logical referencesplanning to maintain one to one mapping between stub multimedia comp tcmid to original content in some mapping DB for reference).
Please let us know if any solution is there to attach external binary content to package at publish time.
The best - and easiest way - is to use the mechanism provided by Tridion out-of-the-box for this. Create a new multimedia component, select "External" in the resource type drop-down, and type the URL to the object. As long as you can address it with a URL, it will work exactly as you want (item will be added to package and sent to delivery server).
If this is not good enough for you, then yes, you can add it to the package yourself. I've done this in the past with code somewhat like this:
FileInfo file = // Weird logic to get a FileInfo object from external system
Item item = package.GetItem("My original Item");
item.SetAsStream(file.OpenRead());
This replaced the content of my original component with the actual file I wanted. This will work for you IF the original component is also a multimedia component. If it's not, just create a new item with your own name, etc. If possible, do use the out-of-the-box process instead.
PS: FileInfo Class.
As Nuno suggested the best way is to use multimedia component with 'External' resource type. You may not need to create these manually, you can automate using core services or API programs.
Another way I used before to create zip file at run time and add same to package with following code. Hope it may help.
using (MemoryStream ms = new MemoryStream())
{
zip.Save(ms);
downloadAllInOneURL = String.Format("ZipAsset{0}.zip", uniqueZipID);
downloadAllInOneURL = m_Engine.PublishingContext.RenderedItem.AddBinary(ms, downloadAllInOneURL, "", "application/zip").Url;
downloadAllInOneSize = getSize(ms.Length);
}

best practice for DB & File search with lucene.net in a asp.net web application

i have site where i need to develop site search functionality. the data may reside in database table or may in aspx page as static word. i search google and found that lucene.net may be appropriate for the site search functionality. but i never use lucene.net so i dont know how to create lucene.net index file. i want to develop 2 utility in my site like
1) one for create & update index file reading data from database table & physical aspx file.
2) utility which search multiple single or multiple keyword against index file.
i found a bit of code snippet which i just do not understand
string indexFileLocation = #"C:\Index";
string stopWordsLocation = #"C:\Stopwords.txt";
var directory = FSDirectory.Open(new DirectoryInfo(indexFileLocation));
Analyzer analyzer = new StandardAnalyzer(
Lucene.Net.Util.Version.LUCENE_29, new FileInfo(stopWordsLocation));
what is Lucene.Net.Util.Version.LUCENE_29 what is stopWordsLocation
how data need to store in Stopwords.txt
but have no concept to develop the above 2 utility. so please guide me how search my DB and as well as aspx files with lucene.net....i will be glad if some one discuss here with bit of sample code. thanks
Lucene.Net.Util.Version.LUCENE_29 just indicates the Lucene version your are using, you should always use the most up to date in new code. It is there for backward compatibility in case you upgrade your Lucene with a version that changes the StandardAnalyzer, but you dont want to re-index all your data.
The stopWordsLocation is the location of a file with your stop words, words you dont want to index.
IE: it, he, she, the, or, and etc...
Its a regular text file, each line should contain 1 stop word, and separate each line with a linebreak.
http://lucene.apache.org/core/old_versioned_docs/versions/3_0_1/api/all/org/apache/lucene/analysis/WordlistLoader.html#getWordSet(java.io.Reader)

Drupal - Get image from url and import it into node

Im writing a module for drupal, Im trying to create a node from my module, everything is fine , I only have 1 problem with creating an image , The image exist on different server, so I want to grab the page and insert it , I install module http://drupal.org/project/filefield_sources , which has remote option , I search in the module code , I could not find the function that he used for this process, module work very nice from interface , but how i make it do the job from code ? which function should i call and what parameter should i pass .
I'm over Drupal 6.
Hopefully you're using Drupal 7...
The system_retrieve_file() function will download a file from a remote source, copy it from temp to a specified destination and optionally save it to the file_managed table if you want it to be managed.
$managed = TRUE; // Whether or not to create a Drupal file record
$path = system_retrieve_file($url, 'public://my_files/', $managed);
If you want to get the file object immediately after you've done this, the following is the only way I've found so far:
$file = file_load(db_query('SELECT MAX(fid) FROM {file_managed}')->fetchField());
get fid using $path->fid. no need to mysql

Resources