aikau - implement export search results functionality - alfresco-share

I have recently started developing with aikau in alfresco share.
I want to achieve a functionality wherein I can export search results to a CSV file.
For that, I can change the back-end repository web script to return csv data.
Now, At alfresco share end - I was successfully able to show the export link by adding a new widget to FCTSRCH_TOP_MENU_BAR. I used alfresco/renderers/PropertyLink to display this link. Now, the missing part for me is - how can I invoke the search web script passing additional param format=csv and alongwith that pass all the query parameters used to retrieve the results.
I am stuck with that. If I use the publishTopic as ALF_CRUD_GET_ALL and provide the URL there then it invokes the sample web script (I created to return sample csv response) and returns the response. However, the csv doesn't come as downloadable response. I am stuck here in order to how to achieve export csv functionality for search results.
It would be great if any of you can help me here and provide your guidance/suggestions.

This blog post provides an example on how you can custom the search page in Share. Although it specifically addresses changing the search queries the basic extension approach is more or less that same in that you will want to change the data that is used to send an XHR request. I think that the major difference here is that you may need to do more in-depth updates to the service - in particular with regards to the switch statement that is used to build the advanced search query object.
If you have extended or replaced the default search REST API then I would expect that you will need to call the same URL, but if you have provided an entirely new REST API to return the CSV data then you'll also need to change the URL that is used by the service.
In terms of providing a link for downloading the content we have previously implemented something in the DragAndDropModelCreationService (see the generateDownload function) but this only works with Chrome due to security limitations and the generation of files to download.
Your best bet may be temporarily store the CSV content on the repository in a hidden location and then use the standard download links to allow it to be downloaded - this would be more complex but would provide better cross browser support. Something similar is done for the "Download as ZIP" action.

OK, with the extra information provided I would do the following...
The information on the process of adding widgets to the search page are quite well detailed here (although you're not adding a view, you can follow the approach to add a new PropertyLink after the widget with the id "FCTSRCH_RESULTS_COUNT_LABEL").
The approach I would take would be to include an additional custom service on the page that subscribes to the "ALF_RETRIEVE_DOCUMENTS_REQUEST_SUCCESS" topic (which is published on a completed search). It should save the the search response in a variable in preparation for users clicking on the PropertyLink.
This custom service should also subscribe to a topic that is published by the PropertyLink (called say "DOWNLOAD_CSV"). This custom service could then generate a file download using the approach described in my previous answer using the CSV data that will have been provided in the payload. As I said though, this may only work with some browsers due to security reasons.
If your custom search WebScript were able to store the CSV data as a node on the Repository then you could just provide a the NodeRef of the CSV data in the search response and the PropertyLink could just publish the "ALF_DOWNLOAD" topic for the DocumentService to handle the download.
Trying to generate a file to download on the client side is going to be an issue for most browsers I think.

Related

What is "dlta" and "ridlist" parameters used in Requests library (Python)

Guys I am working on getting data as tables from QuickBase using Requests library (Python). I found somebody doing it using the URL of the report, but he added two parameters to the URL like that:
&dlta=xs%xx&ridlist=xxxx.
Can anybody please tell me what are those two parameters, I searched for them in the internet but found nothing related to them.
I've been using Quickbase for over ten years and haven't seen documentation for either of these parameters. I have noticed that ridList seems to be used by Quickbase's grid edit view of reports (I suspect it's an ID for a server-side cached list of record IDs to display especially when using the type-ahead search of a report before choosing to grid edit) and dlta is used in the "Download report as CSV" button.
That example you're following may have simply copy and pasted a link generated by Quickbase as a hack to get a CSV instead of XML response. I recommend following the Quickbase HTTP API Reference instead. If you don't want an XML response, Quickbase also has a JSON RESTful API which may be easier to work with.

Import.io - Can it replace Kimonolabs

I use Kimonolabs right now for scraping data from websites that have the same goal. To make it easy, lets say these websites are online shops selling stuff online (actually they are job websites with online application possibilities, but technically it looks a lot like a webshop).
This works great. For each website an scraper-API is created that goes trough the available advanced search page to crawl all product-url's. Let's call this API the 'URL list'. Then a 'product-API' is created for the product-detail-page that scrapes all necessary elements. E.g. the title, product text and specs like the brand, category, etc. The product API is set to crawl daily using all the URL's gathered in the 'URL list'.
Then the gathered information for all product's is fetched using Kimonolabs JSON endpoint using our own service.
However, Kimonolabs will quit its service end of february 2016 :-(. So, I'm looking for an easy alternative. I've been looking at import.io, but I'm wondering:
Does it support automatic updates (letting the API scrape hourly/daily/etc)?
Does it support fetching all product-URL's from a paginated advanced search page?
I'm tinkering around with the service. Basically, it seems to extract data via the same easy proces as Kimonolabs. Only, its unclear to me if paginating the URL's necesarry for the product-API and automatically keeping it up to date are supported.
Any import.io users here that can give advice if import.io is a usefull alternative for this? Maybe even give some pointers in the right direction?
Look into Portia. It's an open source visual scraping tool that works like Kimono.
Portia is also available as a service and it fulfills the requirements you have for import.io:
automatic updates, by scheduling periodic jobs to crawl the pages you want, keeping your data up-to-date.
navigation through pagination links, based on URL patterns that you can define.
Full disclosure: I work at Scrapinghub, the lead maintainer of Portia.
Maybe you want to give Extracty a try. Its a free web scraping tool that allows you to create endpoints that extract any information and return it in JSON. It can easily handle paginated searches.
If you know a bit of JS you can write CasperJS Endpoints and integrate any logic that you need to extract your data. It has a similar goal as Kimonolabs and can solve the same problems (if not more since its programmable).
If Extracty does not solve your needs you can checkout these other market players that aim for similar goals:
Import.io (as you already mentioned)
Mozenda
Cloudscrape
TrooclickAPI
FiveFilters
Disclaimer: I am a co-founder of the company behind Extracty.
I'm not that much fond of Import.io, but seems to me it allows pagination through bulk input urls. Read here.
So far not much progress in getting the whole website thru API:
Chain more than one API/Dataset It is currently not possible to fully automate the extraction of a whole website with Chain API.
For example if I want data that is found within category pages or paginated lists. I first have to create a list of URLs, run Bulk Extract, save the result as an import data set, and then chain it to another Extractor.Once set up once, I would like to be able to do this in one click more automatically.
P.S. If you are somehow familiar with JS you might find this useful.
Regarding automatic updates:
This is a beta feature right now. I'm testing this for myself after migrating from kimonolabs...You can enable this for your own APIs by appending &bulkSchedule=1 to your API URL. Then you will see a "Schedule" tab. In the "Configure" tab select "Bulk Extract" and add your URLs after this the scheduler will run daily or weekly.

A problems with detect data

The task is download the table with names of bookmakers and odds (here).
I can not find in source code part which corresponds to these data. I tried to use chrome extension named SelectorGadget, unsuccessfuly.
Similarly, when I want to open matches (matches) I meet same problem. Thank you for any advice.
The data is not in the HTML, it is dynamically loaded via JavaScript.
From the Terms of Service:
Without prior authorisation in writing from the Provider, Visitors are not authorised to copy, modify, tamper with, distribute, transmit, display, reproduce, transfer, upload, download or otherwise use or alter any of the content of the Website.
Therefore, do not expect us to assist you with breaching their terms of use.

Web API Help Samples - C#

ASP.NET Web API has an easy install Nuget help page with sample generator. It's easy to get it to generate and display sample requests, but not so easy it seems to get it to display sample responses (httpsampleresponses) so that when developers look at the help page they'd see examples of generated responses / not static/typed in responses, but actually generated. I've seen it done before on another project, but still having trouble figuring out how to do it. MSDN's YAO has a good blog but it's just not getting me all the way to success for some reason.
From what I've seen work live and based on what there is to read about it online, it's definitely in getting the HelpPageConfig file right in terms of the config.SetSampleResponses() set up. I've discovered the configuration file that sets the parameters for the SetSampleResponses() method, but still, nothing I try is working. It was suggested to me that I should create a custom type and use extension methods, but getting that to correspond and display what I need hasn't happened yet. I can get it to compile without errors, but it still doesn't show the generated response sample on the page. It was easy with the SetSampleForType piece to get a section to show up in the requests section, but it's the response part that has given me trouble.
Has anyone out there done this with the SetSampleResponses() successfully and is there any kind of trick you can clearly define for getting it to work? Do you have any tips on setting up a specific generic type and making that work?
I'm thinking this must be something really simple and I'm just not clicking to make it happen....
Thanks for any potential info...
SetSampleResponse extension on HelpPageConfig is for statically defining samples for you action.
config.SetSampleResponse("\"Hello World!\"", new MediaTypeHeaderValue("application/json"), "Values", "Get", "id");
if you are looking for generated sample for a particular type, have you tried using SetSampleObjects extension which allows you to set sample objects for different types and this same object is used in all cases where that particular type is returned from an action.
config.SetSampleObjects(new Dictionary<Type, object>
{
{typeof(string), "Hello World!"}
});
Could you share more specific(code) details as to how you are using SetSampleResponse extension?

MS Word mailmerge like functionality to allow export to Word document from ASP.Net Web application

I have requirement where I need to allow users to upload a Word document with place holders for certain fields which can be found in the database. This will be their template. For example the place holders might be prepended with ## or something. For example
Dear ##Title ##Lastname
They then can grab a record and hit export to Word document. This will then let them choose the template. They can select the template and then click continue. I will then get the template and replace the ##Title with the title field in the database for the selected record. I am not sure where to start or what components I need to do this.
From my initial investigation it seems that I can do this with the new open XML standard for Office 2007. So perhaps I should read in the template and save all the contents to a db table somewhere. Then when the use wants to export I get the contents again and then do a search and replace for the ## placeholders and link them properly. Then save the document to the output stream again which will then bring up the save dialog on their browser.
I am using ASP.Net MVC and am in a hosted environment. I was also maybe contemplating dynamically creating a new View type and dynamically creating new views when the user uploads a template. Not sure that this approach will work though.
Is this a good approach?
What tools should I be looking at?
Any other suggestions?
This is similar to an approach we took for inserting data into word documents and then returning them to the user. We opened the .docx file (it's a zip file so easy to extract) extracted the document (in the word folder called document.xml), did the replace and then put the document back into the .docx file and returned it to the user.
An issue we hit were that word inserted tags at strange places, especially things like spell/grammar errors, so we needed to be careful when we did the search/replace.
We decided not to store the fields from the document in a database to allow the documents to be easily updated.
We used dotnetzip component for opening the .docx files
Something we also did was to combine several documents into a single large document to save on the number of downloads. If I remember we used the open xml toolkit to do this merging. The website has also got loads of other information that may be of use.
Check out Scott Guthries blog post about the new view engine code named "Razor" coming out real shortly from Microsoft. In the comments there is talk about it being able to be used in mail merge scenarios like you talk about with ASP.NET MVC views.

Resources