How to store Map in Solr document? - dictionary

I am trying to add HashMap into Solr document. I have gone through Solr documentation but didn't get exactly what I am trying to do.
Solr supports list but I need Map stored in document.
Example:
{
"id": "123456",
"name": "Foo bar",
"social":{
"facebook":"facebookid",
"twitter":"twitterid"
}
}

The easiest way is to flatten the structure before indexing it, as that will make queries, facets and most other features easier to use (as you won't have to use any blockjoins, child/parent-support, etc.).
Index the content as:
{
"id": "123456",
"name": "Foo bar",
"social_facebook": "facebookid",
"social_twitter": "twitterid"
}
.. and recreate the map if necessary (it usually isn't) in your query layer (which would vary depending on the language and library you're using).

Related

JSONPath concat arrays and joing

I am trying to use JSONPath expression to extract 2 arrays and concatenate them today to form a long array - and ideally making them unique. However, no syntax I use in the online playground produces results. Ultimately I will want to use this in a Jenkins job. So, it has to be operational with the Java implementation
Here is my json
{
"commits": [
{
"author": {
"name": "Christian",
"email": "christian#company.com"
},
"added": [
"deleteme/main.tf"
],
"modified": [
],
"removed": [
"deleteme.txt"
]
}
]
}
And here is my expression (which I am trying to evaluate here):
$.append($.commits[*].added[*], $.commits[*].removed[*])
Which I would love to see as
["deleteme/main.tf", "deleteme.txt"]
Or better still:
"deleteme/main.tf, deleteme.txt"
because ultimately I need a comma separated set of values
Oh yeah: Bonus for only unique results
JSON Path is a query language. Some tools have used it to enable transformation functionality, but this is not something that JSON Path itself generally supports.
Disclaimer and Open Specification Plug
The online tool you linked is not "official." There is no "official" tooling because there is no specification. Pretty much everyone has been building implementations from a 15yo blog post and adding their own features.
But there's good news: we're building a spec!

Is there a way to reduce time complexity on the frontend with using Drupal json-api includes?

I'm currently working with an output from the Drupal json-api module and have noticed that the structure of an output forces an O(n^2) time complexity issue on the front by forcing the front end developers to reformat the json output given to them so an attachment can me in the same object as the entity it belongs to.
Example
So let's say I'm listing a bunch of categories with their thumbnails to be used on the front end. What a json output would normally look like for that is something like:
Normal category json structure
[
{
"uid":123,
"category_name":"cars",
"slug":"cars",
"thumbnail":"example.com/cars.jpg"
},
{
"uid":124,
"category_name":"sports",
"slug":"sports",
"thumbnail":"example.com/sports.jpg"
}
]
With drupal it seems that thumbnails are in their own includes separate from data creating an O(n^2). For example:
I make a get request using this endpoint:
example.com/jsonapi/taxonomy_term/genre?fields[taxonomy_term--genre]=name,path,field_genre_image,vid&include=field_genre_image
The structure of the data returned from the drupal json api module is going to be similar to this:
pseudo code for better readability
{
"data":[
{
"uid":123,
"category_name":"cars",
"slug":"cars",
"relationships":{
"thumbnail":{
"id":123
}
}
},
{
"uid":124,
"category_name":"sports",
"slug":"sports",
"relationships":{
"thumbnail":{
"id":124
}
}
}
],
"included":[
{
"type":"file",
"id":123,
"path":"example.com/cars.jpg"
},
{
"type":"file",
"id":124,
"path":"example.com/sports.jpg"
}
]
}
The problem with the drupal output is that I have to loop through the data and then in the data loop loop through the includes and attach each thumbnail to the category causing an O(n^2) on the frontend.
Is there a way for the frontend to request a category using the drupal json module to contain the thumbnail in the category like the normal json output above without having to restructure the json api on the frontend?
Please note I am not a drupal developer so the terminology I might use will be off.
JSON:API can output a list of entities and includes another list of entities (can have different types). each entity has UUID, so, accessing them can be O(logn) or even O(0) if you apply index to their UUID
So, you would have one loop to parse each of the included entity and store and index them (like SQLite), or simply loop over all included entities, build 1 array key by UUID and value is object of an entity

Create document set in Sharepoint with Graph API in a subfolder

I already implemented the creation of a document set at library root level. For this I used the following link: Is it possible to create a project documentset using graph API?
I follow the following steps :
1- Retrieve the document library's Drive Id:
GET https://graph.microsoft.com/v1.0/sites/${siteId}/lists/${listId}?$expand=drive
2- Create the folder:
POST https://graph.microsoft.com/v1.0/drives/${library.drive.id}/root/children
The body of the request is the following
{
"name": ${folderName},
"folder": {},
}
3- Get the folder's SharePoint item id:
GET https://graph.microsoft.com/v1.0/sites/${siteId}/drives/${library.drive.id}/items/${folder.id}?expand=sharepointids
4- Update the item in the Document Library so that it updates to the desired Document Set:
PATCH https://graph.microsoft.com/v1.0/sites/${siteId}/lists/${listId}/items/${sharepointIds.listItemId}
I do send the following body to the patch request:
{
"contentType": {
"id": "content-type-id-of-the-document-set"
},
"fields": {}
}
I'm looking now how to create a document set in a specific folder in Sharepoint.
For example, i want to create the following folder structure.
The documents folder is at the library root and I want to create a document set named billing.
documents
|_ billing
|_ 2021
|_11
|_01
|_ document1.pdf
|_ document2.pdf
|_ document3.pdf
|_02
...
|_03
...
|_04
|_10
...
thanks
I'm doing something similar but I'm a little behind you, haven't yet created the Document Set (almost there!), but may I respectfully challenge your approach?
I'm not sure it's a good idea to mix Folders and Document Sets, mainly because a Folder breaks the metadata flow, you could achieve the same results just using Document Sets.
I am assuming that your 'day' in the data structure above is your Document Set (containing document1.pdf, etc.) You might want to consider creating a 'Billing' Document Set and either specifically add a Date field to the Document Set metadata or, perhaps better still, just use the standard Created On metadata and then create Views suitably filtered/grouped/sorted on that date.
This way you can also create filtered views for 'Client' or 'Invoice' or 'Financial Year' or whatever.
As soon as your documents exist in a folder, you can no longer filter/sort/group etc., the document library based on metadata.
FURTHER INFORMATION
I am personally structuring my Sales document library thus:
Name: Opportunity; Content Type: Document Set; Metadata: Client Name, Client Address, Client Contact
Name: Proposal; Content Type: Document; Metadata: Proposal ID, Version
Name: Quote; Content Type: Document; Metadata: Quote ID, Version
Etc...
This way the basic SharePoint view is a list of Opportunities (Document Sets), inside which are Proposals, Quotes etc., but I can also filter the view to just show Proposals (i.e. filter by Content Type), or search for a specific Proposal ID, or group by Client Name, then sort chronologically, or by Proposal ID etc.
I'm just saying that you get a lot more flexibility if you avoid using Folders entirely.
p.s. I've been researching for days now how to create Document Sets with graph, it never occurred to me that it might be a two-step process i.e. create the folder, then patch its content type. Many thanks for your post!!
Just re-read your post and my assumption that the 'day' would be your document set was incorrect. In this case, there would be no benefit having a Document Set containing Folders because the moment a Folder exists in the Document Set, metadata flow stops, and the only reason (well, the main reason*) to use Document Sets in preference to Folders is that metadata flow.
*Document Sets also allow you to automatically create a set of documents based on defined templates.

Using r googledrive package to create custom property

According to the documentation for drive_mkdir() in the googledrive package, "Named parameters to pass along to the Drive API. Has the tidy dots semantics that come from using rlang::list2(). You can affect the metadata of the target file by specifying properties of the Files resource via .... Read the "Request body" section of the Drive API docs for the associated endpoint to learn about relevant parameters." In the Google Drive API, it lists adding custom properties as, "Custom file properties are key/value pairs used to store custom metadata for a file, such as tags, IDs from other data stores, information shared between workflow applications, and so on.
To add properties visible to all apps, use the properties field of files resource."
In the Google API Files Page, it lists it as
"properties": {
(key): string}
I've tried a number of different ways to pass a value, but nothing seems to work. Does anyone have an example of adding a custom property while creating a folder?
Here is one example I've tried that does not work:
GDriveTarget <- "https://drive.google.com/drive/u/1/folders/'etc'"
drID <- drive_get(GDriveTarget, verbose = TRUE)
gProps <-list2(properties = c("Region","Far Northeast"))
curFolder <- drive_mkdir(name="School Folders",
path=GDriveTarget,
gProps)
this results in:
Error: These parameters are unknown:
Backtrace:
googledrive::drive_mkdir(...)
googledrive::drive_create(...)
googledrive::request_generate(...)
gargle::request_develop(endpoint = ept, params = params)
gargle:::check_params(params, endpoint$parameters)
gargle:::stop_bad_params(unknown, reason = "unknown")
Removing "gProps" creates the desired folder, so I have the proper rights. I'm just unsure how to pass the parameters Google is expecting. When I use the "try this API" tool on Google Developer, what it is expecting is:
{
"properties": {
"Region": "Far Northeast"
}
}

Aggregating a list of http log paths in kibana

I have a nginx->fluentd->elasticsearch->kibana stack up and running. Trying to figure if I can do something like a "terms" panel but with a path string component from logs. Using a terms panel directly on that results in top used words from paths, e.g. for drupal it shows "node" as the most popular, which is quite useless without actual node id.
Is that something that is possible to do with elasticsearch?
Update: Here's a sample of my logs:
"path": "/node/123"
"path": "/node/456"
"path": "/user/create"
If I add a "terms" panel for "path" field, I get columns for "node", "user", "create", which make no statistical sense. What I need is a terms panel that aggregates on unique field values, not unique word parts of the field.
You need to configure Elasticsearch's mapping for setting your "path" field as a "not_analyzed" one. The default setting is "analyzed" and by default, ES parses the string fields and divide them in multiple tokens when possible, which is probably what happened in your case. See this related question.
As for how to configure Elasticsearch's mapping, I am also still digging, having a similar problem myself with multi-token strings I want to be able to sort on. It seems like there would be a put mapping API or the possibility of using config files, see here.

Resources