If a module should output a file with binary data, what should its interface look like so that the file can be mapped correctly in other modules in a scenario?
The interface should contain data and filename.
[
{
"name": "data",
"type": "buffer",
"label": "Data",
"semantic": "file:data"
},
{
"name": "fileName",
"label": "File Name",
"type": "text",
"semantic": "file:name"
},
]
You can take a look at the Dropbox example: https://www.integromat.com/app/dropbox/5/module/getFile#tab:interface
Related
I have one parquet file ,trying to get the data in to the table. In one column it have json with multiple values. Can someone help me how to do in kusto?
Pasting the json's schema:
{
"$schema": "http://json-schema.org/draft-04/schema#",
"type": "object",
"properties": {
"path": {
"type": "string"
},
"partitionValues": {
"type": "object",
"properties": {
"deviceId": {
"type": "string"
},
"date": {
"type": "string"
}
},
"required": [
"deviceId",
"date"
]
},
"size": {
"type": "integer"
},
"modificationTime": {
"type": "integer"
},
"dataChange": {
"type": "boolean"
},
"stats": {
"type": "string"
}
},
"required": [
"path",
"partitionValues",
"size",
"modificationTime",
"dataChange",
"stats"
]
}
If I understood correctly, your PQ file contains a column with a JSON of the specified schema. If you want to ingest it as-is, ingest it into Kusto column with type "dynamic" and query later. If you'd like to ingest just part of this JSON data (like some inner fields), use ingestion mapping and provide appropriate JSON path.
Another option is to ingest as-is into a source table with a Retention policy with SoftDeletePeriod of zero. Define an Update Policy with a KQL query and the transformed data will be pushed into a target table.
I want to have access to the definitions in the schema in order to get the naming of the definition. I am using newtonsoft.json v11.01
I am building a c# converter for jsonschema to make a syntaxtree and compile it in order to get a typed version of the object at runtime.
{
"$id": "https://example.com/arrays.schema.json",
"$schema": "http://json-schema.org/draft-07/schema#",
"description": "xml remarks",
"type": "object",
"properties": {
"fruits": {
"type": "array",
"items": {
"type": "object",
"title": "fruit",
"required": ["naam"],
"properties": {
"naam": {
"type": "string",
"description": "The name of the fruit."
}
}
}
},
"vegetables": {
"type": "array",
"items": { "$ref": "#/definitions/veggie" }
}
},
"definitions": {
"veggie": {
"type": "object",
"required": [ "veggieName", "veggieLike" ],
"properties": {
"veggieName": {
"type": "string",
"description": "The name of the vegetable."
},
"veggieLike": {
"type": "boolean",
"description": "Do I like this vegetable?"
}
}
}
}
}
in the schema a reference is created with the name veggie. This is used in the property vegetable with a reference.
Json schema contains a definition on the root object but is doesn't have it on the property element. On the property element there is nothing identifiable to point to the right definition.
how do i find the right definition for the property?
In general, in order to resolve a json-pointer (a $ref is a "uri-reference", and the part after the # is a "json-pointer"), you need to have access to the root of the json document.
So if you currently have a function that only gets an argument that points to the "properties" section, then you need to give that function a second argument that points to the root of the document.
(It gets more complicated when you're using a schema made up of more than one file; then you need a second argument that points to the roots of all the schema documents)
It's one of the more difficult parts of writing software that interprets json-schema files, especially if your language/library doesn't have support for json-pointers built-in - you'll need to write it yourself in that case.
I am working on a cognitive skillset in Azure Search, and I need to add Text Translator functionality to the project. Currently all of the text is translated correctly, unless the character count is above the maximum value. In which case, the api returns null.
I am currently using document/content as my input for the text translator, but I've also tried using the output of the predefined Split Skill. When I pass the files through the Split Skill (by page) before the translator skill, the indexer breaks and all of the files fail to index (even the ones that don't need to be translated)
This code taken from my skillset.json that translates all of the files below the character limit cutoff.
{
"#odata.type": "#Microsoft.Skills.Custom.WebApiSkill",
"description": "Our new translator custom skill",
"uri": "[my-uri]",
"batchSize":1,
"context": "/document",
"inputs": [
{
"name": "text",
"source": "/document/content"
},
{
"name": "language",
"source": "/document/languageCode"
}
],
"outputs": [
{
"name": "text",
"targetName": "translatedText"
}
]
}
This is my attempt to split the text by page before passing the text through the translator api. This results in a 501 error.
{
"#odata.type": "#Microsoft.Skills.Text.SplitSkill",
"textSplitMode" : "pages",
"maximumPageLength": 4000,
"inputs": [
{
"name": "text",
"source": "/document/content"
},
{
"name": "languageCode",
"source": "/document/languageCode"
}
],
"outputs": [
{
"name": "textItems",
"targetName": "pages"
}
]
},
{
"#odata.type": "#Microsoft.Skills.Custom.WebApiSkill",
"description": "Our new translator custom skill",
"uri": "[my-uri]",
"batchSize":1,
"context": "/document/pages/*",
"inputs": [
{
"name": "text",
"source": "/document/pages/*"
},
{
"name": "language",
"source": "/document/languageCode"
}
],
"outputs": [
{
"name": "text",
"targetName": "translatedText"
}
]
}
I use the exact same implementation for the Named Entity Recognition Skill (using document/pages/* as input), and it works fine. I'm not sure what the difference would be with the Text Translator skill.
I am trying to automate my project create process and would like as part of it to create a new jupyter notebook and populate it with some cells and content that I usually have in every notebook (i.e., imports, titles, etc.)
Is it possible to do this via python?
You can do it using nbformat. Below an example taken from Creating an IPython Notebook programatically:
import nbformat as nbf
nb = nbf.v4.new_notebook()
text = """\
# My first automatic Jupyter Notebook
This is an auto-generated notebook."""
code = """\
%pylab inline
hist(normal(size=2000), bins=50);"""
nb['cells'] = [nbf.v4.new_markdown_cell(text),
nbf.v4.new_code_cell(code)]
fname = 'test.ipynb'
with open(fname, 'w') as f:
nbf.write(nb, f)
This is absolutely possible. Notebooks are just json files. This
notebook for example is just:
{
"cells": [
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# Header 1"
]
},
{
"cell_type": "code",
"execution_count": 2,
"metadata": {
"ExecuteTime": {
"end_time": "2016-09-16T16:28:53.333738",
"start_time": "2016-09-16T16:28:53.330843"
},
"collapsed": false
},
"outputs": [],
"source": [
"def foo(bar):\n",
" # Standard functions I want to define.\n",
" pass"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Header 2"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"collapsed": true
},
"outputs": [],
"source": []
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 2",
"language": "python",
"name": "python2"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 2
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython2",
"version": "2.7.10"
},
"toc": {
"toc_cell": false,
"toc_number_sections": true,
"toc_threshold": 6,
"toc_window_display": false
}
},
"nbformat": 4,
"nbformat_minor": 0
}
While messy it's just a list of cell objects. I would probably create my template in an actual notebook and save it rather than trying to generate the initial template by hand. If you want to add titles or other variables programmatically, you could always copy the raw notebook text in the *.ipynb file into a python file and insert values using string formatting.
I've got a master ARM deployment file with these resources:
{
"apiVersion": "2015-01-01",
"name": "SharedServicePlanTemplate",
"type": "Microsoft.Resources/deployments",
"properties": {
"templateLink": { "uri": "[concat(variables('templateBase'), 'serviceplan.template.json')]" },
"parametersLink": { "uri": "[concat(variables('parametersBase'), 'serviceplan.shared.json')]" },
"mode": "Incremental"
}
},
{
"name": "my_website",
"type": "Microsoft.Web/sites",
"location": "[resourceGroup().location]",
"apiVersion": "2015-08-01",
"dependsOn": [
"[resourceId('Microsoft.Web/serverfarms', 'ServicePlanShared')]"
],
"tags": {
"[concat('hidden-related:', resourceGroup().id, '/providers/Microsoft.Web/serverfarms/', 'ServicePlanShared')]": "Resource",
"displayName": "my_website"
},
"properties": {
"name": "my_website",
"serverFarmId": "[resourceId('Microsoft.Web/serverfarms', 'ServicePlanShared')]"
}
}
When I try to deploy, I get the following error:
New-AzureRmResourceGroupDeployment : InvalidTemplate: Deployment template validation failed: 'The resource
'Microsoft.Web/serverfarms/ServicePlanShared' is not defined in the template.
I thought that was the whole reason for using the resourceId function, though. I can merge my serviceplan.template.json and the website resource into the same template file, but I'd rather not do that since I will have multiple websites using that plan, and I want to be able to deploy them separately.
Change your dependsOn property to:
"dependsOn" : ["SharedServicePlanTemplate"]
One gotcha with your nested approach is if the name of your service plan changes in the linked parameters file, the resource won't be found. Passing that in as a parameter (whether you use the linked parameters file or pass it through) might be a better approach. A bit orthogonal but something to think about.