I'm trying to use Athena to query some files that are in Ion format produced by the recently added Export To S3 feature of DynamoDB backups.
This is a blatantly stupid format which is basically the string $ion_1_0 followed by json. The unquoted $ion_1_0 string at the front makes the data invalid json.
I tried using the Ion Serde from here:
CREATE EXTERNAL TABLE mydb.mytable (
`myfields` string,
...
)
ROW FORMAT SERDE 'com.amazon.ionhiveserde.IonHiveSerDe'
LOCATION 's3:/.../dynamodb-export/AWSDynamoDB/01608775578817-a6944d97/data/'
TBLPROPERTIES ('has_encrypted_data'='true');
But got this:
FAILED: Execution Error, return code 1 from org.apache.hadoop.hive.ql.exec.DDLTask. Cannot validate serde: com.amazon.ionhiveserde.IonHiveSerDe
UPDATE
Actually the format is even a little worse than I thought. The field names are not quoted. So it's not quite valid json even after stripping the $ion prefix.
ION is an open-source textual format which is a superset of JSON. Have you tried converting your ION file(s) with glue? ION is one of the format options supported (for input): https://docs.aws.amazon.com/glue/latest/dg/aws-glue-programming-etl-format.html
This QLDB workshop uses ION in its example, you could explore the cloudformation template/yaml or deploy the workflow and dig into the crawler and job it creates for some ideas: https://qldb-immersionday.workshop.aws/en/lab3/task3.html
Check out the ION cookbook for some additional information: https://amzn.github.io/ion-docs/guides/cookbook.html
And the specs: https://amzn.github.io/ion-docs/docs/spec.html
Related
I'm trying to send a quite simple JSON message from BizTalk.
{
"Value": 1
}
I set the type of the "Value" field to xs:int in my xml schema
<xs:element minOccurs="1" maxOccurs="1" name="Value" type="xs:int" />
But it keeps generating the wrong JSON message.
{
"Value": "1"
}
Not sure what I'm doing wrong here. Does anybody have some tips?
There can be multiple issues at work here.
Your XML payload is not going through a XML Dissasembler or Assembler before reaching the JSON Encoder, and hence the Message Type isn't set correctly, in which case it is not using the schema to determine the correct date types.
You have another earlier element in your schema with the same name but a different type. It will use the type from the first one for all elements with the same name. This was a bug in BizTalk 2013 R2, and I think it was still there in BizTalk 2016 and could cause some strange errors, especially if you had a Records (complex types) and Element (simple types) with the same name. e.g FIX: JSON encoder unable to handle XML schema with the same name for record and one of its elements in CU 7
Biztalk Embedded JSON Encoder has the ability to recognize schema of XML processed and should recognize xs:int value from schema and encoded without quotes.
But sometimes (as Sandro Pereira indicates even often) this encoder throws the error:
Reason: Value cannot be null.
Parameter name: key
That was my case. I had to use a custom JSON Encoder that recognizes XML datatype somehow.
My solution was:
Use this build of Newtonsoft.Json library that uses json:Type argument as indicator of XML tag datatype.
Create a custom JSON Encoder that uses the previous library
Create custom Complex Type in separate schema with targetNamespace = "http://james.newtonking.com/projects/json"
Note that attribute Type need to be Qualified
Import above schema in Your target schema using prefix json
Use this type as Data Structure Type and rename Your parameter.
Map this checking logical existance, to avoid putting argument without value or custom xslt tranform
Usefull sites:
Adding name to attribute
I can think of workarounds on how to get this working however I'm interested in finding out if there's a solution to this specific problem.
I've got a go program which requires a json string arguement:
go run main.go "{ \"field\" : \"value\" }"
No problems so far. However, am I able to run from the command line if one of the json values is another json string?
go run main.go "{ \"json-string\" : \"{\"nestedfield\" : \"nestedvalue\"}\" }"
It would seem that adding escape characters incorrectly matches up the opening and closing quotes. Am I minuderstanding how this is done or is it (and this is the side I'm coming down on) simply not possible?
To reiterate, this is a question that has piqued my curiosity - I'm aware of alternative approaches - I'm hoping for input related to this specific problem.
Why don't you just put your json config to the file and provide config file name to your application using flag package
Based on the feedback from wiredeye I went down the argument route instead. I've modified the program to run on:
go run main.go field:value field2:value json-string:"{\"nestedfield\":nestedvalue}"
I can then iterate over the os.Args and get the nested json within my program. I'm not using flags directly as I don't know the amount of inputs into the program which would have required me to use duplicate flags (not supported) or parse the flag to a collection (doesn't seem to be supported).
Thanks wiredeye
I need to send a Highcharts options object to an asp page so it can be written to a json flat file. These files are later passed to phantomjs via highcharts-convert in order to create some pdfs.
The problem however is stringifying the objects. I keep getting this error:
Uncaught TypeError: Converting circular structure to JSON
when I try this:
$.post("myASP.asp", JSON.stringify(myChart.highcharts().options));
There is a sample POST string here http://docs.highcharts.com/#render-charts-on-the-server but I'm not sure how to achieve that with mine. When I paste their sample into my code for testing I get all kinds of unescaped double quote errors. Is that a typo on their part?
I would check if there are curricular references in the JSON objects. As far as I remember that is not supported by the JSON serializer.
One example of this if you have an object with an array of children that refer back to the parent.
I think you can try the following:
{"infile":myChart.getSVG()}
This should get the svg representation of the chart
hexadecimal entity are converted into string while loading XML data into marklogic server please suggest the right solution for this. without using XLST charmap
There is now way to to this automatically.
In xquery in order to have hexadecimal show up you need to use the &# then the x like this
"©"
You could write a function that looks at the content that your loading in and finds all the hexadecimal already in there and converts them to this format.
I am working on Windows Application development using c#. I want to read a csv file from a directory and imported into sql server database table. I am successfully read and import the csv file data into database table if the file content is uniform. But I am unable to insert the file data with invariant form ex.Actually my csv file delimiter is tab('\t') and after getting individual fields I have a field that contains data like dcc
Name
----
xxx
xxx yyy
xx yy zz
and i rerieved data like xxx,yyy and xx,yy,zz so the insertion becomes problem.
How could i insert the data uniformly into a database table.
It's pretty easy.
Just read file line-by-line. Example on MSDN here:
How to: Read Text from a File
For each line use String.Split Method with your tab as delimiter. Method documentation and sample are here:
String.Split Method (Char[], StringSplitOptions)
Then working insert your data.
If a CSV (or TSV) value contains a delimiter inside of it, then it should be surrounded by quotes. See the spec for more details: https://www.rfc-editor.org/rfc/rfc4180#page-3
So your input file is incorrectly formatted. If you can convince the input provider to fix this issue, that will be the best way to fix the problem. If not, other solutions may include:
visually inspecting and editing the file to fix errors, or
writing your parser program to have enough knowledge of your data expectations that it can correctly "guess" where the real delimiters are.
If I'm understanding you correctly, the problem is that your code is splitting on spaces instead of on tabs. Given you have read in the lines from the file, all you need to do is:
string[] fileLines;//from the file
foreach(string line in fileLines)
{
string[] lineParts=line.Split(new char[]{'\t'});
}
and then do whatever you want with each lineParts. The \t is the tab character.
If you're also asking about writing the lines to a database file...you can just read in tab-delimited files with the Import Export Wizard (assuming you're using Sql Server Mgmt Studio, but I'm sure there are comparable ways to import using other db management software).