SQL Server Polybase | Cosmos Document DB Date conversion issue - azure-cosmosdb

Im new to polybase. I have linked my SQL 2019 server to a third parties Azure cosmos and i am able to query data out of my collection. I am getting an error out when i try to query date fields though. In the documents the dates are defined as:
"created" : {
"$date" : 1579540834768
},
In my external table i have the column defined as
[created] DATE,
I have tried to create the column as int and nvarchar(128) but the schema detection rejects it each time. (i have tried to create a field created_date but the schema detection also disagree's that this is correct.
When i try a query that returns any of the date fields i get this error:
Msg 105082, Level 16, State 1, Line 8
105082;Generic ODBC error: [Microsoft][Support] (40460) Fractional data truncated while performing conversion. .
OLE DB provider "MSOLEDBSQL" for linked server "(null)" returned message "Unspecified error".
Msg 7421, Level 16, State 2, Line 8
Cannot fetch the rowset from OLE DB provider "MSOLEDBSQL" for linked server "(null)". .
This happens if i try and exclude null values in my query - even when filtering to specific records where the date is populated (validated using the Azure portal interface)
Is there something i should be doing to handle the integer date from the json records; or another type i can use to get my external table to work?

Found a solution. SQL Server recommends the wrong type for mongodb dates in the schema. Using DateTime2 resolved the issue. Found this on a polybase type mapping page in msdn.

Related

Are Sybase datetime fields supported in PolyBase exernal tables?

I'm working with PolyBase on SQL Server 2022. I have some tables with type "datetime NULL" on Sybase ASE 16.
When declaring an external table e.g.
CREATE EXTERNAL TABLE SYBASESCHEMA.SomeTable
(
[SomeNiceTime] DATETIME NULL
)
WITH (LOCATION = N'SomeNiceDatabase.dbo.SomeTable', DATA_SOURCE = SYBASE_DS);
I receive an error message like the following one:
105083;The following columns in the user defined schema are incompatible with the external table schema for table 'SomeTable': 'SomeNiceTime' failed to be reflected with the error: 'The detected ODBC SQL_TYPE 11 is not supported for external generic tables.'
Does anyone know how this could be resolved?

Upsert Cosmos item TTL using Azure Data Factory Copy Activity

I have a requirement to upsert data from REST API to Cosmos DB and also maintain the item level TTL for particular time interval.
I have used ADF Copy activity to copy the data but for TTL, used additional custom column at source side with hardcoded value 30.
Noticed that time interval (seconds) updating as string instead of integer. Hence failing with the below error.
Details
Failure happened on 'Sink' side. ErrorCode=UserErrorDocumentDBWriteError,'Type=Microsoft.DataTransfer.Common.Shared.HybridDeliveryException,Message=Documents failed to import due to invalid documents which violate some of Cosmos DB constraints: 1) Document size shouldn't exceeds 2MB; 2) Document's 'id' property must be string if any, and must not include the following charaters: '/', '', '?', '#'; 3) Document's 'ttl' property must not be non-digital type if any.,Source=Microsoft.DataTransfer.DocumentDbManagement,'
ttl Mapping between Custom column to cosmos DB
When i use ttl1 instead of ttl, it is getting success and value stored as string.
Any suggestion please?
Yes, that's the issue with additional columns in Copy activity. Even of you set it to int, it will change to string at the source.
The possible workaround is to create a Cosmos DB trigger in Azure function and add 'TTL' there.

How to specify CosmosDb Synapse Link types when parquet type is incorrect?

I have a CosmosDb and a Synapse workspace linked. Everything almost works using Synapse to create SQL views to the Cosmos data.
In Cosmos I have one data set with a property that is always a zero. I know it is actually a decimal because it is a price and future data is likely to contain decimal prices.
In Synapse I need to project this data into an SQL view where that column is correctly a decimal(19,4).
When I run an OpenRowSet query into the Cosmos data and attempt to specify the type for this property I get the following error.
select *
from OPENROWSET(
'CosmosDb',
'account=myaccount;database=myDatabase;region=theRegion;key=xxxxxxxxxxxxxxx',
[myCollection])
with (
[salesPrice] float '$.salesPrice')
as testQuery
I get the error:
Column 'salesPrice' of type 'FLOAT' is not compatible with external data type 'Parquet physical type: INT64', please try with 'BIGINT'.
Obviously a BIGINT here is going to fail as soon as I get a true decimal price.
I think the parquet type is getting set to BIGINT because in Cosmos all the values for this column are zero. I guess more generally it would be the same problem if the Cosmos property was all non-zero integers.
How can I force the type of salesPrice to be a decimal or float?
(I don't want to get side tracked here on float vs decimal for monetary values, I understand the difference; this error happens either way)
UPDATE
This problem manifests itself also in another way without specifying a schema with OPENROWSET.
In a new CosmosDb collection insert a document such as:
{
"myid" : 1,
"price" : 0
}
If I wait a minute or so I can query this document from Synapse with:
select *
from OPENROWSET(
'myCosmosDb',
'account=myAccount;database=myDatabase;region=myRegion;key=xxxxxxxxxxxxxxxxxxx',
[myCollection])
as testQuery;
and I get the expected results.
Now add a second document:
{
"myid" : 1,
"price" : 1.1
}
and re-run the query and I get the same error:
Column 'price' of type 'FLOAT' is not compatible with external data type 'Parquet physical type: INT64', please try with 'BIGINT'
Is there any way to work around or prevent these kinds of errors?
How about set the document like
{
"myid" : "1",
"price" : "1.1"
}

Error while executing sql query on database

I'm trying to read BLOB data from messages table form WhatsApp database called msgstore, I want to get the file names that are transferred. The blob data is stored in thumb_image column.
I found this query here:
SELECT dbms_lob.substr(thumb_image)
FROM messages
WHERE _id = 3;
I also tried this query from here:
SELECT utl_raw.cast_to_varchar2(dbms_lob.substr(thumb_image))
FROM messages
WHERE _id = '3;
I both cases I keep getting an error thats says:
Error while executing sql query on database 'msgstore': near "(" syntax error.
I don't understand what's wrong with the query.
How can I fix this?

OLE DB provider 'for linked server returned data that does not match expected data length for

I get an error querying a remote postgresql server from my sql server 2017 Standard via a linked server
this is the query:
SELECT CAST(test AS VARCHAR(MAX)) FROM OpenQuery(xxxx,
'SELECT corpo::TEXT as test From public.notification')
and this is the error message:
Msg 7347, Level 16, State 1, Line 57
OLE DB provider 'MSDASQL' for linked server 'xxx' returned data that does not match expected data length for
column '[MSDASQL].test'. The (maximum) expected data length is 1024, while the returned data length is 7774.
Even without conversions the error stills
For the odbc and linked server I followed this handy guide.
In my case, I was reading the data through a view. Apparently, the data size of one column was changed in the underlying table but the view still reported to the linked server the original smaller size of the column. The solution was to open the view with MSSMS and save it again.
Can you try this?
SELECT *
FROM OPENQUERY(xxxx, '\
SELECT TRIM(corpo) AS test
FROM public.notification;
') AS oq
I prefer using OPENQUERY since it will send the exact query to the linked server for it to execute.
MySQL currently has problem with casting to VARCHAR data type, so I using TRIM() function to cheat it.

Resources