Find the start and end of JSON string and ensure it is complete, from a incoming stream - json.net

I am receiving a stream of JSON string from a hardware. It works most of the time. But when the data getting more and more congested, I will encounter situation like below:
JSON string received is not yet complete
{ test: 'test1',
Multiple JSON string received
{ test: 'test1', valid: true }{ test: 'test2', valid: true }{ test:
In the example above.
Case 1: I need to wait until the string is complete.
Case 2: I wish to be able to extract the completed 2 Json strings and parse separately
Keep in mind that the above are for illustrative purposes only. In real life string might include { or } and, the string might use single or double quote, and the Json string is much longer, might contains multiple objects inside it.

Related

findAll() returns empty with WHERE option

First question on StackOverflow, long time reader first time poster or whatever people say.
I'm developing a Discord bot in my free time using Discord.js, and I'm using Sequelize to interface with a local SQLite database. I can insert data into it just fine-- however, I can't seem to delete any of the records I add. Relevant piece of code is below, which I believe to be self-contradictory:
const query3 = await Towers.findAll({
attributes: ['channelID']
});
console.log(JSON.stringify(query3)); //returns the one Tower
console.log(query3[0].channelID === channel); //returns true(!)
const query2 = await Towers.findAll({
attributes: ['channelID'],
where: {channelID: channel}
});
console.log(JSON.stringify(query2)); //returns empty
//DELETE FROM Towers WHERE channelID = channel;
const query = await Towers.destroy({
where: {channelID: channel}
});
console.log(query); //returns 0, expected behavior given query2 returns empty
I'm attempting to delete a record from a table named Towers by passing a channel ID to it, which is expected to be unique. However, when I make any query on the database with a WHERE clause, the query returns an empty set-- even when, in this example, I sanity-checked and verified that the value I'm attempting to remove is present in the table. This occurs for both findAll() and findOne() as long as a WHERE clause is present.
(For posterity, I've double and triple checked that channelID was spelled correctly and with the correct capitalization in all instances.)
I'm happy to provide any more information if needed!
EDIT: As requested, the model definition...
const Towers = sequelize.define('Towers', {
serverID: {
type: Sequelize.INTEGER,
allowNull: false,
},
channelID: {
type: Sequelize.INTEGER,
unique: true,
allowNull: false,
},
pattern: Sequelize.STRING,
height: Sequelize.INTEGER,
delay: Sequelize.BOOLEAN,
});
channel in the snippet in the original post is defined as parseInt(interaction.options.getChannel('channel').id).
To anyone who happens to have the same issue I did, the answer is a doozy.
I wanted to store Discord server and channel ID's as integers, even though they're returned to you as strings when calling the API. As it turns out, Discord snowflakes are higher than float64 precision, which JS uses. When parsing the strings into integers to insert them into my table, the value changed from the intended number, and I was creating erroneous records.
In my case (with the actual numbers obfuscated) interaction.options.getChannel('channel').id returned "837512533934092340", while parseInt(interaction.options.getChannel('channel').id returned 837512533934092300. The number I was adding to the table was somehow 40 less!
I'm not sure if this could be fixed by using BigInt, but since it's going into a different structure anyway, I just shrugged and changed the serverId and channelId types to Sequelize.STRING in the model definition and removed the parseInt calls. Works like a charm now.
Good opportunity to shake my fist at JS though.

Spring cloud contract: how to verify an array list (Kotlin based project)

I would like to write a groovy contract to verify an array list with string values.
Lets say I have an object:
data class MyDataObject(val messageList: List<String>)
my contract is the following:
package contracts
import org.springframework.cloud.contract.spec.Contract
Contract.make {
name("retrieve_list_of_objects")
description("""
given:
you want to have a list of MyObjects
when:
you get the list
then:
you have the list
""")
request {
method 'GET'
url '/10/my-objects'
headers {
contentType(applicationJson())
}
}
response {
status 200
body(
[
messageList: ["23412341324"]
]
)
headers {
contentType(applicationJson())
}
} }
the problem is that created test is translated to:
assertThatJson(parsedJson).array("['messageList']").contains("23412341324").value();
and that results in:
com.jayway.jsonpath.PathNotFoundException: Expected to find an object with property ['messageList'] in path $ but found 'net.minidev.json.JSONArray'. This is not a json object according to the JsonProvider: 'com.jayway.jsonpath.spi.json.JsonSmartJsonProvider'.
The question is: how can I write my contract to create the following test:
assertThatJson(parsedJson).array("['messageList']").contains("23412341324");
I ran your snippet in my project and it generated a test that looks like this (I don't know why my test generating looks different than yours)
MockMvcRequestSpecification request = given()
.header("Accept", "application/json")
.body("{\"messageList\":[\"23412341324\"]}");
If I am reading your question right, you want the body to be a list of MyObjects, and not just one.
I think the problem is that you need to surround MyObject with one more set of square brackets, if indeed you want this to verify a list of MyObjects.
body(
[[
messageList: ["23412341324"]
]]
)
In General
Use SQUARE BRACKETS to make objects (yes i know in JSON square brackets are for arrays, its weird, i didn't invent it)
You can surround field names with quotes or without, they both seem to work.
body([
stringField1: value(regex(".*")),
stringField2: value(regex(alphaNumeric()),
innerObject1: [
innerStringField1: "Hardcoded1",
innerIntegerField1: anyInteger()
]
])
Wait? How do I make JSON lists then if square brackets are for objects?
Double square brackets. Seriously.
body(
[[
stringFieldOfObjectInList: regex(".*")
]]
)

Combine Multiple JSON Files Json.NET

I have an API that currently receives JSON calls that I push to files (800KB-1MB) (1 for each call), and would like to have an hourly task that takes all of the JSON files in the last hour and combines them into a single file as to make it better to do daily/monthly analytics on.
Each file consists of a collection of data, so in the format of [ object {property: value, ... ]. Due to this, I cannot do simple concatenation as it'll no longer be valid JSON (nor add a comma then the file will be a collection of collections). I would like to keep the memory foot-print as low as possible, so I was looking at the following example and just pushing each file to the stream (deserializing the file using JsonConvert.DeserializeObject(fileContent); however, by doing this, I end up with a collection of collection as well. I have also tried using a JArray instead of the JsonConvert, pushing to a list outside of the foreach with but provides the same result. If I move the Serialize call outside the ForEach, it does work; however, I am worried about holding the 4-6GB worth of items in memory.
In summary, I'm ending up with [ [ object {property: value, ... ],... [ object {property: value, ... ]] where my desired output would be [ object {property: value (file1), ... object {property: value (fileN) ].
using (FileStream fs = File.Open(#"C:\Users\Public\Documents\combined.json", FileMode.CreateNew))
{
using (StreamWriter sw = new StreamWriter(fs))
{
using (JsonWriter jw = new JsonTextWriter(sw))
{
jw.Formatting = Formatting.None;
JArray list = new JArray();
JsonSerializer serializer = new JsonSerializer();
foreach (IListBlobItem blob in blobContainer.ListBlobs(prefix: "SharePointBlobs/"))
{
if (blob.GetType() == typeof(CloudBlockBlob))
{
var blockBlob = (CloudBlockBlob)blob;
var content = blockBlob.DownloadText();
var deserialized = JArray.Parse(content);
//deserialized = JsonConvert.DeserializeObject(content);
list.Merge(deserialized);
serializer.Serialize(jw, list);
}
else
{
Console.WriteLine("Non-Block-Blob: " + blob.StorageUri);
}
}
}
}
}
In this situation, to keep your processing and memory footprints low, I think I would just concatenate the files one after the other even though it results in technically invalid JSON. To deserialize the combined file later, you can take advantage of the SupportMultipleContent setting on the JsonTextReader class and process the object collections through a stream as if they were one whole collection. See this answer for an example of how to do this.

Good way to replace invalid characters in firebase keys?

My use case is saving a user's info. When I try to save data to Firebase using the user's email address as a key, Firebase throws the following error:
Error: Invalid key e#e.ee (cannot contain .$[]#)
So, apparently, I cannot index user info by their email. What is the best practice to replace the .?
I've had success changing the . to a - but that won't cut it since some email's have -s in the address.
Currently, I'm using
var cleanEmail = email.replace('.','`');
but there are likely going to be conflicts down the line with this.
In the email address, replace the dot . with a comma ,. This pattern is best practice.
The comma , is not an allowable character in email addresses but it is allowable in a Firebase key. Symmetrically, the dot . is an allowable character in email addresses but it is not allowable in a Firebase key. So direct substitution will solve your problem. You can index email addresses without looping.
You also have another issue.
const cleanEmail = email.replace('.',','); // only replaces first dot
will only replace the first dot . But email addresses can have multiple dots. To replace all the dots, use a regular expression.
const cleanEmail = email.replace(/\./g, ','); // replaces all dots
Or alternatively, you could also use the split() - join() pattern to replace all dots.
const cleanEmail = email.split('.').join(','); // also replaces all dots
We've dealt with this issue many times and while on the surface it seems like using an email as a key is a simple solution, it leads to a lot of other issues: having to clean/parse the email so it can actually be used. What if the email changes?
We have found that changing the format of how the data is stored is a better path. Suppose you just need to store one thing, the user name.
john#somecompany.com: "John Smith"
changing it to
randomly_generated_node_name
email: "john#somecompany.com"
first: "John"
last: "Smith"
The randomly_generated_node_name is a string that Firebase can generate via childByAutoId, or really any type of reference that is not tied directly to the data.
This offers a lot of flexibility: you can now change the persons last name - say if they get married. Or change their email. You could add an 'index' child 0, 1, 2 etc that could be used for sorting. The data can be queried for any child data. All because the randomly_generated_node_name is a static reference to the variable child data within the node.
It also allows you to expand the data in the future without altering the existing data. Add address, favorite food, an index for sorting etc.
Edit: a Firebase query for email in ObjC:
//references all of the users ordered by email
FQuery *allUsers = [myUsersRef queryOrderedByChild:#"email"];
//ref the user with this email
FQuery *thisSpecificUser = [allUsers queryEqualToValue:#“john#somecompany.com”];
//load the user with this email
[thisSpecificUser observeEventType:FEventTypeChildAdded withBlock:^(FDataSnapshot *snapshot) {
//do something with this user
}];
I can think of two major ways to solve this issue:
Encode/Decode function
Because of the limited set of characters allowed in a Firebase key, a solution is to transform the key into an valid format (encode). Then have an inverse function (decode) to transform the encoded key back as the original key.
A general encode/decode function might be transforming the original key into bytes, then converting them to a hexadecimal representation. But the size of the key might be an issue.
Let's say you want to store users using the e-mail as key:
# path: /users/{email} is User;
/users/alice#email.com: {
name: "Alice",
email: "alice#email.com"
}
The example above doesn't work because of the dot in the path. So we use the encode function to transform the key into a valid format. alice#email.com in hexadecimal is 616c69636540656d61696c2e636f6d, then:
# path: /users/{hex(email)} is User;
/users/616c69636540656d61696c2e636f6d: {
name: "Alice",
email: "alice#email.com"
}
Any client can access that resource as long as they share the same hex function.
Edit: Base64 can also be used to encode/decode the key. May be more efficient than hexadecimals, but there are many different implementations. If clients doesn't share the exact same implementation, then they will not work properly.
Specialized functions (ex. that handles e-mails only) can also be used. But be sure to handle all the edge cases.
Encode function with original key stored
Doing one way transformation of the key is a lot easier. So, instead of using a decode function, just store the original key in the database.
A good encode function for this case is the SHA-256 algorithm. It's a common algorithm with implementations in many platforms. And the chances of collisions are very slim.
The previous example with SHA-256 becomes like this:
# path: /users/{sha256(email)} is User;
/users/55bf4952e2308638427d0c28891b31b8cd3a88d1610b81f0a605da25fd9c351a: {
name: "Alice",
email: "alice#email.com"
}
Any client with the original key (the e-mail) can find this entry, because the encode function is known (it is known). And, even if the key gets bigger, the size of the SHA-256 will always be the same, therefore, guaranteed to be a valid Firebase key.
I am using the following code for converting email to hash and then using the hash as key in firebase
public class HashingUtils {
public HashingUtils() {
}
//generate 256 bits hash using SHA-256
public String generateHashkeySHA_256(String email){
String result = null;
try {
MessageDigest digest = MessageDigest.getInstance("SHA-256");
byte[] hash = digest.digest(email.getBytes("UTF-8"));
return byteToHex(hash); // make it printable
}catch(Exception ex) {
ex.printStackTrace();
}
return result;
}
//generate 160bits hash using SHA-1
public String generateHashkeySHA_1(String email){
String result = null;
try {
MessageDigest digest = MessageDigest.getInstance("SHA-1");
byte[] hash = digest.digest(email.getBytes("UTF-8"));
return byteToHex(hash); // make it printable
}catch(Exception ex) {
ex.printStackTrace();
}
return result;
}
public String byteToHex(byte[] bytes) {
Formatter formatter = new Formatter();
for (byte b : bytes) {
formatter.format("%02x", b);
}
String hex = formatter.toString();
return hex;
}
}
code for adding the user to firebase
public void addUser(User user) {
Log.d(TAG, "addUser: ");
DatabaseReference userRef= database.getReference("User");
if(!TextUtils.isEmpty(user.getEmailId())){
String hashEmailId= hashingUtils.generateHashkeySHA_256(user.getEmailId());
Log.d(TAG, "addUser: hashEmailId"+hashEmailId);
userRef.child(hashEmailId).setValue(user);
}
else {
Log.d(TAG,"addUser: empty emailId");
}
}

Different data types in form and database and forward and backward conversion

I thought it'd be easy but, yeah... it wasn't. I already posted a question that went in the same direction, but formulated another question.
What I want to do
I have the collection songs, that has a time attribute (the playing-time of the song). This attribute should be handled different in the form-validation and the backend-validation!
! I'd like to do it with what autoform (and simple-schema / collection2) offers me. If that's possible...
in the form the time should be entered and validated as a string that fits the regex /^\d{1,2}:?[0-5][0-9]$/ (so either format "mm:ss" or mmss).
in the database it should be stored as a Number
What I tried to do
1. The "formToDoc-way"
This is my javascript
// schema for collection
var schema = {
time: {
label: "Time (MM:SS)",
type: Number // !!!
},
// ...
};
SongsSchema = new SimpleSchema(schema);
Songs.attachSchema(SongsSchema);
// schema for form validation
schema.time.type = String // changing from Number to String!
schema.time.regEx = /^\d{1,2}:?[0-5][0-9]$/;
SongsSchemaForm = new SimpleSchema(schema);
And this is my template:
{{>quickForm
id="..."
type="insert"
collection="Songs"
schema="SongsSchemaForm"
}}
My desired workflow would be:
time is validated as a String using the schema
time is being converted to seconds (Number)
time is validated as a Number in the backend
song is stored
And the way back.
I first tried to use the hook formToDoc and converted the string into seconds (Number).
The Problem:
I found out, that the form validation via the given schema (for the form) takes place AFTER the conversion in `formToDoc, so it is a Number already and validation as a String fails.
That is why I looked for another hook that fires after the form is validated. That's why I tried...
2. The "before.insert-way"
I used the hook before.insert and the way to the database worked!
AutoForm.hooks({
formCreateSong: {
before: {
insert: function (doc) {
// converting the doc.time to Number (seconds)
// ...
return doc;
}
},
docToForm: function (doc) {
// convert the doc.time (Number) back to a string (MM:SS)
// ...
return doc;
}
}
});
The Problem:
When I implemented an update-form, the docToForm was not called so in the update-form was the numerical value (in seconds).
Questions:
How can I do the way back from the database to the form, so the conversion from seconds to a string MM:SS?
Is there a better way how to cope with this usecase (different data types in the form-validation and backend-validation)?
I am looking for a "meteor autoform" way of solving this.
Thank you alot for reading and hopefully a good answer ;-)
I feel like the time should really be formatted inside the view and not inside the model. So here's the Schema for time I'd use:
...
function convertTimeToSeconds (timeString) {
var timeSplit = timeString.split(':')
return (parseInt(timeSplit[0]) * 60 + parseInt(timeSplit[1]))
}
time: {
type: Number,
autoValue: function () {
if(!/^\d{1,2}:?[0-5][0-9]$/.test(this.value)) return false
return convertTimeToSeconds(this.value)
}
}
...
This has a small disadvantage of course. You can't use the quickForm-helper anymore, but will have to use autoForm.
To then display the value I'd simply find the songs and then write a helper:
Template.registerHelper('formateTime', function (seconds) {
var secondsMod = seconds % 60
return [(seconds - secondsMod) / 60, secondsMod].join(':')
})
In your template:
{{ formatTime time }}
The easy answer is don't validate the string, validate the number that the string is converted into.
With simpleschema, all you do is create a custom validation. That custom validation is going to grab the string, turn it into a number, and then validate that number.
Then, when you pull it from the database, you'll have to take that number & convert it into a string. Now, simpleschema doesn't do this natively, but it's easy enough to do in your form.
Now, if you wanted to get fancy, here's what I'd recommend:
Add new schema fields:
SimpleSchema.extendOptions({
userValue: Match.Optional(Function),
dbValue: Match.Optional(Function),
});
Then, add a function to your time field (stored as Date field):
userValue: function () {
return moment(this.value).format('mm:ss');
},
dbValue: function () {
return timeToNumber(this.value);
}
Then, make a function that converts a timeString to a number (quick and dirty example, you'll have to add error checking):
function timeToNumber(str) {
str.replace(':',''); //remove colon
var mins = +str.substr(0,2);
var secs = +str.substr(2,2);
return mins * 60 + secs;
}
Then, for real-time validation you can use schema.namedContext().validateOne. To update the db, just send timeToNumber(input.value).

Resources