Validation before bulkinsert in sql server - asp.net

I want to import data from text file.which contain arround lakhs of records
I am using bulk insert do it like this
BULK
INSERT vw_bulk_insert_test
FROM '\\server\c$\csvtext.txt'--\\server\SQLEXPRESS\csvtest.txt'
WITH
(FIRSTROW=2,
check_CONSTRAINTS,
FIELDTERMINATOR = '~',
ROWTERMINATOR = '\n'
)
GO
But before insert I want to validate values of each column without using cursor.Like if second row will have values of all fields except unit_number(Column) then it should create a error log specifying unit_number value is missing.

Personally, I would bulk-insert into a temp table, and then do validations/conversions from the temp table into the table where things will ultimately reside using either TSQL or TSQL in the form of stored procedures created for this purpose.

You have this syntax
BULK INSERT
[ database_name . [ schema_name ] . | schema_name . ] [ table_name | view_name ]
FROM 'data_file'
[ WITH
(
[ [ , ] BATCHSIZE = batch_size ]
[ [ , ] CHECK_CONSTRAINTS ]
[ [ , ] CODEPAGE = { 'ACP' | 'OEM' | 'RAW' | 'code_page' } ]
[ [ , ] DATAFILETYPE =
{ 'char' | 'native'| 'widechar' | 'widenative' } ]
[ [ , ] FIELDTERMINATOR = 'field_terminator' ]
[ [ , ] FIRSTROW = first_row ]
[ [ , ] FIRE_TRIGGERS ]
[ [ , ] FORMATFILE = 'format_file_path' ]
[ [ , ] KEEPIDENTITY ]
[ [ , ] KEEPNULLS ]
[ [ , ] KILOBYTES_PER_BATCH = kilobytes_per_batch ]
[ [ , ] LASTROW = last_row ]
[ [ , ] MAXERRORS = max_errors ]
[ [ , ] ORDER ( { column [ ASC | DESC ] } [ ,...n ] ) ]
[ [ , ] ROWS_PER_BATCH = rows_per_batch ]
[ [ , ] ROWTERMINATOR = 'row_terminator' ]
[ [ , ] TABLOCK ]
[ [ , ] ERRORFILE = 'file_name' ]
)]

Related

Project certain column and carry on one entry in every row

I have a dataset which looks like the following
{
"metadata":"d_meta_v_1.5.9",
"data": {
"a": {
"T": [
1652167964645,
1652168781684,
1652168781720
],
"V": [
1,
2,
3
]
},
"b": {
"T": [
1652167961657,
1652168781720,
1652168781818
],
"V": [
1,
3,
4
]
},
"c": {
"T": [
1652167960194,
1652168787377
],
"V": [
1,
3
]
}
}
}
I want to select the certain column and carry on the metadata also at the end. a part of this question is working in my perviou question here
How can I get my desired output ?
Metadata, Time, a, b
d_meta_v_1.5.9, <Time>, <value of _a>, < value of b>
d_meta_v_1.5.9, <Time>, <value of _a>, < value of b>
d_meta_v_1.5.9, <Time>, <value of _a>, < value of b>
let requested_columns = dynamic(["a","b"]);
datatable(doc:dynamic)
[
dynamic
(
{
"metadata":"d_meta_v_1.5.9",
"data": {
"a": {
"T": [
1652167964645,
1652168781684,
1652168781720
],
"V": [
1,
2,
3
]
},
"b": {
"T": [
1652167961657,
1652168781720,
1652168781818
],
"V": [
1,
3,
4
]
},
"c": {
"T": [
1652167960194,
1652168787377
],
"V": [
1,
3
]
}
}
}
)
]
| project metadata = doc.metadata, data = doc.data
| mv-expand data = data
| extend key = tostring(bag_keys(data)[0])
| where key in (requested_columns)
| mv-expand T = data[key].T to typeof(long), V = data[key].V to typeof(long)
| evaluate pivot(key, take_any(V), metadata, T)
| order by T asc
metadata
T
a
b
d_meta_v_1.5.9
1652167961657
1
d_meta_v_1.5.9
1652167964645
1
d_meta_v_1.5.9
1652168781684
2
d_meta_v_1.5.9
1652168781720
3
3
d_meta_v_1.5.9
1652168781818
4
Fiddle

formatting a dataframe with coordinate points

This is a beginner question, but I have a set of coordinate points formatted like
[ [ -75.526844, 39.655713 ], [ -75.526344, 39.656413 ], [ -75.522343, 39.660813 ], [ -75.518343, 39.663913 ], [ -75.514643, 39.668613 ], [ -75.511743, 39.674313 ], [ -75.509342, 39.685313 ], [ -75.509742, 39.686113 ], [ -75.509042, 39.694513 ], [ -75.507162, 39.696961 ], [ -75.504042, 39.698313 ], [ -75.496241, 39.701413 ], [ -75.491341, 39.711113 ], [ -75.488553, 39.714833 ], [ -75.485241, 39.715813 ], [ -75.483141, 39.715513 ], [ -75.481741, 39.714546 ], [ -75.478940, 39.713813 ], [ -75.477640, 39.715013 ], [ -75.476888, 39.718337 ], [ -75.477432, 39.720561 ], [ -75.477240, 39.724713 ], [ -75.475440, 39.728713 ], [ -75.475384, 39.731057 ], [ -75.474168, 39.735473 ], [ -75.469239, 39.743613 ], [ -75.466263, 39.750737 ], [ -75.466249, 39.750769 ], [ -75.463039, 39.758313 ], [ -75.463339, 39.761213 ]]
I want to make a dataframe that has one column for longitude and one for latitude for this data. How should I go about doing this?
Well, using the basics of R, you can make use of the following implementation. I read the coordinates as a string. You can use R's readChar()
text = "[
[ -75.526844, 39.655713 ], [ -75.526344, 39.656413 ],
[ -75.522343, 39.660813 ], [ -75.518343, 39.663913 ],
[ -75.514643, 39.668613 ], [ -75.511743, 39.674313 ],
[ -75.509342, 39.685313 ], [ -75.509742, 39.686113 ],
[ -75.509042, 39.694513 ], [ -75.507162, 39.696961 ],
[ -75.504042, 39.698313 ], [ -75.496241, 39.701413 ],
[ -75.491341, 39.711113 ], [ -75.488553, 39.714833 ],
[ -75.485241, 39.715813 ], [ -75.483141, 39.715513 ],
[ -75.481741, 39.714546 ], [ -75.478940, 39.713813 ],
[ -75.477640, 39.715013 ], [ -75.476888, 39.718337 ],
[ -75.477432, 39.720561 ], [ -75.477240, 39.724713 ],
[ -75.475440, 39.728713 ], [ -75.475384, 39.731057 ],
[ -75.474168, 39.735473 ], [ -75.469239, 39.743613 ],
[ -75.466263, 39.750737 ], [ -75.466249, 39.750769 ],
[ -75.463039, 39.758313 ], [ -75.463339, 39.761213 ]]"
library(stringr)
s = str_split(gsub('\n', ' ', text), ', ')[[1]]
s = gsub('\\[|\\]', '', s)
s = str_trim(s)
df = data.frame(matrix(s, nc = 2, byrow = T))
colnames(df) = c('longitude', 'latitude')
head(df)
Here's another possibility with tidyverse, where I read in the data as a string, then I extract only numeric, ., and -, then I make the values numeric and turn into a dataframe column. Next, I create an index, ind, that has the same value every other row (this will be the 2 columns). Next, I create a row number column, then pivot the data wide to get into two columns, then rename.
text <- "[
[ -75.526844, 39.655713 ], [ -75.526344, 39.656413 ],
[ -75.522343, 39.660813 ], [ -75.518343, 39.663913 ],
[ -75.514643, 39.668613 ], [ -75.511743, 39.674313 ],
[ -75.509342, 39.685313 ], [ -75.509742, 39.686113 ],
[ -75.509042, 39.694513 ], [ -75.507162, 39.696961 ],
[ -75.504042, 39.698313 ], [ -75.496241, 39.701413 ],
[ -75.491341, 39.711113 ], [ -75.488553, 39.714833 ],
[ -75.485241, 39.715813 ], [ -75.483141, 39.715513 ],
[ -75.481741, 39.714546 ], [ -75.478940, 39.713813 ],
[ -75.477640, 39.715013 ], [ -75.476888, 39.718337 ],
[ -75.477432, 39.720561 ], [ -75.477240, 39.724713 ],
[ -75.475440, 39.728713 ], [ -75.475384, 39.731057 ],
[ -75.474168, 39.735473 ], [ -75.469239, 39.743613 ],
[ -75.466263, 39.750737 ], [ -75.466249, 39.750769 ],
[ -75.463039, 39.758313 ], [ -75.463339, 39.761213 ]]"
library(tidyverse)
data.frame(Column = as.numeric(str_extract_all(text, "[0-9.-]+")[[1]])) %>%
group_by(ind = rep(1:2, length.out = n())) %>%
mutate(rn = row_number()) %>%
ungroup %>%
pivot_wider(names_from = ind, values_from = Column) %>%
select(-rn) %>%
rename("longitude" = 1, "latitude" = 2)
Output
longitude latitude
1 -75.52684 39.65571
2 -75.52634 39.65641
3 -75.52234 39.66081
4 -75.51834 39.66391
5 -75.51464 39.66861
6 -75.51174 39.67431
7 -75.50934 39.68531
8 -75.50974 39.68611
9 -75.50904 39.69451
10 -75.50716 39.69696
11 -75.50404 39.69831
12 -75.49624 39.70141
13 -75.49134 39.71111
14 -75.48855 39.71483
15 -75.48524 39.71581
16 -75.48314 39.71551
17 -75.48174 39.71455
18 -75.47894 39.71381
19 -75.47764 39.71501
20 -75.47689 39.71834
21 -75.47743 39.72056
22 -75.47724 39.72471
23 -75.47544 39.72871
24 -75.47538 39.73106
25 -75.47417 39.73547
26 -75.46924 39.74361
27 -75.46626 39.75074
28 -75.46625 39.75077
29 -75.46304 39.75831
30 -75.46334 39.76121
If you have access to the GEOJson, then it is a little easier to convert. For example, if the data is hosted on a URL, then you could do something like below. You can also convert the excel file with the data into a .txt, then use that to bring in the data (e.g., geojsonsf::geojson_sf("~/Downloads/Alaska.txt"))
library(geojsonsf)
library(sf)
sf <- geojsonsf::geojson_sf("https://raw.githubusercontent.com/glynnbird/usstatesgeojson/master/california.geojson")
# Or if you have a local file, then you could put that here instead, e.g., geojsonsf::geojson_sf("~/Downloads/Alaska.geojson")
as.data.frame( sf::st_coordinates( sf ) ) %>%
select(1:2) %>%
rename("longitude" = 1, "latitude" = 2) %>%
head()
longitude latitude
1 -120.2485 33.99933
2 -120.2474 34.00191
3 -120.2387 34.00759
4 -120.2300 34.01014
5 -120.2213 34.01037
6 -120.2085 34.00565

How can I duplicate multiple times an existing object within a JSON array using jq?

I have the following json file:
{
"actions": [
{
"values": "test",
"features": [
{
"v1": 100,
"v2": {
"dates": [
"2020-04-08 06:58:26",
"2020-04-08 06:58:26"
]
}
}
]
}
]
}
I would like to append n-times the object within the "actions" array to the end of it, creating n+1 total objects.
Expected output if n=2:
{
"actions": [
{
"values": "test",
"features": [
{
"v1": 100,
"v2": {
"dates": [
"2020-04-08 06:58:26",
"2020-04-08 06:58:26"
]
}
}
]
},
{
"values": "test",
"features": [
{
"v1": 100,
"v2": {
"dates": [
"2020-04-08 06:58:26",
"2020-04-08 06:58:26"
]
}
}
]
},
{
"values": "test",
"features": [
{
"v1": 100,
"v2": {
"dates": [
"2020-04-08 06:58:26",
"2020-04-08 06:58:26"
]
}
}
]
}
]
}
I found this answer [How can I duplicate an existing object within a JSON array using jq? however it only works with one element at the end.
You can just use a reduce() function with range() together to create the index to include the object at.
jq --arg n 2 'reduce range(0, ($n|tonumber)) as $d (.; .actions[$d+1] += .actions[0] )' json

How to style a historical barchart in nvd3

I am using nvd3 to create a historical barchart and it's working fine. There are two things I would like to do with the bars.
style the bars with rounded edges instead of square.
colour the bars based on their values
For the second case, I tried a function as follows and it doesn't work.
function(d,i) {
return d.y > 50? "red":"blue";
}
Updated:
This is the data that I am using. I am just trying to colour the bars based on the value. So if the data value is more than 50, it should colour the bar red. As for the other question, I just want to style them with rounded edges.
data = [{
"values" : [
[ 1136005200000 , 17.0] , [ 1138683600000 , 12.0] , [ 1141102800000 , 12.0] , [ 1143781200000 , 14] ,
[ 1146369600000 , 20] , [ 1149048000000 , 21] , [ 1151640000000 , 17] , [ 1154318400000 , 34] , [ 1156996800000 , 10] ,
[ 1159588800000 , 8.0] , [ 1162270800000 , 38.0] , [ 1164862800000 , 38.0] , [ 1167541200000 , 35.0] ,
[ 1170219600000 , 55.0] , [ 1172638800000 , 35.0] , [ 1175313600000 , 26.0] , [ 1177905600000 , 26.0] ,
[ 1180584000000 , 26.0] , [ 1183176000000 , 25.0] , [ 1185854400000 , 25.0] , [ 1188532800000 , 25.0] ,
[ 1191124800000 , 29.0] , [ 1193803200000 , 29.0] , [ 1196398800000 , 29.0] , [ 1199077200000 , 52.0] ,
[ 1201755600000 , 22.0] , [ 1204261200000 , 22.0] , [ 1206936000000 , 22.0] , [ 1209528000000 , 22.0]
}];
For example, this code draws a different shade of blue depending on the x-value, and grey if the y-value is zero:
d3.selectAll("rect.nv-bar").style("fill", function(d,i){
var colorCode = "#1f77b4";
var date = new Date(d.x * 1000);
var hours = date.getUTCHours();
if(hours < 6 || hours >= 18){
colorCode = "#355880";
}
if(d.y === 0){
colorCode = "grey";
}
return colorCode;
});

pulling the list of values from the list of keys

I have a record as
firstMap = [ name1:[ value1:10, value2:'name1', value3:150, value4:20 ],
name2:[ value1:10, value2:'name2', value3:150, value4:20 ] ]
I have a list where the values are name1, name2, etc.
I want to pull the list depending on the name1 as
[ name1:[ value1:10, value2:'name1', value3:150, value4:20 ]
firstMap.subMap(["name1"]), did work for me, but I have a list and by looping the list I need to pull the values
namesList.each{record ->
newMap = firstmap.subMap(record)
}
I have tried subMap([offer]), subMap(["offer"]), subMap(["offer?.stringValue()"]), subMap(['offer']), etc. But none of them work for me.
You don't need submap at all, that's only really useful when you want to grab a few keys at once or if you need the original key in the result
Try:
firstMap = [ name1:[ value1:10, value2:'name1', value3:150, value4:20 ],
name2:[ value1:10, value2:'name2', value3:150, value4:20 ] ]
def namesList = [ 'name1', 'name2' ]
namesList.each { name ->
println firstMap[ name ]
}
Or if you need a Map result with the original query key:
namesList.each { name ->
println firstMap.subMap( [ name ] )
}
Or indeed:
namesList.each { name ->
println( [ (name):firstMap[ name ] ] )
}
Would give you the same (ie: create a new map with the key name and the value of my first example)

Resources