bulk insertion in MS SQL from a text file - asp.net

I have a text file that contains around 21 lac entries and I want to insert all these entries into a table. Initially I have created one function in c# that read line by line and insert into table but it takes too much time. Please suggest an efficient way to insert these bulk data and that file is containing TAB(4 spaces) as delimiter.
And that text file also containing some duplicate entries and I don't want to insert those entries.

Load all of your data into a DataTable object and then use SqlBulkCopy to bulk insert them:
DataTable dtData = new DataTable("Data");
// load your data here
using (SqlConnection dbConn = new SqlConnection("db conn string"))
{
dbConn.Open();
using (SqlTransaction dbTrans = dbConn.BeginTransaction())
{
try
{
using (SqlBulkCopy dbBulkCopy = new SqlBulkCopy(dbConn, SqlBulkCopyOptions.Default, dbTrans))
{
dbBulkCopy.DestinationTableName = "intended SQL table name";
dbBulkCopy.WriteToServer(dtData );
}
dbTrans.Commit();
}
catch
{
dbTrans.Rollback();
throw;
}
}
dbConn.Close();
}
I've included the example to wrap this into a SqlTransaction so there will be a full rollback if there's a failure along the way. To get you started, here's a good CodeProject article on loading the delimited data into a DataSet object.
Sanitizing the data before loading
OK, here's how I think your data looks:
CC_FIPS FULL_NAME_ND
AN Xixerella
AN Vila
AN Sornas
AN Soldeu
AN Sispony
... (cut down for brevity)
In this instance you want to create your DataTable like this:
DataTable dtData = new DataTable("Data");
dtData.Columns.Add("CC_FIPS");
dtData.Columns.Add("FULL_NAME_ND");
Then you want to iterate each row (assuming your tab delimited data is separated row-by-row by carriage returns) and check whether this data already exists in the DataTable using the .Select method and if there is a match (i'm checking for BOTH values, it's up to you whether you want to do something else) then don't add it thereby preventing duplicates.
using (FileStream fs = new FileStream("path to your file", FileMode.Open, FileAccess.Read))
{
int rowIndex = 0;
using (StreamReader sr = new StreamReader(fs))
{
string line = string.Empty;
while (!sr.EndOfStream)
{
line = sr.ReadLine();
// use a row index to skip the header row as you don't want to insert CC_FIPS and FULL_NAME_ND
if (rowIndex > 0)
{
// split your data up into a 2-d array tab delimited
string[] parts = line.Split('\t');
// now check whether this data has already been added to the datatable
DataRow[] rows = dtData.Select("CC_FIPS = '" + parts[0] + "' and FULL_NAME_ND = '" + parts[1] + "'");
if (rows.Length == 0)
{
// if there're no rows, then the data doesn't exist so add it
DataRow nr = dtData.NewRow();
nr["CC_FIPS"] = parts[0];
nr["FULL_NAME_ND"] = parts[1];
dtData.Rows.Add(nr);
}
}
rowIndex++;
}
}
}
At the end of this you should have a sanitized DataTable that you can bulk insert. Please note that this code isn't tested, but it's a best guess as to how you should do it. There are many ways this can be done, and probably a lot better than this method (specifically LINQ) - but it's a starting point.

Related

We found a problem with some content in "example.xlsx" - using ClosedXML library

I'm using ClosedXML library to generate a simple Excel file with 2 worksheets.
I keep getting error message whenever i try to open the file saying
"We found a problem with some content in "example.xlsx". Do you want us to try to recover as much as we can. if you trust source of this workbook, click Yes"
If i click Yes, it displays the data as expected, i don't see any
problems with it. Also if i generate only 1 worksheet this error does
not appear.
This is what my stored procedure returns, first result set is populated in sheet1 and second result set is populated in sheet2, which works as expected.
Workbook data
Here is the method i am using, it returns 2 result sets and populates both result sets in 2 different worksheets:
[HttpPost]
[ValidateAntiForgeryToken]
public ActionResult POAReport(POAReportVM model)
{
POAReportVM poaReportVM = reportService.GetPOAReport(model);
using (var workbook = new XLWorkbook())
{
IXLWorksheet worksheet1 = workbook.Worksheets.Add("ProductOrderAccuracy");
worksheet1.Cell("A1").Value = "DATE";
worksheet1.Cell("B1").Value = "ORDER";
worksheet1.Cell("C1").Value = "";
var countsheet1 = 2;
for (int i = 0; i < poaReportVM.productOrderAccuracyList.Count; i++)
{
worksheet1.Cell(countsheet1, 1).Value = poaReportVM.productOrderAccuracyList[i].CompletedDate.ToString();
worksheet1.Cell(countsheet1, 2).Value = poaReportVM.productOrderAccuracyList[i].WebOrderID.ToString();
worksheet1.Cell(countsheet1, 3).Value = poaReportVM.productOrderAccuracyList[i].CompletedIndicator;
countsheet1++;
}
IXLWorksheet worksheet2 = workbook.Worksheets.Add("Summary");
worksheet2.Cell("A1").Value = "Total Orders Sampled";
worksheet2.Cell("B1").Value = "Passed";
worksheet2.Cell("C1").Value = "% Passed";
worksheet2.Cell(2, 1).Value = poaReportVM.summaryVM.TotalOrdersSampled.ToString();
worksheet2.Cell(2, 2).Value = poaReportVM.summaryVM.Passed.ToString();
worksheet2.Cell(2, 3).Value = poaReportVM.summaryVM.PassedPercentage.ToString();
//save file to memory stream and return it as byte array
using (var ms = new MemoryStream())
{
workbook.SaveAs(ms);
ms.Position = 0;
var content = ms.ToArray();
return File(content, "application/vnd.openxmlformats-officedocument.spreadsheetml.sheet");
}
}
}
I had similar problem. For me the cause was as mentioned in this answer:
I had this issue when I was using EPPlus to customise an existing template. For me the issue was in the template itself as it contained invalid references to lookup tables. I found this in Formula -> Name Manager.
You may find other solutions there.
Apologies as my reputation is too low to add a comment.

.Net Core Npgsql Return Single Row

I was just wondering if it's possible to return a single row using Npgsql in a .Net Core Script.
I've been doing with the Read Method like bellow
NpgsqlCommand command = new NpgsqlCommand (
" select col01, col02, col03 from table ",
dbconnection
);
var result = command.ExecuteReader();
while(result.Read()) {
var varCol01 = result["col01"];
var varCol02 = result["col02"];
}
Which seems a bit excessive for a single row because I would have to exit the while manually.
Each time you call NpgsqlDataReader.Read, you're reading an additional row - so your code sample doesn't seem to be a good fit for a single row scenario...
If you know you're only getting a single row, why not do something along the lines of:
var reader = command.ExecuteReader();
reader.Read(); // You can do a Debug.Assert to make sure the result of this is true
var (col1, col2) = (result["col01"], result["col02"]);

Retrieve Cellset Value in SSAS\MDX

Im writing SSAS MDX queries involving more than 2 axis' to retrieve a value. Using ADOMD.NET, I can get the returned cellset and determine the value by using
lblTotalGrossSales.Text = CellSet.Cells(0).Value
Is there a way I can get the CellSet's Cell(0) Value in my MDX query, instead of relying on the data returning to ADOMD.NET?
thanks!
Edit 1: - Based on Daryl's comment, here's some elaboration on what Im doing. My current query is using several axis', which is:
SELECT {[Term Date].[Date Calcs].[MTD]} ON 0,
{[Sale Date].[YQMD].[DAY].&[20121115]} ON 1,
{[Customer].[ID].[All].[A612Q4-35]} ON 2,
{[Measures].[Loss]} ON 3
FROM OUR_CUBE
If I run that query in Management Studio, I am told Results cannot be displayed for cellsets with more than two axes - which makes sense since.. you know.. there's more than 2 axes. However, if I use ADOMD.NET to run this query in-line, and read the returning value into an ADOMD.NET cellset, I can check the value at cell "0", giving me my value... which as I understand it (im a total noob at cubes) is the value sitting where all these values intersect.
So to answer your question Daryl, what I'd love to have is the ability to have the value here returned to me, not have to read in a cell set into the calling application. Why you may ask? Well.. ultimately I'd love to have one query that performs several multi-axis queries to return the values. Again.. Im VERY new to cubes and MDX, so it's possible Im going at this all wrong (Im a .NET developer by trade).
Simplify your query to return two axis;
SELECT {[Measures].[Loss]} ON 0, {[Term Date].[Date Calcs].[MTD] * [Sale Date].[YQMD].[DAY].&[20121115] * [Customer].[ID].[All].[A612Q4-35]} ON 1 FROM OUR_CUBE
and then try the following to access the cellset;
string connectionString = "Data Source=localhost;Catalog=AdventureWorksDW2012";
//Create a new string builder to store the results
System.Text.StringBuilder result = new System.Text.StringBuilder();
AdomdConnection conn = new AdomdConnection(connectionString);
//Connect to the local serverusing (AdomdConnection conn = new AdomdConnection("Data Source=localhost;"))
{
conn.Open();
//Create a command, using this connection
AdomdCommand cmd = conn.CreateCommand();
cmd.CommandText = #"SELECT { [Measures].[Unit Price] } ON COLUMNS , {[Product].[Color].[Color].MEMBERS-[Product].[Color].[]} * [Product].[Model Name].[Model Name]ON ROWS FROM [Adventure Works] ;";
//Execute the query, returning a cellset
CellSet cs = cmd.ExecuteCellSet();
//Output the column captions from the first axis//Note that this procedure assumes a single member exists per column.
result.Append("\t\t\t");
TupleCollection tuplesOnColumns = cs.Axes[0].Set.Tuples;
foreach (Microsoft.AnalysisServices.AdomdClient.Tuple column in tuplesOnColumns)
{
result.Append(column.Members[0].Caption + "\t");
}
result.AppendLine();
//Output the row captions from the second axis and cell data//Note that this procedure assumes a two-dimensional cellset
TupleCollection tuplesOnRows = cs.Axes[1].Set.Tuples;
for (int row = 0; row < tuplesOnRows.Count; row++)
{
for (int members = 0; members < tuplesOnRows[row].Members.Count; members++ )
{
result.Append(tuplesOnRows[row].Members[members].Caption + "\t");
}
for (int col = 0; col < tuplesOnColumns.Count; col++)
{
result.Append(cs.Cells[col, row].FormattedValue + "\t");
}
result.AppendLine();
}
conn.Close();
TextBox1.Text = result.ToString();
} // using connection
Source : Retrieving Data Using the CellSet
This is fine upto select on columns and on Rows. It will be helpful analyze how to traverse sub select queries from main query.

ado.net query removes file exention when adding string filename to the database

I am using the code below for saving uploaded picture and makigna thumbnail but it saves a filename without the extension to the database, therefore, I get broken links. How can I stop a strongly typed dataset and dataadapter to stop removing the file extension? my nvarchar field has nvarchar(max) so problem is not string length.
I realized my problem was the maxsize in the dataset column, not sql statement parameter, so I fixed it. You may vote to close on this question.
hasTableAdapters.has_actorTableAdapter adp1 = new hasTableAdapters.has_actorTableAdapter();
if (Convert.ToInt16(adp1.UsernameExists(username.Text)) == 0)
{
adp1.Register(username.Text, password.Text,
ishairdresser.Checked, city.Text, address.Text);
string originalfilename = Server.MapPath(" ") + "\\pictures\\" + actorimage.PostedFile.FileName;
string originalrelative = "\\pictures\\" + actorimage.FileName;
actorimage.SaveAs(originalfilename);
string thumbfilename = Server.MapPath(" ") + "\\pictures\\t_" + actorimage.PostedFile.FileName;
string thumbrelative = "\\pictures\\t_" + actorimage.FileName;
Bitmap original = new Bitmap(originalfilename);
Bitmap thumb=(Bitmap)original.GetThumbnailImage(100, 100,
new System.Drawing.Image.GetThumbnailImageAbort(ThumbnailCallback),
IntPtr.Zero);
thumb=(Bitmap)original.Clone(
new Rectangle(new Point(original.Width/2,original.Height/2), new Size(100,100)),
System.Drawing.Imaging.PixelFormat.DontCare);
/*
bmpImage.Clone(cropArea,bmpImage.PixelFormat);
*/
thumb.Save(thumbfilename);
adp1.UpdatePicture(originalrelative, thumbrelative, username.Text);
LoginActor();
Response.Redirect("Default.aspx");
}
}
Looks like the problem is you are using HttpPostedFile.FileName property, which returns fully-qualified file name on the client. So, this code string originalfilename = Server.MapPath(" ") + "\\pictures\\" + actorimage.PostedFile.FileName; generates something like this:
c:\inetpub\pictures\c:\Users\Username\Pictures\image1.jpg
Use FileUpload.FileName property everywhere and you will probably get what you want.
Use this to get Image or file extension :
string Extension = System.IO.Path.GetExtension(FileUpload.FileName);

How do I reference a DataRow element by name in a DataSet retrieved from SQL?

So I am building some XML using a XmlWriter and a DataSet but when it comes time to loop through each DataRow in the DataSet I can't figure out how do reference like "userid" and such that come back from the stored procedure. In page code I see them doing it as Eval("userid") or whatever which I am using the same stored procedure, but I am using it in an ASHX now... see the 'WHAT GOES HERE??' in the code below...
DataSet getData;
getData = SqlHelper.ExecuteDataset(ConfigurationManager.ConnectionStrings["connstr"].ConnectionString, CommandType.StoredProcedure, "Course_NewReportGet_Get_Sav", objPAra)
//COUNT NUMBER OF RESULTS FOR COUNT ATTRIBUTE (must add!)
XmlWriterSettings settings = new XmlWriterSettings();
settings.Indent = true;
settings.IndentChars = (" ");
using(XmlWriter writer = XmlWriter.Create("data.xml", settings))
{
writer.WriteStartElement("changes");
writer.WriteAttributeString("clientname", foundCompany.CompanyName);
writer.WriteAttributeString("clientid", foundCompany.Abbreviation);
//writer... INSERT COUNT ATTRIBUTE
foreach(DataRow dr in getData.Tables)
{
writer.WriteStartElement("change");
writer.WriteStartElement("user");
writer.WriteAttributeString("userid", dr... WHAT GOES HERE??;
}
writer.WriteEndElement();
}
First off, your foreach is wrong. You need to loop over getData.Tables[0].Rows. If there's more than 1 table, you need to loop over getData.Tables.
But, the answer is it's an indexed property. So, dr["userId"] will get you the value.
foreach (DataRow dr in getData.Tables[0].Rows) {
/ * blah, blah */
writer.WriteAttributeString("userId", dr["userId"]);
}
It is pretty straightforward. An integer column in your datarow (dr) would be accessed with:
int someVal = (int)dr["ColumnName"];
Your inner foreach is the issue.
Should be
if(getData.Tables.Count > 0){
foreach(DataRow dr in getData.Tables[0].Rows){
writer.WriteAttributeString("userid", dr["UserID"]);
}
}
You're going to want to add an iteration level - you're currently iterating over Tables in your foreach(DataRow dr in getData.Tables), the DataRow objects are one level below.
Try using the DataRow.Item Property as shown below
dr["userid"]

Resources