LinqToExcel blank rows - linq-to-excel

I am using LinqToExcel to easily import Excel data into SQL Server.
var fileName = ConfigurationManager.AppSettings["ExcelFileLocation"];
var excel = new ExcelQueryFactory(fileName);
var employees = excel.Worksheet<Employee>().ToList();
Everything's fine, there is only 1 problem, the fields mapped exactly to the database table fields, and in the database they are NOT NULL.
Having said that, if you look at this screenshot of Excel file, some rows below row 3 are actually not empty. There are no spaces, but somehow LinqToExcel reads them as well and of course I get exception thrown by EntityFramework saying the field cannot be null.
I need to select all the blank rows below 3 up to 8980 something, and delete them.
Only then I can use LinqToExcel not trying to import blank rows.
Any idea how to solve the problem?
Thanks.

You can add a condition to the LINQ statement so the empty rows are not included.
var employees = excel.Worksheet<Employee>().Where(x => x.VS2012 != null).ToList();
And if checking for not null does not work, then you can check for an empty string
var employees = excel.Worksheet<Employee>().Where(x => x.VS2012 != "").ToList();

but somehow LinqToExcel reads them as well
This is a quirk of Excel. It remembers how many rows and columns where used when the Sheet was at it largest size. You can see this by typing Ctrl-End. This will select the cell in the last row and column ever used.
Office support describes how to reset the last cell: Locate and reset the last cell on a worksheet
Basically, you delete excess rows and columns, clear formatting, save the workbook and reopen it.
This work-around could be useful if you have Excel-files waiting to be imported and no time to deploy your fixed Linq2Excel client.

Related

SQLite : discrepancy when importing CSV file into table

I have a table with last column of integer type and I am trying to import data from a csv file into this table.
Some records have no value for this last column. When importing, I see that the last record is getting handled differently than others. For all other such records, this column is populated with empty string ''. However, in case of last record, it is getting populated with null.
Why is this happening?
Perhaps, if I include a newline character after the last line, it could solve the problem, but thats not an option here. What else can I do to address this?
Assuming every record having a missing third column value other than the last row in the file had empty string, you could just try the following update:
UPDATE yourTable
SET col3 = ''
WHERE col3 IS NULL;
This would work, assuming that at most only the final row would match the above update. The best thing to do here would probably be to fix your CSV file and put a CRLF onto the final line.

Scala SQLite Invalid Query for DELETE

I have problem when trying to add a row and then delete one row from a table. What I have now is:
def addItem(item:Item)={
val query = items.filter(_.name === name)
items += (item.name,item.timestamp)
if(query.list.length > 10)
{
query.sortBy(timestamp).take(1).delete
}
}
It is supposed to store 10 latest items in the database by removing the oldest if there is more than 10.
But I get an error saying
SlickException : Invalid query for DELETE statement: A single source table is required, found List((s2,Comprehension))
I do have another table in the database but this should have nothing to do with that, there is not even a relation between the two tables.
Do you have any ideas what might be wrong? Or is there another way of keeping only last 10 values in the DB. The time stamp is java.sql.timestamp and I'm using Slick library for the SQLite for scala. Also the class Item is just holding a string and a timestamp.
Thanks! Any help is appreciated!
Ok, it seems that delete doesn't work with take for some reason, as was mentioned in the comments. There doesn't seem to be any other way to select the first element of the query as query other than take. .first for example returns the actual data of the row rather than a query of the first element and therefore delete can't be applied to it.
What I did to get it working was a sort of workaround:
val oldestTime = query.sortBy(timestamp).first._2 //selects the timestamp of the oldest element
query.filter(._timestamp === oldestTime).delete // deletes all rows with that timestamp
I hope this helps someone someday.

SqliteDataReader and duplicate column names when using LEFT JOIN

Is there a documentation/specification about Sqlite3 that would describe would is supposed to happen in the following case?
Take this query:
var cmd = new SqliteCommand("SELECT Items.*, Files.* FROM Items LEFT JOIN Files ON Files.strColName = Items.strColName");
Both Items and Files have a column name "strColName". If an entry exists in Files, it will be joined to the result, if not, it will be NULL.
Let's assume I always need the value of strColName, no matter if it is coming from Items or from Files. If I execute a reader:
var reader = cmd.ExecuteReader();
If there is a match in Files, reader["strColName"] will obviously contain the correct result because the value is set and it is the same in both tables. But if there wasn't a match in Files, will the NULL value of Files overwrite the non-NULL value of Items?
I'm really looking for some specification that defines how a Sqlite3 implementation has to deal with this case, so that I can trust either result.
SQLite has no problem returning multiple columns labelled with the same name.
However, the columns will always be returned in exactly the same order they are written in the SELECT statement.
So, when you are searching for "strColName", you will find the first one, from Items.
It is recommended to use explicit column names instead of * so that the order is clear, and you can access values by their column index, if needed (and you detect incompatible changes in the table structure).

Empty fields extracted from excel

This is my problem:
I'm reading data from an Excel file on a .NET MVC app, what I'm doing is to read all data from the excel and then loop over each record inserting the data contained in the record into my business model.
All works perfectly. However, I've found that one field, sometimes, return an empty string when retrieved from the excel. Curiously this field can contain a simple string or a string that will be treated as an array (it can include '|' characters to build the array) on some excel files the field returns empty when the '|' char is present and in others when it isn't, and this behaviour is consistent all along that file.
There are other fields that can receive the separator and work always ok. The only difference between both fields are that the working ones are pure strings and the one that's failing is a string of numbers with possibles '|' separating them.
I've tried to change the separator character (I tried with '#' with same results) and to specifically format the cells as text without any success.
This is the method that extracts data from the excel
private DataSet queryData(OleDbConnection objConn) {
string strConString = "SELECT * FROM [Hoja1$] WHERE NUMACCION <> ''";
OleDbCommand objCmdSelect = new OleDbCommand(strConString, objConn);
OleDbDataAdapter objAdapter1 = new OleDbDataAdapter();
objAdapter1.SelectCommand = objCmdSelect;
DataSet objDataset = new DataSet();
objAdapter1.Fill(objDataset, "ExcelData");
return objDataset;
}
I first check the fields from the excel with:
fieldsDictionary.Add("Hours", table.Columns["HOURS"].Ordinal);
And later, when looping through the DataSet I extract data with:
string hourString = row.ItemArray[fieldsDictionary["Hours"]].ToString();
This hourString is empty in some records. In some Excel files it's empty when the record contains '|', on others it's empty when it doesn't. I haven't found yet a file where it returns empty on records of both classes.
I'm quite confused about this. I'm pretty sure it has to be related to the numerical nature of field data, but cannot understand why it doesn't solve when I force the cells on the excel file to be "text"
Any help will be more than welcome.
Ok. I finally solved this.
It seems like Excel isn't able to recognize a whole column as same data type if it contains data of possibly different classes. This happens even if you force the cell format to be text on the workbook, as when you query the data it will recognize the field as a determinated type according to the first record it receives; that was the reason why different files emptied different type of records, files starting with a plain text emptied numeric values and vice versa.
I've found a solution to this just changing the connection string to Excel.
This was my original connection string
Provider=Microsoft.Jet.OLEDB.4.0;Data Source=pathToFile;Extended Properties="Excel 8.0;HDR=Yes;"
And this the one that fixes the problem
Provider=Microsoft.Jet.OLEDB.4.0;Data Source=pathToFile;Extended Properties="Excel 8.0;HDR=Yes;IMEX=1"
The parameter IMEX=1 states to excel that it must manage all mixed data columns as plain text. This won't work for you if you need to edit the excel file, as this parameter also opens it on read-only mode. However it was perfect for my situation.

how to compare spreadsheet rows and columns with database table rows and columns

I have a spreadsheet in my local machine with columns(employee number and salary). I need to update employee table with this value. Mismatched rows has to be tabled in browser.
Using file upload control in visual studio .net I'm uploading excel sheet. When clicking on the button I need unmatched rows.
I think we can achieve this by using datasets to bring database values and compare with sheet, but what is the best way to compare?
Thanks
This is how we do it:
First, you should have atleast a 'comparison column' in each of your datasets (ie. Excel Sheet and Database table).
You will create 2 objects to hold the data for the excel sheet and the table records.
You will then populate them.
Next is where you will have a little magic. You will have to make a choice of which of the two is you primary dataset.
What next? .... loop through items.
Pseudo:
DataSet ds1 = .....;
DataSet ds2 = .....;
foreach record(r) in ds1 Table
foreach record(s) in ds2 Table
if record r = record s
store this as matched record and break
else
store this as mismached record (may need some more logic here)
endloop
endloop
Unfortunately I do not have a copy/paste function/method to do this but can provide guidance.
Hope this gives you a starting point.

Resources