i have a page that allow user to upload an excel file and insert the data in excel file to the SQL Server. Now i have a small issue that, there is a column in excel file with values, such as "001", "029", "236". When it's insert to the SQL Server, the zero in front will be ignored in SQL, so the data would become "1", "29", "239". The data type for the column in SQL is varchar(10). How to solve this?
Excel seems to automatically convert cell values to numbers. Try prefixing the cell contents with a single quote in the Excel sheet prior to processing. Eg '001. If you can't trust the users to do that, use a string formatting routine to left pad the numbers with zeroes.
Something must be converting the data in the excel cell from a string to an integer. How are you performing the insert?
If a user enters 001 into Excel, it will be converted to the number 1.
If the user enters '001 into Excel, it will be saved in the cell as text.
If the cell is pre-formatted with the number format "#", then when the user enters 001 into the cell it will be entered as the text "001". The "#" number format tells Excel that the cell is a text cell and any entry (whether it looks like a number, date, time, fraction, etc...) should simply be placed in the cell as is - as a text cell.
Can you tell your users to pre-format this column with "#"? This is generally the most reliable way to handle this since the user does not have to remember to enter '001.
Maybe setting up the datatype "Text" for an Excel cell will help.
Excel is probably the culprit here. Try converting your file to CSV and see how it comes out. If the leading zeros are gone in the new CSV file, Excel is the problem.
Excel always does this, and its a nuissance. There are three workarounds I know of:
BEFORE entering the data in any cell in Excel format the cell as text (you can do a whole column if needed.) This only works if you control the spreadsheets and users, which is basically never :-).
Assume you'll get a mix of numbers and/or text in the Excel data, and fix it in Excel before import: add a column to the spreadsheet and use the TEXT() function to convert the number to text, as in =TEXT(A2, "000"); fill down. Also assumes you can edit the worksheet.
Assume you have to fix the numbers upon insert in your code. Depending on how you are loading the data, that could happen in T-SQL or in your other code. In TSQL this expression works to pad with zeros to a width of 3 characters: right( '000' + cast ( 2 as varchar(3) ), 3 )
Related
I am trying to import a .csv file to match the records in the database. However, the database records has leading zeros. This is a character field The amount of data is a bit higher side.
Here the length of the field in database is x(15).
The problem I am facing is that the .csv file contains data like example AB123456789 wherein the database field has "00000AB123456789" .
I am importing the .csv to a character variable.
Could someone please let me know what should I do to get the prefix zeros using progress query?
Thank you.
You need to FILL() the input string with "0" in order to pad it to a specific length. You can do that with code similar to this:
define variable inputText as character no-undo format "x(15)".
define variable n as integer no-undo.
input from "input.csv".
repeat:
import inputText.
n = 15 - length( inputText ).
if n > 0 then
inputText = fill( "0", n ) + inputText.
display inputText.
end.
input close.
Substitute your actual field name for inputText and use whatever mechanism you are actually using for importing the CSV data.
FYI - the "length of the field in the database" is NOT "x(15)". That is a display formatting string. The data dictionary has a default format string that was created when the schema was defined but it has absolutely no impact on what is actually stored in the database. ALL Progress data is stored as variable length length. It is not padded to fit the display format and, in fact, it can be "overstuffed" and it is very, very common for applications to do so. This is a source of great frustration to SQL reporting tools that think the display format is some sort of length limit. It is not.
I have a issue with data formats of Excel and SQL.
I have a column in SQL which is of datatype DECIMAL(18,0) and when I am trying to paste the result in SQL..the last 3 digits of the sql result gets replaced by 0 in Excel.
Example:
In SQL the result set has a column called session id and has decimal numbers like
119,597,417,242,309,670
329,621,151,415,350,454
134,460,940,261,658,890
but when I paste it in Excel the numbers look like:
I tried changing the format in EXCEL to paste as text however, the whole format of the result set gets distorted (and only the first column gets pasted properly without the 0's)
I can't keep casting all columns in SQL from decimal to int as there are way too many columns.
Can you please guide me as to what I can do?
Numeric fields in Excel are limited to 15 digits precision.
In SQL Assistant under Tools / Options / Data Format you can ask to have large Decimal (and BIGINT) fields displayed as text for just this sort of copy / paste. Or you can tell SQL Assistant to Save As or Export to Excel format.
For other tools you can explicitly FORMAT and CAST the data to VARCHAR in your SELECT so it is retrieved as text.
Several things you can do. I'll list 4.
Pick whatever suits you best.
First paste in a text editor (like notepad), seach/replace there, and paste that.
Set the datarange where you're going to paste to "text", and then paste. After that you can search/replace, and change to the correct format.
Change the regional settings of Windows to match the data that you have.
You can generate formula's from your SQL query, instead of floating point numbers. So generate a text like =5/10 instead of 0.5 or 0,5. Excel will pick it up correctly regardless of your regional settings.
I should read data from more than 4 different excel file with different cell formating but same data within, so how i can change the cell format then read the data using phpexcel?
If you're storing a numeric value that's longer than a 32-bit signed integer can handle (such as 435546567567345) then treat it as a string using
$objPHPExcel->getActiveSheet()
->setCellValueExplicit(
'A1',
'435546567567345',
PHPExcel_Cell_DataType::TYPE_STRING
)
EDIT
If you're reading this value from an Excel worksheet, and it is actually a number value rather than a string containing a numeric value, then it is likely being treated as a float by MS Excel, so there may well be some loss of precision already (unless the file was created using a 64-bit version of MS Excel), even before PHPExcel reads it. If it is a number created using a 64-bit version of MS Excel, then you'll need a 64-bit version of PHP to read it without loss of precision.
Try reading the raw, unformatted value using getValue() and then doing a var_dump() to see what datatype it actually is; or try using getDataType() to see what the value was being stored as in the Excel file
I'm importing an .xls file using the following connection string:
If _
SetDBConnect( _
"Provider=Microsoft.Jet.OLEDB.4.0;Data Source=" & filepath & _
";Extended Properties=""Excel 8.0;HDR=Yes;IMEX=1""", True) Then
This has been working well for parsing through several Excel files that I've come across. However, with this particular file, when I SELECT * into a DataTable, there is a whole column of data, Item Description, missing from the DataTable. Why?
Here are some things that may set this particular workbook apart from the others that I've been working with:
The workbook has a freeze pane consisting of the first 24 rows (however, all of these rows appear in the DataTable)
There is some weird cell highlighting going on throughout the workbook
That's pretty much it. I can't see anything that would make the Item Description column not import correctly. Its data is comprised of all Strings that really have no special characters apart from &. Additionally, each data entry in this column is a maximum of 20 characters. What is happening? Is there any other way I can get all of the data? Keep in mind I have to use the original file and I cannot alter it, as I want this to ultimately be an automated process.
Thanks!
Some initial thoughts/questions: Is the missing column the very first column? What happens if you remove the space within "Item Description"? Stupid question, but does that column have a column header?
-- EDIT 1 --
If you delete that column, does the problem move to another column (the new index 4), or is the file complete. My reason for asking this -- is the problem specific to data in that column/header, or is the problem more general, on index 4.
-- EDIT 2 --
Ok, so since we know it's that column, we know it's either the header, or the rows. Let's concentrate on rows for now. Start with that ampersand; dump it, and see what happens. Next, work with the first 50% of rows. Does deleting that subset affect anything? What about the latter 50% of rows? If one of those subsets changes the result, you ought to be able to narrow it down to an individual row (hopefully not plural) by halfing your selection each time.
My guess is that you're going to find a unicode character or something else funky is one of the cells. Maybe there's a formula or, as you mentioned, some of that "weird cell highlighting."
It's been years since I worked with excel access, but I recall some problems with excel grouping content into some areas that would act as table inside each sheet. Try copy/paste the content from the problematic sheet to a new workbook and connect to that workbook. If this works you may be able to investigate a bit further about areas.
I am trying to convert data from Act 2000 to a MySQL database. I have successfully imported the DBF files into individual MySQL tables. However I am having issues with the *.BLB file, which seems to be a non-standard memo file.
The DBF files, identifies themselves as dbase III Plus, No memo format. There is a single *.BLB which is a memo file for multiple DBFs to share BLOB data.
If you read this document: http://cicorp.com/act/sdk/ACT6-SDK-ChapterA.htm#_Toc483994053)
You can see that the REGARDING column is a 6 character one. The description is: This 6-byte field is supplied by the system and contains a reference to a field in the Binary Large Object (BLOB) Database.
Now upon opening the *.BLB I can see that the block size is 64 bytes. All the blocks of text are NULL padded out to that size.
Where I am stumbling is trying to convert the values stored in the REGARDING column to blocks location in the BLB file. My assumption is that 6 character field is an offset.
For example, one value for REGARDING is, (ignoring the square brackets): [ ",J$]
In my Googling, I found this: http://ulisse.elettra.trieste.it/services/doc/dbase/DBFstruct.htm#C1.5
It explains that in memo fields (in normal DBF files at least) the space value is ignore (i.e. it's padding out the column).
Therefore if I'm correct (again, square brackets) [",J$] should be the offset in my BLB file. Luckily I've still got access to the original ACT2000 software, so I can compare the full text in the program / MySQL and BLB file.
Using my example value, I know that the DB row with REGARDING value of [ ",J$] corresponds to a 1024 byte offset (or 16 blocks, assuming my guess of a 64 byte sized block).
I've tried reading some Python code for open source projects that read DBF files - but I'm in over my head.
I think what I need to do is unpack the characters to binary, but am not sure.
How can I find the 64-block based spot to read from based on what's found in the DBF files?
EDIT by Jerry Dodge
I've attempted to reverse-engineer the strings in this field to hexadecimal values, and then to an integer value using StrToInt64, but the result still does not match up with the blob file. I've also tried multiplying this integer value by 64 and not multiplying, but the result keeps winding up outside of the size of the blob file, not actually finding any data.
For example, a value of ___/BD (_ = space) translates to $2f4244 hexidecimal, which in turn translates to the integer value of 3097156, but does not correspond with any relevant portion of data in the blob file, even when multiplied or divided by 64.
According to the SDK you linked, the following happens as I understand:
There is a TYPE field (right behing REGARDING) that encodes what REGARDING is used for (see the second table of the linked chapter). So I'd assume that if type=6 (meeting not held) the REGARDING is either irrelevant or only contains a meeting ID reference from some other table. On that line of thought I would only expect REGARDING to be a BLB offset if type=101 (or possibly 100). I'd also not abandon the thought that in these relevant cases TYPE might be a concatenation of BLB file index and offset (because there is a mention that each file must not be longer than 30K chars and I really expect to be able to store much more data even in one table).