U SQL string column size is >128 KB - u-sql

I have an input file that contains one column which size is more than 128kb. I follow workaround guides to write a custom extractor, extract that as a byte array, and now I am not sure how to proceed with processing of that column.
What should i do when i extract that as a byte[] ? I try to write a custom outputter also but and then in outputter class to convert that to string but it throws me the same error message
String size 828267 exceeds the maximum allowed size of 131072
. Can someone explain me steps after extracting that column as byte[], how to proceed?

This is limitation of String Data Type. You can vote here to be considered to improvement :)
There is a workaround which involves converting and handling it as a byte[]. Check this post.

Related

C# BinaryReader/Writer equivalent in JAVA

I have a stream (hooked to an azure blob) which contains strings and integers. The same stream is consumed by a .net process also.
In C# the writing and reading is done through the type specific methods of BinaryWriter and BinaryReader classes e,g., BinaryWriter.Write("path1;path2") and BinaryReader.ReadString().
In Java, I couldn't find the relevant libraries to achieve the same. Most of the InputStream methods are capable of reading the whole line of the string.
If there are such libraries in Java, please share with me.
Most of the InputStream methods are capable of reading the whole line of the string.
None of the InputStream methods is capable of doing that.
What you're looking for is DataInputStreamand DataOutputStream.
If you are trying to read in data generated from BinaryWriter in C# you are going to have to mess with this on the bit level. The data you actually want is prefixed with an integer to show the length of the data. You can read about how the prefix is generated here:
C# BinaryWriter length prefix - UTF7 encoding
It's worth mentioning that from what I tested the length is written backwards. In my case the first two bytes of the file were 0xA0 0x54 convert this to binary to get 10100000 01010100. The first byte here starts with a 1 so it is not the last byte. The second byte starts with a 0 however so it is the last (or in this case first byte) for the length. So the resulting length prefix is 1010100 (taken from the last byte removing the indicator that it is the last byte) Then all previous bytes 0100000 which gives us the result of 10101000100000 or 10784 bytes. The file I was dealing with was 10786 bytes so with the two byte prefix indicating the length this is correct.

Oracle CLOB and NCLOB require even number of bytes

Asp.Net 4, C#, Oracle 11g
Hi, I'm trying to save the content of an html file to an Oracle CLOB column. The html file is uploaded to the server through an asp:Upload button. It is working fine most of the time, the problem is that sometimes the stream at the FileContent property of the button, has an odd number of bytes, and the Write method of the clob column throws an exception, stating that it requires an even number of bytes
How can I solve this problem??? Is there anything I can do to make my html files have an even number of bytes??? Html files are encoded as UTF8, and changing the encoding do modify the number of bytes, but they are still an even number
Thanks in advance
Edit: For now, I'm just increasing the size of the buffer by 1 in the case of the stream length being an odd number, then write the stream, in its own length, to the buffer, thus defaulting the last byte of the buffer. Please advice of any potential errors in doing it this way:
var buffer = new byte[(stream.Length % 2 > 0? stream.Length + 1: stream.Length)];
stream.Read(buffer, 0, (int)stream.Length);
clob.Write(buffer, 0, buffer.Length);
Thanks again
Edit: the previous solution didn't work. The new approach consists of passing the stream to string, add a space at the end of the string, and then convert to stream again. It's working fine until now. Sorry I can't post the code... it's just that I couldn't figured out how to overcome this policy of 4 spaces for code in StackOverflow
I'm assuming that you are using the System.Data.OracleClient classes (as opposed to Oracle's ODP.NET).
The OracleLob class has no method for writing a string, which I would expect for handling CLOBs. Instead, the documentation says:
The .NET Framework Data Provider for Oracle handles all CLOB and NCLOB
data as Unicode. Therefore, when accessing CLOB and NCLOB data types,
you are always dealing with the number of bytes, where each character
is 2 bytes. For example, if a string of text containing three
characters is saved as an NCLOB on an Oracle server where the
character set is 4 bytes per character, and you perform a Write
operation, you specify the length of the string as 6 bytes, although
it is stored as 12 bytes on the server.
In this context, Unicode means the UTF-16 encoding which requires 2 bytes for most characters and 4 or 6 bytes for characters in the supplementary planes.
So if you have a string, you have to convert it to UTF-16 first:
byte[] utf16Bytes = Encoding.Unicode.GetBytes(str);
clob.Write(utf16Bytes, 0, utf16Bytes.Lenght);
Or you can use a StreamWriter to achieve the same:
OracleLob clob = ...
using (StreamWriter writer = new StreamWriter(clob, Encoding.Unicode))
{
writer.Write(str);
}
If your data is in a UTF-8 encoded byte array, then you have to convert it to UTF-16:
byte[] utf8Data = ...
byte[] utf16Data = Encoding.Convert(Encoding.UTF8, Encoding.Unicode, utf8Data);
clob.Write(utf16Data, 0, utf16Data.Length);

when to use writeUTF() and writeUTFBytes() in ByteArray of AS3

I am trying to create a file format for myself, so i was forming the header for my file. To write a known length string into a ByteArray, which method should i use, writeUTF() or writeUTFBytes().
From the Flex 3 language ref, it tells me that writeUTF() prepends the length of the string and throws a RangeError whereas writeUTFBytes() does not.
Any suggessions would be appreciated.
The only difference between the two is that writeUTFBytes() doesn't prepend the message with the length of the string (The RangeError is because 65535 is the highest number you can store in 16 bits)
Where you'd use one over the other depends on what you're doing. For example, I use writeUTFBytes() when copying a XML object over to be compressed. In this case, I don't care about the length of the string, and it'd introduce something extra to the code.
writeUTF() can be useful if you're writing a streaming/network server, where as you prefix the message length to the message, you know how many bytes to stream on the other end before the end of the message. e.g., I have 200 bytes worth of message. I read the length (16-bit integer), which tells me the message is 100 bytes. I read in 100 bytes and I know it's a complete message. Everything after is another message. If the message length said the message was 300 bytes, then I'd know I'd have to wait a bit before I have the full message.
I think i have found the solution myself. It came to me when i was coding to read back the data. The corresponding functions to read from a bytearray readUTF() and readUTFBytes(length:uint) requires the length to be passed to it.
So if you know the length of the string that you are gonna write, you can use writeUTFBytes() and use readUTFBytes() with that size. Else you can use readUTF(), letting as3 write the size of the data which can be read back without any need to know the length of the string while using readUTF().
Hope this might be useful to some one as well.

in asp.net how to convert the binary file to a file in a string format and display it in a page

I wish to know how to convert the byte array file to a file in a string format and display it in a webpage. Can anyone tell me how can i perform this?
Unsure of what your actual aim is, a good place to start investigating would be the System.Text.Encoding class
http://msdn.microsoft.com/en-us/library/system.text.encoding.aspx

In a BoundColumn in a DataGrid, how do I format a byte[] column as a string?

I'm passing a datebase reader object to a DataGrid and it sees one of my columns as type byte[] but I happen to known that it should always be a printable string. How can I force the .NET DateBinding system to do that conversion? The only place I can see to put anything is in BoundColumn.DataFormatString but I can't find any indication how to do what I need with that.
Edit: I known how to convert a byte[] to a string in general but don't know how make the BoundColumn do it.
Because in this case I can edit the query string, I hacked passed it by using PADR(column,0) as column in the SELECT. I'm still interested in what to do if I couldn't modify the query.
You can use System.Text.Encoding.UTF8.GetString(byte[]) to get the string (make sure you use the correct encoding where UTF8 is present - there is ASCII, UTF7, UTF8, Unicode, and UTF32).

Resources