FSO OpenTextFile with french characters - asp-classic

Using ASP's file system object (FSO), I'm trying to read a txt file with OpenTextFile that contains French characters (e and a with accents for e.g). Those characters come out wrong.
I tried specifying the format to TristateTrue to open the file as Unicode but to no avail.
I've been reading about using the ADO Stream object instead but I hoped there would be a way with FSO. Does anyone have any ideas?

Most likely the file is saved in UTF-8 encoding. The FileSystemObject does not handle UTF-8.
Either have the file saved as Unicode or use the ADODB.Stream object. The ADODB.Stream has a LoadFromFile method and does support UTF-8.
Dim s
Dim stream : Set stream = CreateObject("ADODB.Stream")
stream.CharSet = "UTF-8"
stream.LoadFromFile Server.MapPath("yourfile.txt")
s = stream.ReadAll
stream.Close

Related

RQDA does not read Imported UTF-8 characters in UTF-8 encoded .txt

Trying to import a database of texts for analysis with RQDA. The database consists of word to text converted files with UTF-8 encoding. RQDA is supposed to read UTF-8, however UTF-8 characters like (ą, č, ę, ė, į, š, ų, ū) are not recognized after import to RQDA.
I'm using the "write.FileList" function for import. Its details state, that
"The file content will be converted to UTF-8 character before being written to *.rqda. The original content can be in any suitable encoding, so you can inspect the content correctly; In other words, the better practices is to used the corresponding encoding (you can get a hint by localeToCharset function) to save the imported files."
write.FileList(FileList, encoding = .rqda$encoding, con = .rqda$qdacon)
addFilesFromDir("C:\\output", pattern = "*.txt$")
write.FileList imports the database of text to RQDA, but UTF-8 characters are not recognized.
Shows this warning:
"In rsqlite_fetch(res#ptr, n = n) :
Don't need to call dbFetch() for statements, only for queries"

Base64 Encode a ZIP file using Classic ASP and VB Script

I have a zip file, which contains one CSV file.
I need to Base64 encode this zip file to send to eBay (using their API).
I used this website: http://www.opinionatedgeek.com/DotNet/Tools/Base64Encode/ which works nicely, I upload my zip file and it returns a base64 encoded string which eBay likes.
I need to do what this website does, but using Classic ASP and VB Script.
I already have a base64 encode function, from here: http://www.motobit.com/tips/detpg_base64encode/ so I don't need a script for that. This function takes a parameter, so I need to turn my zip file into a string (I think) to pass into this function.
I have tried using ADODB.Stream and the LoadFromFile method, but the string it returns, after base64 encoding, doesn't match that from the opinionated geek website and isn't accepted by eBay.
This is what I've tried:
Dim objStream, strFileText
Set objStream = Server.CreateObject("ADODB.Stream")
objStream.Type = 1
objStream.Open
objStream.LoadFromFile Server.MapPath("myzipfile.zip")
strFileText = Base64Encode(objStream.Read)
Response.Write strFileText
objStream.Close
Set objStream = Nothing
Can anyone help..?
Thanks!
This is now solved...
I was missing the BinaryToString function between the stream output and the base64 encode.
Now I use:
strFileText = Base64Encode(BinaryToString(objStream.Read))
Where the new function is...
Function BinaryToString(Binary)
Dim I, S
For I = 1 To LenB(Binary)
S = S & Chr(AscB(MidB(Binary, I, 1)))
Next
BinaryToString = S
End Function
The output from this now matches the output from the opinionated geek tool.
Thanks to ulluoink for pointing me in the right direction!

Fix Special Characters in String

I've got a program that in a nutshell reads values from a SQL database and writes them to a tab-delimited text file.
The issue is that some of the values in the database have special characters (TM, dash, ellipsis, etc.) When written to the text file, the formatting is lost and they come across as junk "™ or – etc"
When the value is viewed in the immediate window, before it is written to the txt file, everything looks fine. My guess is that this is an issue of encoding. But, I'm not real sure how to proceed, where to look, or what to look for.
Is this ASCII or UTF-8? If it's one of those how do I correct it before it's written to the text file.
Here's how I build the text file (where feedStr is a StringBuilder)
objReader = New StreamWriter(filePath)
objReader.Write(feedStr)
objReader.Close()
The default encoding for StreamWriter is UTF8 (with no byte order mark). Your result file is ok, the question is what do you open it in afterwards? If you open it in a UTF8 capable text editor, the characters should look the way you want.
You can also write the text file in another encoding, for example iso-8859-1 (latin1)
objReader = New StreamWriter(filePath, false, Encoding.GetEncoding("iso-8859-1"))

How to add encoding information to the response stream in ASP.NET?

I have following piece of code:
public void ProcessRequest (HttpContext context)
{
context.Response.ContentType = "text/rtf; charset=UTF-8";
context.Response.Charset = "UTF-8";
context.Response.ContentEncoding = System.Text.Encoding.UTF8;
context.Response.AddHeader("Content-disposition", "attachment;filename=lista_obecnosci.csv");
context.Response.Write("ąęćżźń󳥌ŻŹĆŃŁÓĘ");
}
When I try to open generated csv file, I get following behavior:
In Notepad2 - everything is fine.
In Word - conversion wizard opens and asks to convert the text. It suggest UTF-8, which is somehow ok.
In Excel - I get real mess. None of those Polish characters can be displayed.
I wanted to write those special encoding-information characters in front of my string, i.e.
context.Response.Write((char)0xef);
context.Response.Write((char)0xbb);
context.Response.Write((char)0xbf);
but that won't do any good. The response stream is treating that as normal data and converts it to something different.
I'd appreciate help on this one.
I ran into the same problem, and this was my solution:
context.Response.BinaryWrite(System.Text.Encoding.UTF8.GetPreamble());
context.Response.Write("ąęćżźń󳥌ŻŹĆŃŁÓĘ");
What you call "encoding-information" is actually a BOM. I suspect each of those "characters" is getting encoded separately. To write the BOM manually, you have to write it as three bytes, not three characters. I'm not familiar with the .NET I/O classes, but there should be a method available to you that takes a byte or byte[] parameter and writes them directly to the file.
By the way, the UTF-8 BOM is optional; in fact, its use is discouraged by the Unicode Consortium. If you don't have a specific reason for using it, save yourself some hassle and leave it out.
EDIT: I just remembered you can also write the actual BOM character, '\uFEFF', and let the encoder handle it:
context.Response.Write('\uFEFF');
I think the problem is with Excel based on Microsoft Excel mangles Diacritics in .csv files. To prove this, copy your sample output string of ąęćżźń󳥌ŻŹĆŃŁÓĘ and paste into a test file using your favorite editor, and save as a UTF-8 encoded .csv file. Open in Excel and see the same issues.
The answer from Alan Moore
translated to VB:
Context.Response.Write(""c)

Read UTF-8 XML with MSXML 4.0

I have a problem with classc ASP / VBScript trying to read an UTF-8 encoded XML file with MSXML. The file is encoded correctly, I can see that with all other tools.
Constructed XML example:
<?xml version="1.0" encoding="UTF-8"?>
<itshop>
<Product Name="Backup gewünscht" />
</itshop>
If I try to do this in ASP...
Set fso = Server.CreateObject("Scripting.FileSystemObject")
Set ts = fso.OpenTextFile("input.xml", FOR_READING)
XML = ts.ReadAll
ts.Close
Set ts = nothing
Set fso = Nothing
Set myXML = Server.CreateObject("Msxml2.DOMDocument.4.0")
myXML.loadXML(XML)
Set DocElement = myXML.documentElement
Set ProductNodes = DocElement.selectNodes("//Product")
Response.Write ProductNodes(0).getAttribute("Name")
' ...
... and Name contains special characters (german umlauts to be specific) the bytes of the umlaut "two-byte-code" get reencoded, so I end up with two totally crappy nonsense characters. What should be "ü" becomes "ü" - being FOUR bytes in my output, not two (correct UTF-8) or one (ISO-8859-#).
What am I doing wrong? Why is MSXML thinking that the input is ISO-8859-# so that it tries to convert it to UTF-8?
Set ts = fso.OpenTextFile("input.xml", FOR_READING, False, True)
The last parameter is the "Unicode" flag.
OpenTextFile() has the following signature:
object.OpenTextFile(filename[, iomode[, create[, format]]])
where "format" is defined as
Optional. One of three Tristate values used to indicate the format of
the opened file. If omitted, the file
is opened as ASCII.
And Tristate is defined as:
TristateUseDefault -2 Opens the file using the system default.
TristateTrue -1 Opens the file as Unicode.
TristateFalse 0 Opens the file as ASCII.
And -1 happens to be the numerical value of True.
Anyway, better is:
Set myXML = Server.CreateObject("Msxml2.DOMDocument.4.0")
myXML.load("input.xml")
Why should you use a TextStream object to read in a file that MSXML can read perfectly on it's own.
The TextStream object also has no notion of the actual file encoding. The docs say "Unicode", but there is more than one way of encoding Unicode. The load() method of the MSXML object will be able to deal with all of them.

Resources