Restrict user input to characters in IBM System i 00280 code page - asp-classic

We need to restrict user input in a classic ASP web site to the characters allowed by the 00280 code page of IBM System i.
Is there a way to do it in a sane way besides having a (JavaScript|VBScript) function checking every character of an input string against a string of allowed characters?
A basic classic ASP function I thought of:
Function CheckInput(text, replacement)
Dim output : output = ""
Dim haystack : haystack = "abcd.. " ' Insert here the allowed characters.
Dim i : i = 0
For i = 1 To Len(text)
Dim needle : needle = Mid(text, i, 1)
If InStr(haystack, needle) = 0 Then
needle = replacement
End If
output = output & needle
Next
CheckInput = output
End Function
Would - in my function - a RegExp be an overkill?

The short answer to your first question is: No. To your second question: RegEx might not help you here because not all RegEx implementation in browsers will support the characters you need to test and neither does VBScript version of RegEx.
Even using the code approach you are proposing would need some very careful thought. In order to be able to place the set of characters you want to support in as string literal the codepage that you save the ASP file would need to be one that covers all the characters needed or alternatively you would need to use AscW to help you build a string containing those characters.
One slightly simpler approach would be to use Javascript and have the page charset and codepage set to UTF-8. This would allow you to create a string literal containing anyset of characters.

Since it is generally not considered secure to rely on browser validation, you should consider changing your IBM i (formerly OS/400) application interface to accept UCS-2 data, and perform any necessary validation and conversion at the server side.

Related

Loop through all characters in XML file and replace certain characters

After finally getting my XmlReader to work correctly on a project at work, I am now getting certain parsing errors when trying to create new Reader objects for certain XML files. For instance, this one that keeps occurring is an error trying to parse a hyphen (-). This slightly baffles me because I manually go in and replace that character with something else (like an underscore), and it reads fine - even when there are hyphens elsewhere in the document that are not changed.
So, unless there is a explanation to fix this (maybe some XmlReaderSettings? Have yet to use any so I don't know what they are capable of), what is the best syntax/method to cycle through every character and replace with ones that will parse correctly?
This program will run automatically once per day on a daily-added XML and length of run-time is not an issue.
Edit: Error Message:
System.Xml.XmlException: An error occurred while parsing EntityName. Line 2896, position 89.
Code:
FN = Path.GetFileName(file1).ToString()
xmlFile = XmlReader.Create(Path.Combine(My.Settings.Local_Meter_Path, FN), New XmlReaderSettings())
ds.ReadXml(xmlFile)
Dim dt As DataTable = ds.Tables(13)
Dim filecreatedate As String = IO.File.GetLastWriteTime(file1)
If the problem occurs in ONLY ONE HYPHEN in entire file, even if the file contains more hyphens, the problem may be related to:
1) The HYPHEN is really not an HYPHEN but a control-character or even be accomplished of a hidden control character.
2) The link has other interesting thinhs, like an ampersand ("&"), which in strings may cause some problems. Are you sure the problem is the Hyphen?

VB.NET : FromBase64String Error

I am getting following error when I am trying to use Convert.FromBase64String
"The input is not a valid Base-64 string as it contains a non-base 64 character, more than two padding characters, or a non-white space character among the padding characters."
Dim payloadBytes = Convert.FromBase64String(payloadBase64)
Basically when my facebook registration form (http://developers.facebook.com/docs/plugins/registration/) phone field has a dash in it and encoded string is posted to other page and I am trying to decode it there which creates this error. Basically I am trying to extract data from Facebook Signed Request.
The issue is that the dash is not a valid character in the Base64String. Here is a quote from MSDN:
The base-64 digits in ascending order from zero are the uppercase characters "A" to "Z", the lowercase characters "a" to "z", the numerals "0" to "9", and the symbols "+" and "/". The valueless character, "=", is used for trailing padding.
You can either take the dash out (which might not be what you want) or you need to figure out what format the data is truly coming in as since it doesn't seem like it is Base-64 string data.
http://msdn.microsoft.com/en-us/library/system.convert.frombase64string.aspx
The issue is that the Facebook Signed Request is using a modified Base-64 request for URL that changes a few things. Here is a quote on what it does:
For this reason, a modified Base64 for URL variant exists, where no padding '=' will be used, and the '+' and '/' characters of standard Base64 are respectively replaced by '-' and '_', so that using URL encoders/decoders are no longer necessary and have no impact on the length of the encoded value, leaving the same encoded form intact for use in relational databases, web forms, and object identifiers in general.
I believe you could solve your problem by simply replace the dash with a plus and replace the underscore with a backslash and you should be able to then decode it from Base-64.
Here is the link to the Facebook developers page that indicates that the value you are trying to decode is base64url encoded:
http://developers.facebook.com/docs/authentication/signed_request/
Are you trying to encode it to Base64 OR decode something encoded to Base64? From the looks of it, you should use Convert.ToBase64String.
- is definitely not a character that will appear in Base64 encoded string.
Are you sure that you are getting a valid Base64 encoded string from Facebook Signed Request ?
I also got same problem on same task
I was using ifram, everything was working fine, then after that on the same page I replaced ifrm code with xfbml registration code, when I checked it again it was giving following error
"The input is not a valid Base-64 string as it contains a non-base 64 character, more than two padding characters, or a non-white space character among the padding characters."
I spent lot of time to fix this issue but problem was still there, at the end I think about temporary internet files and I delete clear those files after that when I tested it was working fine.
You can also try this solution.
signedfor vb asp.net you can do this to extract and decode the payload from facebook send request, I hope this helps because I'm not familiar with vb and asp.net and spent a lot of time figuring out why I was getting the same error you were.
<%# Page Language="vb" %>
<%
Dim strSignedRequest As String
strSignedRequest = Request("signed_request")
If String.IsNullOrEmpty(strSignedRequest) = False Then
Dim arrayRequest As Array
arrayRequest = Split(strSignedRequest, ".")
Dim strPayload As String
strPayload = arrayRequest(1)
strPayload = Replace(strPayload, "-", "+")
strPayload = Replace(strPayload, "_", "/")
' padding, FromBase64String() will barf if the string is the wrong length so we need to pad it with =
strPayload = strPayload.PadRight(strPayload.Length + (4 - strPayload.Length Mod 4) Mod 4, "="C)
Dim bytSignedRequest As Byte()
bytSignedRequest = Convert.FromBase64String(strPayload)
Dim strJson As String
strJson = Encoding.UTF8.GetString(bytSignedRequest)
'Response.Write("encoded: " & strPayload)
Response.Write("decoded: " & strJson)
End If
%>

regex to parse csv

I'm looking for a regex that will parse a line at a time from a csv file. basically, what string.readline() does, but it will allow line breaks if they are within double quotes.
or is there an easier way to do this?
Using regex to parse CSV is fine for simple applications in well-controlled CSV data, but there are often so many gotchas, such as escaping for embedded quotes and commas in quoted strings, etc. This often makes regex tricky and risky for this task.
I recommend a well-tested CSV module for your purpose.
--Edit:-- See this excellent article, Stop Rolling Your Own CSV Parser!
The FileHelpers library is pretty good for this purpose.
http://www.filehelpers.net/
Rather than relying on error prone regular expressions, over simpified "split" logic or 3rd party components, use the .NET framework's built in functionality:
Using Reader As New Microsoft.VisualBasic.FileIO.TextFieldParser("C:\MyFile.csv")
Reader.TextFieldType = Microsoft.VisualBasic.FileIO.FieldType.Delimited
Dim MyDelimeters(0 To 0) As String
Reader.HasFieldsEnclosedInQuotes = False
Reader.SetDelimiters(","c)
Dim currentRow As String()
While Not Reader.EndOfData
Try
currentRow = Reader.ReadFields()
Dim currentField As String
For Each currentField In currentRow
MsgBox(currentField)
Next
Catch ex As Microsoft.VisualBasic.FileIO.MalformedLineException
MsgBox("Line " & ex.Message &
"is not valid and will be skipped.")
End Try
End While
End Using

ASP Readline non-standard Line Endings

I'm using the ASP Classic ReadLine() function of the File System Object.
All has been working great until someone made their import file on a Mac in TextEdit.
The line endings aren't the same, and ReadLine() reads in the entire file, not just 1 line at a time.
Is there a standard way of handling this? Some sort of page directive, or setting on the File System Object?
I guess that I could read in the entire file, and split on vbLF, then for each item, replace vbCR with "", then process the lines, one at a time, but that seems a bit kludgy.
I have searched all over for a solution to this issue, but the solutions are all along the lines of "don't save the file with Mac[sic] line endings."
Anyone have a better way of dealing with this problem?
There is no way to change the behaviour of ReadLine, it will only recognize CRLF as a line terminator. Hence the only simply solution is the one you have already described.
Edit
Actually there is another library that ought to be available out of the box on an ASP server that might offer some help. That is the ADODB library.
The ADODB.Stream object has a LineSeparator property that can be assigned 10 or 13 to override the default CRLF it would normally use. The documentation is patchy because it doesn't describe how this can be used with ReadText. You can get the ReadText method to return the next line from the stream by passing -2 as its parameter.
Take a look at this example:-
Dim sLine
Dim oStreamIn : Set oStreamIn = CreateObject("ADODB.Stream")
oStreamIn.Type = 2 '' # Text
oStreamIn.Open
oStreamIn.CharSet = "Windows-1252"
oStreamIn.LoadFromFile "C:\temp\test.txt"
oStreamIn.LineSeparator = 10 '' # Linefeed
Do Until oStreamIn.EOS
sLine = oStreamIn.ReadText(-2)
'' # Do stuff with sLine
Loop
oStreamIn.Close
Note that by default the CharSet is unicode so you will need to assign the correct CharSet being used by the file if its not Unicode. I use the word "Unicode" in the sense that the documentation does which actually means UTF-16. One advantage here is that ADODB Stream can handle UTF-8 unlike the Scripting library.
BTW, I thought MACs used a CR for line endings? Its Unix file format that uses LFs isn't it?

Character Support Issue - How to Translate Higher ASCII Characters to Lower ASCII Characters

So I have an ASP.Net (vb.net) application. It has a textbox and the user is pasting text from Microsoft Word into it. So things like the long dash (charcode 150) are coming through as input. Other examples would be the smart quotes or accented characters. In my app I'm encoding them in xml and passing that to the database as an xml parameter to a sql stored procedure. It gets inserted in the database just as the user entered it.
The problem is the app that reads this data doesn't like these characters. So I need to translate them into the lower ascii (7bit I think) character set. How do I do that? How do I determine what encoding they are in so I can do something like the following. And would just requesting the ASCII equivalent translate them intelligently or do I have to write some code for that?
Also maybe it might be easier to solve this problem in the web page to begin with. When you copy the selection of characters from Word it puts several formats in the clipboard. The straight text one is the one I want. Is there a way to have the html textbox get that text when the user pastes into it? Do I have to set the encoding of the web page somehow?
System.Text.Encoding.ASCII.GetString(System.Text.Encoding.GetEncoding(1251).GetBytes(text))
Code from the app that encodes the input into xml:
Protected Function RequestStringItem( _
ByVal strName As System.String) As System.String
Dim strValue As System.String
strValue = Me.Request.Item(strName)
If Not (strValue Is Nothing) Then
RequestStringItem = strValue.Trim()
Else
RequestStringItem = ""
End If
End Function
' I get the input from the textboxes into an array like this
m_arrInsertDesc(intIndex) = RequestStringItem("txtInsertDesc" & strValue)
m_arrInsertFolder(intIndex) = RequestInt32Item("cboInsertFolder" & strValue)
' create xml file for inserts
strmInsertList = New System.IO.MemoryStream()
wrtInsertList = New System.Xml.XmlTextWriter(strmInsertList, System.Text.Encoding.Unicode)
' start document and add root element
wrtInsertList.WriteStartDocument()
wrtInsertList.WriteStartElement("Root")
' cycle through inserts
For intIndex = 0 To m_intInsertCount - 1
' if there is an insert description
If m_arrInsertDesc(intIndex).Length > 0 Then
' if the insert description is of the appropriate length
If m_arrInsertDesc(intIndex).Length <= 96 Then
' add element to xml
wrtInsertList.WriteStartElement("Insert")
wrtInsertList.WriteAttributeString("insertdesc", m_arrInsertDesc(intIndex))
wrtInsertList.WriteAttributeString("insertfolder", m_arrInsertFolder(intIndex).ToString())
wrtInsertList.WriteEndElement()
' if insert description is too long
Else
m_strError = "ERROR: INSERT DESCRIPTION TOO LONG"
Exit Function
End If
End If
Next
' close root element and document
wrtInsertList.WriteEndElement()
wrtInsertList.WriteEndDocument()
wrtInsertList.Close()
' when I add the xml as a parameter to the stored procedure I do this
cmdAddRequest.Parameters.Add("#insert_list", OdbcType.NText).Value = System.Text.Encoding.Unicode.GetString(strmInsertList.ToArray())
How big is the range of these input characters? 256? (each char fits into a single byte). If that's true, it wouldn't be hard to implement a 256 value lookup table. I haven't toyed with BASIC in years, but basically you'd DIM an array of 256 bytes and fill in the array with translated values, i.e. the 'a'th byte would get 'a' (since it's OK as is) but the 150'th byte would get a hyphen.
This seems to work for long dash to short dash and smart quotes to regular quotes. As my html pages has the following as the content type. But it converts all the accented characters to questions marks. Which is not what the Text version of the clipboard has. So I'm closer, I just think I have the target encoding wrong.
<meta http-equiv="Content-Type" content="text/html; charset=iso-8859-1">
System.Text.Encoding.ASCII.GetString(System.Text.Encoding.GetEncoding("iso-8859-1").GetBytes(m_arrFolderDesc(intIndex)))
Edit: Found the correct target encoding for my purposes which is 1252.
System.Text.Encoding.GetEncoding(1252).GetString(System.Text.Encoding.GetEncoding("iso-8859-1").GetBytes(m_arrFolderDesc(intIndex)))
If you convert to a non-unicode character set, you will lose some characters in the process. If the legacy app reading the data doesn't need to do any string transformations, you might want to consider using UTF-7, and converting it back once it gets back into the unicode world - this will preserve all special characters.

Resources