replacing sql collation specific characters -

Regardless of the sql database collation being used is there any way to replace the special characters when displayed in the interface. At least is there any way to implement that for the "Turkish I" so discussed here :-) I want to eliminate small dotless 'i'.

What about a simple String.Replace for the characters you dont want?
You can either keep the original data in the database and do this when the page renders or you can do this before saving the data to the database.


Decimal/Date Sorting With Entity Framework Core and SQLite

Using Entity Framework Core (Code First) and SQLite, Guid is stored as binary but Decimal and Date fields are stored as text with Microsoft's provider.
I can understand they might not want the imprecision of DOUBLE for currency amounts and thus use text.
What happens if I need to sort? Is Entity Framework Core smart enough to make sorting work as expected (but slower because it needs to parse everything!), or will it sort alphabetically instead of sorting by number? I don't want it to return 100 before 2.
I'll have to do things like "give me the latest order" so what's the best approach for that? I want to make sure it's going to work.
Am I better to switch to System.Data.SQLite provider to store dates in UNIX format (this is not supported by the Microsoft's provider)? and then would I have to do the parsing back and forth myself or it could take care of it automatically?
I am still learning myself, but I am aware that you can create and assign a custom collation to your column. The collation can either be assigned to the table column, or only to a particular view or query using standard sqlite SQL syntax and the COLLATE keyword.
This is not a complete example/tutorial, but for starters visit the docs. Also see this stack overflow answer. These are just hints, but provide a consistent method to do this. Remember that sqlite is an in-process DB engine, so it should still be rather efficient and still allow working with the database in a normal fashion without having to constantly inject custom logic between queries. Once you have the custom collation defined and properly registered, it should be rather seamless with perhaps the only extra requirement to append e.g. COLLATE customDecimal to the ORDER BY clauses.
The custom collation function would convert the string value to an appropriate numeric type and return the comparison. It's very similar to the native .Net IComparer and IComparison interfaces/implementations.

writing escaped or unescaped HTML into sqlite database?

i scrape HTML pages and write certain values into an sqlite database.
my question is: should the values, which i insert into the database, be html escaped or unescaped? what is best practice?
right now, e.g., one value looks like this in my db (note the escaped ampersand):
The database itself does not care.
It is your choice whether you escape the values before writing them to the DB, or after reading them from the DB.
However, you might need to apply different escaping algorithms in different contexts (URL, HTML, XML, JSON, CSV, etc.), and if you write HTML code to an .html file, you need no escaping at all.
So it would be a bad idea to force the values in the DB to have one specific one.

Encoder.HtmlEncode encodes Farsi characters

I want to use the Microsoft AntiXss library for my project. When I use the Microsoft.Security.Application.Encoder.HtmlEncode(str) function to safely show some value in my web page, it encodes Farsi characters which I consider to be safe. For instance, it converts لیست to لیست. Am I using the wrong function? How should I be able to print the user input in my page safely?
I'm currently using it like this:
I think I messed up! Razor view encodes the values unless you use #Html.Raw right? Well, I encoded the string and it encoded it again. So in the end it just got encoded twice and hence, the weird looking chars (Unicode values)!
If your encoding (lets assume that it's Unicode by default) supports Farsi it's safe to use Farsi, without any additional effort, in ASP.NET MVC almost always.
First of all, escape-on-input is just wrong - you've taken some input and applied some transformation that is totally irrelevant to that data. It's generally wrong to encode your data immediately after you receive it from the user. You should store the data in pure view to your database and encode it only when you display it to the user and according to the possible vulnerabilities for the current system. For example the 'dangerous' html characters are not 'dangerous' for SQL or android etc. and that's one of the main reasons why you shouldn't encode the data when you store it in the server. And one more reason - when you html encode the string you got 6-7 times more characters for your string. This can be a problem with server constraints for strings length. When you store the data to the sql server you should escape, validate, sanitize your data only for it and prevent only its vulnerabilities (like sql injection).
Now for ASP.NET MVC and razor you don't need to html encode your strings because it's done by default unless you use Html.Raw() but generally you should avoid it (or html encode when you use it). Also if you double encode your data you'll result in corrupted output :)
I Hope this will help to clear your mind.

Replacing apostrophe in to prevent SQL error

I have a web-form with a Name field which I want to be able to accept single apostrophes, such as in the name O'Leary, but when trying to push this record to the SQL 2005 server, I get an error. My question is not this. It's that when I attempt to insert the record into the db using this statement...
Dim acctName As String = Replace(txtName.Text, "'", "''")
I get O''Leary in the database instead of O'Leary. Thought SQL was supposed to treat these double single apostrophes as one apostrophe???
You'd be better off using parameterized queries. These will automatically handle the single quotes, and protect you better from SQL Injection.
Inserting the double single quotes (did I say that right?) is a way of escaping the data. It should work, but it's not a best practice.
See this article for a much fuller answer:
What I'm proposing is step 3.
Edit - I should read the question better
If you're already using parameterized queries, or a stored procedure, and you're setting the value of acctName to the value of a parameter, then you do not need to escape the quotes yourself. That's handled automatically.
It's also handled by several tools, including the Mirosoft Patterns and Practices Database library. That has several commands where you can pass in a statement and array of objects that are used as parameter values -that handles the escaping as well.
If either of those are the case, you can completely eliminate the line of code where you're replacing the values.
Depends how you're INSERTing the data into the database.
If you're using dynamic SQL and building the SQL string yourself, you are responsible for doubling the quotes yourself. But if you're using a parameterized query (as you should be, and probably are) then the engine will take care of that for you and, if you double the quotes yourself, you'll get doubled quotes in the database.
Note, if you started with dynamic SQL and switched to paramterized queries, this issue would suddenly appear at the time you made the change.
Off-the-cuff, without knowing too much detail I'd recommend checking the SET QUOTED_IDENTIFIER setting on the SQL Server. More information can be found here. Let me know if this helps.
It highly depends what query you actually submit. If you submit '' then this is what will be saved. You do need to double the ' but for other reasons (mainly security, but of course also syntax validity).
Please submit the code that you use to submit the query.

What is causing the corruption of text fields with ¿ characters?

We have a very strange problem in out application, all of a sudden we started noticing
upside down question marks being saved along with other text typed in to the fields on the screen. These upside down question marks were not originally entered by the users and it is unclear where they come from. We are using Oracle 10g with Asp.Net.
Here is an example of the issue: "140, 141) ¿ 16-Oct-07". If any one have seen this before and found a way to fix this please let me know how.
This sounds like a character encoding issue. Please check what encoding your database (tables) are set to, and what encoding the objects or strings which are passing data in the database are of. If there is a mis-match (DB in ANSI, App in UTF-8), these sorts of issues can appear.
Greg, you should check NLS_CHARACTERSET not NLS_NCHAR_CHARACTERSET settings. And I bet you it's WE8ISO8859P1 or something similar and not unicode. The problem occurs when the submitted data in unicode, which is probably UTF8, and Oracle tries to map the characters to WE8ISO8859P1 character set. It does fine for most of them but fails for high ASCII number characters, like 140.
So yes, I have seen the same issue in our application and in our case it was caused by special quote marks (“example”, ‘example’) that were copied from MS Word. Word automatically converts double quotes to some other quotes. The solution was to convert the database to UTF-8.
IF your users are copying from MS Word you can turn the feature off . Its part of the autocorrect/autoformat functionality. If you uncheck the replace options for quotes and apostrophes you should be ok. Be sure turn off the replacements in both the AutoFormat and AutoFormat as you type.
