Convert characters to html equivalent using .net - asp.net

I have a text document that is a roster of licensees. I am looping through this document to create a html table of this data. I've come across names with non standard characters.
This is one of them
Aimeé
I tried running all the inputs through the following function, but when it comes across the above character it doesn't replace it.
Function ReplaceBadCharacters(ByVal input As String) As String
Return input.Replace(Chr(233), "é")
End Function
How can I replace each character with the html equivalent?
EDIT
When I debug the above function it shows the input as Aime[] and not Aimeé.
In Chrome it looks like this Aime�

You don't need to do that.
As long as your page is encoded as UTF8, the characters will work fine.
However, you do need to call Server.HtmlEncode to escape HTML special characters.
(Unless you're printing the strings in a <%: %> block or a Razor # block, which escapes them for you)

é is in the current ASCII char set. If you put that into the HTML, it will render correctly (just like how it shows up correctly in the browser when you look at this page)
but if you want to replace all instances of it, use this instead é
input.Replace("é", "é")

Related

How to scrape transliterated or font rendered text from a html page

I want to scrape https://777codes.com/newtestament/gen1.html and fetch all the Hebrew sentences.
However some letters in the words are being rendered by the stylesheet and font files so data that is fetched by scraping the html directly is not complete.
For example when I use Beautiful Soup and fetch the contents of the first "stl_01 stl_21" class div I get "ייתꢀראꢁראꢁ" when I should be getting "בראשית"
I think I need to build a character map and match and replace the missing letters? How do I convert the scraped string into something I can use like utf8 encoded or unicode code point correctly so I can than lookup and replace the missing/replaced chars with their correct values.
Or is there a simpler way to get "בראשית" instead of "ייתꢀראꢁראꢁ" when scraping the first "stl_01 stl_21" class div

What do these characters mean?(ANSI Code)

It's the code I'm printing with node:
const m = `[38;5;1;48;5;16m TEST`
console.log(m)
output:
It changes the text color.
As you can see `` is a special char I don't understand(It's not being shown by the browser). How does it work?
Is there any alternative for ESC?
As #puucee already mentions they are terminal control characters. I find it surprising that it says ESC[ in the code as that won't be escaped in normal node. I suspect that maybe your IDE is converting the "true" escape character to ESC. Node does not support octal escapes (such as \033), but hexadecimal escapes. That is, you string should usually be like this:
console.log('\x1b[38;5;1;48;5;16m TEST \x1b[0m')
These are terminal control characters. They are often used e.g. for coloring the output. Some are non-printable. Backticks ` in your javascript example are called template literals.

paste0 regular and italicized text in R

I need to concatenate two strings within an R object: one is just regular text; the other is italicized. So, I tried a lot of combinations, e.g.
paste0(" This is Regular", italic( This is Italics))
The desired result should be:
This is Regular This is Italics
Any ideia on how to do it?
Thanks!
In plot labels, you can use expressions, see mathematical annotation :
plot(1,xlab=expression("This is regular"~italic("this is italic")))
To provide an string for which an HTML parser will recognise the need to render the text in Italics, wrap the text in <i> and </i>. For example: "This is plain text, but <i>this is in Italics</i>.".
However, most HTML processors will assume that you want your text to appear as-is and will escape their input by default. This means that the special meanings of certain characters - including < and > will be "turned off". You need to tell the processor not to do this. How you do that will depend on context. I can't tell you that because you haven't given me context.
Are you for example, writing to a raw HTML file? (You need do nothing.) Are you writing to a Markdown file? If so, how? In plain text or in a rendered chunk? Are you writing a caption to a graphic? (Waldi has suggested a solution.) Etc, etc....

gsub en dash in R/Shiny

I am attempting to provide some links in a shiny app table that then get parsed via a text input. I get a url from a csv, render it in a data table as a link like so:
paste0("<a href='",link_1,"'>Link</a>")
Then the user copies the link and pastes it into a text input, which is used in a function. Somewhere in the above process, the en dashes in the links get jumbled and turned it this character string: %C2%96, which prevents my function from running correctly.
My workaround for this was to use gsub on the url to change %C2%96 back to an en dash, but I have not been able to get it to work.
I have tried the following, with varying results:
gsub("%C2%96", "\u2013", url) - this turns the string into – instead of an en dash. This does it both in the console and the Shiny session.
However, if I run paste(url_first_half, "\u2013", url_second_half) it returns a correctly encoded url.
gsub("%C2%96", "–", url) - directly copying an en dash has the same effect as above, turning the string into –.
I have my server code saved with UTF-8 encoding. How can I correctly return an en dash via gsub?

How to avoid implicit mailto link in Restructured Text?

I'm new to Restructured Text and am trying to write a document that refers to a project with an "at" sign in the name, something like "Foo#BAR". When I convert the .rst file into HTML using the docutils "rst2html" tool, this is converted into a "mailto" link. If I use double backticks for verbatim rendering, it is turned into monospace text. How can I get it to be rendered in the normal text font, and not converted into a link?
You can use character escaping to include an # within a word. In reStructuredText the escape character is \, so try using Foo\#BAR in your document.

Resources