Specifying pronunciations with user dictionaries (Nuance Vocalizer Expressive TTS 5.4) - dictionary

I am currently trying to correct pronunciation of a word using dictionary called userdct_eng.dct which later will be converted to .dat file using python.
My problem is I don't know how to modify the pronunciation of an input word which enclosed in double quotes (").
this is the example code inside the dictionary:
[Header]
Name=userdct_eng.dct
Description=userdct_eng
Language=ENG
Content=EDCT_CONTENT_BROAD_NARROWS
Representation=EDCT_REPR_SZZ_STRING
[Data]
you // #'jEs#
"you" // #'jEs#
I am trying to modify word you to pronounce as yes. it's work, this string ( you // #'jEs# ) is working.
And in the second string I am trying to modify word "you" (including the double quotes) to pronouncing as yes. but it doesn't, this string ( "you" // #'jEs# ) doesn't work, the voice still pronounce it as you.
my question is: How to deal with double quotation marks word?
thanks.

SOLVED by using backslash (\) before double quote (").
Example:
\"you\" // #'jEs#

Related

how can i remove special emoji's using xquery from text

I have a $text = "Hello ๐Ÿ˜€๐Ÿ˜ƒ๐Ÿ˜„ ๐Ÿ’œ ๐Ÿ™๐Ÿป ๐Ÿฆฆรผรครถ$"
I wanted to remove just emoji's from the text using xquery. How can i do that?
Expected result : "Hello รผรครถ$"
i tried to use:
replace($text, '\p{IsEmoticons}+', '')
but didn't work.
it just removed smiley's
Result now: "Hello ๐Ÿ’œ ๐Ÿ™๐Ÿป ๐Ÿฆฆรผรครถ$"
Expected result : "Hello รผรครถ$"
Thanks in advance :)
I outlined the approach in my answer to the original question, which I updated based on your comment asking about how to strip out ๐Ÿ’œ.
Quoting from that expanded answer:
The "Emoticons" block doesn't contain all characters commonly associated with "emoji." For example, ๐Ÿ’œ (Purple Heart, U+1F49C), according to a site like https://www.compart.com/en/unicode/U+1F49C that lets you look up Unicode character information, is from:
Miscellaneous Symbols and Pictographs, U+1F300 - U+1F5FF
This block is not available in XPath or XQuery processors, since it is neither listed in the XML Schema 1.0 spec linked above, nor is it in Unicode block names for use in XSD regular expressionsโ€”a list of blocks that XPath and XQuery processors conforming to XML Schema 1.1 are required to support.
For characters from blocks not available in XPath or XQuery, you can manually construct character classes. For example, given the purple heart character above, we can match it as follows:
replace("Purple ๐Ÿ’œ heart", "[๐ŸŒ€-๐Ÿ—ฟ]", "")
This returns the expected result:
Purple Heart
This approach can be applied to ๐Ÿ™๐Ÿป , ๐Ÿฆฆ, or any other character:
Locate the character's unicode block.
Craft your regular expression with the block name (if available in XPath) or character class.
Alternatively, rather than locating the blocks of characters you want to strip out, you could identify the blocks of characters you want to preserve. For example, given the example string in the original post, perhaps the goal is to preserve only those characters in the "Basic Latin" block. To do so, we can match characters NOT in this block via the \P Category Escape:
xquery version "3.1";
let $text := "Hello ๐Ÿ˜€๐Ÿ˜ƒ๐Ÿ˜„ ๐Ÿ’œ ๐Ÿ™๐Ÿป ๐Ÿฆฆรผรครถ$"
return
replace($text, "\P{IsBasicLatin}", "")
This query returns:
Hello $
Notice that this has stripped out the characters with diacritics, which perhaps isn't desired. These characters with diacritics belong to the Latin-1 Supplement block. To preserve characters from both the Latin and Latin-1 Supplement blocks, we'd need to adjust the query as follows:
xquery version "3.1";
let $text := "Hello ๐Ÿ˜€๐Ÿ˜ƒ๐Ÿ˜„ ๐Ÿ’œ ๐Ÿ™๐Ÿป ๐Ÿฆฆรผรครถ$"
return
replace($text, "[^\p{IsBasicLatin}\p{IsLatin-1Supplement}]", "")
... which returns:
Hello รผรครถ$
This now preserves the characters with diacritics.
To be precise about the characters you preserve or remove, you need to consult the Unicode blocks and charts.

How to escape quotechar in opencsv CSVReader. Default quotechar is (") double quote

The data which I am passing through CSV file contain text as - 1 Micron Filter Cartridge 10"-(DNL). I want to escape " in the text.
Create a CSVReader with a RFC4180Parser and that will allow the data you listed above.
The default CSVReader uses a CSVParser and for that you need to use a \ character, unless you set a different escape character, to let the CSVParser know the quote is just another character in the string and not to be acted upon.
Hope that helps.
Scott :)

Escaping backslash (\) in string or paths in R

Windows copies path with backslash \, which R does not accept. So, I wanted to write a function which would convert \ to /. For example:
chartr0 <- function(foo) chartr('\','\\/',foo)
Then use chartr0 as...
source(chartr0('E:\RStuff\test.r'))
But chartr0 is not working. I guess, I am unable to escape /. I guess escaping / may be important in many other occasions.
Also, is it possible to avoid the use chartr0 every time, but convert all path automatically by creating an environment in R which calls chartr0 or use some kind of temporary use like using options
From R 4.0.0 you can use r"(...)" to write a path as raw string constant, which avoids the need for escaping:
r"(E:\RStuff\test.r)"
# [1] "E:\\RStuff\\test.r"
There is a new syntax for specifying raw character constants similar to the one used in C++: r"(...)" with ... any character sequence not containing the sequence )". This makes it easier to write strings that contain backslashes or both single and double quotes. For more details see ?Quotes.
Your fundamental problem is that R will signal an error condition as soon as it sees a single back-slash before any character other than a few lower-case letters, backslashes themselves, quotes or some conventions for entering octal, hex or Unicode sequences. That is because the interpreter sees the back-slash as a message to "escape" the usual translation of characters and do something else. If you want a single back-slash in your character element you need to type 2 backslashes. That will create one backslash:
nchar("\\")
#[1] 1
The "Character vectors" section of _Intro_to_R_ says:
"Character strings are entered using either matching double (") or single (') quotes, but are printed using double quotes (or sometimes without quotes). They use C-style escape sequences, using \ as the escape character, so \ is entered and printed as \, and inside double quotes " is entered as \". Other useful escape sequences are \n, newline, \t, tab and \b, backspaceโ€”see ?Quotes for a full list."
?Quotes
chartr0 <- function(foo) chartr('\\','/',foo)
chartr0('E:\\RStuff\\test.r')
You cannot write E:\Rxxxx, because R believes R is escaped.
The problem is that every single forward slash and backslash in your code is escaped incorrectly, resulting in either an invalid string or the wrong string being used. You need to read up on which characters need to be escaped and how. Take a look at the list of escape sequences in the link below. Anything not listed there (such as the forward slash) is treated literally and does not require any escaping.
http://cran.r-project.org/doc/manuals/R-lang.html#Literal-constants

ConfigurationManager.AppSettings convert "\n" to "\\n" why?

I have a AppSetting in web.config.
<add key="key" value="\n|\r"/>
When i read it by ConfigurationManager.AppSettings["key"] it gives "\\n|\\r".
Why ?
In the debugger, becuase the backslash is a special character used for things like tabs (\t) and line endings (\n), it has to be escaped by the use of another backslash. Hence any text that contains an actual \ will be displayed as \. If you print it out to a file or use it in any other way, you will find your string only contains the one .
This isn't ConfigurationManager doing anything.
The backslash escaping syntax is only recognized inside of string literals by the C# compiler. Since your string is being read from an XML file at runtime, you need to use XML-compatible escaping (character entities) in order include those characters in your string. Thus, your app settings entry should look like the following:
<add key="key" value="&x10;|&x13;"/>
Because 10 and 13 are the hex values for linefeed and carriage return, respectively.
Like cjk said, the extra slash is being inserted by the debugger to indicate that it is seeing a literal slash and not an escape sequence.
I solved the same problem with a string replacement.
Not beautful.. but works!
ConfigurationManager.AppSettings["Key"].Replace("\\n", "\n")
string str = "\n";// means \n
string str1 = #"\n";// means \\n
From the AppSettings, It seems that when you extract the key's value, # is internally wrapped.. It is done by the compiler not runtime.

Does QString::fromUtf8 automatically reverse a Hebrew string?

I am having a problem where a Hebrew string is being displayed in reverse. I use QTableWidget to display some info, and here the string appears correctly using:
CString hebrewStr; hebrewStr.ToUTF8();
QString s = QString::fromUtf8( hebrewStr );
In another part of my program this same string is displayed on the screen, but not using QT, and this is what is being shown in reverse:
CString hebrewStr;
hebrewStr.ToUTF8();
I have debugged and hebrewStr.ToUTF8() in both cases produces the exact same unicode string, but the string is only displayed correctly in the QTableWidget. So I am wondering if Qt automatically reverses a given Hebrew string (since it is a rigth-to-left language). Thanks!
Yes, in this case QString generate the full unicode wchar_t from the UTF-8 encoded string. If you would like to do similar thing in MFC, you should use CStringW and decode the string.
Use MultiByteToWideChar for UTF8 to CStringW conversion.
Connected question in StackOverflow.

Resources