Chinese text not working in terminal - sqlite

I'm trying to paste chinese text into terminal but I just get lots of numbers instead. if I quickly paste as soon as terminal loads the paste will work that once but not again? Its utf-8 unicode i'm using.
I dont think its the font as it works in textedit the only place I get the problem is in terminal but I need to use it to make my sqlite database.
What would be the best thing to do?
Thanks

Load Terminal Inspector, and make sure the Character Set Encoding has to be set to Unicode (UTF-8) and check the Wide glyphs for Japanese/Chinese/etc setting.

The best thing to do would probably be to write the data into an SQL file and perform that with sqlite3 mydatabase.db < mychinesetextfile.sql.
It's not pretty, on the whole; but it'll work.

Related

RStudio - some non-standard special characters in the R Script change by themselves

I occasionally work with data frames where unorthodox special characters are used that look identical to standard characters in RStudio's in-built viewing functionality. I refer to these characters in my scripts, but sometimes when I open the file, these characters have been changed to standard keyboard characters within the script.
For example, in my script, ’ changes to a standard apostrophe ' and – changes to a standard hyphen -.
These scripts are ones I have to run regularly, so having to manually correct this each time is a chore. I also haven't worked out what it is that triggers RStudio to make these changes. I've tried closing and reopening to test if that's the trigger, and the characters have remained correct. It only seems to happen after I've turned off my computer.
Does anyone know of a workaround for this and/or what is causing this? TIA
EDIT: the reason I need to do this is I export to csv which is UTF-8 encoded.
I've found a workaround, although I welcome any feedback on any drawbacks to this.
If you have already written your code (including the special characters):
Click File > Save with Encoding... > Show all encodings > unicodeFFFE
Now when you reopen the file:
Click File > Reopen with Encoding... > Show all encodings > unicodeFFFE
If you haven't already written your code, it should just be a case of saving your file from the start with the unicodeFFFE encoding (instructions above) before you write the code and then using the reopen with encoding option whenever you open the file.

Using write.csv() function in R but it doesn't actually save it to my C:

I'm a newbie and have searched Stack, the internet, everywhere I can think of.. But I cannot figure out why when I use write.csv() in R it doesn't actually save it as a csv file on my computer. All I want is to get a .csv file of my work from RStudio to Tableau and I've spent a week trying to figure it out. Many of the answers I have read use too much coding "lingo" and I cannot translate it because I'm just a beginner. Would be so so thankful for any help.
Here is the code I'm using:
""write.csv(daily_steps2,"C:\daily_steps2.csv", row.names = TRUE)""
I put the double quotes around the code because it seems like that's what I'm supposed to do here? IDK, but I don't have those when I run the function. There is no error when I run this, it just doesn't show up as a .csv on my computer. It runs but actually does nothing. Thank you so much for any help.
In my opinion, the simplest way would be to save the file to the same folder that rstudio is running in, and use the rstudio gui. It should be write.csv(daily_steps, "./daily_steps.csv") (no quotes around the function), and then on the tab in the bottom right of rstudio, you can select files, and it should be there. Then you can use the graphic user interface to move it to your desktop in a way analogous to what you would do in MS word.
Quick fix is to use double slashes or forward slash for Windows paths. (Also, since row.names=TRUE is the default, there is no need to specify)
write.csv(daily_steps2, "C:\\daily_steps2.csv")
write.csv(daily_steps2, "C:/daily_steps2.csv")
However, consider the OS-agnostic file.path() and avoid issues of folder separators in file paths including forward slash (used on Unix systems like Mac and Linux) or backslash (used on Windows systems).
write.csv(daily_steps2, file.path("C:", "daily_steps2.csv"))
Another benefit is that because of this functional form to path expression, you can pass dynamic file or folder names without paste.

Track the exact place of a not encoded character in an R script file

more a tip question that can save lots of time in many cases. I have a script.R file which I try to save and get the error:
Not all of the characters in ~/folder/script.R could be encoded using ASCII. To save using a different encoding, choose "File | Save with Encoding..." from the main menu.
I was working on this file for months and today I was editing like crazy my code and got this error for the first time, so obviously I inserted a character that can not be encoded while I was working today.
My question is, can I track and find this specific character and where exactly in the document is?
There are about 1000 lines in my code and it's almost impossible to manually search it.
Use tools::showNonASCIIfile() to spot the non-ascii.
Let me suggest two slight improvements this.
Process:
Save your file using a different encoding (eg UTF-8)
set a variable 'f' to the name of that file. something like this f <- yourpath\\yourfile.R
Then use tools::showNonASCIIfile(f) to display the faulty characters.
Something to check:
I have a Markdown file which I run to output to Word document (not important).
Some of the packages I used to initialise overload previous functions. I have found that the warning messages sometimes have nonASCII characters and this seems to have caused this message for me - some fault put all that output at the end of the file and I had to delete it anyway!
Check where characters are coming back from Warnings!
Cheers
Expanding the accepted answer with this answer to another question, to check for offending characters in the script currently open in RStudio, you can use this:
tools::showNonASCIIfile(rstudioapi::getSourceEditorContext()$path)

R exporting text issue

I have a problem that it might be a bit unique, but I think that if it is answered it could answer other questions about encoding too.
In order to expand my R skills I tried to write a function that I could manage the vcf file from android phones. Everything went ok, until I tried to upload the file in the phone. An error appeared that the first line starts with something else than a normal VCF version 3 file. But when I check the file on the PC it appears to be ok without these characters that my phone said. So, I asked about it and one person here said that it is the Byte Ordering Mark and I should use a HEX editor to see it. And it was there even it couldn't be seen in the TXT editor of windows and linux.
Thus, I tried to solve the problem by using fileEncoding arguments in R. the code that I use to write the file is:
write.table(cons2,file=paste(filename,".vcf",sep=""),row.names=F,col.names=F,quote=FALSE,fileEncoding="")
I put ASCII as argument, UTF-8 etc but no luck. ASCII seems to delete some of the characters, and UTF-8 makes these characters be visible in the text file.
I would appreciate if someone could provide a solution to this.
PS: I know that if I modify the file in a HEX editor it solves the problem, but I want the solution in the R coding.

Unix vs. Windows rendering of characters

I have a text file that display differently when opening it in FreeBSD vs. Windows.
On FreeBSD:
An·lisis e InvestigaciÛn
On Windows:
Análisis e Investigación
The windows representation is obviously right. Any ideas on how to get that result in bsd?
The Absolute Minimum Every Software Developer Absolutely, Positively Must Know About Unicode and Character Sets (No Excuses!)
The problem is that it's not ASCII, but UTF-8. You have to use another editor which detects the encoding correctly or convert it to something your editor on freebsb understands.
This is not pure ASCII. It's utf-8. Try freebsd editor with utf-8 support or change locales.
From the way the characters are being displayed, I would say that file is UTF-8 encoded unicode. Windows is recognising this, and displaying the 'á' and 'ó' characters correctly, while FreeBSD is assuming it's ISO-8859-1, which results in these characters being displayed as 2 seperate characters (due to the UTF-8 encoding using 2 bytes).
You'll have to tell FreeBSD that it is a UTF-8 file, somehow.
How is the file encoded? I would try re-encoding the file as UTF-16.
So after doing a bit more digging if 1) Open the csv file in excel on mac and export it as csv file and 2) then open it in textmate, copy the text, and save it again it works.
The result of: file file.csv is
UTF-8 Unicode English text, with very long lines
The original is:
on-ISO extended-ASCII English text, with very long lines
This workaround isn't really suitable as this process is supposed to be automated, thanks for the help so far.
It doesn't matter which operating system you're using when you open the file. What matters is the application you use to open it. On Windows you're probably using Notepad, which automatically identifies the encoding as UTF-8.
The app you're using on FreeBSD obviously isn't doing that. Maybe it just can't read UTF-8 and you need to use a different app. Or maybe you just have to tell it which encoding to use. Automatic detection of character encodings is far from universal (and much farther from perfect).

Resources