R doesn't recognize Latin7 characters - r

I have really strange problem. I am using Lithuanian keyboard, but R doesn't recognize letters such as į, š, č.
For example when I write:
žodis <- "žibutė"
in R console I see
þodis <- "þibutë".
I have R in several computers, all work fine except this one. Can you help me with this issue? Is any function to let R know that I'm using Lithuanian keyboard? My computer's operating system is Windows 10 and R version 3.3.2.

Related

double ~ in R interactive terminal in VSCode

I am trying to use R in VSCode. I downloaded the R extension and followed the steps specified but I keep getting an annoying issue: when I use Alt+126 to write the character '~' in the script everything is fine, but when I run the line, in the R interactive terminal it appears twice. Also, if I use Alt+126 in the R interactive terminal, '~~' appears instead of '~' (two times the character instead of once). I have no idea why this is happening, I tried to uninstall and reinstall both VSCode and R and reset the pc but nothing changed.
I hope somebody knows how to solve this, thanks in advance!

missing R Data Editor window with RStudio on Mac

I am learning R with RStudio on Mac. When trying the following code:
mydata <- data.frame(age=numeric(0),
gender=character(0), weight=numeric(0))
mydata <- edit(mydata)
if I use R(GUI) on Mac, it works fine.
R data editor popup from R on Mac
But if I run the same code from RStudio on the same Mac, there is no window and the RStudio is stuck.
Anybody can help?
R studio doesn't support edit function. Instead you can use library like 'editData' (https://cran.r-project.org/web/packages/editData/README.html).

Bengali conjuncts not rendering in ggplot

ggplot(data=NULL,aes(x=1,y=1))+
geom_text(size=10,label="ক্ত", family="Kohinoor Bangla")
On my machine, the Bengali conjunct cluster "ক্ত" is rendered as its constituents plus a virana:
I have tried several different fonts to no avail. Is there a trick to making conjuncts render correctly?
EDIT:
Explicitly using the unicode still doesn't not render correctly:
This renders correctly for me:
print(stringi::stri_enc_toutf8("\u0995\u09cd\u09a4"))
This still gives me the exact same result as before
ggplot(data=NULL,aes(x=1,y=1))+
geom_text(size=10,label="\u0995\u09cd\u09a4", family="Kohinoor Bangla")
Why is there a difference between the console output and ggplot output?
I'm not familiar with the Bengali language, but if you would look up the unicode characters for the text that you want to render, you could simply use those in geom_text()
# According to unicode code chart, these are some Bengali characters
# U+099x4
# U+09Ex3
ggplot(data=NULL,aes(x=1,y=1))+
# Substitute 'U+' by '\u', leave the 'x' out
geom_text(size = 10, label = "\u0994\u09E3")
Substitute the unicode characters as you see fit.
Hope that helped!
EDIT: I tried your last piece of code, which gave me a warning about the font not being installed. So I ran it without the family = "Kohinoor Bangla":
ggplot(data=NULL,aes(x=1,y=1))+
geom_text(size=10,label="\u0995\u09cd\u09a4")
Which gave me the following output:
From a visual comparison with the character that you posted, it looks quite similar. Next, I ran the same piece of code on my work computer, which gave me the following output:
The difference between work and home, is that work runs on a linux, while home runs on windows, work has R 3.4.4, home has R 3.5.3. Both are in RStudio, both are ggplot 3.2.0. I can't update R on work because of backwards compatibility issues, to check wether the version of R might be the problem. However, you could check wether your version of R is older than 3.5.3 and see if updating relieves the problem. Otherwise, I would guess it is a platform issue.

Converting accents to ASCII in R

I'm trying to convert special characters to ASCII in R. I tried using Hadley's advice in this question:
stringi::stri_trans_general('Jos\xe9', 'latin-ascii')
But I get "Jos�". I'm using stringi v1.1.1.
I'm running a Mac. My friends who are running Windows machines seem to get the desired result of "Jose".
Any idea what is going on?
The default encoding on Windows is different from the typical default encoding on other operating systems (UTF-8). x ='Jos\xe9' means something in
Latin1, but not in UTF-8. So, on Linux or OS X you need to tell R what the encoding is:
x ='Jos\xe9'
Encoding(x) <- 'latin1'
stri_trans_general(x, 'Latin-ASCII')

Read.dta not working on Mac OS X

A project that typically works on my Windows 7 office machine now gives errors on my Mac OS X laptop, trying to run it with R Studio. The part it fails is
library(foreign)
basis <- read.dta("myfile.dta")
Error in factor(rval[[v]], levels = tt[[ll[v]]], labels = names(tt[[ll[v]]])) :
invalid 'labels'; length 4 should be 1 or 3
R and Rstudio are on the newest version, I already ran update.packages(). As I'm a beginner on R itself, I'm completely clueless what to try next.
Could this somehow be related with OS X encoding? The stata file has German "umlaut" (that is, non ISO characters) in it.
Use package memisc instead. This is supposed to be more flexible. From the docs (found here) we have:
The importer mechanism is more flexible and extensible than read.spss
and read.dta of package "foreign", as most of the parsing of the file
headers is done in R.
So back to the problem. First, load the following:
library(lattice)
library(MASS)
library(memisc)
and then use the call:
as.data.frame(as.data.set(Stata.file("filename.dta")))

Resources