Skimr - cant seem to produce the histograms - r

came across this seemingly new package - skimr, which looks pretty nifty, and was trying it out and looks like I'm missing some package installation. Skim works fine except that it doesn't print the histogram, it is supposed to print for numeric variables. I am merely trying the examples given in the documentation.
Link to skimr documentation here - https://github.com/ropenscilabs/skimr#skimr
this is the code I'm using
devtools::install_github("hadley/colformat")
devtools::install_github("ropenscilabs/skimr")
library(skimr)
a<-skim(mtcars)
dim(a)
View(a)
instead of histograms being printed, I see some ASCII/unicode characters .

A solution that can be used to workaround the above problem is to set the locale of the R system to Chinese and to set the font of the R console to NSimSun.
temp <- tempfile()
cat("font = NSimSun\n", file = temp, append = TRUE)
loadRconsole(file = temp)
Sys.setlocale( locale='Chinese' )
library(skimr)
(a <- skim(mtcars))
View(a)
In RStudio this solution works only partially. Histograms generated by skim can be visualized only using View after setting the locale of R to Chinese
Sys.setlocale( locale='Chinese' )
library(skimr)
a <- skim(mtcars)
View(a)
Hope this can help you.

Related

exporting plotted variable shows blank image

I am doing java and R integration using JRI.
Please find below script
String path = "C:\\Users\\hrpatel\\Desktop\\CSVs\\DataNVOCT.csv";
rengine.eval("library(tseries)");
rengine.eval(String.format("mydata <- read.csv('%s')",path.replace('\\', '/')));
String exportFilePath= "C:\\Users\\hrpatel\\Desktop\\CSVs\\arima3.jpg";
rengine.eval("Y <- NewVisits");
rengine.eval("t <- Day.Index");
rengine.eval("summary(Y)");
rengine.eval("adf.test(Y, alternative='stationary')");
rengine.eval("adf.test(Y, alternative='stationary', k=0)");
rengine.eval("acf(Y)");
rengine.eval("pacf(Y)");
rengine.eval("mydata.arima101 <- arima(Y,order=c(1,0,1))");
rengine.eval("mydata.pred1 <- predict(mydata.arima101, n.ahead=1000)");
rengine.eval(String.format("jpeg('%s')",exportFilePath.replace('\\', '/')));
rengine.eval("plot(t,Y)");
rengine.eval("lines(mydata.pred1$pred, col='blue',size=10)");
rengine.eval("lines(mydata.pred1$pred+1*mydata.pred1$se, col='red')");
rengine.eval("lines(mydata.pred1$pred-1*mydata.pred1$se, col='red')");
rengine.eval("dev.off()");
In above codebase when i tried plot(t,Y) or plot(Y). it export a blank image, while in case of plot(mydata) it is working file.
One more thing when i run above code in R it creates the image(using JRI it shows blank image).
I have spend 1 day to solve this but i dont found any solution.
Please suggest if you have any alternatives.
Your help is needed.
Thanks in Advance
if i understand correctly, you have a data set named mydata, that has two columns, NewVisits, and Day.Index, in that case you need to change:
rengine.eval("Y <- NewVisits");
to
rengine.eval("Y <- mydata$NewVisits");
and
rengine.eval("t <- Day.Index");
to
rengine.eval("t <- mydata$Day.Index");
This also explains why plot(mydata) works for you - because R recognizes it.
if this isn't the solution, then i cant see where you are reading NewVisits and Day.Index from
BTW i stongly recommend to plot using the ggplot package

export all the content of r script into pdf

I would want to export all the content of r script into pdf. Could it be possible?
I used these commands export, but what I see I just exported graphics
pdf(file = "example.pdf")
dev.off()
Thank you!
setwd("C:/Users/Prat/Desktop/c")
> dir()
[1] "script.R"
> knitr::stitch('script.r')
output file: script.tex
In my folder doesn't appears a script.pdf else a script.tex and a folder with the pictures in pdf
You can do this with the knitr package. Here's a workflow:
Save your script as a file (e.g., myscript.r)
Then run knitr::stitch('myscript.r')
The resulting PDF will be saved locally as myscript.pdf. You can use browseURL('myscript.pdf') to view it.
You can generate html file by using,
knitr::stitch_rhtml('filename.r')
As .tex file is not easily readable but html files can view in any browser easily.
For everyone who is looking for an easy and fast solution, I would propose using the function capture.output (https://www.rdocumentation.org/packages/utils/versions/3.6.2/topics/capture.output) from utils.
One only needs to 1.) capture what ever command one wants to run and assign it to a variable and 2.) then print that variable. Images can be printed along the way as you can see. The example on the webpage I linked above does not use markdown.
Here my example with markdown (this is really all one needs):
```{r, echo = F}
# fake data-set
x = rnorm(50, mean = 3.3, sd=1)
y = rnorm(50, mean = 3.1, sd=0.9)
z = rnorm(50, mean = 3.2, sd=1.1)
# create dataframe
df <- data.frame(x, y, z)
# adding a graphic
plot(df$x, df$y)
# create a model as example
linearMod <- lm(y ~ x + z, data=df)
# all one needs to capture the output!!:
bla <- capture.output(summary(linearMod))
print(bla)
```
Remark: if one also wants to print the command, that is also easy. Just replace "echo = F" with "warning = F" or remove the text altogether if you also wanna have the warnings printed, in case there are any.
I was having the same issue, but I realized I was working in R 4.1 and ignored the warning that knitr was created using R 4.2. However after updating my R version, I was also just getting a .tex file but when I read the .log file I found the error "sh: pdflatex: command not found."
I used this suggestion with success:
Have you installed a LaTeX distribution in your system? For rmarkdown,
tinytex is recommended, you would need to install the R package and
then the TinyTex distribution.
install.packages('tinytex')
tinytex::install_tinytex()
Make sure you not only install the package but also run that second command tinytex::install_tinytex() as I made that mistake also before finally getting the program to create a pdf file.
Here is the link to the site where I found this method.
https://community.rstudio.com/t/knitting-error-pdflatex-command-not-found/139965/3
Please use the below set of codes (you need to modify it according to your dataset/data-frame name).
library(gridExtra)
library(datasets)
setwd("D:\\Downloads\\R Work\\")
data("mtcars") # Write your dataframe name that you want to print in pdf
pdf("data_in_pdf.pdf", height = 11, width = 8.5)
grid.table(mtcars)
dev.off()
Thanks.

Exporting result from kml package in R

I'm using a kml package of R to cluster my data and I need to get in the end a csv file with a column including the number of clusters according to each id. The data has many missing values, so I can't use kmeans function without deleting all observations, but kml works nicely with that. My problem is that I use choice() to export the results and all I get is a graphical window, but no output files. Here is my code:
setwd("/Volumes/NATASHKA/api/R files")
statadata <-read.dta("Data_wide_withdemogr_auris_for_kml_negative.dta")
mydata <- data.frame(statadata)
cldDQ <- cld(mydata)
kml(cldDQ,c(2:6),20,toPlot="none")
plotAllCriterion(cldDQ)
par(mar = rep(2, 4))
X11(type = "Xlib")
choice(cldDQ, typeGraph = "bmp")
What do I do wrong?
I had the same problem and I solved it that way:
first, you need to choose the desired partition with the arrow
second, select it pressing “space”,
then press “Enter” and you can find all files in your work directory, check getwd().
Good luck.

Is it possible to to export from reporttools?

I am using tableNominal{reporttools} to produce frequency tables. The way I understand it, tableNominal() produces latex code which has to be copied and pasted onto a text file and then saved as .tex. But is it possible to simple export the table produced as can be done in print(xtable(table), file="path/outfile.tex"))?
You may be able to use either latex or latexTranslate from the "Hmisc" package for this purpose. If you have the necessary program infrastructure the output gets sent to your TeX engine. (You may be able to improve the level of our answers by adding specific examples.)
Looks like that function does not return a character vector, so you need to use a strategy to capture the output from cat(). Using the example in the help page:
capture.output( TN <- tableNominal(vars = vars, weights = weights, group = group,
cap = "Table of nominal variables.", lab = "tab: nominal") ,
file="outfile.tex")

Read SPSS file into R

I am trying to learn R and want to bring in an SPSS file, which I can open in SPSS.
I have tried using read.spss from foreign and spss.get from Hmisc. Both error messages are the same.
Here is my code:
## install.packages("Hmisc")
library(foreign)
## change the working directory
getwd()
setwd('C:/Documents and Settings/BTIBERT/Desktop/')
## load in the file
## ?read.spss
asq <- read.spss('ASQ2010.sav', to.data.frame=T)
And the resulting error:
Error in read.spss("ASQ2010.sav", to.data.frame = T) : error
reading system-file header In addition: Warning message: In
read.spss("ASQ2010.sav", to.data.frame = T) : ASQ2010.sav: position
0: character `\000' (
Also, I tried saving out the SPSS file as a SPSS 7 .sav file (was previously using SPSS 18).
Warning messages: 1: In read.spss("ASQ2010_test.sav", to.data.frame =
T) : ASQ2010_test.sav: Unrecognized record type 7, subtype 14
encountered in system file 2: In read.spss("ASQ2010_test.sav",
to.data.frame = T) : ASQ2010_test.sav: Unrecognized record type 7,
subtype 18 encountered in system file
I had a similar issue and solved it following a hint in read.spss help.
Using package memisc instead, you can import a portable SPSS file like this:
data <- as.data.set(spss.portable.file("filename.por"))
Similarly, for .sav files:
data <- as.data.set(spss.system.file('filename.sav'))
although in this case I seem to miss some string values, while the portable import works seamlessly. The help page for spss.portable.file claims:
The importer mechanism is more flexible and extensible than read.spss and read.dta of package "foreign", as most of the parsing of the file headers is done in R. They are also adapted to load efficiently large data sets. Most importantly, importer objects support the labels, missing.values, and descriptions, provided by this package.
The read.spss seems to be outdated a little bit, so I used package called memisc.
To get this to work do this:
install.packages("memisc")
data <- as.data.set(spss.system.file('yourfile.sav'))
You may also try this:
setwd("C:/Users/rest of your path")
library(haven)
data <- read_sav("data.sav")
and if you want to read all files from one folder:
temp <- list.files(pattern = "*.sav")
read.all <- sapply(temp, read_sav)
I know this post is old, but I also had problems loading a Qualtrics SPSS file into R. R's read.spss code came from PSPP a long time ago, and hasn't been updated in a while. (And Hmisc's code uses read.spss(), too, so no luck there.)
The good news is that PSPP 0.6.1 should read the files fine, as long as you specify a "String Width" of "Short - 255 (SPSS 12.0 and earlier)" on the "Download Data" page in Qualtrics. Read it into PSPP, save a new copy, and you should be in business. Awkward, but free.
,
You can read SPSS file from R using above solutions or the one you are currently using. Just make sure that the command is fed with the file, that it can read properly. I had same error and the problem was, SPSS could not access that file. You should make sure the file path is correct, file is accessible and it is in correct format.
library(foreign)
asq <- read.spss('ASQ2010.sav', to.data.frame=TRUE)
As far as warning message is concerned, It does not affect the data. The record type 7 is used to store features in newer SPSS software to make older SPSS software able to read new data. But does not affect data. I have used this numerous times and data is not lost.
You can also read about this at http://r.789695.n4.nabble.com/read-spss-warning-message-Unrecognized-record-type-7-subtype-18-encountered-in-system-file-td3000775.html#a3007945
It looks like the R read.spss implementation is incomplete or broken. R2.10.1 does better than R2.8.1, however. It appears that R gets upset about custom attributes in a sav file even with 2.10.1 (The latest I have). R also may not understand the character encoding field in the file, and in particular it probably does not work with SPSS Unicode files.
You might try opening the file in SPSS, deleting any custom attributes, and resaving the file.
You can see whether there are custom attributes with the SPSS command
display attributes.
If so, delete them (see VARIABLE ATTRIBUTE and DATAFILE ATTRIBUTE commands), and try again.
HTH,
Jon Peck
If you have access to SPSS, save file as .csv, hence import it with read.csv or read.table. I can't recall any problem with .sav file importing. So far it was working like a charm both with read.spss and spss.get. I reckon that spss.get will not give different results, since it depends on foreign::read.spss
Can you provide some info on SPSS/R/Hmisc/foreign version?
Another solution not mentioned here is to read SPSS data in R via ODBC. You need:
IBM SPSS Statistics Data File Driver. Standalone driver is enough.
Import SPSS data using RODBC package in R.
See the example here. However I have to admit that, there could be problems with very big data files.
For me it works well using memisc!
install.packages("memisc")
load('memisc')
Daten.Februar <-as.data.set(spss.system.file("NPS_Februar_15_Daten.sav"))
names(Daten.Februar)
I agree with #SDahm that the haven package would be the way to go. I myself have struggled a bit with string values when starting to use it, so I thought I'd share my approach on that here, too.
The "semantics" vignette has some useful information on this topic.
library(tidyverse)
library(haven)
# Some interesting information in here
vignette('semantics')
# Get data from spss file
df <- read_sav(path_to_file)
# get value labels
df <- map_df(.x = df, .f = function(x) {
if (class(x) == 'labelled') as_factor(x)
else x})
# get column names
colnames(df) <- map(.x = spss_file, .f = function(x) {attr(x, 'label')})
There is no such problem with packages you are using. The only requirement for read a spss file is to put the file into a PORTABLE format file. I mean, spss file have *.sav extension. You need to transform your spss file in a portable document that uses *.por extension.
There is more info in http://www.statmethods.net/input/importingdata.html
In my case this warning was combined with a appearance of a new variable before first column of my data with values -100, 2, 2, 2, ..., a shift in the correspondence between labels and values and the deletion of the last variable. A solution that worked was (using SPSS) to create a new dump variable in the last column of the file, fill it with random values and execute the following code:
(filename is the path to the sav file and in my case the original SPSS file had 62 columns, thus 63 with the additional dumb variable)
library(memisc)
data <- as.data.set(spss.system.file(filename))
copyofdata = data
for(i in 2:63){
names(data)[i] <- names(copyofdata)[i-1]
}
data[[1]] <- NULL
newcopyofdata = data
for(i in 2:62){
labels(data[[i]]) <- labels(newcopyofdata[[i-1]])
}
labels(data[[1]]) <- NULL
Hope the above code will help someone else.
Turn your UNICODE in SPSS off
Open SPSS without any data open and run the code below in your syntax editor
SET UNICODE OFF.
Open the data set and resave it to remove the Unicode
read.spss('yourdata.sav', to.data.frame=T) works correctly then
I just came came across an SPSS file that I couldn't get open using haven, foreign, or memisc, but readspss::read.por did the trick for me:
download.file("http://www.tcd.ie/Political_Science/elections/IMSgeneral92.zip",
"IMSgeneral92.zip")
unzip("IMSgeneral92.zip", exdir = "IMSgeneral92")
# rio, haven, foreign, memisc pkgs don't work on this file! But readspss does:
if(!require(readspss)) remotes::install_git("https://github.com/JanMarvin/readspss.git")
ims92 <- readspss::read.por("IMSgeneral92/IMS_Nov7 92.por", convert.factors = FALSE)
Nice! Thanks, #JanMarvin!
1)
I've found the program, stat-transfer, useful for importing spss and stata files into R.
It resolves the issue you mention by converting spss to R dataset. Also very useful for subsetting super large datasets into smaller portions consumable by R. Not free, but a very useful tool for working with datasets from different programs -- especially if you don't have access to them.
2)
Memisc package also has an spss function worth trying.

Resources