Data specific print as PSPP variable view in R - r

I have a .sav file. I want to print out the data properly as PSPP variable view in R.
Succeeded to print the type of data, but not the other specific sg.: width, label, value label,...
I using following command to read data:
library(foreign)
library(memisc)
data <- read.spss("Database.sav", use.value.labels = FALSE,
max.value.labels = 100)
x = do.call(rbind,data)

Please check the following command for variable and value labels read from SPSS in R. Hope this will work...
library(foreign)
## Read SPSS data
data<-read.spss("Database.sav",use.value.labels=FALSE,to.data.frame=FALSE)
data_frame<-as.data.frame(data)
dim(data_frame)
# Variable Labels...
variable_labels <- attr(data, "variable.labels")
variable_labels
# Value Labels...
value_labels<-attr(data,"label.table")
value_labels

Related

How to format data to automate table production?

I would be very grateful for any guidance on how to use the xltabr package to automatically format tables in r, please:
https://github.com/moj-analytical-services/xltabr
In SPSS for example, I would apply the relevant weight and then run a cross tab on the raw data e.g var1*var2.
How would you go about doing this in r so that the package recognises it to produce the table?
Much appreciated.
You need to create/ read in the dataframe which you want to use first.
dat <- read.spss("mydataframe.sav")
Then you need to put it in the format you want: As in your example of crosstables, you can do this:
library(reshape2)
ct <- reshape2::dcast(iris, variable1 ~ variable2, fun.aggregate = length)
#depending on what data you want, you can change the fun.aggreagte function (e.g. sum or mean).
Then you can use the xltabr package to prepare the excel file by creating a Workbook:
wb <- xltabr::auto_crosstab_to_wb(ct)
Then you can save it as .xlsx file:
library(openxlsx)
openxlsx::saveWorkbook(wb, file = "crosstable.xlsx", overwrite = T)
I hope this helps

CSV imported data table is not possible to use for histogram plot

I have created my own data set named as Kwality.csv in Excel and when I am executing above code I am not able to get histogram for the same data and it's throwing me error like this:
Error in hist.default(mydata) : 'x' must be numeric
library(data.table)
mydata = fread("Kwality.csv", header = FALSE)
View(mydata)
hist(mydata)
I tried to reproduce you work flow and exported xlsx-file into csv-file (using export to comma-separated file).
First, you should check what kind of character is used for variable and decimal places separation. In my case, for variable separation it is the ; semicolon, and the decimal places is "," comma.
Then you should choose the column, which you will use for the histogramm plot with the function[[]]. The data table itself is not a valid argument for hist function. Please see as below.
See below:
Taken this into consideration you cod execute your code:
library(data.table)
# load csv generatd by NORMSINV(RAND()) in Excel
mydata = fread("check.csv",header = FALSE, sep = ";", dec = ",")
mydata
#hist(mydata)
# Error in hist.default(mydata) : 'x' should be numeric
# does not work
# access by column, e.g. third colum - OK
hist(mydata[[3]])
Output:

An error in histogram from a one-column data set in R

Background:
I have one column of Data consisting of 400 rows stored in a csv file. The data can be easily imported into your RStudio using the R code below.
Question:
I'm wondering how to get a histogram of this Data? Specifically, after I import this Data into R studio and run hist(Data), I get the following error message:
Error in hist.default(D) : 'x' must be numeric
P.S. I initially created the data using Initial = rbeta(400, 2, 3); final = sample(c(Initial,0.5,0.6), size = 400, prob = c(rep(.98/400,400),.005,.015), replace = T).
Here is my small R code:
id <- "0B5V8AyEFBTmXcURlQ0tzNjBEVFU"
Data <- read.csv(paste0("https://docs.google.com/uc?id=",id,"&export=download"))
hist(Data) ## HERE I get the error
read.csv uses header = TRUE by default so without x, the first value in your file becomes the header. So you need:
Data <- read.csv("https://docs.google.com/uc?id=0B5V8AyEFBTmXcURlQ0tzNjBEVFU&export=download",
header = FALSE)
hist(Data$V1)

What is the best way to import spss file in R with value labels?

I have a spss file which contents variables and value labels. I saw foreign package with read.spss function:
data <- read.spss("2017.sav", to.data.frame = TRUE, use.value.labels = TRUE)
If i use use.value.labels = TRUE, all string change to factor variables and i dont want it because they are not factor all.
I found one solution but i dont know if it is the best way to do it
1º First read spss file with previous sentence
2º select which variables are not factor and change it to string with:
cols <- c("x", "ab")
data[cols] <- lapply(data[cols], as.character)
if i dont use use.value.labels = TRUE i will have not value labels and i cannot export file correctly
You can also use the memisc package:
sav <- spss.system.file("file.sav")
df <- as.data.set(sav)
My company regularly deals with SAV files and we extract out the metadata separately. With the foreign package, you can get the metadata out in a few different ways (after you have loaded the file in):
data.label.table <- attr(sav, "label.table")
missings <- attr(sav, "missings")
The other bits require various lapply and sapply functions to get them out. The script I have is quite long, so I will not share it here. If you read the data in with read.spss(sav, to.data.frame = TRUE) you can get:
VariableLabels <- unname(attr(sav, "variable.labels"))
I dont know why, but I can’t install a "foreign" package.
Here is what I did instead to import a dataset from SPSS to R (through Excel):
Open your data in SPSS.
Export dataset from SPSS to Excel, but make sure to choose the "Save
value labels where defined instead of data values" option at the
very bottom.
Open R.
Import dataset from Excel.
Now, you have a dataset in R with value labels.
Use the haven package:
library(haven)
data <- read_sav("2017.sav")
The labels are shown in the RStudio viewer.

Print ncvTest to csv file

In R i run a ncvTest for heteroscedasticity. But i can't seem to print the result into a csv file. This is what i have done,
ncvt<-ncvTest(pol_reg)
outss<-file(paste0("hetero_test.csv"))
write.csv(ncvt,outss)
I get the following error message,
Error in as.data.frame.default(x[[i]], optional = TRUE, stringsAsFactors = stringsAsFactors) :
cannot coerce class ""chisqTest"" to a data.frame
What am I suppose to do in order to save the result into a csv file. The result of ncvt looks as follows,
Non-constant Variance Score Test
Variance formula: ~ fitted.values
Chisquare = 75514.06 Df = 1 p = 0
You can pull out components of your ncvt list and make a dataframe out of it to write to a csv file:
ncvt<-ncvTest(pol_reg)
ds_ncvt <- data.frame(ncvt$formula.name, ncvt$ChiSquare, ncvt$Df, ncvt$p, ncvt$test)
outss<-file(paste0("hetero_test.csv"))
write.csv(ds_ncvt,outss)
Thanks, the following also works
ncvt<-ncvTest(pol_reg)
ds_ncvt <- as.matrix(ncvt)
outss<-file(paste0("hetero_test.csv"))
write.csv(ds_ncvt,outss)

Resources