How to open SAS files using Excel? - r

I have a set of SAS data sets and I want to open it using Excel or R. I don't have a SAS software with me so i can't use the export option in it. Is there any converter that converts from SAS7BDAT to excel?
Thanks

I help develop the Colectica for Excel addin, which opens SAS data files in Excel. No SAS software or ODBC configurations are required. The addin directly reads the SAS file and then inserts the data and metadata into your worksheet.
Imports SAS .sas7bdat data and column names
Imports SAS .sas7bcat formats and value labels when avalaible
The Excel addin is downloadable from http://www.colectica.com/software/colecticaforexcel
Documentation is available in the user manual.

You could use SAS add in for Microsoft office to open the SAS dataset in Excel. Not sure if it is free though.
http://support.sas.com/software/products/addin/
As Reese suggested you can use - SAS Universal Viewer , its free!!
Here is the link :-
https://support.sas.com/downloads/browse.htm?fil=&cat=74
Or you can download SAS University Edition, which is also free, it is more than just a viewer, you can write and execute programs in here.
http://www.sas.com/en_us/software/university-edition/download-software.html

Here a quick-and-dirty python five-liner to convert a .xpt file to .csv
import pandas as pd
FILE_PATH = "(directory containing file)"
FILE = "ABC" # filename itself (without suffix)
# Note: might need to substitute the column name of the index (in quotes) for "None" here
df = pd.read_sas(FILE_PATH + FILE + '.XPT', index=None)
df.to_csv(FILE_PATH + FILE + '.csv')
Hopefully this might help someone

I came across the same "need" and after some research here and there, I found a nice and easy way with R and the latest version of RStudio (as per 2020 June date - the FREE one). Using it, you can open various formats of files and RStudio generates for you the R script it ran behind. You can use this as a starting point, in order to have the .sas7bdat file opened, and then do the conversion step.
Steps to follow in order to import the file using the RStudio "visual" way: Evironment tab -> Import Dataset -> From SAS...
It will ask you to import the haven library. After the installation you will have a tab with the preview of the data within the file and also the R script ran behind which will look like this:
library(haven)
aux <- read_sas("//PATH_ON_YOUR_MACHINE_TO_FILE/actual_file.sas7bdat", NULL)
View(aux)
Notice the NULL there, it has the purpose of converting empty strings to NULL.
But wait, we also need to convert it to a .csv file in order to have the final job done. For this you simply add below those lines from above the following:
write.csv(aux, "actual_file.csv")
Which will produce within the same folder with the original SAS file, the desired .CSV one. If you want to have ";" as separator instead on "," use write.csv2(aux, "actual_file.csv"). Anyway Strings are enclosed by " " so it should be fine.

Related

Is there a way to accelerate formatted table writing from R to excel?

I have a 174603 rows and 178 column dataframe, which I'm importing to Excel using openxlsx::saveWorkbook, (Using this package to obtain the aforementioned format of cells, with colors, header styles and so on). But the process is extremely slow, (depending on the amount of memory used by the machine it can take from 7 to 17 minutes!!) and I need a way to reduce this significantly (Doesn't need to be seconds, but anything bellow 5 min would be OK)
I've already searched other questions but they all seem to focus either in exporting to R (I have no problem with this) or writing non-formatted files to R (using write.csv and other options of the like)
Apparently I can't use xlsx package because of the settings on my computer (industrial computer, Check comments on This question)
Any suggestions regarding packages or other functionalities inside this package to make this run faster would be highly appreciated.
This question has some time ,but I had the same problem as you and came up with a solution worth mentioning.
There is package called writexl that has implemented a way to export a data frame to Excel using the C library libxlsxwriter. You can export to excel using the next code:
library(writexl)
writexl::write_xlsx(df, "Excel.xlsx",format_headers = TRUE)
The parameter format_headers only apply centered and bold titles, but I had edited the C code of the its source in github writexl library made by ropensci.
You can download it or clone it. Inside src folder you can edit write_xlsx.c file.
For example in the part that he is inserting the header format
//how to format headers (bold + center)
lxw_format * title = workbook_add_format(workbook);
format_set_bold(title);
format_set_align(title, LXW_ALIGN_CENTER);
you can add this lines to add background color to the header
format_set_pattern (title, LXW_PATTERN_SOLID);
format_set_bg_color(title, 0x8DC4E4);
There are lots of formating you can do searching in the libxlsxwriter library
When you have finished editing that file and given you have the source code in a folder called writexl, you can build and install the edited package by
shell("R CMD build writexl")
install.packages("writexl_1.2.tar.gz", repos = NULL)
Exporting again using the first chunk of code will generate the Excel with formats and faster than any other library I know about.
Hope this helps.
Have you tried ;
write.table(GroupsAlldata, file = 'Groupsalldata.txt')
in order to obtain it in txt format.
Then on Excel, you can simply transfer you can 'text to column' to put your data into a table
good luck

Fetch data from an open excel sheet into R?

I am wondering is it possible to read an excel file that is currently open, and capture things you manually test into R?
I have an excel file opened (in Windows). In my excel, I have connected to a SSAS cube. And I do some manipulations using PivotTable Fields (like changing columns, rows, and filters) to understand the data. I would like to import some of the results I see in excel into R to create a report. (I mean without manually copy/paste the results into R or saving excel sheets to read them later). Is this a possible thing to do in R?
UPDATE
I was able to find an answer. Thanks to awesome package created by Andri Signorell.
library(DescTools)
fxls<-GetCurrXL()
tttt<-XLGetRange(header=TRUE)
I was able to find an answer. Thanks to awesome package created by Andri Signorell.
library(DescTools)
fxls<-GetCurrXL()
tttt<-XLGetRange(header=TRUE)
Copy the values you are interested in (in a single spread sheet at a time) to clipboard.
Then
dat = read.table('clipboard', header = TRUE, sep = "\t")
You can save the final excel spreadsheet as a csv file (comma separated).
Then use read.csv("filename") in R and go from there. Alternatively, you can use read.table("filename",sep=",") which is the more general version of read.csv(). For tab separated files, use sep="\t" and so forth.
I will assume this blog post will be useful: http://www.r-bloggers.com/a-million-ways-to-connect-r-and-excel/
In the R console, you can type
?read.table
for more information on the arguments and uses of this function. You can just repeat the same call in R after Excel sheet changes have been saved.

Turn off scientific notation before writing to file

My data.frame uses the scientific notation, when parsing files like 3.007530e+07.
I definitely like to use it in R, however, for this analysis I have to transfer my data to csv and open it in excel(German Version), which cannot handle this notation.
My df looks sth like that:
df <- c(6.402000e+05,9.312903e+05,1.007800e+06,1.142000e+06,1.298500e+06,1.511700e+06,1.749000e+06,1.869357e+06)
I tried changing my global options such as options(scipen=999), which does not work, because then I have problems with my fread function.
Therefore, my question:
How to change the notation in a data.frame before, using write.csv()?
I appreciate your replies!
As an alternative to altering the R format (since you want to keep scientific notation in R), could you change how Excel imports your file?
For example, naming your csv file with a non-standard extension to trigger the manual importing process (import wizard), instead of automatically opening the file in the wrong format?
I tried a simple test with a csv formatted file of numbers in scientific notation, saved with a ".sci" filename. My version of Excel launched the wizard, then imported the file and handled the scientific notation correctly [MS Excel Starter 2010, English version].
Edit: I found the reference to using an unrecognized file extension to trigger Excel's import wizard: http://excelribbon.tips.net/T012201_Avoiding_Scientific_Notation_on_File_Imports.html
[The article suggests using .DAT, which I wouldn't use for an ASCII file, but I wanted to give credit where it's due for the idea.]

Speed up read.dbf in R (problems with importing large dbf file)

I have a dataset given in .dbf format and need to import it into R.
I haven't worked with such extension previously, so have no idea how to export dbf file with multiple tables into different format.
Simple read.dbf has been running hours and still no results.
Tried to look for speeding up R performance, but not sure whether it's the case, think the problem is behind reading the large dbf file itself (weights ~ 1.5Gb), i.e. the command itself must be not efficient at all. However, I don't know any other option how to deal with such dataset format.
Is there any other option to import the dbf file?
P.S. (NOT R ISSUE) The source of the dbf file uses visual foxpro, but can't export it to other format. I've installed foxpro, but given that I've never used it before, I don't know how to export it in the right way. Tried simple "Export to type=XLS" command, but here comes a problem with encoding as most of variables are in Russian Cyrillic and can't be decrypted by excel. In addition, the dbf file contains multiple tables that should be merged in 1 big table, but I don't know how to export those tables separately to xls, same as I don't know how to export multiple tables as a whole into xls or csv, or how to merge them together as I'm absolutely new to dbf files theme (though looked through base descriptions already)
Any helps will be highly appreciated. Not sure whether I can provide with sample dataset, as there are many columns when I look the dbf in foxpro, plus those columns must be merged with other tables from the same dbf file, and have no idea how to do that. (sorry for the mess)
Your can export from Visual FoxPro in many formats using the COPY TO command via the Command Window, as per the VFP help file.
For example:
use mydbf in 0
select mydbf
copy to myfile.xls type xl5
copy to myfile.csv type delimited
If you're having language-related issues, you can add an 'as codepage' clause to the end of those. For example:
copy to myfile.csv type delimited as codepage 1251
If you are not familiar with VFP I would try to get the raw data out like that, and into a platform that you are familiar with, before attempting merges etc.
To export them in a loop you could use the following in a .PRG file (amending the two path variables at the top to reflect your own setup).
Close All
Clear All
Clear
lcDBFDir = "c:\temp\" && -- Where the DBF files are.
lcOutDir = "c:\temp\export\" && -- Where you want your exported files to go.
lcDBFDir = Addbs(lcDBFDir) && -- In case you forgot the backslash.
lcOutDir = Addbs(lcOutDir)
* -- Get the filenames into an array.
lnFiles = ADir(laFiles, Addbs(lcDBFDir) + "*.DBF")
* -- Process them.
For x = 1 to lnFiles
lcThisDBF = lcDBFDir + laFiles[x, 1]
Use (lcThisDBF) In 0 Alias currentfile
Select currentfile
Copy To (lcOutDir + Juststem(lcThisDBF) + ".csv") type csv
Use in Select("Currentfile") && -- Close it.
EndFor
Close All
... and run it from the Command Window - Do myprg.prg or whatever.

Can anyone help me write a R data frame as a SAS data set?

In R, I have used the write.foreign() function from the foreign library in order to write a data frame as a SAS data set.
write.foreign(df = test.df, datafile = 'test.sas7bdat', codefile = 'test.txt', package = "SAS")
The SAS data file is written, but when I try to open it in SAS Viewer 9.1 (Windows XP), I receive the following message - "SAS Data set file format is not supported".
Note: I am generally unfamiliar with SAS, so if an answer exists that would have been known by a regular SAS user, please excuse my ignorance.
write.foreign with option package="SAS" actually writes out a comma-delimited text file and then creates a script file with SAS statements to read it in. You have to run SAS and submit the script to turn the text file into a SAS dataset. Your call should look more like
write.foreign(df=test.df, datafile="test.csv", codefile="test.sas", package="SAS")
Note the different extension. Also, write.foreign writes factor variables as numeric variables with a format controlling their appearance -- ie, the R definition of a factor. If you just want the character representation, you'll have to convert the factors via as.character before exporting.
I'm not much of a SAS user either, but I've used write.xport() before and it's worked fine. My crude understanding is that there are two types of SAS files, internal ones and XPORT files. The XPORT ones are the ones that are more compatible across different versions, architectures, etc.
This is an edit to Hong Ooi's answer.
In R:
library(foreign)
write.foreign(df=test.df, datafile="test.csv", codefile="test.sas", package="SAS")
In SAS:
Upload both test.csv and test.sas files. Open test.sas. You may have to edit the test.sas code that is output from the write.foreign function. What worked for me is updating the INFILE line to include the library / location:
"/home/kristenmae0/test.csv"
You can do it easily with SAS : just have a test with SAS/IML (proc iml) or IMLPlus (object oriented version) with SAS/IML Studio.
See this :
http://support.sas.com/documentation/cdl/en/imlsstat/63827/HTML/default/viewer.htm#imlsstat_statr_sect004.htm
or download SAS/IML Studio for free :
http://www.sas.com/apps/demosdownloads/92_SDL_sysdep.jsp?packageID=000721
This release of SAS/IML Studio provides the capability to interface with the R language.

Resources