Turn off scientific notation before writing to file - r

My data.frame uses the scientific notation, when parsing files like 3.007530e+07.
I definitely like to use it in R, however, for this analysis I have to transfer my data to csv and open it in excel(German Version), which cannot handle this notation.
My df looks sth like that:
df <- c(6.402000e+05,9.312903e+05,1.007800e+06,1.142000e+06,1.298500e+06,1.511700e+06,1.749000e+06,1.869357e+06)
I tried changing my global options such as options(scipen=999), which does not work, because then I have problems with my fread function.
Therefore, my question:
How to change the notation in a data.frame before, using write.csv()?
I appreciate your replies!

As an alternative to altering the R format (since you want to keep scientific notation in R), could you change how Excel imports your file?
For example, naming your csv file with a non-standard extension to trigger the manual importing process (import wizard), instead of automatically opening the file in the wrong format?
I tried a simple test with a csv formatted file of numbers in scientific notation, saved with a ".sci" filename. My version of Excel launched the wizard, then imported the file and handled the scientific notation correctly [MS Excel Starter 2010, English version].
Edit: I found the reference to using an unrecognized file extension to trigger Excel's import wizard: http://excelribbon.tips.net/T012201_Avoiding_Scientific_Notation_on_File_Imports.html
[The article suggests using .DAT, which I wouldn't use for an ASCII file, but I wanted to give credit where it's due for the idea.]

Related

R: write_csv without messing up the language

I am working with a dataset that contains data in multiple languages.
Is there a way to export my work as a CSV file and have R maintain the use of characters in a foreign language instead of replacing them with gibberish English symbols?
Update for anyone who reaches this by Google:
It looks like R only pretends to screw up foreign languages. When you use write_csv, it actually does create a .csv that uses the correct foreign characters.
However, you'll only see them if you open them in Notepad. If you open it in Excel, Excel will screw it up, and if you open it with read_csv, R will screw it up (but will still export it correctly when you use write_csv again).
write_excel_csv() from the readr package seems to work without messing up the foreign characters when opening with Excel.

Converting a (probable) ENVI file to decimals using R or excel

I got the output file from a spectrometer which is supposed to be a series of decimals numbers. The file looks like this:
™pQH1JHxþFH$ÏFH÷~EHa×BHäBHßdBH.#H²Ï=HL=HŒÚ<Hê‰:H­P:Hoõ9H¢Ž6Hº7H¨Y5H ?1H½¶.Hø²0HøŽ2H8æ.H.î,HŒt/H&1H͸0Hí.Hvî,H$ª+HµX+HCý*H·W+H!º+HP+HfØ(Hû'H†Ù'H|U(HQ`)Hn*H
})H'Hó%HÂ%H¶¨&H&H|•&H\
I have been reading a lot without getting to the solution. My silly question is: is that a ENVI or ASCII file? Or? How can I see the numbers I need do to use? I tried some online converters without being successful.
The starting point would be to get these numbers to develop a R code to make graphs. Thanks a lot for your time.
This looks like you opened the binary file of the mass spectrometer. Almost all vendors keep their format a secret. The only way to do this is to export it to an open format. Most vendors supply some kind of data analysis software and there are often export functions present. Most general open data formats are mzXML and mzML.
For converting have a look at the msconvert program from ProteoWizard.
If you have converted the data one of the packages in R where you can start with is XCMS.

How to convert .dat + .sps to .sav on command line

I get a lot of datasets that arrive as .dat files with syntax files for converting to SPSS (.sps). I'm an R user, so I need to convert the .dat file into a .sav that R can read.
In the past, I've used PSPP to do this manually. (I can't afford SPSS!) But I'd MUCH prefer a programmatic solution.
I thought pspp-convert would do the trick, but there's something I'm not understanding about how that works in terms of inputting the syntax file:
My files are:
data.dat
data.sps (which correctly points to data.dat)
I tried
pspp-convert data.sps data.sav
But get
`data.sps' is not a system or portable file.
Makes sense since the input is supposed to be a portable file. Am I trying to do something beyond the scope of this CLI?
Generally speaking, there MUST be some way to apply an SPS file to a DAT file to get a SAV file (or any other portable file) back, right?
From an SPSS Statistics point of view, a .dat file extension most often means the data is in a fixed ASCII text format. You would need the accompanying codebook to tell you what variables to read and in what formats. The SPSS Statistics command syntax file (.sps) does this for you. But this file is simply the list of SPSS Statistics commands used to read the ASCII data. It is not a data file itself.
Elsewhere you've referenced these files as "portable files". An SPSS Statistics portable file (.por) is a very special case of an ASCII file; structured to be read and written by SPSS Statistics. In any case, if your preferred tool takes an SPSS Statistics portable file (.por), these *.dat files likely aren't it.
Assuming these *.dat files are fixed ASCII text files, you'll need to discern how the information therein is stored and then use a likely tool for reading ASCII text.

How to open SAS files using Excel?

I have a set of SAS data sets and I want to open it using Excel or R. I don't have a SAS software with me so i can't use the export option in it. Is there any converter that converts from SAS7BDAT to excel?
Thanks
I help develop the Colectica for Excel addin, which opens SAS data files in Excel. No SAS software or ODBC configurations are required. The addin directly reads the SAS file and then inserts the data and metadata into your worksheet.
Imports SAS .sas7bdat data and column names
Imports SAS .sas7bcat formats and value labels when avalaible
The Excel addin is downloadable from http://www.colectica.com/software/colecticaforexcel
Documentation is available in the user manual.
You could use SAS add in for Microsoft office to open the SAS dataset in Excel. Not sure if it is free though.
http://support.sas.com/software/products/addin/
As Reese suggested you can use - SAS Universal Viewer , its free!!
Here is the link :-
https://support.sas.com/downloads/browse.htm?fil=&cat=74
Or you can download SAS University Edition, which is also free, it is more than just a viewer, you can write and execute programs in here.
http://www.sas.com/en_us/software/university-edition/download-software.html
Here a quick-and-dirty python five-liner to convert a .xpt file to .csv
import pandas as pd
FILE_PATH = "(directory containing file)"
FILE = "ABC" # filename itself (without suffix)
# Note: might need to substitute the column name of the index (in quotes) for "None" here
df = pd.read_sas(FILE_PATH + FILE + '.XPT', index=None)
df.to_csv(FILE_PATH + FILE + '.csv')
Hopefully this might help someone
I came across the same "need" and after some research here and there, I found a nice and easy way with R and the latest version of RStudio (as per 2020 June date - the FREE one). Using it, you can open various formats of files and RStudio generates for you the R script it ran behind. You can use this as a starting point, in order to have the .sas7bdat file opened, and then do the conversion step.
Steps to follow in order to import the file using the RStudio "visual" way: Evironment tab -> Import Dataset -> From SAS...
It will ask you to import the haven library. After the installation you will have a tab with the preview of the data within the file and also the R script ran behind which will look like this:
library(haven)
aux <- read_sas("//PATH_ON_YOUR_MACHINE_TO_FILE/actual_file.sas7bdat", NULL)
View(aux)
Notice the NULL there, it has the purpose of converting empty strings to NULL.
But wait, we also need to convert it to a .csv file in order to have the final job done. For this you simply add below those lines from above the following:
write.csv(aux, "actual_file.csv")
Which will produce within the same folder with the original SAS file, the desired .CSV one. If you want to have ";" as separator instead on "," use write.csv2(aux, "actual_file.csv"). Anyway Strings are enclosed by " " so it should be fine.

Can SAS still read or create the combination of {.dat fixed-column ascii data file, .sas syntax file}, or is it obsolete?

In the past I have used the excellent SAScii package in R to read in this type of data: {.dat fixed-column data file + the corresponding .sas "syntax" file}. I want to be quite precise about that because there is no end of ambiguity surrounding phrases like "SAS file". These .dat files contain only integers, and the .sas files specify both the way to parse the columns and the way the integers represent the values in the actual data (this feature is sometimes called the "codebook".) I have found very good data in that format (i.e. in the form of the pair of files {.dat, .sas}) from places like Minnesota Population Center's IPUMS https://usa.ipums.org/usa/, and built up a lot of tools to analyze it using R and SAScii.
Now I have access to SAS itself, and but would still like to re-use some of my tools and techniques. However I can find no reference in SAS to data like that {fixed-column data in .dat, syntax file in .sas}. Has that format been entirely superseded within SAS (perhaps by the SAS7BDAT format)? Or perhaps the {.dat,.sas} format was never used within SAS?? The reason I ask is, now that I have access to SAS and so much data in SAS7BDAT format, I would like to be able to export some of it in {.dat, .sas} format for use with my own tools.
Thanks very much, and cheers - Ed
I don't think this is something built into SAS. You could, however, write such a program pretty easily.
First off, Chris Hemidinger has written something that basically does this (it creates datalines, not .dat file, but that shouldn't be too hard to modify if you know .NET and/or to modify the R module to accept). That is discussed and available here. The title of the post is "Turn your data set into a data step program". This is roughly equivalent to the SQL Server task that creates "Create Table" code out of a table. This would only work in Enterprise Guide, although you should be able to do roughly the same thing in a standalone .NET program.
Second, you can easily write something like this in Base SAS. Creating the datalines is easy, numerous ways to write out to a file.
For a CSV, for example, you can do this.
ods csv file="c:\temp\mydata.csv";
proc print data=mydata;
run;
ods csv close;
If you're going to write a flat file, you might as well make the input/output .sas first - after all it can be almost the same code. You can query dictionary.columns to generate the code, both the input and output code. Create a table with the variable names, lengths, and formats for each variable, then process it in a data step advancing the start variable by the length of each variable (so it moves to the next position after the last one finished). If you need formats for your R project, then proc format cntlout=<datasetname> will generate a dataset that contains those formatted value translations, and you can write that out in whatever format you need as well.

Resources