data frame into matrix in R - r

I need to import data from csv file and later do computations with it as with usual matrix in R. Data in csv file contain only numbers except variable names in header. I used following commands:
XX <- read.table("C:/Users/.../myfile.csv", header = TRUE)
and got something resembling a matrix, with numbers separated with comas. Then:
X<- as.matrix(sapply(XX, as.numeric))
which gave me just column vector with strange numbers. What am i doing wrong? Thanks for help!

Related

Data import from excel to R (all columns are character classed)

I'm new to r and really need some help with an assignment i have for school.
So I've created an xls file containing returns for companies as decimals i.e 0.023 (2.3% return)
Data is in 3 columns with some negative values. titles for each column in the first row. No row names present just 130 observations of returns and the company names (column names) at the top. All the cells are formatted to general
I converted the xls file to csv on my mac so the file type became CSV-UTF-8 (comma delimited).
When i try to create a dataset in r I imported the csv using read.table command:
read.table(”filename.csv”, header = TRUE, sep =”;” row.names=null)
The dataset looks good all the individual numbers in the right place but when I try
Sapply(dataset, class)
All columns are character. I've tried as.numeric and it says list object cannot be coerced to type ’double’
The issue comes from the fact that you imported a dataset with commas and R cannot interpret this as numeric (it requires dot as decimals separator).
Two ways to avoid this :
You import as you did and you convert your dataframe
dataset=apply(apply(dataset, 2, gsub, patt=",", replace="."), 2, as.numeric)
You directly import the dataset by intepreting commas as decimals separator with read.csv2
library(readr)
read.csv2("filename.csv",fill=TRUE,header=TRUE)

writing single column to .csv in R

HI folks: I'm trying to write a vector of length = 100 to a single-column .csv in R. Each time I try, I get two columns in the csv file: first with index numbers from the vector, second with the contents of my vector. For example:
MyPath<-("~/rstudioshared/Data/HW3")
Files<-dir(MyPath)
write.csv(Files,"Names.csv",row.names = FALSE)
If I convert the vector to a data frame and then check its dimensions,
Files<-data.frame(Files)
dim(Files)
I get 100 rows by 1 column, and the column contains the names of the files in my directory folder. This is what I want.
Then I write the csv. When I open it outside of R or read it back in and look at it, I get a 100 X 2 DF where the first column contains the index numbers and the second column has the names of my files.
Why does this happen?
How do I write just the single column of data to the .csv?
Thanks!
Row names are written by write.csv() by default (and by default, a data frame with n rows will have row names 1,...,n). You can see this by looking at e.g.:
dat <- data.frame(mevar=rnorm(10))
# then compare what gets written by:
write.csv(dat, "outname1.csv")
# versus:
rownames(dat) <- letters[1:10]
write.csv(dat, "outname2.csv")
Just use write.csv(dat, "outname.csv", row.names=FALSE) and the row names won't show up.
And a suggestion: might be easier/cleaner to just just write the vector directly to a text file with writeLines(your_vector, "your_outfile.txt") (you can still use read.csv() to read it back in if you prefer using that :p).

Importing matrix csv data into R - how to convert into dataframe

I have a set of csv data that is saved in matrix format attached image is an example of the matrix
I would like to load the data into R and have it stored as a data frame with x$Year,x$Death,x$ASMR. How would I be able to do that?
Thanks!
CS
I think you're just looking for read.csv() and then change the colnames. I am assuming your file is separated by commas.
x <- read.csv('matrix.csv', sep=',', header=T)
colnames(x) <- c('Year', 'Death', 'ASMR')

How do I import data from a .csv file into R without repeating the values of the first column into all the other ones?

I want to import data into R from a .csv file.
So far I have done the following:
> #Clear environment
rm(list=ls())
#Read my data into R
myData <- read.csv("C:/Users/.../flow.csv", header=TRUE)
#Convert from list to array
myData <- array(as.numeric(unlist(myData)), dim=c(264,3))
#Create vectors with specific values of interest: qMax, qMin
qMax <- myData[,2]
qMin <- myData[,3]
#Transform vectors into matrices
qMax <- matrix(qMax,nrow = 12, ncol = round((length(qMax)/12)))
qMin <- matrix(qMin,nrow = 12, ncol = round((length(qMin)/12)))
After importing the data using read.csv, I have a list. I then proceed to transform this list into an array with 264 lines of data spread through 3 columns. Here I have my first problem.
I know that each column of my list brings a different set of data; the values are not the same. However, after I check to see what I imported, it seems that only the first column is imported correctly, but then it repeats itself for columns one and two.
Here's an image for better explanation:
The matrix has the right layout, yet wrong data. Columns 2 and 3 should have different values from each other and from column 1.
How do I correct that? I have checked the source and the original document has all the correct values.
Also, assuming I will eventually get rid of this mistake, will the proceeding lines of code from the block "#Transform vectors into matrices" deliver a 12 x 22 matrix? The first six elements of both qMax and qMin are NA and I wish to keep it this way in the matrix. Will R perform that with these lines of code or will I need to change it?
Thank you.
Edit: As suggested by akrun, here's the results for str(myData and for dput(droplevels(head(myData)))

read multiple csv files with the same column headings and find the mean

Is it possible to read multiple csv excell files into R. All of the csv files have the same 4 columns. the first is a character, the second and third are numeric and the fourth is integer. I want to combine the data in each numeric column and find the mean.
I can get the csv files into R with
data <- list.files(directory)
myFiles <- paste(directory,data[id],sep="/")
I am unable to get the numbers from the individual columns add them and find the mean.
I am completely new to R and any advice is appreciated.
Here is a simple method:
Prep: Generate dummy data: (You already have this)
dummy <- data.frame(names=rep("a",4), a=1:4,b=5:8)
write.csv(dummy,file="data01.csv",row.names=F)
write.csv(dummy,file="data02.csv",row.names=F)
write.csv(dummy,file="data03.csv",row.names=F)
Step0: Load the file names: (just like you are doing)
data <- dir(getwd(),".csv")
Step1: Read and combine:
DF <- do.call(rbind,lapply(data,function(fn) read.csv(file=fn,header=T)))
DF
Step2: Find mean of appropriate columns:
apply(DF[,2:3],2,mean)
Hope that helps!!
EDIT: If you are having trouble with file path, try ?file.path.

Resources