This question already has answers here:
How can I read a matrix from a txt file in R?
(2 answers)
Closed 7 years ago.
I have a very big file like this (no separator between characters):
1234
3456
2345
I want to read it to R as a matrix and get this:
1 2 3 4
3 4 5 6
2 3 4 5
This question is like this question: read in matrix into r without delimination but I am looking for a better way. I do not want to put the number of columns - I want the number of columns to be a variable in the code and support big files.
How about:
library(readr)
my_file <- "big_file.txt"
my_matrix <- as.matrix(read_fwf(my_file, fwf_widths(rep(1,nchar(readLines(my_file, n=1))))))
nchar(readLines(my_file, n=1)) reads the first line and counts the number of characters. This is the multiplier of for the rep() for specifying the fwf_widths.
This assumption being that all your numbers are integers between 0 and 9.
Related
This question already has answers here:
Calculating cumulative sum for each row
(6 answers)
Closed 5 months ago.
I am new to coding so this is probably very basic. I was wondering how to add two successive values in a column of a csv along its entire length. Say this was my data:
[1] 2
[2] 3
[3] 4
[4] 5
I want to make a vector which contains 2+3, 3+4 and 4+5 (but obviously my real data set is much larger).
Thanks a lot!
I think you need cumsum():
If this is your vector:
vector <- 2:5
[1] 2 3 4 5
You would need to:
cumsum(vector)
[1] 2 5 9 14
This question already has an answer here:
Write a data frame to csv file without column header in R [duplicate]
(1 answer)
Closed 4 years ago.
Hi i am trying to remove a variable name and dsiplay just the values alone. when doing so, i am getting a space above the first value.
Even when i tried to save it as a csv, that too had a space in the place of the variable names. How to avoid the space? Can anyone help?
I just wanted to display from the first value without any spaces in the place of the variable names.
Sample code:
unname(data.frame(cars$speed))
write.csv(unname(data.frame(cars$speed)), 'C:/Users/pdrajama/Downloads/Personal/R/cars.csv')
Console output:
unname(data.frame(cars$speed))
1 4
2 4
3 7
4 7
5 8
6 9
7 10
8 10
You can try this:
write.table(data.frame(cars$speed),"test.txt",col.names=F,quote=F,sep=",")
write.table has already col.names=F to remove the first line of the data. And you can use sep="," to mimic the write.csv output.
This question already has answers here:
Split comma-separated strings in a column into separate rows
(6 answers)
Closed 6 years ago.
I've got a dataframe like this:
The first column is numeric, and the second column is a comma separated list (character)
id numbers
1 2,4,5
2 1,4,6
3 NA
4 NA
5 5,1,2
And I want to in essence "melt" the dataframe similar to the reshape package. So that the output is a dataframe which looks like this
id numbers
1 2
1 4
1 5
2 1
2 4
2 6
3 NA
4 NA
5 5
5 1
5 2
Except in the reshape2 package each number will have to be each in a column... which takes up too much storage space if there are many numbers... which is why I have opted to set the list of numbers as a comma separated list. But melt no longer works with this setup.
Can you recommend the most efficient way to achieve the transformation from the input dataframe to output dataframe?
The way I would do it for each row, create a data.frame and store them in a list, where df is your initial data.frame.
l = list()
for (j in 1:nrow(df)){
l[[j]] = data.frame(id = df$id[[j]],
numbers = split(df$numbers[[j]], ','))
}
Afterwards, you can stack all list elements into a single data.frame using plyr::ldply with the 'data.frame' option.
This question already has answers here:
Quick question about subsetting via character-class
(3 answers)
Closed 8 years ago.
I have a vector called gas
gas <- c("Hydrogen","Methane")
I also have a data frame called df that looks like
df <- ID Hydrogen Methane
1 2 20
1 3 19
1 2 23
2 8 13
ect..
Normally to use a variable in a data frame I would use df$Hydrogen for example but what I want to know is can I also call Hydrogen by using the vector above? e.g.
data$gas[1]
#In other words I would like the following to be true:
data$gas[1] == data$Hydrogen
what syntax, if any, would I use to obtain this?
Thanks
If you want a specific gas, try:
df[,gas[1]]
For all gases:
df[gas]
This question already has answers here:
Read a list of files with R, each file contains a list of float numbers. what's the proper way to do it?
(2 answers)
Closed 9 years ago.
I've got many text files with named by year i.e. yob1940.txt,yob1941.txt. Each file has 3 colums of data. I'm trying to import the data into R in a single data table, and add the year for each file in a 4th column.
Any help would be much appreciated.
Thanks
Smth like this will work:
rbindlist(lapply(list.files(pattern = "yob[0-9]+\\.txt"),
function(x) data.table(year = sub('.*?([0-9]+).*', '\\1', x),
fread(x)))))
Assuming you have read these files as x1 and x2
df.list<-list(x1,x2)
kk<-do.call(rbind,df.list)
year<-data.frame(rep(c(1940,1941),c(nrow(x1),nrow(x2))))
names(year)<-"year"
mydata<-data.frame(cbind(kk,year))
A sample example:
x1<-data.frame(x=c(1,3),y=c(2,3))
x2<-data.frame(x=c(3,3),y=c(2,2))
df.list<-list(x1,x2)
kk<-do.call(rbind,df.list)
year<-data.frame(rep(c(1940,1941),c(nrow(x1),nrow(x2))))
names(year)<-"year"
mydata<-data.frame(cbind(kk,year))
mydata
x y year
1 1 2 1940
2 3 3 1940
3 3 2 1941
4 3 2 1941