I want to read a xls file into R and select specific columns.
For example I only want columns 1 to 10 and rows 5 - 700. I think you can do this with xlsx but I can't use that library on the network that I am using.
Is there another package that I can use? And how would I go about selecting the columns and rows that I want?
You can try this:
library(xlsx)
read.xlsx("my_path\\my_file.xlsx", "sheet_name", rowIndex = 5:700, colIndex = 1:10)
Since you are unable to lead the xlsx package, you might want to consider base R and use read.csv. For this, save your Excel file as a csv. The explanation for how to do this can be easily found on the web. Note, csv files can still be opened as Excel.
These are the steps you need to take to only read the 2nd and 3rd column and row.
hd = read.csv('a.csv', header=F, nrows=1, as.is=T) # first read headers
removeCols <- c('NULL', NA, NA) #define which columns to keep/remove
df <- read.csv('a.csv', skip=2, header=F, colClasses=removeCols) #skip says which rows not to read
colnames(df) <- hd[is.na(removeCols)]
df
two three
1 5 8
2 6 9
This is the example data I used.
a <- data.frame(one=1:3, two=4:6, three=7:9)
write.csv(a, 'a.csv', row.names=F)
read.csv('a.csv')
one two three
1 1 4 7
2 2 5 8
3 3 6 9
I have data in text format whose structure is as follows:
ATCTTTGAT*TTAGGGGGAAAAATTCTACGC*TTACTGGACTATGCT
.........T.....,,,,,,,,,.......T,,,,,,.........
......A..*............,,,,,,,,.A........T......
....*..................,,,T...............
...*.....................*...........
...................*.....
I have been trying to import it into R using the read.table() command but when I do the output has an altered structure like this:
V1
1 ATCTTTGAT*TTAGGGGGAAAAATTCTACGC*TTACTGGACTATGCT
2 .........T.....,,,,,,,,,.......T,,,,,,.........
3 ......A..*............,,,,,,,,.A........T......
4 ....*..................,,,T...............
5 ...*.....................*...........
6 ...................*.....
For some reason, R is shifting the rows with lesser number of characters to the right. How can I load my data into R without altering the data structure present in the original text file?
Try this :)
read.table(file, sep = "\n")
result:
V1
1 ATCTTTGAT*TTAGGGGGAAAAATTCTACGC*TTACTGGACTATGCT
2 .........T.....,,,,,,,,,.......T,,,,,,.........
3 ......A..*............,,,,,,,,.A........T......
4 ....*..................,,,T...............
5 ...*.....................*...........
6 ...................*.....
Hi I have the following data frame:
b = data.frame(c(1,2),c(3,4))
> colnames(b) <- c("100.X0","100.00")
> b
100.X0 100.00
1 1 3
2 2 4
I would like to save this as a csv file with headers as strings. When I use write.csv the result ends up being:
100.X0 100
1 3
2 4
It turns the 100.00 to 100, how do I incorporate this?
I think the problem might be the way you read the csv file. Certain programs will guess the type and convert (for eg Excel)
Use write.xls from package dataframes2xls instead:
> library(dataframes2xls)
> write.xls(b, "test.csv")
Result :
I am reading in parameter estimates from some results files that I would like to compare side by side in a table. But I cant get the dataframe to the structure that I want to have (Parameter name, Values(file1), Values(file2))
When I read in the files I get a wide dataframe with each parameter in a separate column that I would like to transform to "long" format using melt. But that gives only one column with values. Any idea on how to get several value columns without using a for loop?
paraA <- c(1,2)
paraB <- c(6,8)
paraC <- c(11,9)
Source <- c("File1","File2")
parameters <- data.frame(paraA,paraB,paraC,Source)
wrong_table <- melt(parameters, by="Source")
You can use melt in combination with cast to get what you want. This is in fact the intended pattern of use, which is why the functions have the names they do:
m<-melt(parameters)
dcast(m,variable~Source)
# variable File1 File2
# 1 paraA 1 2
# 2 paraB 6 8
# 3 paraC 11 9
Converting #alexis's comment to an answer, transpose (t()) pretty much does what you want:
setNames(data.frame(t(parameters[1:3])), parameters[, "Source"])
# File1 File2
# paraA 1 2
# paraB 6 8
# paraC 11 9
I've used setNames above to conveniently rename the resulting data.frame in one step.
I have a membership vector created with another software and I am stuck to write it into R so that I can use iGraph' modularity function to calculate modularity of this community division.
Can someone help me with how to write the vector into R so that the Modularity(g,membership) could run?
I tried with using membership <- read.table(file), but the result could not be used with Modularity(g, membership)
Thanks,
Song
read.table creates a data frame, you need to convert that to a simple numeric vector. Alternatively you can use scan(). You might need to adjust the following to your data format.
library(igraph)
G <- graph.full(3) + graph.ring(3) + graph.full(3)
contents <- '1 1 1 2 2 2 3 3 3'
memb <- scan(textConnection(contents))
# Read 9 items
modularity(G, memb)
# [1] 0.6666667
Instead of the textConnection(), just put your file name there.