I have a dataframe like
All_DATA
ID Name Age
1 xyz 10
2 pqr 20
5 abc 15
6 pqr 19
8 xyz 10
9 pqr 12
10 abc 20
54 abc 41
Right now I have code which works for subsetting the data based on Name and the putting them into different excel ,but Now I want it in same excel in different sheets.
Here is the code for putting them into different excel
library("xlsx")
library("openxlsx")
All_DATA = read.xlsx("D:/test.xlsx")
data.list=list()
for(i in unique(All_DATA$Name)){
data.list[[i]] = subset(All_DATA,NAME==i)
write.xlsx( data.list[[i]],file=paste0("D:/Admin/",i,".xlsx"),row.names=F)
}
Is there any way by which a single excel file with data on multiple sheets can be generated.
Thanks
Domnick
You can use
write.xlsx(data.list[[i]], file="file.xlsx", sheetName=paste0("Sheet_",i,".xlsx"), row.names = F)
Related
This question already has answers here:
Trying to merge multiple csv files in R
(10 answers)
How to combine multiple .csv files in R?
(1 answer)
Closed 9 months ago.
I have a bunch of large CSVs and they all contain the exact same columns and I need to combine them all into a single CSV, so basically appending all the data from each data frame underneath the next. Like this
Table 1
Prop_ID
State
Pasture
Soy
Corn
1
WI
20
45
75
2
MI
10
80
122
Table 2
Prop_ID
State
Pasture
Soy
Corn
3
MN
152
0
15
4
IL
0
10
99
Output table
Prop_ID
State
Pasture
Soy
Corn
1
WI
20
45
75
2
MI
10
80
122
3
MN
152
0
15
4
IL
0
10
99
I have more than 2 tables to do this with, any help would be appreciated. Thanks
A possible solution, in base R:
rbind(df1, df2)
#> Prop_ID State Pasture Soy Corn
#> 1 1 WI 20 45 75
#> 2 2 MI 10 80 122
#> 3 3 MN 152 0 15
#> 4 4 IL 0 10 99
Or using dplyr:
dplyr::bind_rows(df1, df2)
Assuming all the csv files are in a single directory, and that these are the only files in that directory, this solution, using data.table, should work.
library(data.table)
setwd('<directory with your csv files>')
files <- list.files(pattern = '.+\\.csv$')
result <- rbindlist(lapply(files, fread))
list.files(...) returns a vector containing the file names in a given directory, based on a pattern. Here we ask for only files containing .csv at the end.
fread(...) is a very fast file reader for data.table. We apply this function to each file name ( lapply(files, fread) ) to generate a list containing the contents of each csv. Then we use rbindlist(...) to combine them row-wise.
I have been having this issue where after importing data from a csv file using the commands
> mydata = read.table(file.choose(), header = TRUE)
> attach(mydata)
> mydata
my data appears as
Tail_Length.Wing_Length.Gender
1 180,278,1
2 186,277,1
3 206,308,1
4 184,290,1
5 177,273,1
6 177,284,1
7 176,267,1
8 200,281,1
9 191,287,1
10 193,271,1
11 212,302,1
12 181,254,1
13 195,297,1
14 187,281,1
15 190,284,1
16 185,282,1
17 195,285,1
18 183,276,1
what can i do to have each variable stick to each section as well as fix titles to not have periods in between?
Please help soon thank you
So I have multiple sheets in one excel file (e.g.tabs of sheets: s1, s2, s3) .
I want to create a function to read them in and attach each tab of sheets another column with their repetitious tabname, and then combine them together as a data frame.
#####Step1 read in multiple tabs of sheets in a function
s1<-data.frame(ID=c(132,453,644))
s2<-data.frame(ID=c(1332,4532,6443))
s3<-data.frame(ID=c(432,643,747))
> s1
ID
1 132
2 453
3 644
> s2
ID
1 1332
2 4532
3 6443
> s3
ID
1 432
2 643
3 747
######Result Step 2
s1$tabname<-c(rep('s1',nrow(s1)))
> s1
ID tabname
1 132 s1
2 453 s1
3 644 s1
s2$tabname<-c(rep('s2',nrow(s2)))
> s2
ID tabname
1 1332 s2
2 4532 s2
3 6443 s2
s3$tabname<-c(rep('s3',nrow(s3)))
> s3
ID tabname
1 432 s3
2 643 s3
3 747 s3
####My ultimate goal
ultimate<-rbind(s1,s2,s3)
> ultimate
ID tabname
1 132 s1
2 453 s1
3 644 s1
4 1332 s2
5 4532 s2
6 6443 s2
7 432 s3
8 643 s3
9 747 s3
####I'm stuck with step 2 to add the col according to their tab names and had to hard code step 3 as well. My codes are as below, can someone provide me any hint?
#
library("readxl")
Import<-function(Ref){
Excel.Ref<-read_xlsx("The Excel Sheet I Have.xlsx", sheet = Ref)
for (Ref in 1:length(Ref)){
Excel.Ref<-cbind(Excel.Ref,
Excel.Tab<- data.frame (Tab_name =rep(Ref,nrow(Excel.Ref))))
}
return(Excel.Ref)
print(Ref)
}
d<-c('s1','s2','s3')
Obs<-apply(d<-as.matrix(d), 1, function(x)do.call(Import, as.list(x)))
I managed your desired result using a combination of readxl and tidyverse. I created a file on my environment called test_file with the sheets.
##First: get all sheet names
sheets_to_read <- readxl::excel_sheets("test_file.xlsx")
##Second: read all sheets, add tabname, and then bind rows
x <- bind_rows(lapply(1:length(sheets_to_read),
function(i)readxl::read_excel("test_file.xlsx",
sheet = sheets_to_read[i]) %>%
mutate(tabname = sheets_to_read[i])))
x
Here is a simple dplyr solution:
Import <- function(xlsxfile,col_names){
# get names of sheets in input xlsxfile
sheets <- readxl::excel_sheets(xlsxfile)
# sheets as list
l <- lapply(sheets, readxl::read_xlsx, path=xlsxfile, col_names=col_names)
# sheets as data.frame
dplyr::bind_rows(l,.id="tabname")
}
Import(xlsxfile,col_names="ID")
# A tibble: 9 x 2
tabname ID
<chr> <dbl>
2 1 132
3 1 453
4 1 644
6 2 1332
7 2 4532
8 2 6443
10 3 432
11 3 643
12 3 747
Use col_names argument of read_xlsx() to specify the name of the column with tabs' names.
Use bind_rows() for binding a list of data.frames into a single data.frame, keeping track of the originatingdata.frame in the new column whose name is given by .id argument.
i am trying to create boxplots over 3 folders. each folders have 10 files in it. I have tryied to iterate the folders, but i have trouble in boxplotting each of the files.
So, here is my code
fileList = list.files(path=path,pattern="\\.teTestResult",full.names=T)
myfiles = lapply(fileList, read.delim,sep=";")
variablelist = c("BRFS3", "CPFE3", "CYRE3", "EMBR3", "ITUB4", "LREN3", "PETR4", "TIMP3", "TOTS3", "VALE5")
The variablelistis the name of the 10 files.
I have stuck for now,
this is the example of boxploting 1 header in BRFS3 file. but i need 5 more iteration for this
boxplot(Return~gama, data=BRFS3)
Thank you so much for your help
gama theta detectionsLimit NSMOOTH NREF NOBS sma lma PosTrades NegTrades Acc AvgWin AvgLoss Return
10.0 1.00 3 10 50 10 15 33 11 2 0.846154 0.0451529 -0.019449800 1.54942
1.0 1.00 5 5 80 40 15 33 7 2 0.777778 0.0676022 -0.008395400 1.54916
1.0 1.00 1 20 80 40 15 33 6 1 0.857143 0.0673241 -0.017465300 1.44823
The above is the sample of data file of variable BRFS3, i have tried to make it a good table to see, but i don't know how to show table in stackoverflow. Anyway i hope the table can help you understand the problem,
Much thanks..
I am new to R and am looking for a code to manipulate hundreds of files that I have at hand. They are .txt files with a few rows of unwanted text, followed by columns of data, looking something like this:
XXXXX
XXXXX
XXXXX
Col1 Col2 Col3 Col4 Col5
1 36 37 35 36
2 34 34 36 37
.
.
1500 34 35 36 35
I wrote a code (below) to extract selected rows of columns 1 and 5 of an individual .txt file, and would like to do a loop for all the files that I have.
data <- read.table(paste("/Users/tan/Desktop/test/01.txt"), skip =264, nrows = 932)
selcol<-c("V1", "V5")
write.table(data[selcol], file="/Users/tan/Desktop/test/01ed.txt", sep="\t")
With the above code, the .txt file now looks like this:
Col1 Col5
300 34
.
.
700 34
If possible, I would like to combine all the Col5 of the .txt files with one of Column 1 (which is the same for all txt files), so that it looks something like this:
Col1 Col5a Col5b Col5c Col5d ...
300 34 34 36 37
.
.
700 34 34 36 37
Thank you!
Tan
Alright - I think I hit on all your questions here, but let me know if I missed something. The general process that we will go through here is:
Identify all of the files that we want to read in and process in our working directory
Use lapply to iterate over each of those file names to create a single list object that contains all of the data
Select your columns of interest
Merge them together by the common column
For the purposes of the example, consider I have four files named file1.txt through file4.txt that all look like this:
x y y2
1 1 2.44281173 -2.32777987
2 2 -0.32999022 -0.60991623
3 3 0.74954561 0.03761497
4 4 -0.44374491 -1.65062852
5 5 0.79140012 0.40717932
6 6 -0.38517329 -0.64859906
7 7 0.92959219 -1.27056731
8 8 0.47004041 2.52418636
9 9 -0.73437337 0.47071120
10 10 0.48385902 1.37193941
##1. identify files to read in
filesToProcess <- dir(pattern = "file.*\\.txt$")
> filesToProcess
[1] "file1.txt" "file2.txt" "file3.txt" "file4.txt"
##2. Iterate over each of those file names with lapply
listOfFiles <- lapply(filesToProcess, function(x) read.table(x, header = TRUE))
##3. Select columns x and y2 from each of the objects in our list
listOfFiles <- lapply(listOfFiles, function(z) z[c("x", "y2")])
##NOTE: you can combine steps 2 and 3 by passing in the colClasses parameter to read.table.
#That code would be:
listOfFiles <- lapply(filesToProcess, function(x) read.table(x, header = TRUE
, colClasses = c("integer","NULL","numeric")))
##4. Merge all of the objects in the list together with Reduce.
# x is the common columns to join on
out <- Reduce(function(x,y) {merge(x,y, by = "x")}, listOfFiles)
#clean up the column names
colnames(out) <- c("x", sub("\\.txt", "", filesToProcess))
Results in the following:
> out
x file1 file2 file3 file4
1 1 -2.32777987 -0.671934857 -2.32777987 -0.671934857
2 2 -0.60991623 -0.822505224 -0.60991623 -0.822505224
3 3 0.03761497 0.049694686 0.03761497 0.049694686
4 4 -1.65062852 -1.173863215 -1.65062852 -1.173863215
5 5 0.40717932 1.189763270 0.40717932 1.189763270
6 6 -0.64859906 0.610462808 -0.64859906 0.610462808
7 7 -1.27056731 0.928107752 -1.27056731 0.928107752
8 8 2.52418636 -0.856625895 2.52418636 -0.856625895
9 9 0.47071120 -1.290480033 0.47071120 -1.290480033
10 10 1.37193941 -0.235659079 1.37193941 -0.235659079