Append data to different sheets in an excel in R - r

I have a dataframe like
All_DATA
ID Name Age
1 xyz 10
2 pqr 20
5 abc 15
6 pqr 19
8 xyz 10
9 pqr 12
10 abc 20
54 abc 41
Right now I have code which works for subsetting the data based on Name and the putting them into different excel ,but Now I want it in same excel in different sheets.
Here is the code for putting them into different excel
library("xlsx")
library("openxlsx")
All_DATA = read.xlsx("D:/test.xlsx")
data.list=list()
for(i in unique(All_DATA$Name)){
data.list[[i]] = subset(All_DATA,NAME==i)
write.xlsx( data.list[[i]],file=paste0("D:/Admin/",i,".xlsx"),row.names=F)
}
Is there any way by which a single excel file with data on multiple sheets can be generated.
Thanks
Domnick

You can use
write.xlsx(data.list[[i]], file="file.xlsx", sheetName=paste0("Sheet_",i,".xlsx"), row.names = F)

Related

R combine multiple data frames based on column names into single table [duplicate]

This question already has answers here:
Trying to merge multiple csv files in R
(10 answers)
How to combine multiple .csv files in R?
(1 answer)
Closed 9 months ago.
I have a bunch of large CSVs and they all contain the exact same columns and I need to combine them all into a single CSV, so basically appending all the data from each data frame underneath the next. Like this
Table 1
Prop_ID
State
Pasture
Soy
Corn
1
WI
20
45
75
2
MI
10
80
122
Table 2
Prop_ID
State
Pasture
Soy
Corn
3
MN
152
0
15
4
IL
0
10
99
Output table
Prop_ID
State
Pasture
Soy
Corn
1
WI
20
45
75
2
MI
10
80
122
3
MN
152
0
15
4
IL
0
10
99
I have more than 2 tables to do this with, any help would be appreciated. Thanks
A possible solution, in base R:
rbind(df1, df2)
#> Prop_ID State Pasture Soy Corn
#> 1 1 WI 20 45 75
#> 2 2 MI 10 80 122
#> 3 3 MN 152 0 15
#> 4 4 IL 0 10 99
Or using dplyr:
dplyr::bind_rows(df1, df2)
Assuming all the csv files are in a single directory, and that these are the only files in that directory, this solution, using data.table, should work.
library(data.table)
setwd('<directory with your csv files>')
files <- list.files(pattern = '.+\\.csv$')
result <- rbindlist(lapply(files, fread))
list.files(...) returns a vector containing the file names in a given directory, based on a pattern. Here we ask for only files containing .csv at the end.
fread(...) is a very fast file reader for data.table. We apply this function to each file name ( lapply(files, fread) ) to generate a list containing the contents of each csv. Then we use rbindlist(...) to combine them row-wise.

How do i get my .csv data file to be aligned properly in R?

I have been having this issue where after importing data from a csv file using the commands
> mydata = read.table(file.choose(), header = TRUE)
> attach(mydata)
> mydata
my data appears as
Tail_Length.Wing_Length.Gender
1 180,278,1
2 186,277,1
3 206,308,1
4 184,290,1
5 177,273,1
6 177,284,1
7 176,267,1
8 200,281,1
9 191,287,1
10 193,271,1
11 212,302,1
12 181,254,1
13 195,297,1
14 187,281,1
15 190,284,1
16 185,282,1
17 195,285,1
18 183,276,1
what can i do to have each variable stick to each section as well as fix titles to not have periods in between?
Please help soon thank you

loop read the sheets and add the sheets as another column in R

So I have multiple sheets in one excel file (e.g.tabs of sheets: s1, s2, s3) .
I want to create a function to read them in and attach each tab of sheets another column with their repetitious tabname, and then combine them together as a data frame.
#####Step1 read in multiple tabs of sheets in a function
s1<-data.frame(ID=c(132,453,644))
s2<-data.frame(ID=c(1332,4532,6443))
s3<-data.frame(ID=c(432,643,747))
> s1
ID
1 132
2 453
3 644
> s2
ID
1 1332
2 4532
3 6443
> s3
ID
1 432
2 643
3 747
######Result Step 2
s1$tabname<-c(rep('s1',nrow(s1)))
> s1
ID tabname
1 132 s1
2 453 s1
3 644 s1
s2$tabname<-c(rep('s2',nrow(s2)))
> s2
ID tabname
1 1332 s2
2 4532 s2
3 6443 s2
s3$tabname<-c(rep('s3',nrow(s3)))
> s3
ID tabname
1 432 s3
2 643 s3
3 747 s3
####My ultimate goal
ultimate<-rbind(s1,s2,s3)
> ultimate
ID tabname
1 132 s1
2 453 s1
3 644 s1
4 1332 s2
5 4532 s2
6 6443 s2
7 432 s3
8 643 s3
9 747 s3
####I'm stuck with step 2 to add the col according to their tab names and had to hard code step 3 as well. My codes are as below, can someone provide me any hint?
#
library("readxl")
Import<-function(Ref){
Excel.Ref<-read_xlsx("The Excel Sheet I Have.xlsx", sheet = Ref)
for (Ref in 1:length(Ref)){
Excel.Ref<-cbind(Excel.Ref,
Excel.Tab<- data.frame (Tab_name =rep(Ref,nrow(Excel.Ref))))
}
return(Excel.Ref)
print(Ref)
}
d<-c('s1','s2','s3')
Obs<-apply(d<-as.matrix(d), 1, function(x)do.call(Import, as.list(x)))
I managed your desired result using a combination of readxl and tidyverse. I created a file on my environment called test_file with the sheets.
##First: get all sheet names
sheets_to_read <- readxl::excel_sheets("test_file.xlsx")
##Second: read all sheets, add tabname, and then bind rows
x <- bind_rows(lapply(1:length(sheets_to_read),
function(i)readxl::read_excel("test_file.xlsx",
sheet = sheets_to_read[i]) %>%
mutate(tabname = sheets_to_read[i])))
x
Here is a simple dplyr solution:
Import <- function(xlsxfile,col_names){
# get names of sheets in input xlsxfile
sheets <- readxl::excel_sheets(xlsxfile)
# sheets as list
l <- lapply(sheets, readxl::read_xlsx, path=xlsxfile, col_names=col_names)
# sheets as data.frame
dplyr::bind_rows(l,.id="tabname")
}
Import(xlsxfile,col_names="ID")
# A tibble: 9 x 2
tabname ID
<chr> <dbl>
2 1 132
3 1 453
4 1 644
6 2 1332
7 2 4532
8 2 6443
10 3 432
11 3 643
12 3 747
Use col_names argument of read_xlsx() to specify the name of the column with tabs' names.
Use bind_rows() for binding a list of data.frames into a single data.frame, keeping track of the originatingdata.frame in the new column whose name is given by .id argument.

Create boxplot over lists from the first 6 header of the list

i am trying to create boxplots over 3 folders. each folders have 10 files in it. I have tryied to iterate the folders, but i have trouble in boxplotting each of the files.
So, here is my code
fileList = list.files(path=path,pattern="\\.teTestResult",full.names=T)
myfiles = lapply(fileList, read.delim,sep=";")
variablelist = c("BRFS3", "CPFE3", "CYRE3", "EMBR3", "ITUB4", "LREN3", "PETR4", "TIMP3", "TOTS3", "VALE5")
The variablelistis the name of the 10 files.
I have stuck for now,
this is the example of boxploting 1 header in BRFS3 file. but i need 5 more iteration for this
boxplot(Return~gama, data=BRFS3)
Thank you so much for your help
gama theta detectionsLimit NSMOOTH NREF NOBS sma lma PosTrades NegTrades Acc AvgWin AvgLoss Return
10.0 1.00 3 10 50 10 15 33 11 2 0.846154 0.0451529 -0.019449800 1.54942
1.0 1.00 5 5 80 40 15 33 7 2 0.777778 0.0676022 -0.008395400 1.54916
1.0 1.00 1 20 80 40 15 33 6 1 0.857143 0.0673241 -0.017465300 1.44823
The above is the sample of data file of variable BRFS3, i have tried to make it a good table to see, but i don't know how to show table in stackoverflow. Anyway i hope the table can help you understand the problem,
Much thanks..

Manipulating multiple files in R

I am new to R and am looking for a code to manipulate hundreds of files that I have at hand. They are .txt files with a few rows of unwanted text, followed by columns of data, looking something like this:
XXXXX
XXXXX
XXXXX
Col1 Col2 Col3 Col4 Col5
1 36 37 35 36
2 34 34 36 37
.
.
1500 34 35 36 35
I wrote a code (below) to extract selected rows of columns 1 and 5 of an individual .txt file, and would like to do a loop for all the files that I have.
data <- read.table(paste("/Users/tan/Desktop/test/01.txt"), skip =264, nrows = 932)
selcol<-c("V1", "V5")
write.table(data[selcol], file="/Users/tan/Desktop/test/01ed.txt", sep="\t")
With the above code, the .txt file now looks like this:
Col1 Col5
300 34
.
.
700 34
If possible, I would like to combine all the Col5 of the .txt files with one of Column 1 (which is the same for all txt files), so that it looks something like this:
Col1 Col5a Col5b Col5c Col5d ...
300 34 34 36 37
.
.
700 34 34 36 37
Thank you!
Tan
Alright - I think I hit on all your questions here, but let me know if I missed something. The general process that we will go through here is:
Identify all of the files that we want to read in and process in our working directory
Use lapply to iterate over each of those file names to create a single list object that contains all of the data
Select your columns of interest
Merge them together by the common column
For the purposes of the example, consider I have four files named file1.txt through file4.txt that all look like this:
x y y2
1 1 2.44281173 -2.32777987
2 2 -0.32999022 -0.60991623
3 3 0.74954561 0.03761497
4 4 -0.44374491 -1.65062852
5 5 0.79140012 0.40717932
6 6 -0.38517329 -0.64859906
7 7 0.92959219 -1.27056731
8 8 0.47004041 2.52418636
9 9 -0.73437337 0.47071120
10 10 0.48385902 1.37193941
##1. identify files to read in
filesToProcess <- dir(pattern = "file.*\\.txt$")
> filesToProcess
[1] "file1.txt" "file2.txt" "file3.txt" "file4.txt"
##2. Iterate over each of those file names with lapply
listOfFiles <- lapply(filesToProcess, function(x) read.table(x, header = TRUE))
##3. Select columns x and y2 from each of the objects in our list
listOfFiles <- lapply(listOfFiles, function(z) z[c("x", "y2")])
##NOTE: you can combine steps 2 and 3 by passing in the colClasses parameter to read.table.
#That code would be:
listOfFiles <- lapply(filesToProcess, function(x) read.table(x, header = TRUE
, colClasses = c("integer","NULL","numeric")))
##4. Merge all of the objects in the list together with Reduce.
# x is the common columns to join on
out <- Reduce(function(x,y) {merge(x,y, by = "x")}, listOfFiles)
#clean up the column names
colnames(out) <- c("x", sub("\\.txt", "", filesToProcess))
Results in the following:
> out
x file1 file2 file3 file4
1 1 -2.32777987 -0.671934857 -2.32777987 -0.671934857
2 2 -0.60991623 -0.822505224 -0.60991623 -0.822505224
3 3 0.03761497 0.049694686 0.03761497 0.049694686
4 4 -1.65062852 -1.173863215 -1.65062852 -1.173863215
5 5 0.40717932 1.189763270 0.40717932 1.189763270
6 6 -0.64859906 0.610462808 -0.64859906 0.610462808
7 7 -1.27056731 0.928107752 -1.27056731 0.928107752
8 8 2.52418636 -0.856625895 2.52418636 -0.856625895
9 9 0.47071120 -1.290480033 0.47071120 -1.290480033
10 10 1.37193941 -0.235659079 1.37193941 -0.235659079

Resources