crawl excel data automatically based on contents - julia

I want to get data from many excel files with the same format like this:
What I want is to output the ID data from column B to a CSV file. I have many files like this and for each file, the number of columns may not be the same but the ID data will always be in the B column.
Is there a package in Julia that can crawl data in this format? If not, what method should I use?

You can use the XLSX package.
If the file in your screenshot is called JAKE.xlsx and the data shown is in a sheet called DataSheet:
data = XLSX.readtable("JAKE.xlsx", "DataSheet")
# `data[1]` is a vector of vectors, each with data for a column.
# that way, `data[1][2]` correponds to column B's data.
data[1][2]
This should give you access to a vector with the data you need. After getting the IDs into a vector, you can use the CSV package to create an output file.
If you add a sample xlsx file to your post it might be possible to give you a more complete answer.

Related

How to read macro enabled excel files in R?

I have 2 excel files which have macros in it. The file extension ends with .xlsb and .xlsm. I want to read these files into R and do exactly what excel is doing with these files in terms of data inputs in R. What is the way to go about it?
For example: if the excel file calculates house prices in sheet 2 based on data input in sheet 1, how can the same results for house price calculation be obtained in R?
You might take a look at the R package RDCOMClient:
https://github.com/omegahat/RDCOMClient
Here is a nice example shown:
https://www.r-bloggers.com/2021/07/rdcomclient-read-and-write-excel-and-call-vba-macro-in-r/

How to put data frame in R including count of complete cases in separate files

I'm a new student at R. I have a directory containing EXCEL files and I need to make a summary in a data frame with complete cases in each file. How can I do this. I tried the following code buwt doesn't work. Appreciate your support
Always begin with the steps required. You will need to do the following:
Read in your data
Clean up your data
Since you do not have any code shown, I will provide you with pseudo code.
library(readxl)
df <- read_xls(path, other options)
df <- complete.cases(df)
You'll want to do that for all of your files. You can use lapply once you are more advanced, and loop over your list.files() list of excel files.

Saving R dataframe from script

It should be very simple, but for now cannot figure it out. Say I create a generic dataframe with some data in a loop, let's call it df.
Now I want to assign a specific name to it and want to save it to specific destination. I generate two character variables - filename and file_destination and try to use the following in the script code:
assign(filename, df)
save(filename, file = file_destination)
Of course it save just a string with a name in the file and not the actual data.
How do i save the dataframe created via assign(filename,df)?
Try save(list=filename,file=file_destination). Also, use better names for your variables. filename for an object which is not a file name is very odd.
Put this as answer, to ensure other people find it easily.

read a selected column from multiple csv files and combine into one large file in R

Hi,
My task is to read selected columns from over 100 same formatted .csv files in a folder, and to cbind into a big large file using R. I have attached a screen shot in this question for a sample data file.
This is the code I'm using:
filenames <- list.files(path="G:\\2014-02-04")
mydata <- do.call("cbind",lapply(filenames,read.csv,skip=12))
My problem is, for each .csv file I have, the first column is the same. So using my code will create a big file with duplicate first columns... How can I create a big with just a single column A (no duplictes). And I would like to name the second column read from each .csv file using the value of cell B7, which is the specific timestamp of each .csv file.
Can someone help me on this?
Thanks.

How do I merge the headers from one csv file with another csv file in R?

What I'm trying to ask is, how would I use the headers from one csv as the headers for another csv file? It would kind of be like a merge, except the first csv file is JUST headers, and the second csv file has JUST data
Something as simple as this will work
dn <- read.csv("d-names.txt")
dd <- read.csv("d-data.txt",header=FALSE)
names(dd)<-names(dn)
Just assign the names from one data.frame to the other. Just make sure the files have exactly the same number of columns.

Resources