Apply same subset to multiple datasets in R - r

I am pretty new to R, but I am working with several datasets containing the same data only from different days.
For my analysis I only need some specific columns from this dataset, therefore I created a new dataset with only the new colums (I do not want to overwrite or delete the old dataset). I am using the following code to do this:
subset01012018 <- (dataset01012018[,c(1,2,3,4,10,11,14,15,16)])
Now I want to apply the same to all the datasets. How could I do something like this? Could I do this with a for loop? Or do I need an apply function?
Hope someone can help me!

Related

Comparing two lists in R

Hi so I have two nearly identical data sets, however one has some values the other doesn't and I'm trying to compare them in R. I'm trying to create a list of the observations in the two data sets that aren't shared between the two, but I'm struggling with how to do this. I'm relatively new to R.
You should try the arsenal package.
try
install.packages("arsenal")
library(arsenal)
captureVariable <- summary(arsenal::comparedf(list1,list2))
captureVariable[["diffs.byvar.table"]]
There are some other helpful outputs that will be captured by captureVariable if that particular table doesn't suit your needs.

Working on the Titanic dataset and using Sm.binning to categorize continuous variables into categorical

Is there a way to automate the process of categorizing like the function cut does when using sm.binning? https://imgur.com/a/jy637
basically I want to have a new variable somehow that automatically puts it into the right category without right a bunch of nested ifelse statements.
Thanks!!

R pivot table with multiple column levels

I would be grateful if anyone could tell me how to create pivot table in R like python pandas with selected aggregation function and more then one level in column.
I would like to receive in R something like this in python:
Iris.pivot_table(index='Sepal.Length',columns=['Sepal.Width','Species'],values='Petal.Length',aggfunc=sum)
I know there is pivotabler package, but default rendering to html method is to slow for a bit larger tables.
I also have found ftable function from stats package but its only for contingency tables, in which I can`t specify my own aggregation function.
Thank you.

Combining longitudinal data in separate CSVs into one with year field?

I am very familiar with Excel but new to R. I have several years worth of data across multiple spreadsheets:
data1996.csv
data1997.csv
...
data2013.csv
Each csv is about 500,000 rows by 1700 columns.
I want to manipulate this data in D3 and plan to remove columns that are not essential to calculation. My goal is to create a slider with years that will create a corresponding visualization. I want to know what the easiest way is to aggregate these massive datasets. I suppose it could be done manually, but this would prove cumbersome and inefficient.
I am very new to R, but if there is another means to aggregate the data into a single CSV, that would work fine as well.
Any help and suggestions are appreciated.

how to use princomp() or prcomp() functions in R with large datasets, without trasposing the data?

I have just started knowing PCA and i wish to use it for a huge microarray dataset with more than 4,00,000 rows. I have my columns in the form of samples, and rows in the form of genes/locus. I did go through some tutorials on using PCA and came across princomp() and prcomp() and a few others.
Now, as i learn here that, in order to plot ¨samples¨ in the biplot, i would need to have them in the rows, and genes/locus in the columns, and hence i will have to transpose my data before using it for PCA.
However, since the rows are more than 4,00,000, i am not really able to transpose them into columns, because the columns are limited. So my question is that, is there any way to perform a PCA on my data, without transposing it, using these R functions ? If not, can anyone of you suggest me any other way or method to do so ?
Why do you hate to transpose your data? It's easy!
If you read your data into R (for example as the matrix microarray.data) you can transpose them with just a command:
transposed.microarray.data<-t(microarray.data)

Resources