stacking multiband rasters separately - r

I have the list of multiband raster files( each raster has several bands) and all are in one folder! I only want to take some bands of one file of this folder! How I can do it?
Thanks in advance
Here are the list of my files;
[1] "./2014069_33UXP_04_05_L8_sr_band2.tif" "./2014069_33UXP_04_05_L8_sr_band3.tif" "./2014069_33UXP_04_05_L8_sr_band4.tif"
[4] "./2014069_33UXP_04_05_L8_sr_band5.tif" "./2014069_33UXP_04_05_L8_sr_band6.tif" "./2014069_33UXP_04_05_L8_sr_band7.tif"
[7] "./2014085_33UXP_04_05_L8_sr_band2.tif" "./2014085_33UXP_04_05_L8_sr_band3.tif" "./2014085_33UXP_04_05_L8_sr_band4.tif"
[10] "./2014085_33UXP_04_05_L8_sr_band5.tif" "./2014085_33UXP_04_05_L8_sr_band6.tif" "./2014085_33UXP_04_05_L8_sr_band7.tif"
[13] "./2014092_33UXP_04_05_L8_sr_band2.tif" "./2014092_33UXP_04_05_L8_sr_band3.tif" "./2014092_33UXP_04_05_L8_sr_band4.tif"
[16] "./2014092_33UXP_04_05_L8_sr_band5.tif" "./2014092_33UXP_04_05_L8_sr_band6.tif" "./2014092_33UXP_04_05_L8_sr_band7.tif"
[19] "./2014108_33UXP_04_05_L8_sr_band2.tif" "./2014108_33UXP_04_05_L8_sr_band3.tif" "./2014108_33UXP_04_05_L8_sr_band4.tif"
[22] "./2014108_33UXP_04_05_L8_sr_band5.tif" "./2014108_33UXP_04_05_L8_sr_band6.tif" "./2014108_33UXP_04_05_L8_sr_band7.tif"
[25] "./2014117_33UXP_04_05_L8_sr_band2.tif" "./2014117_33UXP_04_05_L8_sr_band3.tif" "./2014117_33UXP_04_05_L8_sr_band4.tif"
and for example I want to take the band 2,3,4 and 8 from one file!

Related

List.files based on numbers

I am trying to create a list of files on which I want to run a function. I created a pattern which matches 35 files which I want to use.
mypattern <- paste0("NBS_NLoans_since2009_", seq(1, 35),".xls")
[1] "NBS_NLoans_since2009_1.xls" "NBS_NLoans_since2009_2.xls" "NBS_NLoans_since2009_3.xls" "NBS_NLoans_since2009_4.xls"
[5] "NBS_NLoans_since2009_5.xls" "NBS_NLoans_since2009_6.xls" "NBS_NLoans_since2009_7.xls" "NBS_NLoans_since2009_8.xls"
[9] "NBS_NLoans_since2009_9.xls" "NBS_NLoans_since2009_10.xls" "NBS_NLoans_since2009_11.xls" "NBS_NLoans_since2009_12.xls"
[13] "NBS_NLoans_since2009_13.xls" "NBS_NLoans_since2009_14.xls" "NBS_NLoans_since2009_15.xls" "NBS_NLoans_since2009_16.xls"
[17] "NBS_NLoans_since2009_17.xls" "NBS_NLoans_since2009_18.xls" "NBS_NLoans_since2009_19.xls" "NBS_NLoans_since2009_20.xls"
[21] "NBS_NLoans_since2009_21.xls" "NBS_NLoans_since2009_22.xls" "NBS_NLoans_since2009_23.xls" "NBS_NLoans_since2009_24.xls"
[25] "NBS_NLoans_since2009_25.xls" "NBS_NLoans_since2009_26.xls" "NBS_NLoans_since2009_27.xls" "NBS_NLoans_since2009_28.xls"
[29] "NBS_NLoans_since2009_29.xls" "NBS_NLoans_since2009_30.xls" "NBS_NLoans_since2009_31.xls" "NBS_NLoans_since2009_32.xls"
[33] "NBS_NLoans_since2009_33.xls" "NBS_NLoans_since2009_34.xls" "NBS_NLoans_since2009_35.xls"
Then I used the pattern to get those files from my directory. I got only one file. I have tried different patterns but either I got one file or more than 35 files. Thanks for any suggestion.
list.files(pattern = mypattern)
[1] "NBS_NLoans_since2009_1.xls"

list files pattern select date

Hello I have a set of daily meteo data, using the expression :
f <- list.files(getwd(), include.dirs=TRUE, recursive=TRUE, pattern= "PREC")
I select only the files of Precipitation
I wonder how to select only files for example of January, the one for example named 20170103 (yyyymmdd) , so the one named yyyy01dd....
the files are named in this way: "PREC_20010120.grd".
Try pattern='PREC_\\d{4}01\\d[2].*'.
PREC_ literally
\\d{4} four digits
01 '"01" literally
\\d{2} two digits
.* any character repeatedly
Thank you , but I retrieved only 35 items instead of 31 days * 10 years what's wrong ?
[1] "20100102/PREC_20100102.tif" "20100112/PREC_20100112.tif"
[3] "20100122/PREC_20100122.tif" "20110102/PREC_20110102.tif"
[5] "20110112/PREC_20110112.tif" "20110122/PREC_20110122.tif"
[7] "20120102/PREC_20120102.tif" "20120112/PREC_20120112.tif"
[9] "20120122/PREC_20120122.tif" "20130102/PREC_20130102.tif"
[11] "20130112/PREC_20130112.tif" "20130122/PREC_20130122.tif"
[13] "20140102/PREC_20140102.tif" "20140112/PREC_20140112.tif"
[15] "20140122/PREC_20140122.tif" "20150102/PREC_20150102.tif"
[17] "20150112/PREC_20150112.tif" "20150122/PREC_20150122.tif"
[19] "20160102/PREC_20160102.tif" "20160112/PREC_20160112.tif"
[21] "20160122/PREC_20160122.tif" "20170102/PREC_20170102.tif"
[23] "20170112/PREC_20170112.tif" "20170122/PREC_20170122.tif"
[25] "20180102/PREC_20180102.tif" "20180112/PREC_20180112.tif"
[27] "20180122/PREC_20180122.tif" "20190102/PREC_20190102.tif"
[29] "20190112/PREC_20190112.tif" "20190122/PREC_20190122.tif"
[31] "20200102/PREC_20200102.tif" "20200112/PREC_20200112.tif"
[33] "20200122/PREC_20200122.tif" "20210102/PREC_20210102.tif"
[35] "20210112/PREC_20210112.tif" "20210122/PREC_20210122.tif"
Resolved with:
f <- list.files(getwd(), include.dirs=TRUE, recursive=TRUE, pattern='PREC_\\d{4}01.*')

My dataset is huge and I don't know how to make figures with the data as it is

I have RNAseq data for a Time-course experiment (6 time points) and involves tens of thousands of genes.
I have used the Filter program on Tidyverse to find genes that fit certain criteria (reference genes for qPCR), but I don't know how to make this data into a figure easily. Right now, I'd have to change the format of the dataset completely, but that would take so much time to be impractical.
The goal is just to have a graph for each gene that shows the change in expression over time for each condition (different leaf pairs and droughted/well-watered). I have done this for some in Excel but would like a quicker way to do it.
The dataset is set out like this:
[1] "gene.id" "LP1.2.02:00.WW" "LP1.2.02:00.WW_1" "LP1.2.02:00.WW_2"
[5] "LP1.2.06:00.WW" "LP1.2.06:00.WW_1" "LP1.2.06:00.WW_2" "LP1.2.10:00.WW"
[9] "LP1.2.10:00.WW_1" "LP1.2.10:00.WW_2" "LP1.2.14:00.WW" "LP1.2.14:00.WW_1"
[13] "LP1.2.14:00.WW_2" "LP1.2.18:00.WW" "LP1.2.18:00.WW_1" "LP1.2.18:00.WW_2"
[17] "LP1.2.22:00.WW" "LP1.2.22:00.WW_1" "LP1.2.22:00.WW_2" "LP3.4.5.02:00.WW"
[21] "LP3.4.5.02:00.WW_1" "LP3.4.5.02:00.WW_2" "LP3.4.5.06:00.WW" "LP3.4.5.06:00.WW_1"
[25] "LP3.4.5.06:00.WW_2" "LP3.4.5.10:00.WW" "LP3.4.5.10:00.WW_1" "LP3.4.5.10:00.WW_2"
[29] "LP3.4.5.14:00.WW" "LP3.4.5.14:00.WW_1" "LP3.4.5.14:00.WW_2" "LP3.4.5.18:00.WW"
[33] "LP3.4.5.18:00.WW_1" "LP3.4.5.18:00.WW_2" "LP3.4.5.22:00.WW" "LP3.4.5.22:00.WW_1"
[37] "LP3.4.5.22:00.WW_2" "LP1.2.02:00.Drought" "LP1.2.02:00.Drought_1" "LP1.2.02:00.Drought_2"
[41] "LP1.2.06:00.Drought" "LP1.2.06:00.Drought_1" "LP1.2.06:00.Drought_2" "LP1.2.10:00.Drought"
[45] "LP1.2.10:00.Drought_1" "LP1.2.10:00.Drought_2" "LP1.2.14:00.Drought" "LP1.2.14:00.Drought_1"
[49] "LP1.2.14:00.Drought_2" "LP1.2.18:00.Drought" "LP1.2.18:00.Drought_1" "LP1.2.18:00.Drought_2"
[53] "LP1.2.22:00.Drought" "LP1.2.22:00.Drought_1" "LP1.2.22:00.Drought_2" "LP3.4.5.02:00.Drought"
[57] "LP3.4.5.02:00.Drought_1" "LP3.4.5.02:00.Drought_2" "LP3.4.5.06:00.Drought" "LP3.4.5.06:00.Drought_1"
[61] "LP3.4.5.06:00.Drought_2" "LP3.4.5.10:00.Drought" "LP3.4.5.10:00.Drought_1" "LP3.4.5.10:00.Drought_2"
[65] "LP3.4.5.14:00.Drought" "LP3.4.5.14:00.Drought_1" "LP3.4.5.14:00.Drought_2" "LP3.4.5.18:00.Drought"
[69] "LP3.4.5.18:00.Drought_1" "LP3.4.5.18:00.Drought_2" "LP3.4.5.22:00.Drought." "LP3.4.5.22:00.Drought"
[73] "LP3.4.5.22:00.Drought_1" "X74" "LP1.2.02:00.WW.mean" "LP1.2.06:00.WW.mean"
[77] "LP1.2.10:00.WW.mean" "LP1.2.14:00.WW.mean" "LP1.2.18:00.WW.mean" "LP1.2.22:00.WW.mean"
[81] "LP1.2.02:00.drought.mean" "LP1.2.06:00.drought.mean" "LP1.2.10:00.drought.mean" "LP1.2.14:00.drought.mean"
[85] "LP1.2.18:00.drought.mean" "LP1.2.22:00.drought.mean" "LP3.4.5.02:00.WW.mean" "LP3.4.5.06:00.WW.mean"
[89] "LP3.4.5.10:00.WW.mean" "LP3.4.5.14:00.WW.mean" "LP3.4.5.18:00.WW.mean" "LP3.4.5.22:00.WW.mean"
[93] "LP3.4.5.02:00.drought.mean" "LP3.4.5.06:00.drought.mean" "LP3.4.5.10:00.drought.mean" "LP3.4.5.14:00.drought.mean"
[97] "LP3.4.5.18:00.drought.mean" "LP3.4.5.22:00.drought.mean"
It's a lot of headings, and as you can see from the titles, they contain the time, leaf pairs and condition. So, I'm not sure how to translate this into an x~y graph.
I've had several thoughts including trying to divide conditions into different subsets (LP1.2. WW/ LP.1.2.D/LP3.4.5.WW/LP.3.4.5.D) and making a subset for Time (02:00, 06:00, etc.) and trying to make a graph for that.
#make subset for the time points
Time <- c("02:00", "06:00", "10:00", "14:00", "18:00", "22:00")
#make subsets for each condition (LP1.2. WW/ LP.1.2.D/LP3.4.5.WW/LP.3.4.5.D)
LP1.2.WW.mean <- as.matrix(KG_graph_data[c( "LP1.2.02:00.WW.mean",
"LP1.2.06:00.WW.mean",
"LP1.2.10:00.WW.mean",
"LP1.2.14:00.WW.mean",
"LP1.2.18:00.WW.mean",
"LP1.2.22:00.WW.mean",
"gene.id")])
LP.1.2.D.mean <-
as.matrix(KG_graph_data[c("LP1.2.02:00.drought.mean",
"LP1.2.06:00.drought.mean",
"LP1.2.10:00.drought.mean",
"LP1.2.14:00.drought.mean",
"LP1.2.18:00.drought.mean",
"LP1.2.22:00.drought.mean",
"gene.id")])
LP345.WW.mean <- as.matrix((KG_graph_data[c("LP3.4.5.02:00.WW.mean",
"LP3.4.5.06:00.WW.mean",
"LP3.4.5.10:00.WW.mean",
"LP3.4.5.14:00.WW.mean",
"LP3.4.5.18:00.WW.mean",
"LP3.4.5.22:00.WW.mean",
"gene.id")]))
LP345.D.mean <-
as.matrix(KG_graph_data[c("LP3.4.5.02:00.drought.mean",
"LP3.4.5.06:00.drought.mean",
"LP3.4.5.10:00.drought.mean",
"LP3.4.5.14:00.drought.mean",
"LP3.4.5.18:00.drought.mean",
"LP3.4.5.22:00.drought.mean",
"gene.id")])
I tried extracting a particular gene from each matrix to then perhaps plot a graph from but it only worked when it was from one matrix and even then, the table contained no data.
Total_KgGene007565 <- subset(LP1.2.WW.mean, "gene.id"=="KgGene007565",
LP.1.2.D.mean, "gene.id"=="KgGene007565",
LP345.WW.mean, "gene.id"=="KgGene007565",
LP345.D.mean, "gene.id"="KgGene007565")
I am not sure how to proceed from here or if this was the wrong way to approach this.

How to turn rvest output into table

Brand new to R, so I'll try my best to explain this.
I've been playing with data scraping using the "rvest" package. In this example, I'm scraping US state populations from a table on Wikipedia. The code I used is:
library(rvest)
statepop = read_html("https://en.wikipedia.org/wiki/List_of_U.S._states_and_territories_by_population")
forecasthtml = html_nodes(statepop, "td")
forecasttext = html_text(forecasthtml)
forecasttext
The resulting output was as follows:
[2] "7000100000000000000♠1"
[3] " California"
[4] "39,250,017"
[5] "37,254,503"
[6] "7001530000000000000♠53"
[7] "738,581"
[8] "702,905"
[9] "12.15%"
[10] "7000200000000000000♠2"
[11] "7000200000000000000♠2"
[12] " Texas"
[13] "27,862,596"
[14] "25,146,105"
[15] "7001360000000000000♠36"
[16] "763,031"
[17] "698,487"
[18] "8.62%"
How can I turn these strings of text into a table that is set up similar to the way it is presented on the original Wikipedia page (with columns, rows, etc)?
Try using rvest's html_table function.
Note there are five tables on the page thus you will need to specify which table you would like to parse.
library(rvest)
statepop = read_html("https://en.wikipedia.org/wiki/List_of_U.S._states_and_territories_by_population")
#find all of the tables on the page
tables<-html_nodes(statepop, "table")
#convert the first table into a dataframe
table1<-html_table(tables[1])

R sort list of files numerically [duplicate]

This question already has answers here:
How to sort a character vector where elements contain letters and numbers?
(6 answers)
Closed 2 years ago.
I have a list of files that I need to sort numerically, such that I can import them in order
my code is:
bed = '/files/coverage_v2'
beds=list.files(path=bed, pattern='ctcf.motif.minus[0-9]+.bed.IGTB950.bed')
for(b in beds){
`for(b in beds){`print(b)
read.table(b)
}
> [1] "ctcf.motif.minus1.bed.IGTB950.bed" "ctcf.motif.minus10.bed.IGTB950.bed"
[3] "ctcf.motif.minus100.bed.IGTB950.bed" "ctcf.motif.minus101.bed.IGTB950.bed"
[5] "ctcf.motif.minus102.bed.IGTB950.bed" "ctcf.motif.minus103.bed.IGTB950.bed"
[7] "ctcf.motif.minus104.bed.IGTB950.bed" "ctcf.motif.minus105.bed.IGTB950.bed"
[9] "ctcf.motif.minus106.bed.IGTB950.bed" "ctcf.motif.minus107.bed.IGTB950.bed"
[11] "ctcf.motif.minus108.bed.IGTB950.bed" "ctcf.motif.minus109.bed.IGTB950.bed"
[13] "ctcf.motif.minus11.bed.IGTB950.bed" "ctcf.motif.minus110.bed.IGTB950.bed"
[15] "ctcf.motif.minus111.bed.IGTB950.bed" "ctcf.motif.minus112.bed.IGTB950.bed"
[17] "ctcf.motif.minus113.bed.IGTB950.bed" "ctcf.motif.minus114.bed.IGTB950.bed"
[19] "ctcf.motif.minus115.bed.IGTB950.bed" "ctcf.motif.minus116.bed.IGTB950.bed"
[21] "ctcf.motif.minus117.bed.IGTB950.bed" "ctcf.motif.minus118.bed.IGTB950.bed"
[23] "ctcf.motif.minus119.bed.IGTB950.bed" "ctcf.motif.minus12.bed.IGTB950.bed"
[25] "ctcf.motif.minus120.bed.IGTB950.bed" "ctcf.motif.minus121.bed.IGTB950.bed"
[27] "ctcf.motif.minus122.bed.IGTB950.bed" "ctcf.motif.minus123.bed.IGTB950.bed"
[29] "ctcf.motif.minus124.bed.IGTB950.bed" "ctcf.motif.minus125.bed.IGTB950.bed"
[31] "ctcf.motif.minus126.bed.IGTB950.bed" "ctcf.motif.minus127.bed.IGTB950.bed"
[33] "ctcf.motif.minus128.bed.IGTB950.bed" "ctcf.motif.minus129.bed.IGTB950.bed"
[35] "ctcf.motif.minus13.bed.IGTB950.bed" "ctcf.motif.minus130.bed.IGTB950.bed"
[37] "ctcf.motif.minus131.bed.IGTB950.bed" "ctcf.motif.minus132.bed.IGTB950.bed"
[39] "ctcf.motif.minus133.bed.IGTB950.bed" "ctcf.motif.minus134.bed.IGTB950.bed"
But what I really want is for it to be sorted numerically:
> "ctcf.motif.minus1.bed.IGTB950.bed"
"ctcf.motif.minus10.bed.IGTB950.bed"
"ctcf.motif.minus11.bed.IGTB950.bed"
"ctcf.motif.minus12.bed.IGTB950.bed"
"ctcf.motif.minus13.bed.IGTB950.bed"
"ctcf.motif.minus100.bed.IGTB950.bed"
"ctcf.motif.minus101.bed.IGTB950.bed"
etc, so that it will be imported numerically.
Thanks in advance!!
You could try mixedsort from gtools
library(gtools)
beds1 <- mixedsort(beds)
head(beds1)
#[1]"ctcf.motif.minus1.bed.IGTB950.bed" "ctcf.motif.minus10.bed.IGTB950.bed"
#[3]"ctcf.motif.minus11.bed.IGTB950.bed" "ctcf.motif.minus12.bed.IGTB950.bed"
#[5]"ctcf.motif.minus13.bed.IGTB950.bed" "ctcf.motif.minus100.bed.IGTB950.bed"
Or using regex (assuming that the order depends on the numbers after 'minus' and before 'bed'.
beds[order(as.numeric(gsub('\\D+|\\.bed.*', '', beds)))]

Resources