List.files based on numbers - r

I am trying to create a list of files on which I want to run a function. I created a pattern which matches 35 files which I want to use.
mypattern <- paste0("NBS_NLoans_since2009_", seq(1, 35),".xls")
[1] "NBS_NLoans_since2009_1.xls" "NBS_NLoans_since2009_2.xls" "NBS_NLoans_since2009_3.xls" "NBS_NLoans_since2009_4.xls"
[5] "NBS_NLoans_since2009_5.xls" "NBS_NLoans_since2009_6.xls" "NBS_NLoans_since2009_7.xls" "NBS_NLoans_since2009_8.xls"
[9] "NBS_NLoans_since2009_9.xls" "NBS_NLoans_since2009_10.xls" "NBS_NLoans_since2009_11.xls" "NBS_NLoans_since2009_12.xls"
[13] "NBS_NLoans_since2009_13.xls" "NBS_NLoans_since2009_14.xls" "NBS_NLoans_since2009_15.xls" "NBS_NLoans_since2009_16.xls"
[17] "NBS_NLoans_since2009_17.xls" "NBS_NLoans_since2009_18.xls" "NBS_NLoans_since2009_19.xls" "NBS_NLoans_since2009_20.xls"
[21] "NBS_NLoans_since2009_21.xls" "NBS_NLoans_since2009_22.xls" "NBS_NLoans_since2009_23.xls" "NBS_NLoans_since2009_24.xls"
[25] "NBS_NLoans_since2009_25.xls" "NBS_NLoans_since2009_26.xls" "NBS_NLoans_since2009_27.xls" "NBS_NLoans_since2009_28.xls"
[29] "NBS_NLoans_since2009_29.xls" "NBS_NLoans_since2009_30.xls" "NBS_NLoans_since2009_31.xls" "NBS_NLoans_since2009_32.xls"
[33] "NBS_NLoans_since2009_33.xls" "NBS_NLoans_since2009_34.xls" "NBS_NLoans_since2009_35.xls"
Then I used the pattern to get those files from my directory. I got only one file. I have tried different patterns but either I got one file or more than 35 files. Thanks for any suggestion.
list.files(pattern = mypattern)
[1] "NBS_NLoans_since2009_1.xls"

Related

list files pattern select date

Hello I have a set of daily meteo data, using the expression :
f <- list.files(getwd(), include.dirs=TRUE, recursive=TRUE, pattern= "PREC")
I select only the files of Precipitation
I wonder how to select only files for example of January, the one for example named 20170103 (yyyymmdd) , so the one named yyyy01dd....
the files are named in this way: "PREC_20010120.grd".
Try pattern='PREC_\\d{4}01\\d[2].*'.
PREC_ literally
\\d{4} four digits
01 '"01" literally
\\d{2} two digits
.* any character repeatedly
Thank you , but I retrieved only 35 items instead of 31 days * 10 years what's wrong ?
[1] "20100102/PREC_20100102.tif" "20100112/PREC_20100112.tif"
[3] "20100122/PREC_20100122.tif" "20110102/PREC_20110102.tif"
[5] "20110112/PREC_20110112.tif" "20110122/PREC_20110122.tif"
[7] "20120102/PREC_20120102.tif" "20120112/PREC_20120112.tif"
[9] "20120122/PREC_20120122.tif" "20130102/PREC_20130102.tif"
[11] "20130112/PREC_20130112.tif" "20130122/PREC_20130122.tif"
[13] "20140102/PREC_20140102.tif" "20140112/PREC_20140112.tif"
[15] "20140122/PREC_20140122.tif" "20150102/PREC_20150102.tif"
[17] "20150112/PREC_20150112.tif" "20150122/PREC_20150122.tif"
[19] "20160102/PREC_20160102.tif" "20160112/PREC_20160112.tif"
[21] "20160122/PREC_20160122.tif" "20170102/PREC_20170102.tif"
[23] "20170112/PREC_20170112.tif" "20170122/PREC_20170122.tif"
[25] "20180102/PREC_20180102.tif" "20180112/PREC_20180112.tif"
[27] "20180122/PREC_20180122.tif" "20190102/PREC_20190102.tif"
[29] "20190112/PREC_20190112.tif" "20190122/PREC_20190122.tif"
[31] "20200102/PREC_20200102.tif" "20200112/PREC_20200112.tif"
[33] "20200122/PREC_20200122.tif" "20210102/PREC_20210102.tif"
[35] "20210112/PREC_20210112.tif" "20210122/PREC_20210122.tif"
Resolved with:
f <- list.files(getwd(), include.dirs=TRUE, recursive=TRUE, pattern='PREC_\\d{4}01.*')

stacking multiband rasters separately

I have the list of multiband raster files( each raster has several bands) and all are in one folder! I only want to take some bands of one file of this folder! How I can do it?
Thanks in advance
Here are the list of my files;
[1] "./2014069_33UXP_04_05_L8_sr_band2.tif" "./2014069_33UXP_04_05_L8_sr_band3.tif" "./2014069_33UXP_04_05_L8_sr_band4.tif"
[4] "./2014069_33UXP_04_05_L8_sr_band5.tif" "./2014069_33UXP_04_05_L8_sr_band6.tif" "./2014069_33UXP_04_05_L8_sr_band7.tif"
[7] "./2014085_33UXP_04_05_L8_sr_band2.tif" "./2014085_33UXP_04_05_L8_sr_band3.tif" "./2014085_33UXP_04_05_L8_sr_band4.tif"
[10] "./2014085_33UXP_04_05_L8_sr_band5.tif" "./2014085_33UXP_04_05_L8_sr_band6.tif" "./2014085_33UXP_04_05_L8_sr_band7.tif"
[13] "./2014092_33UXP_04_05_L8_sr_band2.tif" "./2014092_33UXP_04_05_L8_sr_band3.tif" "./2014092_33UXP_04_05_L8_sr_band4.tif"
[16] "./2014092_33UXP_04_05_L8_sr_band5.tif" "./2014092_33UXP_04_05_L8_sr_band6.tif" "./2014092_33UXP_04_05_L8_sr_band7.tif"
[19] "./2014108_33UXP_04_05_L8_sr_band2.tif" "./2014108_33UXP_04_05_L8_sr_band3.tif" "./2014108_33UXP_04_05_L8_sr_band4.tif"
[22] "./2014108_33UXP_04_05_L8_sr_band5.tif" "./2014108_33UXP_04_05_L8_sr_band6.tif" "./2014108_33UXP_04_05_L8_sr_band7.tif"
[25] "./2014117_33UXP_04_05_L8_sr_band2.tif" "./2014117_33UXP_04_05_L8_sr_band3.tif" "./2014117_33UXP_04_05_L8_sr_band4.tif"
and for example I want to take the band 2,3,4 and 8 from one file!

Issue with encoding of cyrlic character strings

I have some cyrlic strings in my dataframe that I can't manage to read acuratelly.
This is how the dataframe looks after I load the csv:
unique(transactions$orders)
[1] "ÌÈÏÑ-ÏÏ30Å" "ÈÍÒ-ÏÏ30Å" "ÊÈÁÑ-ÏÏ30Å" "ÊÈÁÑ-ÏÏ50Å" "ÌÈÏÑ-ÏÏ50Å" "ÊÈÁÑ-ÏÏ53Å" "ÈÍÒ-ÏÏ53Å"
[8] "ÌÈÏÑ-ÏÏ53Å" "ÈÍÒ-ÏÏ30" "ÊÈÁÑ-ÏÏ30" "ÌÈÏÑ-ÏÏ30" "ÌÈÏÑ-ÏÏ50" "ÈÍÒ-ÏÏ10" "ÊÈÁÑ-ÏÏ50"
[15] "ÈÍÒ-ÏÏ40" "ÊÈÁÑ-ÏÏ53" "ÈÍÒ-ÏÏ53" "ÌÈÏÑ-ÏÏ53" "ÊÈÁÑ-ÏÏ10" "ÈÍÒ-ÏÏ30Ï" "ÊÈÁÑ-ÏÏ50Ï"
[22] "ÌÈÏÑ-ÏÏ30Ï" "ÊÈÁÑ-ÏÏ30Ï" "ÌÈÏÑ-ÏÏ50Ï" "ÈÍÒ-ÏÏ50" "ÌÈÏÑ-ÏÏ10" "ÊÈÁÑ-ÏÏ53Ï" "ÈÍÒ-ÏÏ53Ï"
Any ideas how I can fix this?

Case insensitive sort of vector of string in R

I have the following vector:
mylist <- c("MBT.LN.ID", "ISA51VG.LN.ID", "R848.LN.ID", "sHz.LN.ID", "FK565.LN.ID",
"bCD.LN.ID", "MALP2s.LN.ID", "ADX.LN.ID", "AddaVax.LN.ID", "FCA.LN.ID",
"Pam3CSK4.LN.ID", "D35.LN.ID", "ALM.LN.ID", "K3.LN.ID", "K3SPG.LN.ID",
"MPLA.LN.ID", "DMXAA.LN.ID", "cGAMP.LN.ID", "Poly_IC.LN.ID",
"cdiGMP.LN.ID")
I'd like to sort them alphabetically in case-insensitive manner.
The expected output is this:
[1] "AddaVax.LN.ID" "ADX.LN.ID" "ALM.LN.ID" "bCD.LN.ID" "cdiGMP.LN.ID" "cGAMP.LN.ID"
[7] "D35.LN.ID" "DMXAA.LN.ID" "FCA.LN.ID" "FK565.LN.ID" "ISA51VG.LN.ID" "K3.LN.ID"
[13] "K3SPG.LN.ID" "MALP2s.LN.ID" "MBT.LN.ID" "MPLA.LN.ID" "Pam3CSK4.LN.ID" "Poly_IC.LN.ID"
[19] "R848.LN.ID" "sHz.LN.ID"
I tried this but failed (Using R.3.2.0 alpha):
> sort(mylist)
[1] "ADX.LN.ID" "ALM.LN.ID" "AddaVax.LN.ID" "D35.LN.ID"
[5] "DMXAA.LN.ID" "FCA.LN.ID" "FK565.LN.ID" "ISA51VG.LN.ID"
[9] "K3.LN.ID" "K3SPG.LN.ID" "MALP2s.LN.ID" "MBT.LN.ID"
[13] "MPLA.LN.ID" "Pam3CSK4.LN.ID" "Poly_IC.LN.ID" "R848.LN.ID"
[17] "bCD.LN.ID" "cGAMP.LN.ID" "cdiGMP.LN.ID" "sHz.LN.ID"
Try
mylist[order(tolower(mylist))]
As noted by #Pascal, this is documented in help(Comparison) and sort is local specific. One Option is switching your local (for example Sys.setlocale("LC_TIME", "us")), but that could be inconvenient. Another option could be using gtools::mixedsort which could be also useful because you string also contains numbers.
library(gtools)
mixedsort(mylist)
# [1] "AddaVax.LN.ID" "ADX.LN.ID" "ALM.LN.ID" "bCD.LN.ID" "cdiGMP.LN.ID" "cGAMP.LN.ID" "D35.LN.ID" "DMXAA.LN.ID" "FCA.LN.ID" "FK565.LN.ID"
# [11] "ISA51VG.LN.ID" "K3.LN.ID" "K3SPG.LN.ID" "MALP2s.LN.ID" "MBT.LN.ID" "MPLA.LN.ID" "Pam3CSK4.LN.ID" "Poly_IC.LN.ID" "R848.LN.ID" "sHz.LN.ID"
> library(searchable)
> sort(ignore.case(mylist))
[1] "AddaVax.LN.ID" "ADX.LN.ID" "ALM.LN.ID" "bCD.LN.ID" "cdiGMP.LN.ID"
[6] "cGAMP.LN.ID" "D35.LN.ID" "DMXAA.LN.ID" "FCA.LN.ID" "FK565.LN.ID"
[11] "ISA51VG.LN.ID" "K3.LN.ID" "K3SPG.LN.ID" "MALP2s.LN.ID" "MBT.LN.ID"
[16] "MPLA.LN.ID" "Pam3CSK4.LN.ID" "Poly_IC.LN.ID" "R848.LN.ID" "sHz.LN.ID"

How to write a list to file as one row and without quotes in R

I am trying to write a list to file as one row and without quotes in R.
Content of the list is:
[1] "X4775495036_J" "X4775495036_F" "X5147722015_F" "X5067554009_F"
[5] "X5067554063_B" "X4954590047_A" "X5067554063_G" "X5067554009_L"
[9] "X5147722015_D" "X5511045011_D" "X5067554063_A" "X4805447025_F"
[13] "X5455362015_K" "X4805447025_L" "X5147722015_B" "X5067554009_G"
[17] "X5147722014_K" "X5067554063_H" "X5147722009_G" "X5067554008_H"
[21] "X5067554054_H" "X4805447016_K" "X5147722014_E" "X4954590051_K"
[25] "X5067554008_E" "X5147722015_H" "X5147722009_H" "X5067554063_D"
[29] "X5147722015_A" "X5511045022_E" "X5067554054_I" "X5067554063_J"
[33] "X5067554007_F" "X4775495036_E" "X4775495036_H" "X4805447025_H"
[37] "X5067554009_I" "X4805447025_K" "X4954590051_C" "X4805447025_E"
[41] "X5067554063_E" "X5147722009_J" "X5067554054_C" "X5067554054_G"
[45] "X4805447016_I" "X5455362015_B" "X5067554009_H" "X5147722014_A"
[49] "X4775495036_I" "X5067554063_L" "X5455362015_J" "X4954590047_J"
[53] "X5067554009_A" "X4954590051_D" "X5455362015_I" "X5511045011_E"
[57] "X5147722014_F"
I want something like this (all elements in one row):
X4775495036_J X4775495036_F X5147722015_F X5067554009_F ...
I have tried with write.table, write but with no result.
Note that you don't have a list, you have a character vector.
cat(your_vector, "\n", file="your_file.txt")
The "\n" is an optional newline at the end.
You could use the ncolumns argument of write:
n <- LETTERS[1:10] # create example values
write(n, "letters.txt", ncolumns=length(n))
Or you could concatenate your names before:
nc <- paste0(n, collapse=" ")
write(nc, "letters.txt")

Resources