Make a repeating alpha-numeric list - r

I want to make a list like this:
"A001:A048", "B001:B048", ..., "Z001:Z048", "AA001:AA048", "BB001:BB048", ...
I looked at this thread, but couldn't figure how to adapt it for my repeating letters.
Thanks for the help.

c( sprintf("%s001:%s048", LETTERS,LETTERS),
sprintf("%s%s001:%s%s048", LETTERS,LETTERS,LETTERS, LETTERS) )
Here is an example with using "indexed substitution" (my term) with sprintf:
outer(LETTERS, 1:26, FUN=sprintf, fmt="%1$s%1$s%2$03d:%1$s%1$s%2$03d")
# [,1] [,2] [,3] [,4] [,5]
[1,] "AA001:AA001" "AA002:AA002" "AA003:AA003" "AA004:AA004" "AA005:AA005"
[2,] "BB001:BB001" "BB002:BB002" "BB003:BB003" "BB004:BB004" "BB005:BB005"
[3,] "CC001:CC001" "CC002:CC002" "CC003:CC003" "CC004:CC004" "CC005:CC005"
[4,] "DD001:DD001" "DD002:DD002" "DD003:DD003" "DD004:DD004" "DD005:DD005"
[5,] "EE001:EE001" "EE002:EE002" "EE003:EE003" "EE004:EE004" "EE005:EE005"
snipped a couple of pages of output
And one further shot with the A:AF 1:48 combo:
outer( c(LETTERS,paste("A",LETTERS[1:6],sep="")),
1:48,
FUN=sprintf,
fmt="%1$s%1$s%2$03d")
#-----------------------------------
# [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8]
[1,] "A001" "A002" "A003" "A004" "A005" "A006" "A007" "A008" "A009" "A010"
[2,] "B001" "B002" "B003" "B004" "B005" "B006" "B007" "B008" "B009"
snipped
[,41] [,42] [,43] [,44] [,45] [,46] [,47] [,48]
snipped
[31,] "AE041" "AE042" "AE043" "AE044" "AE045" "AE046" "AE047" "AE048"
[32,] "AF041" "AF042" "AF043" "AF044" "AF045" "AF046" "AF047" "AF048"

I think this is what you want, even though your question isn't clear. I use sprintf because it makes padding with leading zeros easier.
prefix <- c(LETTERS,paste("A",LETTERS[1:6],sep=""))
out <- sapply(prefix, function(x) sprintf("%s%03d",x,1:48))
as.vector(out) # if you want a vector instead

Related

Is there a function for finding anti imaging correlation matrix in R? I can find it via excel but not in R

Is there a function for finding anti imaging correlation matrix in R? I can find it via excel but not in R.
I tried searching for it but couldn't find anything.
Are you trying to calculate the Kaiser-Meyer-Olkin (KMO) index? There is KMO function in the package psych.
If you review the source code by simply typing psych::KMO, you can see that the function actually calculates the anti imaging correlation matrix and returns it in the value ImgCov. See here about why it's called ImgCov and not ImgCor
Therefore, you can either calculate KMO index directly, or access the imaging correlation matrix.
library(psych)
KMO(Thurstone)$ImCov
[,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9]
[1,] 0.26191905 -0.13247459 -0.098997433 0.01696386 -0.007869000 -0.01530553 -0.02683189 -0.02883750 -0.015467511
[2,] -0.13247459 0.25042987 -0.093156627 -0.02997555 -0.013462396 -0.03933474 -0.01452050 -0.03244578 0.024857482
[3,] -0.09899743 -0.09315663 0.325362301 -0.02885198 0.001412136 -0.01360381 0.01550812 -0.06027647 -0.009499454
[4,] 0.01696386 -0.02997555 -0.028851979 0.44683097 -0.201926848 -0.14992727 -0.02064537 0.01507317 -0.051260798
[5,] -0.00786900 -0.01346240 0.001412136 -0.20192685 0.479035204 -0.09880388 -0.03070355 -0.01220736 -0.070899784
[6,] -0.01530553 -0.03933474 -0.013603811 -0.14992727 -0.098803879 0.57375353 0.02729260 -0.00322785 -0.014775752
[7,] -0.02683189 -0.01452050 0.015508115 -0.02064537 -0.030703553 0.02729260 0.52259809 -0.16114288 -0.222865605
[8,] -0.02883750 -0.03244578 -0.060276472 0.01507317 -0.012207363 -0.00322785 -0.16114288 0.54993095 -0.065094221
[9,] -0.01546751 0.02485748 -0.009499454 -0.05126080 -0.070899784 -0.01477575 -0.22286561 -0.06509422 0.571067253

Scraping data from pdf file in R

I need to extract tables from a pdf. Here's the link
https://www.acea.be/uploads/statistic_documents/ACEA_Report_Vehicles_in_use-Europe_2018.pdf
I want first table from this pdf.
Here is my code
Sys.setenv(JAVA_HOME='C:\\Program Files\\Java\\jre1.8.0_201') # for 64-bit version
# install.packages("devtools")
library(tabulizer)
library(tabulizerjars)
library(tidyverse)
tab <- extract_tables("https://www.acea.be/uploads/statistic_documents/ACEA_Report_Vehicles_in_use-Europe_2018.pdf")
tab[[1]]
head(tab[[1]])
But in o/p column of year 2012,2013,2015,2016 are getting append into one column.
I want table as in pdf file .
o/p of my code.
[,1] [,2] [,3]
[1,] "Croatia" "1,445,0001,433,5631,458,1491,489,3381,540,2603.4" ""
[2,] "Czech Republic" "4,698,8004,787,8494,893,5625,115,3165,368,6605.0" ""
[3,] "Denmark" "2,225,1642,265,3492,320,9822,391,7552,477,4783.6" ""
[4,] "Estonia" "602,133628,562652,949676,592703,1513.9" ""
[5,] "Finland" "2,560,1902,575,9512,595,8672,612,9222,629,4320.6" ""
[6,] "France" "31,600,00031,650,00031,799,00031,915,49331,999,9530.3" ""
Here is an alternative solution :
library(RDCOMClient)
################################################
#### Step 1 : We convert the image to a PDF ####
################################################
path_PDF <- "C:\\ACEA_Report_Vehicles_in_use-Europe_2018.pdf"
path_Word <- "C:\\temp.docx"
####################################################################
#### Step 2 : We use the OCR of Word to convert the PDF in word ####
####################################################################
wordApp <- COMCreate("Word.Application")
wordApp[["Visible"]] <- TRUE
wordApp[["DisplayAlerts"]] <- FALSE
doc <- wordApp[["Documents"]]$Open(normalizePath(path_PDF),
ConfirmConversions = FALSE)
doc$SaveAs2(path_Word)
##############################################################
#### Step 3 : We extract the table from the word document ####
##############################################################
nb_Table <- doc$tables()$count()
list_Table <- list()
for(l in 1 : nb_Table)
{
nb_Row <- doc$tables(l)$Rows()$Count()
nb_Col <- doc$tables(l)$Columns()$Count()
mat_Temp <- matrix(NA, nrow = nb_Row, ncol = nb_Col)
for(i in 1 : nb_Row)
{
for(j in 1 : nb_Col)
{
mat_Temp[i, j] <- tryCatch(doc$tables(l)$cell(i, j)$range()$text(), error = function(e) NA)
}
}
list_Table[[l]] <- mat_Temp
}
list_Table[[1]]
[,1] [,2] [,3] [,4] [,5] [,6] [,7]
[1,] "Austria\r\a" "4,584,202\r\a" "4,641,308\r\a" "4,694,921\r\a" "4,748,048\r\a" "4,821,557\r\a" "1.5\r\a"
[2,] "Belgium\r\a" "5,392,909\r\a" "5,439,295\r\a" "5,511,080\r\a" "5,587,415\r\a" "5,669,764\r\a" "1.5\r\a"
[3,] "Croatia\r\a" "1,445,000\r\a" "1,433,563\r\a" "1,458,149\r\a" "1,489,338\r\a" "1,540,260\r\a" "3.4\r\a"
[4,] "Czech Republic\r\a" "4,698,800\r\a" "4,787,849\r\a" "4,893,562\r\a" "5,115,316\r\a" "5,368,660\r\a" "5.0\r\a"
[5,] "Denmark\r\a" "2,225,164\r\a" "2,265,349\r\a" "2,320,982\r\a" "2,391,755\r\a" "2,477,478\r\a" "3.6\r\a"
[6,] "Estonia\r\a" "602,133\r\a" "628,562\r\a" "652,949\r\a" "676,592\r\a" "703,151\r\a" "3.9\r\a"
[7,] "Finland\r\a" "2,560,190\r\a" "2,575,951\r\a" "2,595,867\r\a" "2,612,922\r\a" "2,629,432\r\a" "0.6\r\a"
[8,] "France\r\a" "31,600,000\r\a" "31,650,000\r\a" "31,799,000\r\a" "31,915,493\r\a" "31,999,953\r\a" "0.3\r\a"
[9,] "Germany\r\a" "43,431,124\r\a" "43,851,230\r\a" "44,403,124\r\a" "45,071,209\r\a" "45,803,560\r\a" "1.6\r\a"
[10,] "Greece\r\a" "5,138,745\r\a" "5,109,435\r\a" "5,102,203\r\a" "5,104,908\r\a" "5,126,024\r\a" "0.4\r\a"
[11,] "Hungary\r\a" "2,978,745\r\a" "3,035,764\r\a" "3,101,752\r\a" "3,192,132\r\a" "3,308,495\r\a" "3.6\r\a"
[12,] "Ireland\r\a" "1,882,550\r\a" "1,910,165\r\a" "1,943,868\r\a" "1,985,130\r\a" "2,026,977\r\a" "2.1\r\a"
[13,] "Italy\r\a" "37,078,274\r\a" "36,962,934\r\a" "37,080,753\r\a" "37,351,233\r\a" "37,876,138\r\a" "1.4\r\a"
[14,] "Latvia\r\a" "618,000\r\a" "634,214\r\a" "657,487\r\a" "677,561\r\a" "663,091\r\a" "-2.1\r\a"
[15,] "Lithuania\r\a" "1,753,000\r\a" "1,837,661\r\a" "1,113,445\r\a" "1,153,859\r\a" "1,190,146\r\a" "3.1\r\a"
[16,] "Luxembourg\r\a" "344,951\r\a" "355,358\r\a" "362,879\r\a" "372,538\r\a" "380,860\r\a" "2.2\r\a"
[17,] "Netherlands\r\a" "8,142,000\r\a" "8,154,000\r\a" "8,192,570\r\a" "8,336,414\r\a" "8,439,318\r\a" "1.2\r\a"
[18,] "Poland\r\a" "18,744,412\r\a" "19,389,446\r\a" "20,003,863\r\a" "20,723,423\r\a" "21,675,388\r\a" "4.6\r\a"
[19,] "Portugal\r\a" "4,497,000\r\a" "4,480,000\r\a" "4,496,000\r\a" "4,538,000\r\a" "4,600,000\r\a" "1.4\r\a"
[20,] "Romania\r\a" "4,485,148\r\a" "4,693,651\r\a" "4,905,630\r\a" "5,153,182\r\a" "5,470,578\r\a" "6.2\r\a"
[21,] "Slovakia\r\a" "1,826,393\r\a" "1,882,577\r\a" "1,952,002\r\a" "2,037,772\r\a" "2,124,972\r\a" "4.3\r\a"
[22,] "Slovenia\r\a" "1,080,001\r\a" "1,085,347\r\a" "1,096,920\r\a" "1,116,006\r\a" "1,143,218\r\a" "2.4\r\a"
[23,] "Spain\r\a" "22,247,528\r\a" "22,024,538\r\a" "22,029,512\r\a" "22,355,549\r\a" "22,876,247\r\a" "2.3\r\a"
[24,] "Sweden\r\a" "4,447,165\r\a" "4,495,473\r\a" "4,585,519\r\a" "4,669,063\r\a" "4,768,060\r\a" "2.1\r\a"
[25,] "United Kingdom\r\a" "31,481,823\r\a" "31,917,885\r\a" "32,612,782\r\a" "33,542,448\r\a" "34,378,386\r\a" "2.5\r\a"
[26,] "EUROPEAN UNION\r\a" "243,285,257\r\a" "245,241,555\r\a" "247,566,819\r\a" "251,917,306\r\a" "257,061,713\r\a" "2.0\r\a"
[27,] "Norway\r\a" "2,433,147\r\a" "2,487,254\r\a" "2,539,513\r\a" "2,592,324\r\a" "2,639,245\r\a" "1.8\r\a"
[28,] "Switzerland\r\a" "4,300,036\r\a" "4,366,895\r\a" "4,430,375\r\a" "4,503,865\r\a" "4,571,994\r\a" "1.5\r\a"
[29,] "EFTA\r\a" "6,733,183\r\a" "6,854,149\r\a" "6,969,888\r\a" "7,096,189\r\a" "7,211,239\r\a" "1.6\r\a"
[30,] "Russia\r\a" "38,482,000\r\a" "39,322,526\r\a" "40,844,535\r\a" "40,859,866\r\a" "41,614,430\r\a" "1.8\r\a"
[31,] "Turkey\r\a" "8,648,875\r\a" "9,283,923\r\a" "9,857,915\r\a" "10,589,337\r\a" "11,317,998\r\a" "6.9\r\a"
[32,] "Ukraine\r\a" "9,910,004\r\a" "9,958,943\r\a" "9,581,401\r\a" "9,602,581\r\a" "9,679,279\r\a" "0.8\r\a"

apply function from R to julia

I am a total newbie in Julia world and I am a trying to call the julia mapslices function from R. However I have this following issue:
library(XRJulia)
japply=JuliaFunction(juliaEval("function(a) return(mapslices(sum,a,[1])) end"))
a=array(runif(16),c(4,4))
juliaGet(japply(juliaSend(a)))
# [,1] [,2] [,3] [,4]
#[1,] 1.083545 2.426658 2.310691 1.44339
#But
a=array(runif(32),c(4,4,2))
juliaGet(japply(juliaSend(a)))
# Error in checkSlotAssignment(object, name, value) :
# ‘.Data’ is not a slot in class “array”
What am I doing wrong? Thank you
You could also try my package JuliaCall, which embeds Julia in R. The usage is quite similar to XRJulia in this case. The multi-dimensional array in Julia just converts to multi-dimensional array in R automatically.
library(JuliaCall)
julia_setup()
japply=julia_eval("function(a) return(mapslices(sum,a,[1])) end")
a=array(runif(16),c(4,4))
japply(a)
# [,1] [,2] [,3] [,4]
#[1,] 1.083545 2.426658 2.310691 1.44339
a=array(runif(32),c(4,4,2))
japply(a)
#, , 1
#
# [,1] [,2] [,3] [,4]
#[1,] 3.119738 3.116167 2.299303 1.96874
#
#, , 2
#
# [,1] [,2] [,3] [,4]
#[1,] 1.578722 1.280093 0.6427822 2.786489
The main difference between XRJulia and JuliaCall is that XRJulia connects to Julia in R while JuliaCall embeds Julia in R. JuliaCall has performance advantage over XRJulia when you need to transfer large vectors or matrices between R and Julia, but it will do more work in the startup of Julia (especially for the first time).

Assigning variable names within a function in R

I am currently working on a dataset in R which is assigned to the global enviroment in R by a function of i, due to the nature of my work I am unable to disclose the dataset so let's use an example.
DATA
[,1] [,2] [,3] [,4] [,5]
[1,] 32320 27442 29275 45921 162306
[2,] 38506 29326 33290 45641 175386
[3,] 42805 30974 33797 47110 198358
[4,] 42107 34690 47224 62893 272305
[5,] 54448 39739 58548 69470 316550
[6,] 53358 48463 63793 79180 372685
Where DATA(i) is a function and the above is an output for a certain i
I want to assign variable names based on i such as:-
names(i)<-c(a(i),b(i),c(i),d(i),e(i))
for argument sake, let's say that the value of names for this specific i is
c("a","b","c","d","e")
I hope that it will produce the following:-
a b c d e
[1,] 32320 27442 29275 45921 162306
[2,] 38506 29326 33290 45641 175386
[3,] 42805 30974 33797 47110 198358
[4,] 42107 34690 47224 62893 272305
[5,] 54448 39739 58548 69470 316550
[6,] 53358 48463 63793 79180 372685
This is the code I currently use:-
VarName<-function(i){
colnames(DATA(i))<<-names(i)
}
However this produces an error message when I run it: "Error in colnames(DATA(i)) <- names(i)) :
target of assignment expands to non-language object" which we can see from my input that isn't true. Is there another way to do this?
Sorry for the basic questions. I'm fairly new to programming.

Switch Element Type from Character to Numeric?

I have 3D matrix with numbers, but R treat numeric data as character, somehow. Files I load are numeric vectors. But once I put them into 3D vector, all data numbers shows up as "character" like this:
[,1] [,2] [,3] [,4] [,5] [,6]
[1,] "3.79" "3.79" "2.33" "2.33" "2.79" "2.79"
[2,] "3.79" "3.79" "2.33" "2.33" "2.79" "2.79"
[3,] "3.02" "3.02" "4.94" "4.94" "4.33" "4.33"
[4,] "3.02" "3.02" "4.94" "4.94" "4.33" "4.33"
[5,] "4.25" "4.25" "4.06" "4.06" "4.98" "4.98"
[6,] "4.25" "4.25" "4.06" "4.06" "4.98" "4.98"
[7,] "4.25" "4.25" "4.06" "4.06" "4.98" "4.98"
[8,] "2.07" "2.07" "2.09" "2.09" "2.92" "2.92"
but before I put in 3D matrix, data shows like this:
[39965] 3.68230769 3.68230769 3.68230769 2.96454545
[39969] 2.96454545 3.93600000 3.93600000 3.93600000
[39973] 3.67769231 3.67769231 3.67769231 5.12750000
[39977] 5.12750000 5.12750000 3.05083333 3.05083333
[39981] 3.05083333 1.94166667 1.94166667 1.69000000
[39985] 1.69000000 1.69000000 2.01769231 2.01769231
[39989] 2.01769231 3.05692308 3.05692308 3.05692308
[39993] 3.72916667 3.72916667 3.72916667 2.65454545
[39997] 2.65454545 2.45583333 2.45583333 2.45583333
Here is my code:
for (i in 1: length(precipitation)) {
precip <- read.csv(precipitation[i])
precip[is.na(precip)] <- 0
precip2<- precip[,-1]
precip3<-as.vector(unlist(precip2))
prep_data[,,i]<-matrix(precip3,ncol=200,nrow=200)
}
Is it possible to add some coding to fix this problem, so all my 3D matrix elements are numeric, not "numeric".
Use as.numeric to convert something to numeric. In general, as.class converts to that class (numeric, character, factor, Date, data.frame, matrix, and many many more).
You can coerce input data to a particular class with the colClasses argument. The code below which might be substituted for the read.csv call in your code will generate warnings if it encounters non-numeric entries, but the good data will be ensured to be numeric:
precip <- read.csv(precipitation[i], colClasses="numeric" )

Resources