Why does Rshiny tableOutput create an additional column? - r

Hi all I'm relatively new to Rshiny and got some unexpected behavior in a simple example of tableOutput.
I'm trying to display a simple table in an Rshiny application but notice there is an additional column created that I did not specify. I'm a little baffled as reproducing the code in 'normal' R does not show this additional column.
My RShiny code:
library(shiny)
ui <- fluidPage(
titlePanel("Test"),
tableOutput("Table1")
)
server <- function(input, output) {
output$Table1 <- renderTable({
matrix1 = matrix(c(2, 3, 5, 8, 13), ncol =1, nrow = 5)
rownames(matrix1) = c('Min', 'value1', 'Max', 'value2', 'Standard deviation')
colnames(matrix1) = c('values')
as.table(matrix1)
})
}
shinyApp(ui = ui, server = server)
This results in the following app where the 'Var2' column is the one I did not expect:
RShiny view
I expected the output to be similar of the 'normal' R one: R table
Created by this R code:
matrix1 = matrix(c(2, 3, 5, 8, 13), ncol =1, nrow = 5)
rownames(matrix1) = c('Min', 'value1', 'Max', 'value2', 'Standard deviation')
colnames(matrix1) = c('values')
table1 = as.table(matrix1)
Can someone help me understand from where the additional column comes from in the RShiny view and how to get rid of it?
(Ps: I'm new to posting, so please let me know in case I'm unclear/missed something)

It is a table object, instead wrapped around the matrix, which is really not needed, instead it can be just the matrix object or converted to data.frame
server <- function(input, output) {
output$Table1 <- renderTable({
matrix1 = matrix(c(2, 3, 5, 8, 13), ncol =1, nrow = 5)
rownames(matrix1) = c('Min', 'value1', 'Max', 'value2', 'Standard deviation')
colnames(matrix1) = c('values')
matrix1
}, rownames = TRUE)
}
shinyApp(ui = ui, server = server)
-output
When we add as.table, it converts to table class and it gets converted to data.frame during the process and this results in the change of format
> as.table(matrix1)
values
Min 2
value1 3
Max 5
value2 8
Standard deviation 13
> as.data.frame(as.table(matrix1))
Var1 Var2 Freq
1 Min values 2
2 value1 values 3
3 Max values 5
4 value2 values 8
5 Standard deviation values 13
> str(as.table(matrix1))
'table' num [1:5, 1] 2 3 5 8 13
- attr(*, "dimnames")=List of 2
..$ : chr [1:5] "Min" "value1" "Max" "value2" ...
..$ : chr "values"
If we check the source code of renderTable, there is a line which does the as.data.frame and thus it changes the format to three column because there are different methods for as.data.frame depending on the class of the object
> shiny::renderTable
...
classNames <- paste0("table shiny-table", paste0(" table-",
names(format)[format], collapse = ""), paste0(" spacing-",
spacing))
data <- as.data.frame(data)
...
Also, check methods for as.data.frame
> methods(as.data.frame)
[1] as.data.frame.aovproj* as.data.frame.array as.data.frame.AsIs as.data.frame.character
[5] as.data.frame.complex as.data.frame.data.frame as.data.frame.data.table* as.data.frame.Date
[9] as.data.frame.default as.data.frame.descr* as.data.frame.difftime as.data.frame.EventHistory.frame*
[13] as.data.frame.factor as.data.frame.ftable* as.data.frame.function* as.data.frame.grouped_df*
[17] as.data.frame.groupedData* as.data.frame.idf* as.data.frame.integer as.data.frame.ITime*
[21] as.data.frame.list as.data.frame.logical as.data.frame.logLik* as.data.frame.mapped_discrete*
[25] as.data.frame.matrix as.data.frame.model.matrix as.data.frame.noquote as.data.frame.numeric
[29] as.data.frame.numeric_version as.data.frame.ordered as.data.frame.POSIXct as.data.frame.POSIXlt
[33] as.data.frame.raw as.data.frame.resamples* as.data.frame.shingle* as.data.frame.Surv*
[37] as.data.frame.Surv2* as.data.frame.table ##### as.data.frame.tbl_df* as.data.frame.timeDate*
[41] as.data.frame.ts as.data.frame.vctrs_sclr* as.data.frame.vctrs_vctr* as.data.frame.vector
[45] as.data.frame.xyVector*
see '?methods' for accessing help and source code
The source code of as.data.frame.table does create three columns
> as.data.frame.table
function (x, row.names = NULL, ..., responseName = "Freq", stringsAsFactors = TRUE,
sep = "", base = list(LETTERS))
{
ex <- quote(data.frame(do.call("expand.grid", c(dimnames(provideDimnames(x,
sep = sep, base = base)), KEEP.OUT.ATTRS = FALSE, stringsAsFactors = stringsAsFactors)),
Freq = c(x), row.names = row.names))
names(ex)[3L] <- responseName
eval(ex)
}

Related

Error in R code for Reddit sentiment analysis

I am trying to so a Reddit sentiment analysis in R. I cannot get past an error and I am trying to troubleshoot the problem.
# Getting Reddit Data
links<- find_thread_urls(
keywords = "Ghostbusters",
sort_by = "top",
subreddit = NA,
period = "all"
)
# function to iterate through all posts
funct = function(i){
content = get_thread_content()(links$URL[i])
com = iconv(content$comment, to = 'utf-8')
clov = get_nrc_sentiment(com)
x1 = 100*colSums(clov)/sum(clov)
return(cbind(links[i,], t(x1) ))
}
# list of all the links
ls = 1:nrow(links)
# loop through all the links and bind to a data frame
res = do.call("rbind", lapply(ls, funct))
When I run this code, I get this error:
Error in lapply(urls, parse_thread_url) :
argument "urls" is missing, with no default
What am I missing here?
There were few typos in the code,
links = find_thread_urls(
keywords = "Ghostbusters",
sort_by = "top",
subreddit = NA,
period = "all"
)
x0 = vector()
funct = function(i){
content = get_thread_content(links$url[i])
com = iconv(content$comment, to = 'utf-8')
clov = get_nrc_sentiment(com)
x1 = 100*colSums(clov)/sum(clov)
x1 = cbind(x0, t(x1))
return(x1)
}
ls = 1:nrow(links)
res = lapply(ls, funct)
res = do.call(rbind, res)
anger anticipation disgust fear joy sadness surprise trust negative positive
[1,] 7.276119 8.955224 6.156716 9.514925 6.156716 9.328358 3.731343 9.701493 18.28358 20.89552
[2,] 5.586592 11.731844 4.469274 6.145251 12.290503 8.379888 5.586592 14.525140 10.61453 20.67039
[3,] 7.333333 9.238095 5.238095 8.952381 8.095238 6.571429 4.952381 11.523810 16.95238 21.14286
[4,] 8.641975 6.790123 8.024691 7.098765 8.333333 8.024691 5.555556 11.111111 16.35802 20.06173

Why is my R code only returning 1 output?

Hi I am new to using R and I am trying to create a version of Fermis Piano Tuner problem using it. however is it only returning 1 estimate when I need it to return 2 upon running the code. anyone know what I am doing wrong here?
{
Population<-(2695598)
PPH<-(2)
HWTP<-(0.04)
TPYPP<-(2)
TPTPD<-(6)
WPY<-(261)
((Population/PPH)*HWTP*TPYPP/(TPTPD*WPY))
Population2<-(2679080)
PPH2<-(2.48)
HWTP2<-(0.04)
TPYPP2<-(2)
TPTPD2<-(6)
WPY2<-(261)
{((Population/PPH)*HWTP*TPYPP/(TPTPD*WPY))}
{((Population2/PPH2)*HWTP2*TPYPP2/(TPTPD2*WPY2))}
}
How about changing last 2 lines into
c((Population/PPH)*HWTP*TPYPP/(TPTPD*WPY),
(Population2/PPH2)*HWTP2*TPYPP2/(TPTPD2*WPY2))
First, I suggest that you create a function
fermis_piano_tuner <-
function(Population,PPH,HWTP,TPYPP,TPTPD,WPY){
((Population/PPH)*HWTP*TPYPP/(TPTPD*WPY))
}
Than, you can pass the arguments to get your results
fermis_piano_tuner(
Population = 2695598,
PPH = 2,
HWTP = 0.04,
TPYPP = 2,
TPTPD = 6,
WPY = 261
)
[1] 68.85308
And if you want to pass multiple arguments you can use pmap
library(purrr)
pmap(
.l = tibble::tibble(
Population = c(2695598,2679080),
PPH = c(2,2.48),
HWTP = c(0.04,0.04),
TPYPP = c(2,2),
TPTPD= c(6,6),
WPY = c(261,261)
),
.f = fermis_piano_tuner
)
[[1]]
[1] 68.85308
[[2]]
[1] 55.18642

Difficulty in downloading TCGA data

I am trying to download the TCGA data but I am getting this error:
Error in summarizeMaf(maf = maf, anno = clinicalData, chatty =
verbose): Tumor_Sample_Barcode column not found in provided clinical
data. Rename column containing sample names to Tumor_Sample_Barcode if
necessary.
This is my code:
library("TCGAbiolinks")
library("tidyverse")
library(maftools)
query <- GDCquery( project = "TCGA-LIHC",
data.category = "Clinical",
file.type = "xml",
legacy = FALSE)
GDCdownload(query,directory = ".")
clinical <- GDCprepare_clinic(query, clinical.info = "patient",directory = ".")
#getting the survival time of event data
survival_data <- as_tibble(clinical[,c("days_to_last_followup","days_to_death","vital_status","bcr_patient_barcode","patient_id")])
survival_data <- filter(survival_data,!is.na(days_to_last_followup)|!is.na(days_to_death)) #not both NA
survival_data <- filter(survival_data,!is.na(days_to_last_followup)|days_to_last_followup>0 &is.na(days_to_death)|days_to_death > 0 ) #ensuring positive values
survival_data <- survival_data[!duplicated(survival_data$patient_id),] #ensuring no duplicates
dim(survival_data) #should be 371
maf <- GDCquery_Maf("LIHC", pipelines = "muse")
#maf <- GDCquery_Maf("LIHC", pipelines = "somaticsniper")
#clin <- GDCquery_clinic("TCGA-LIHC","clinical")
#print(clin )
laml = read.maf(
maf,
clinicalData = clinical,
removeDuplicatedVariants = TRUE,
useAll = TRUE,
gisticAllLesionsFile = NULL,
gisticAmpGenesFile = NULL,
gisticDelGenesFile = NULL,
gisticScoresFile = NULL,
cnLevel = "all",
cnTable = NULL,
isTCGA = TRUE,
vc_nonSyn = NULL,
verbose = TRUE
)
You should have: a) loaded with library(maftools) and b) included what was printed out before that error message:
-Validating
-Silent variants: 18306
-Summarizing
--Possible FLAGS among top ten genes:
TTN
MUC16
OBSCN
FLG
-Processing clinical data
Available fields in provided annotations..
[1] "bcr_patient_barcode" "additional_studies"
[3] "tissue_source_site" "patient_id"
# snipped remaining 78 column names
Notice that the first column is not named "Tumor_Sample_Barcode", so you need to follow the helpful error message directions and rename the appropriate column which appears to be the first one:
ns. After doing so I get:
-Validating
-Silent variants: 18306
-Summarizing
--Possible FLAGS among top ten genes:
TTN
MUC16
OBSCN
FLG
-Processing clinical data
-Finished in 1.911s elapsed (2.470s cpu)

Error: target of assignment expands to non-language object

I am trying to dynamically clean up some column names for a large number of tables and I get the above error.
I have a gut feeling that I should be using quo but I have no idea on how to do that.
Any ideas?
The apply_alias applies a set of business rules to clean the names.
apply_alias <- function(l){
which(l=="Geography")
l[which(l=="Geography")] <- "GEO"
toupper(l)
}
The cleanup_column_names_tbl applies the alias_function to a list of table
cleanup_column_names_tbl <- function(PID){
for(p in PID){
names(get(paste0("tbl_",p))) <- apply_alias(names(get(paste0("tbl_",p))))
}
}
cleanup_column_names_tbl("14100287")
When I try to run it i get the following error message:
> cleanup_column_names_tbl("14100287")
Error in names(get(paste0("tbl_", p))) <- apply_alias(names(get(paste0("tbl_", :
target of assignment expands to non-language object
Sample data:
> dput(tbl_14100287[1,])
structure(list(V1 = 0L, REF_DATE = "1976-01", GEO = "Canada",
DGUID = "2016A000011124", `Labour force characteristics` = "Population",
Sex = "Both sexes", `Age group` = "15 years and over", Statistics = "Estimate",
`Data type` = "Seasonally adjusted", UOM = "Persons", UOM_ID = 249L,
SCALAR_FACTOR = "thousands", SCALAR_ID = 3L, VECTOR = "v2062809",
COORDINATE = "1.1.1.1.1.1", VALUE = 16852.4, STATUS = "",
SYMBOL = NA, TERMINATED = NA, DECIMALS = 1L), class = c("data.table",
"data.frame"), row.names = c(NA, -1L), .internal.selfref = <pointer: 0x000002123cf21ef0>)
You cannot assign a value to get since there is no function get<-. The right way of doing it would be something like the following.
apply_alias <- function(l){
l[which(l == "Geography")] <- "GEO"
toupper(l)
}
cleanup_column_names_tbl <- function(PID, envir = .GlobalEnv){
pid_full <- paste0("tbl_", PID)
res <- lapply(pid_full, function(p){
nms <- apply_alias(names(get(p)))
DF <- get(p)
names(DF) <- nms
DF
})
names(res) <- pid_full
list2env(res, envir = envir)
invisible(NULL)
}
cleanup_column_names_tbl("14100287")
names(tbl_14100287)
# [1] "V1" "REF_DATE"
# [3] "GEO" "DGUID"
# [5] "LABOUR FORCE CHARACTERISTICS" "SEX"
# [7] "AGE GROUP" "STATISTICS"
# [9] "DATA TYPE" "UOM"
#[11] "UOM_ID" "SCALAR_FACTOR"
#[13] "SCALAR_ID" "VECTOR"
#[15] "COORDINATE" "VALUE"
#[17] "STATUS" "SYMBOL"
#[19] "TERMINATED" "DECIMALS"
My solution:
Creating and expression and evaluating it. It is short but somehow does not feel like it is the proper way of doing things since I am breaking away from the functional paradigm of R.
cleanup_column_names_tbl <- function(PID){
for(p in PID){
expr1 <- paste0("names(", paste0("tbl_",p), ") <- apply_alias(names(", paste0("tbl_",p),"))")
eval(rlang::parse_expr(expr1))
}
}
Edit:
A slightly different way that:
avoids using strings
a little more flexible since it allows the use of strings and symbol
library(rlang)
test_df <- data.frame(a=1:10,b=1:10)
test_df2 <- data.frame(a=1:10,b=1:10)
fix_names <- function(df){
x <- ensym(df)
expr1 <- expr(names(!!x) <- toupper(names(!!x)))
eval(expr1, envir = parent.env(environment()))
# expr1
}
fix_names(test_df)
fix_names("test_df2")
names(test_df)
#> [1] "A" "B"
names(test_df2)
#> [1] "A" "B"

Loop error: Error in `[.data.frame`(company, , i) : undefined columns selected

I have 2 data.frames:
The first includes Companies ("n" columns)
The second a benchmark (1 column)
Both have the same length of rows
After the calculations, each column (in both of them) has "k" levels more.
Eg. if the first data.frame had 3 Companies in each column {a,b,c} and 3 levels then it will be like {a.1, b.1, c.1, a.2,..., c.3} and the second will be {bench.1,bench.2,bench.3}
I'm trying to create a table that has the companies as rows and the regression coefficients between the two data.frames as columns.
Here's what I have tried so far:
a1 <- data.frame(); b1 <- data.frame(); rs <- data.frame()
# company: the company data.frame (levels included)
# benchmark: the benchmark data.frame (levels included)
# levels: the levels
k <- ncol(company)/levels
l <- 1 - k
for (j in 1:levels) {
l <- l + k
k <- k + k
for (i in l:k) {
mod <- lm(company[,i] ~ benchmark[,j])
a1[i,j] <- mod$coefficients[1]
b1[i,j] <- mod$coefficients[2]
rs[i,j] <- summary(mod)$adj.r.squared
}
}
table <- data.frame('Alpha_Coef' = a1, 'Beta_Coef' = b1, 'Adj.R_Squared' = rs)
I would appreciate any kind of help.
Thank you very much
Edit:
The Company Dataframe (4 levels)
> dput(wsf[1:30, seq(1, 356, 32)])
structure(list(W1free = c(-0.0412000679217938, 0.0880096344938139,
0.050722366154044, -0.0946016947369898, 0.0473490608961103, -0.0316948337480497,
-0.000178721497686476, 0.0330043255435735, -0.0130213571012855,
-0.0210134019518695, 0.00515539322334187, 0.0313594382649759,
-0.0636728947738869, 0.069783976440601, -0.0233500288324757,
0.000310577992801808, -0.0106422243645882, -0.00844960399798073,
0.0207508743372746, -0.00655907030852679, -0.00917878754640728,
0.0422985993411094, -0.0583530401655691, 0.00316447724460396,
0.0548253425520056, -0.0303784828505676, -0.0178226463347567,
0.0141621070101102, 0.0040525928183183, 0.00756095153768141),
W1free.32 = c(-0.0303584128041137, 0.0781259056778406, 0.0543601664896256,
-0.0838929634247089, 0.0168995513463754, -0.00116320770298013,
-0.00661142266491474, 0.00316786967673743, 0.014865018488982,
0.00115744890492633, -0.0133139018524208, 0.00357666761370058,
-0.00984757332103789, 0.0121936464190736, -0.0116509859273921,
0.0214700671630899, -9.18974370460937e-05, -0.0291337664969924,
0.0183613008541499, -0.00623650470163079, 0.00188765595333864,
0.0169715922156494, -0.0202622159786693, 0.00117256873232021,
0.0188193851765861, -0.0330732214410425, 0.0276667753578936,
-0.00912446778879232, -0.00645760419877771, 0.0229548860537124
), W1free.64 = c(-0.041320747786529, 0.0940721256003125,
0.0546722155282992, -0.0776194922127773, -0.0188218918393148,
0.0369018784168118, -0.0159710004093249, 0.00270145290316121,
0.00593858603645259, -0.0018398928749183, -0.0203943374367289,
0.0409079997097826, -0.0302107159471338, 0.00687409495325281,
-0.00278673048827845, 0.00579509928589942, 0.00450071408544868,
-0.0167040789608537, 0.0183985148340536, -0.0140021183403113,
0.00302717840065282, 0.010014802876346, -0.0108024225605952,
0.00664477277427707, -0.00998679780654633, 0.0093455115960892,
-0.0157862445056117, 0.0278399667684328, -0.0124817099428715,
-0.00574730232120169), W2free.7 = c(-0.723798800294376, -1.69462545197809,
-0.322808715747857, 0.983076466907199, 0.601543663890055,
0.00738341380613653, -0.17043040850117, -0.130795012258449,
0.0877012527311328, 0.144654492689526, -0.0233580207307919,
-0.130528770674957, -0.0811443177437124, 0.00455323978774625,
0.0322004951423808, 0.041028915171238, 0.0397928561027241,
-0.0146377819487017, -0.0564390552961818, -0.0157108172121778,
0.0418289577687561, 0.0407167980322358, -0.000920052532455742,
-0.0455497658600707, -0.0588733856819542, 0.00103356528290383,
0.072999787419253, 0.0566294953344014, -0.0139270074576876,
-0.051740788996522), W2free.39 = c(-0.719321984671057, -1.67269117285615,
-0.320838224757322, 0.99401218132634, 0.631713043453369,
0.00311609762638836, -0.215156520118741, -0.148974422352992,
0.0967557896976805, 0.120695957831705, -0.0371356083199943,
-0.0698672528876954, -0.0239853298540248, -0.00317615704745786,
0.0069410289690482, 0.00582109106839021, 0.00118481377532628,
-0.000737420044277918, -0.00796548329789375, -0.000991745340389675,
0.00911571004844975, 0.00143748164424383, -0.00218700317993713,
0.00467672676232776, 0.0132823977786716, 0.00346746468015986,
-0.0284915884601326, -0.0246832031983181, 0.0100296772438092,
0.0157902406702257), W2free.71 = c(-0.743286786555513, -1.68048071636107,
-0.304449505870913, 1.00901538641735, 0.626394565701536,
-0.00615227460862292, -0.212468284990251, -0.146055149093144,
0.0968104699798898, 0.129746894309015, -0.0313266854247715,
-0.0756053816865943, -0.0334202489014552, -0.0195998665482904,
0.010111846958308, 0.0392121845384003, 0.00544740230543033,
-0.0252594118720365, -0.0189333126817274, -0.0200581449364624,
0.00172381708912141, 0.0377707777006468, 0.0253024954648705,
0.00375902559668281, -0.000786565490537995, -0.0183404009358504,
-0.0144607525884234, 0.00635615971857548, -0.01995715314952,
-0.0339339332556635), W3free.14 = c(-0.541971799096092, -0.547254597852541,
-0.351417129791281, -0.079614566169948, 0.190159527265416,
0.369077378264343, 0.344217272166032, 0.307260788133197,
0.335851696193595, 0.389852108956907, 0.508410193658837,
0.454576753894716, 0.0246283521212561, -0.468731535325678,
-0.911314207864435, -1.10081031169091, -0.693894242109856,
-0.176018831349174, 0.268013221596553, 0.584156224372314,
0.483437264811605, 0.31842908645562, 0.211725648202932, 0.0860050283532339,
0.0374128558225933, -0.0175973788383141, -0.104572345794254,
-0.139452927246105, -0.102203406030287, -0.0486154328959697
), W3free.46 = c(-0.551809713896474, -0.554226067631237,
-0.344691113458163, -0.0662397128588583, 0.210840558688842,
0.392862015453828, 0.361004880017924, 0.315708806714992,
0.335952251181492, 0.38343966622271, 0.499384154321493, 0.451678337070302,
0.0251530591544403, -0.472351170164682, -0.920102359468484,
-1.12201190642854, -0.722117102849035, -0.19757656345329,
0.257496857320272, 0.585660993514737, 0.489353812024943,
0.318486723025294, 0.201872000937277, 0.0775858599501224,
0.0511079098302739, 0.0245387675955284, -0.0362827260167453,
-0.0723029499363708, -0.0746172353011349, -0.0670522978529944
), W3free.78 = c(-0.54111517653805, -0.544975921014698, -0.340169990337836,
-0.0694109571903436, 0.196448997132187, 0.3710738269895,
0.339483189903327, 0.305610659199469, 0.344043131600595,
0.405907239122663, 0.527335328093289, 0.469685973096099,
0.02498782328932, -0.485994428238778, -0.940087328347237,
-1.13732999528212, -0.724247605957125, -0.192167012739451,
0.266786191889965, 0.598018937656833, 0.497551375682964,
0.324775953859068, 0.203400777243459, 0.0638344266267895,
0.0231419137529647, -0.0116987397715524, -0.0687128256203292,
-0.0832420106844102, -0.056424343917657, -0.0291186092992814
), W4free.21 = c(0.118918655781396, 0.108718779015937, 0.102775253233938,
0.0828524472480944, 0.0410407234124081, -0.012381397981727,
-0.0746511451766081, -0.126967081904642, -0.138260372435415,
-0.142146232655087, -0.144033266870221, -0.113074511945608,
-0.0670368640581458, -0.0159506388411364, 0.0576398958821921,
0.114505138680487, 0.138987380604024, 0.17298154560344, 0.190728780916805,
0.172497528928508, 0.148714427830889, 0.111056896498362,
0.0322016534964464, -0.0411902802201208, -0.108714452830289,
-0.193403780805043, -0.242468575602981, -0.2518526201175,
-0.226528938042806, -0.16536045623284), W4free.53 = c(0.110582824081414,
0.0990631039125744, 0.092803742919437, 0.07531563115484,
0.0364487613311845, -0.0148493771602927, -0.0766616219694969,
-0.130115230718347, -0.140678702662368, -0.141590690581832,
-0.140856314457181, -0.108167295871846, -0.0604430195571222,
-0.00767211515738517, 0.0691273756544755, 0.129924506626998,
0.154897007967789, 0.186742438470891, 0.201270731261766,
0.177042350952871, 0.147222730892732, 0.105238351658075,
0.0205834624120472, -0.0576989254447123, -0.127161196169879,
-0.214118057849679, -0.26440281808828, -0.269229187033526,
-0.237685035633724, -0.172411380276773), W4free.85 = c(0.118750904966437,
0.105876306066672, 0.0980390856567921, 0.0763519039332738,
0.0332059062273466, -0.0198714852858678, -0.0829673978857073,
-0.13673208596903, -0.147663483183383, -0.150600183457387,
-0.15128240500669, -0.117726475726204, -0.0690026777475602,
-0.0158895662765118, 0.0616902413279015, 0.122399296935057,
0.147735045409867, 0.181929151536837, 0.19882362481352, 0.177391946124788,
0.150509412599221, 0.110259155225688, 0.0273385567351872,
-0.0487018627269797, -0.116082830918324, -0.200781736711009,
-0.248333357539446, -0.252999585674256, -0.222777574439045,
-0.157452460819487)), .Names = c("W1free", "W1free.32", "W1free.64",
"W2free.7", "W2free.39", "W2free.71", "W3free.14", "W3free.46",
"W3free.78", "W4free.21", "W4free.53", "W4free.85"), row.names = c(NA,
30L), class = "data.frame")
Benchmark Dataframe (4 levels):
> dput(wbf[1:30,])
structure(list(W1free = c(-0.0432455343158943, 0.0874909119474779,
0.0571001565877175, -0.0884507864841058, 0.0172050650848323,
-0.00224255555827861, 0.000722631485183195, 0.00993908327736091,
-0.0117418265042027, 0.00534988466357996, -0.000940017698511891,
0.00546041849046642, -0.0103733510325452, 0.00956961911965048,
-0.0113459297172276, 0.0133958729726377, -0.0029416252299541,
-0.011414032468681, 0.0141883947414388, -0.00822524736978799,
0.00316136152084385, 0.00521921925859321, -0.0159379140722575,
0.00678918411451175, 0.0148541387078586, -0.0129222646391021,
-0.00765917374935741, 0.0110142405769407, -0.00458116020357859,
0.0168472964297673), W2free = c(-0.740522709681838, -1.67769972639025,
-0.307890697506331, 1.00015354619176, 0.626337664518445, 0.00505640584038886,
-0.205521408512004, -0.150946308038834, 0.0857142318047878, 0.123609501408063,
-0.0266672430532372, -0.0735840422655566, -0.0316973358168455,
-0.00275393824752691, 0.00718734780106011, 0.00446732335756168,
0.000838919259401042, 0.00297041882635259, 6.04838862344908e-05,
-0.00405610663093049, -0.00221649131469466, 0.00381412534328539,
0.0066934836198532, -0.000657836733240365, -0.00444029910039748,
0.00143349288299151, 0.00265511863053606, -0.000743159102980881,
-0.00661664414444181, -0.00754557691213225), W3free = c(-0.537883232834564,
-0.543185462048805, -0.341173228804443, -0.0719862903967931,
0.195766331288493, 0.373205978054011, 0.344376017839598, 0.312014362375209,
0.346039880023554, 0.403777644728145, 0.52334517757098, 0.46282661064603,
0.0212821638525336, -0.487411032478064, -0.942703893546479, -1.13740960834218,
-0.725250237305827, -0.192667529664807, 0.268680928939938, 0.598063378890107,
0.499012500582589, 0.328774353296679, 0.21316016955454, 0.0826700443418289,
0.0440645204568091, 0.00181643263159242, -0.0732510545397724,
-0.107882422379997, -0.0915052281639699, -0.0597510688882035),
W4free = c(0.120565311954663, 0.110601962226259, 0.105408918589852,
0.0865592864118751, 0.0450732522189251, -0.00895997768433414,
-0.0730317587513421, -0.127472031073358, -0.139664402331508,
-0.144382004669276, -0.14768736817263, -0.118078134288994,
-0.0729438783923505, -0.0220437686110972, 0.0529380577249222,
0.111858330376543, 0.137386196550584, 0.172903566190241,
0.192861994141253, 0.176582979701844, 0.1548325160191, 0.118996625676438,
0.040367522055528, -0.0335387828973377, -0.101185748094811,
-0.187175302200655, -0.238647127560718, -0.250387805121693,
-0.227657278667077, -0.16931414087988)), .Names = c("W1free",
"W2free", "W3free", "W4free"), row.names = c(NA, 30L), class = "data.frame")
Consider a vectorized approach using Map where you pass in each dataframe's column names to a user defined function, reghandle, to run regression and return model results.
However, to use Map, lengths must match. Hence benchmark colnames are replicated by the k levels and sorted to align. At end, the function's returned list is iteratively cast to dataframe rows, then are binded together in a do.call(rbind, ...):
reghandle <- function(x, y){
mod <- lm(company[[x]] ~ benchmark[[y]])
return(list(Alpha_coef = unname(mod$coefficients[1]),
Beta_coef = unname(mod$coefficients[2]),
Adj.R_Squared = unname(summary(mod)$adj.r.squared)))
}
k <- ncol(company)/levels
benchmarknames <- sort(rep(names(benchmark), k))
benchmarknames
# [1] "W1free" "W1free" "W1free" "W2free" "W2free" "W2free" "W3free" "W3free"
# [9] "W3free" "W4free" "W4free" "W4free"
tablelist <- Map(reghandle, names(company), benchmarknames)
table <- do.call(rbind, lapply(tablelist, data.frame))
table
# Alpha_coef Beta_coef Adj.R_Squared
# W1free -0.0003729282 1.1575992 0.6540945
# W1free.32 0.0003352187 0.9155197 0.8303173
# W1free.64 -0.0002986047 0.9099407 0.6990809
# W2free.7 -0.0003745362 0.9973343 0.9925469
# W2free.39 0.0002856617 0.9957494 0.9991361
# W2free.71 -0.0011951258 1.0032802 0.9986368
# W3free.14 -0.0028038826 0.9759708 0.9991466
# W3free.46 0.0023389397 0.9871707 0.9987852
# W3free.78 0.0009924749 0.9972702 0.9992721
# W4free.21 -0.0012718476 0.9931372 0.9990408
# W4free.53 -0.0036123789 1.0318762 0.9914355
# W4free.85 -0.0029080074 1.0139958 0.9964788

Resources