Facing difficulties in regression with mxnet in R - r

I was following this tutorial http://mxnet.io/tutorials/r/fiveMinutesNeuralNetwork.html#regression
Everything worked accordingly but when I changed:
fc1 <- mx.symbol.FullyConnected(data, num_hidden=1)
to
fc1 <- mx.symbol.FullyConnected(data, num_hidden=2)
And among the stacks of error logs I thought may be this is the most interesting:
Error in exec$update.arg.arrays(arg.arrays, match.name, skip.null) :
[20:22:59] src/ndarray/ndarray.cc:239: Check failed: from.shape() == to->shape()
shape mismatchfrom.shape = (20,) to.shape=(20,2)
How do I diagnose this problem?
Here is the output of sessionInfo():
R version 3.3.3 RC (2017-02-27 r72279)
Platform: x86_64-pc-linux-gnu (64-bit)
Running under: Ubuntu 14.04.5 LTS
locale:
[1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C LC_TIME=en_US.UTF-8 LC_COLLATE=en_US.UTF-8
[5] LC_MONETARY=en_US.UTF-8 LC_MESSAGES=en_US.UTF-8 LC_PAPER=en_US.UTF-8 LC_NAME=C
[9] LC_ADDRESS=C LC_TELEPHONE=C LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C
attached base packages:
[1] stats graphics grDevices utils datasets methods base
other attached packages:
[1] mlbench_2.1-1 mxnet_0.9.5
loaded via a namespace (and not attached):
[1] igraph_1.0.1 Rcpp_0.12.10 rstudioapi_0.6 magrittr_1.5 munsell_0.4.3 colorspace_1.3-2
[7] viridisLite_0.2.0 R6_2.2.0 brew_1.0-6 stringr_1.2.0 plyr_1.8.4 dplyr_0.5.0
[13] visNetwork_1.0.3 Rook_1.1-1 tools_3.3.3 grid_3.3.3 gtable_0.2.0 DBI_0.6
[19] influenceR_0.1.0 DiagrammeR_0.9.0 htmltools_0.3.5 lazyeval_0.2.0 digest_0.6.12 assertthat_0.1
[25] tibble_1.2 gridExtra_2.2.1 RColorBrewer_1.1-2 ggplot2_2.2.1 codetools_0.2-8 htmlwidgets_0.8
[31] viridis_0.4.0 rgexf_0.15.3 stringi_1.1.3 scales_0.4.1 XML_3.98-1.6 jsonlite_1.3

The thing is that below the fc1 <- mx.symbol.FullyConnected(data, num_hidden=1) line, tutorial uses linear regression for the output lro <- mx.symbol.LinearRegressionOutput(fc1).
LinearRegressionOutput is used to compute the l2-loss between it's input symbol and the labels provided to it. It assumes 1 label per example, and passing 2 breaks it. In my case it is a little bit different from your message, maybe because the difference in versions:
Error in symbol$infer.shape(list(...)) :
Error in operator linearregressionoutput5: Shape inconsistent, Provided=(20,), inferred shape=(20,2)
Fixing of this depends of what exactly you want to achieve. If you are solving classification task and want to receive probabilities for both classes, then you need to use Softmax:
fc1 <- mx.symbol.FullyConnected(data, num_hidden=2)
lro <- mx.symbol.SoftmaxOutput(fc1)

Related

Calling mclapply from Rscript

When calling a function which uses mclapply() with Rscript myFuction.R --json=config.json the mclapply functions fails with message
all scheduled cores encountered errors in user code
However when I run the code within RStudio it runs fine. I'm developing on RStudio AWS AMI and testing in RStudio, and executing Rscript from the terminal of RStudio AWS machine, and the environments are the same between RStudio and the terminal.
Does anyone have an idea of extra parameters I might need to give to mclapply or other environment parameters I need to define when running mclapply with Rscript?
I've tried changing all the mclapply() arguments without success
Here is my environment.
R version 3.5.1 (2018-07-02)
Platform: x86_64-conda_cos6-linux-gnu (64-bit)
Running under: Amazon Linux 2
Matrix products: default
BLAS/LAPACK: /anaconda3/envs/r35p27/lib/R/lib/libRblas.so
locale:
[1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C
[3] LC_TIME=en_US.UTF-8 LC_COLLATE=en_US.UTF-8
[5] LC_MONETARY=en_US.UTF-8 LC_MESSAGES=en_US.UTF-8
[7] LC_PAPER=en_US.UTF-8 LC_NAME=C
[9] LC_ADDRESS=C LC_TELEPHONE=C
[11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C
attached base packages:
[1] stats4 parallel stats graphics grDevices utils datasets
[8] methods base
other attached packages:
[1] rlist_0.4.6.1 stringi_1.4.3
[3] stringr_1.3.1 seqinr_3.4-5
[5] rBLAST_0.99.2 ShortRead_1.40.0
[7] GenomicAlignments_1.18.1 SummarizedExperiment_1.12.0
[9] DelayedArray_0.8.0 matrixStats_0.54.0
[11] Biobase_2.42.0 Rsamtools_1.34.0
[13] GenomicRanges_1.34.0 GenomeInfoDb_1.18.1
[15] Biostrings_2.50.2 XVector_0.22.0
[17] IRanges_2.16.0 S4Vectors_0.20.1
[19] BiocParallel_1.16.6 BiocGenerics_0.28.0
[21] aws.s3_0.3.12 jsonlite_1.5
[23] configr_0.3.3 optparse_1.6.2
loaded via a namespace (and not attached):
[1] Rcpp_1.0.1 RColorBrewer_1.1-2 compiler_3.5.1
[4] base64enc_0.1-3 bitops_1.0-6 tools_3.5.1
[7] zlibbioc_1.28.0 digest_0.6.20 lattice_0.20-35
[10] Matrix_1.2-14 yaml_2.2.0 GenomeInfoDbData_1.2.1
[13] hwriter_1.3.2 httr_1.3.1 xml2_1.2.0
[16] ade4_1.7-13 grid_3.5.1 getopt_1.20.2
[19] data.table_1.12.2 glue_1.3.1 R6_2.2.2
[22] latticeExtra_0.6-28 magrittr_1.5 MASS_7.3-51.4
[25] aws.signature_0.5.0 ini_0.3.1 RCurl_1.95-4.12
[28] RcppTOML_0.1.6
Although this is a "Warning message" mclapply quits
Warning messages:
1: In mclapply(1:dim(this.split)[1], BLAST.loop, mc.preschedule = TRUE) :
all scheduled cores encountered errors in user code

'filter' from dplyr package causes error message

I run queries to ClickHouse database (connected with Rserver by RClickhouse package). Queries run smoothly unless I use filter function - which generates error message about wrong object type
Important detail: this problem appears only in Rstudio project that is located in common server folder. Same code works fine (without errors) in similar project that shares parent folder (/users/boris/) with R
> a <- con %>% tbl("test_sample") %>% select(domain) %>% collect()
> show(a)
# A tibble: 140,000 x 1
domain
* <chr>
1 allforchildren.ru
> a <- con %>% tbl("test_sample") %>% filter(domain == "wildberries.ru") %>% collect()
Error in storage.mode(x) <- "double" :
(list) object cannot be coerced to type 'double'
Any guesses what is the reason for such reaction on filter function?
P.S. session info
> sessionInfo()
R version 3.5.1 (2018-07-02)
Platform: x86_64-pc-linux-gnu (64-bit)
Running under: Debian GNU/Linux 9 (stretch)
Matrix products: default
BLAS: /usr/lib/libblas/libblas.so.3.7.0
LAPACK: /usr/lib/lapack/liblapack.so.3.7.0
locale:
[1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C
[3] LC_TIME=en_US.UTF-8 LC_COLLATE=en_US.UTF-8
[5] LC_MONETARY=en_US.UTF-8 LC_MESSAGES=en_US.UTF-8
[7] LC_PAPER=en_US.UTF-8 LC_NAME=C
[9] LC_ADDRESS=C LC_TELEPHONE=C
[11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C
attached base packages:
[1] stats graphics grDevices utils datasets methods base
other attached packages:
[1] urltools_1.7.1 dplyr_0.7.6 getPass_0.2-2
[4] DBI_1.0.0 RClickhouse_0.4.0
loaded via a namespace (and not attached):
[1] Rcpp_0.12.18 dbplyr_1.2.2 crayon_1.3.4 assertthat_0.2.0
[5] R6_2.2.2 magrittr_1.5 pillar_1.3.0 rlang_0.2.1
[9] rstudioapi_0.7 bindrcpp_0.2.2 tools_3.5.1 glue_1.3.0
[13] triebeard_0.3.0 purrr_0.2.5 compiler_3.5.1 yaml_2.2.0
[17] pkgconfig_2.0.1 bindr_0.1.1 tidyselect_0.2.4 tibble_1.4.2

Shinydashboard: 'restoreInput' is not an exported object from 'namespace:shiny'

I am using RStudio to create a new Shiny app. I copy and paste the code sample from https://rdrr.io/cran/shinydashboard/man/renderValueBox.html into the app.R. I receive this error:
Warning: Error in : 'restoreInput' is not an exported object from 'namespace:shiny'
Stack trace (innermost first):
45: getExportedValue
44: ::
43: dashboardSidebar
42: inherits
41: tagAssert
40: dashboardPage
1: runApp
Error : 'restoreInput' is not an exported object from 'namespace:shiny'
Here is my sessionInfo() output:
R version 3.3.3 (2017-03-06)
Platform: x86_64-redhat-linux-gnu (64-bit)
Running under: Fedora 25 (Workstation Edition)
locale:
[1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C LC_TIME=en_US.UTF-8
[4] LC_COLLATE=en_US.UTF-8 LC_MONETARY=en_US.UTF-8 LC_MESSAGES=en_US.UTF-8
[7] LC_PAPER=en_US.UTF-8 LC_NAME=C LC_ADDRESS=C
[10] LC_TELEPHONE=C LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C
attached base packages:
[1] stats graphics grDevices utils datasets methods base
other attached packages:
[1] shiny_0.13.0 shinydashboard_0.6.0 dplyr_0.5.0 readr_1.0.0
loaded via a namespace (and not attached):
[1] Rcpp_0.12.3 digest_0.6.9 assertthat_0.1 mime_0.4 grid_3.3.3
[6] plyr_1.8.4 R6_2.1.2 jsonlite_0.9.19 xtable_1.8-0 gtable_0.2.0
[11] DBI_0.5-1 magrittr_1.5 scales_0.4.1 ggplot2_2.2.1 lazyeval_0.2.0
[16] tools_3.3.3 munsell_0.4.3 httpuv_1.3.3 colorspace_1.3-2 htmltools_0.3.5
[21] tibble_1.2
UPDATE:
1) I tried the sample from a different machine and produced the same error.
2) I also took a shinydashboard skeleton app and reproduced this error.
Ran update.packages() and newer versions didn't have this issue.

DESeq2 - Invalid class “GRangesList” object

I've just updated my DESeq2 package from version 1.4.5 to version 1.6.3 and my scripts are no longer working. Specifically, I get the following error when generating a DESeqDataSet object with the function DESeqDataSetFromMatrix:
Error in validObject(.Object) :
invalid class “GRangesList” object: number of rows in DataTable 'mcols(x)' must match length of 'x'
To replicate this error one may either use the example shown in the DESeq2 vignette:
library("pasilla")
library("Biobase")
data("pasillaGenes")
countData <- counts(pasillaGenes)
colData <- pData(pasillaGenes)[,c("condition","type")]
dds <- DESeqDataSetFromMatrix(countData = countData,
colData = colData,
design = ~ condition)
or the example shown in the DESeq2 Reference Manual:
countData <- matrix(1:4,ncol=2)
colData <- data.frame(condition=factor(c("a","b")))
dds <- DESeqDataSetFromMatrix(countData, colData, formula(~ condition))
Thanks in advance for your help,
Alessia
sessionInfo()
R version 3.1.0 (2014-04-10)
Platform: x86_64-unknown-linux-gnu (64-bit)
locale:
[1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C
[3] LC_TIME=en_US.UTF-8 LC_COLLATE=en_US.UTF-8
[5] LC_MONETARY=en_US.UTF-8 LC_MESSAGES=en_US.UTF-8
[7] LC_PAPER=en_US.UTF-8 LC_NAME=C
[9] LC_ADDRESS=C LC_TELEPHONE=C
[11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C
attached base packages:
[1] parallel stats4 stats graphics grDevices utils datasets
[8] methods base
other attached packages:
[1] DESeq2_1.6.3 RcppArmadillo_0.4.400.0 Rcpp_0.11.2
[4] GenomicRanges_1.16.4 GenomeInfoDb_1.0.2 IRanges_1.22.10
[7] S4Vectors_0.4.0 BiocGenerics_0.12.1 BiocInstaller_1.16.1
loaded via a namespace (and not attached):
[1] acepack_1.3-3.3 annotate_1.42.1 AnnotationDbi_1.26.0
[4] BatchJobs_1.3 BBmisc_1.7 Biobase_2.24.0
[7] BiocParallel_0.6.1 brew_1.0-6 checkmate_1.4
[10] cluster_1.15.3 codetools_0.2-9 colorspace_1.2-4
[13] DBI_0.3.0 digest_0.6.4 fail_1.2
[16] foreach_1.4.2 foreign_0.8-61 Formula_1.1-2
[19] genefilter_1.46.1 geneplotter_1.42.0 ggplot2_1.0.0
[22] grid_3.1.0 gtable_0.1.2 Hmisc_3.14-5
[25] iterators_1.0.7 lattice_0.20-29 latticeExtra_0.6-26
[28] locfit_1.5-9.1 MASS_7.3-34 munsell_0.4.2
[31] nnet_7.3-8 plyr_1.8.1 proto_0.3-10
[34] RColorBrewer_1.0-5 reshape2_1.4 rpart_4.1-8
[37] RSQLite_0.11.4 scales_0.2.4 sendmailR_1.1-2
[40] splines_3.1.0 stringr_0.6.2 survival_2.37-7
[43] tools_3.1.0 XML_3.98-1.1 xtable_1.7-4
[46] XVector_0.4.0
source("http://bioconductor.org/biocLite.R")
Bioconductor version 3.0 (BiocInstaller 1.16.1), ?biocLite for help

R RODBC sqlSave crashing/disconnecting when too many columns supplied to existing table

I've found a situation where R hangs due to sqlSave not handling cases where the are more columns being inserted than are present in the table.
Does anyone have any insight on how I can resolve this behaviour? I had the idea of retrieving the number of columns already in the table but this would be a siginificant performance drain and add significant complexity to the code.
The error (reproduction will require a dsn/database connection locally)
channel <- odbcConnect("mydb")
odbcClearError(channel)
sqlSave(channel, dat=data.frame(a=1:3,b=letters[1:3]),
tablename="R_update_test",
# rownames=FALSE,
append=TRUE)
odbcClose(channel)
Or
channel <- odbcConnect("mydb")
odbcClearError(channel)
sqlSave(channel, dat=data.frame(a=1:3,b=letters[1:3],c=letters[1:3])]),
tablename="R_update_test",
rownames=FALSE,
append=TRUE)
odbcClose(channel)
sessionInfo included below, and target db is mysql 5.6
> sessionInfo()
R version 3.0.2 (2013-09-25)
Platform: x86_64-pc-linux-gnu (64-bit)
locale:
[1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C LC_TIME=C LC_COLLATE=C LC_MONETARY=C LC_MESSAGES=C
[7] LC_PAPER=C LC_NAME=C LC_ADDRESS=C LC_TELEPHONE=C LC_MEASUREMENT=C LC_IDENTIFICATION=C
attached base packages:
[1] splines grid stats graphics grDevices utils datasets methods base
other attached packages:
[1] data.table_1.8.10 Hmisc_3.13-0 Formula_1.1-1 survival_2.37-4 caret_5.17-7 reshape2_1.2.2 plyr_1.8
[8] lattice_0.20-24 foreach_1.4.1 cluster_1.14.4 RODBC_1.3-9 Nemo_1.0 testthat_0.7.1 devtools_1.4
loaded via a namespace (and not attached):
[1] MASS_7.3-29 RColorBrewer_1.0-5 RCurl_1.95-4.1 codetools_0.2-8 colorspace_1.2-4 dichromat_2.0-0 digest_0.6.3
[8] evaluate_0.5.1 ggplot2_0.9.3.1 gtable_0.1.2 httr_0.2 iterators_1.0.6 labeling_0.2 memoise_0.1
[15] munsell_0.4.2 parallel_3.0.2 proto_0.3-10 scales_0.2.3 stringr_0.6.2 tools_3.0.2 whisker_0.3-2

Resources