Converting an S4 object into a dataframe in R

Converting an S4 object into a dataframe in R - r

I have an S4 object named 'res' which I got while using an R package called RDAVIDWebService. I can't seem to find a way to convert this object into a dataframe in R.
I tried using the function 'as.data.frame(res)' but it throws this error:
> as.data.frame(res)
Error in as.data.frame.default(res) :
cannot coerce class ‘structure("DAVIDFunctionalAnnotationTable", package = "RDAVIDWebService")’ to a data.frame
The structure of the object looks like this:
> str(res)
Formal class 'DAVIDFunctionalAnnotationTable' [package "RDAVIDWebService"] with 4 slots
..# Genes :'data.frame': 3011 obs. of 3 variables:
Formal class 'DAVIDGenes' [package "RDAVIDWebService"] with 5 slots
.. .. ..# .Data :List of 3
.. .. .. ..$ : chr [1:3011] "22574630" "3544383" "3544385" "3544382" ...
.. .. .. ..$ : chr [1:3011] "1,2-Dihydroxy-3-keto-5-methylthiopentene dioxygenase,
putative(LPMP_204190)" "10 kDa heat shock protein(Tc00.1047053508209.100)" "10 kDa heat shock
protein(Tc00.1047053508209.120)" "10 kDa heat shock protein(Tc00.1047053508209.90)" ...
.. .. .. ..$ : Factor w/ 11 levels "Leishmania braziliensis MHOM/BR/75/M2904",..: 6 10 10 10
10 10 10 2 6 6 ...
.. .. ..# names : chr [1:3] "ID" "Name" "Species"
.. .. ..# row.names: chr [1:3011] "1" "2" "3" "4" ...
.. .. ..# .S3Class : chr "data.frame"
.. .. ..# type : chr "Gene List Report"
..# Dictionary:List of 10
.. ..$ COG_ONTOLOGY :'data.frame': 18 obs. of 2 variables:
.. .. ..$ ID : chr [1:18] "Translation, ribosomal structure and biogenesis" "Lipid
metabolism" "Cell division and chromosome partitioning" "General function prediction only" ...
.. .. ..$ Term: chr [1:18] "" "" "" "" ...
.. ..$ GOTERM_BP_DIRECT:'data.frame': 215 obs. of 2 variables:
.. .. ..$ ID : chr [1:215] "GO:0006457" "GO:0051603" "GO:0008152" "GO:0006412" ...
.. .. ..$ Term: chr [1:215] "protein folding" "proteolysis involved in cellular protein
catabolic process" "metabolic process" "translation" ...
.. ..$ GOTERM_CC_DIRECT:'data.frame': 84 obs. of 2 variables:
.. .. ..$ ID : chr [1:84] "GO:0005737" "GO:0016021" "GO:0005634" "GO:0005839" ...
.. .. ..$ Term: chr [1:84] "cytoplasm" "integral component of membrane" "nucleus" "proteasome
core complex" ...
.. ..$ GOTERM_MF_DIRECT:'data.frame': 222 obs. of 2 variables:
.. .. ..$ ID : chr [1:222] "GO:0010309" "GO:0018580" "GO:0051213" "GO:0004298" ...
.. .. ..$ Term: chr [1:222] "acireductone dioxygenase [iron(II)-requiring] activity"
"nitronate monooxygenase activity" "dioxygenase activity" "threonine-type endopeptidase
activity" ...
.. ..$ INTERPRO :'data.frame': 695 obs. of 2 variables:
.. .. ..$ ID : chr [1:695] "IPR004313" "IPR011051" "IPR014710" "IPR011032" ...
.. .. ..$ Term: chr [1:695] "Acireductone dioxygenase ARD family" "RmlC-like cupin domain"
"RmlC-like jelly roll fold" "GroES-like" ...
.. ..$ KEGG_PATHWAY :'data.frame': 363 obs. of 2 variables:
.. .. ..$ ID : chr [1:363] "ldo00071" "ldo00280" "ldo01100" "lmi00280" ...
.. .. ..$ Term: chr [1:363] "Fatty acid degradation" "Valine, leucine and isoleucine
degradation" "Metabolic pathways" "Valine, leucine and isoleucine degradation" ...
.. ..$ PIR_SUPERFAMILY :'data.frame': 44 obs. of 2 variables:
.. .. ..$ ID : chr [1:44] "PIRSF000868" "PIRSF002144" "PIRSF002134" "PIRSF002122" ...
.. .. ..$ Term: chr [1:44] "14-3-3 protein" "ribosomal protein, S19p/S19a/S15e/organellar S19
types" "ribosomal protein, S13p/S13a/S18e/organellar S13 types" "ribosomal protein,
S7p/S7a/S5e/organellar S7 types" ...
.. ..$ SMART :'data.frame': 90 obs. of 2 variables:
.. .. ..$ ID : chr [1:90] "SM00883" "SM00101" "SM01386" "SM01387" ...
.. .. ..$ Term: chr [1:90] "SM00883" "14_3_3" "SM01386" "SM01387" ...
.. ..$ UP_KEYWORDS :'data.frame': 116 obs. of 2 variables:
.. .. ..$ ID : chr [1:116] "Coiled coil" "Complete proteome" "Dioxygenase" "Oxidoreductase"
...
.. .. ..$ Term: chr [1:116] "" "" "" "" ...
.. ..$ UP_SEQ_FEATURE :'data.frame': 13 obs. of 2 variables:
.. .. ..$ ID : chr [1:13] "chain:60S ribosomal protein L18" "chain:Probable eukaryotic
initiation factor 4A" "domain:Helicase ATP-binding" "domain:Helicase C-terminal" ...
.. .. ..$ Term: chr [1:13] "" "" "" "" ...
..# Membership:List of 10
.. ..$ COG_ONTOLOGY : logi [1:3011, 1:18] FALSE FALSE FALSE FALSE FALSE FALSE ...
.. .. ..- attr(*, "dimnames")=List of 2
.. .. .. ..$ : NULL
.. .. .. ..$ : chr [1:18] "Translation, ribosomal structure and biogenesis" "Lipid metabolism"
"Cell division and chromosome partitioning" "General function prediction only" ...
.. ..$ GOTERM_BP_DIRECT: logi [1:3011, 1:215] FALSE TRUE TRUE TRUE TRUE TRUE ...
.. .. ..- attr(*, "dimnames")=List of 2
.. .. .. ..$ : NULL
.. .. .. ..$ : chr [1:215] "GO:0006457" "GO:0051603" "GO:0008152" "GO:0006412" ...
.. ..$ GOTERM_CC_DIRECT: logi [1:3011, 1:84] FALSE TRUE TRUE TRUE TRUE TRUE ...
.. .. ..- attr(*, "dimnames")=List of 2
.. .. .. ..$ : NULL
.. .. .. ..$ : chr [1:84] "GO:0005737" "GO:0016021" "GO:0005634" "GO:0005839" ...
.. ..$ GOTERM_MF_DIRECT: logi [1:3011, 1:222] TRUE FALSE FALSE FALSE FALSE FALSE ...
.. .. ..- attr(*, "dimnames")=List of 2
.. .. .. ..$ : NULL
.. .. .. ..$ : chr [1:222] "GO:0010309" "GO:0018580" "GO:0051213" "GO:0004298" ...
.. ..$ INTERPRO : logi [1:3011, 1:695] TRUE FALSE FALSE FALSE FALSE FALSE ...
.. .. ..- attr(*, "dimnames")=List of 2
.. .. .. ..$ : NULL
.. .. .. ..$ : chr [1:695] "IPR004313" "IPR011051" "IPR014710" "IPR011032" ...
.. ..$ KEGG_PATHWAY : logi [1:3011, 1:363] FALSE FALSE FALSE FALSE FALSE FALSE ...
.. .. ..- attr(*, "dimnames")=List of 2
.. .. .. ..$ : NULL
.. .. .. ..$ : chr [1:363] "ldo00071" "ldo00280" "ldo01100" "lmi00280" ...
.. ..$ PIR_SUPERFAMILY : logi [1:3011, 1:44] FALSE FALSE FALSE FALSE FALSE FALSE ...
.. .. ..- attr(*, "dimnames")=List of 2
.. .. .. ..$ : NULL
.. .. .. ..$ : chr [1:44] "PIRSF000868" "PIRSF002144" "PIRSF002134" "PIRSF002122" ...
.. ..$ SMART : logi [1:3011, 1:90] FALSE TRUE TRUE TRUE TRUE TRUE ...
.. .. ..- attr(*, "dimnames")=List of 2
.. .. .. ..$ : NULL
.. .. .. ..$ : chr [1:90] "SM00883" "SM00101" "SM01386" "SM01387" ...
.. ..$ UP_KEYWORDS : logi [1:3011, 1:116] TRUE FALSE FALSE FALSE FALSE FALSE ...
.. .. ..- attr(*, "dimnames")=List of 2
.. .. .. ..$ : NULL
.. .. .. ..$ : chr [1:116] "Coiled coil" "Complete proteome" "Dioxygenase" "Oxidoreductase"
...
.. ..$ UP_SEQ_FEATURE : logi [1:3011, 1:13] FALSE FALSE FALSE FALSE FALSE FALSE ...
.. .. ..- attr(*, "dimnames")=List of 2
.. .. .. ..$ : NULL
.. .. .. ..$ : chr [1:13] "chain:60S ribosomal protein L18" "chain:Probable eukaryotic
initiation factor 4A" "domain:Helicase ATP-binding" "domain:Helicase C-terminal" ...
..# type : chr "Functional Annotation Table"
Also, is there a generalized way of converting any S4 object into a dataframe without caring about the data inside the object? This is important because the S4 objects I'm fetching with this R package could have varying number of lists/variables/characters inside each of the 4 slots(i.e. #Genes, #Dictionary, #Membership and #type).

Maybe the following functions can be of help.
as.data.frame.DAVIDFunctionalAnnotationTable <- function(x){
Genes <- x#Genes
y <- Genes#.Data
names(y) <- Genes#names
y
}
extractS4_Dictionary <- function(x) x#Dictionary
extractS4_Membership <- function(x) x#Membership
extractS4_type <- function(x) x#type
The call
as.data.frame(res)
will coerce res to a data.frame.
The other functions will extract the S4 object's members.
The following function will extract the membership for each annotation.
membership <- function(x, which){
y <- as.data.frame(x)
memb <- extractS4_Membership(x)
i <- memb[, which]
y[i, , drop = FALSE]
}
# example usage
membership(res, "COG_ONTOLOGY")

Related

Plotly in R: How to reference and extract figure values?

I want to know how can I access, extract, and reference values from a plotly figure in R.
Consider, for example, the Sankey diagram from plotly's own site of which there is an abbreviated version here:
library(plotly)
fig <- plot_ly(
type = "sankey",
node = list(
label = c("A1", "A2", "B1", "B2", "C1", "C2"),
color = c("blue", "blue", "blue", "blue", "blue", "blue"),
line = list()
),
link = list(
source = c(0,1,0,2,3,3),
target = c(2,3,3,4,4,5),
value = c(8,4,2,8,4,2)
)
)
fig
If I do View(fig) in Rstudio, a new tab opens titled . (I don't know why this instead of 'fig'). In this tab I can go to x > visdat > 'strig of letters and numbers that is a function?' > attrs > node > x (as shown bellow).
Here all the x coordinates for the Sankey nodes appear.
I want to access these values so I can use them somewhere else. How do I do this? If I click on the right side of the Rsutudio tab to copy the code to console I get:
environment(.[["x"]][["visdat"]][["484c3ec36899"]])[["attrs"]][["node"]][["x"]]
which obviously doesn't work as there is no object named ..
In this case I have tried fig$x$visdat$`484c3ec36899`() but I cant do fig$x$visdat$`484c3ec36899`()$attr, and I don't know what else to do.
So, how can I access any value from a plotly object? Any documentation referencing this topic would also be helpful.
Thanks.

You can find the documentation of the data structure of plotly in R here: https://plotly.com/r/figure-structure/
To check the data structure you can use str(fig):
List of 8
$ x :List of 6
..$ visdat :List of 1
.. ..$ a3b8795a4:function ()
..$ cur_data: chr "a3b8795a4"
..$ attrs :List of 1
.. ..$ a3b8795a4:List of 6
.. .. ..$ node :List of 3
.. .. .. ..$ label: chr [1:6] "A1" "A2" "B1" "B2" ...
.. .. .. ..$ color: chr [1:6] "blue" "blue" "blue" "blue" ...
.. .. .. ..$ line : list()
.. .. ..$ link :List of 3
.. .. .. ..$ source: num [1:6] 0 1 0 2 3 3
.. .. .. ..$ target: num [1:6] 2 3 3 4 4 5
.. .. .. ..$ value : num [1:6] 8 4 2 8 4 2
.. .. ..$ alpha_stroke: num 1
.. .. ..$ sizes : num [1:2] 10 100
.. .. ..$ spans : num [1:2] 1 20
.. .. ..$ type : chr "sankey"
..$ layout :List of 3
.. ..$ width : NULL
.. ..$ height: NULL
.. ..$ margin:List of 4
.. .. ..$ b: num 40
.. .. ..$ l: num 60
.. .. ..$ t: num 25
.. .. ..$ r: num 10
..$ source : chr "A"
..$ config :List of 1
.. ..$ showSendToCloud: logi FALSE
..- attr(*, "TOJSON_FUNC")=function (x, ...)
$ width : NULL
$ height : NULL
$ sizingPolicy :List of 6
..$ defaultWidth : chr "100%"
..$ defaultHeight: num 400
..$ padding : NULL
..$ viewer :List of 6
.. ..$ defaultWidth : NULL
.. ..$ defaultHeight: NULL
.. ..$ padding : NULL
.. ..$ fill : logi TRUE
.. ..$ suppress : logi FALSE
.. ..$ paneHeight : NULL
..$ browser :List of 5
.. ..$ defaultWidth : NULL
.. ..$ defaultHeight: NULL
.. ..$ padding : NULL
.. ..$ fill : logi TRUE
.. ..$ external : logi FALSE
..$ knitr :List of 3
.. ..$ defaultWidth : NULL
.. ..$ defaultHeight: NULL
.. ..$ figure : logi TRUE
$ dependencies :List of 5
..$ :List of 10
.. ..$ name : chr "typedarray"
.. ..$ version : chr "0.1"
.. ..$ src :List of 1
.. .. ..$ file: chr "htmlwidgets/lib/typedarray"
.. ..$ meta : NULL
.. ..$ script : chr "typedarray.min.js"
.. ..$ stylesheet: NULL
.. ..$ head : NULL
.. ..$ attachment: NULL
.. ..$ package : chr "plotly"
.. ..$ all_files : logi FALSE
.. ..- attr(*, "class")= chr "html_dependency"
..$ :List of 10
.. ..$ name : chr "jquery"
.. ..$ version : chr "1.11.3"
.. ..$ src :List of 1
.. .. ..$ file: chr "lib/jquery"
.. ..$ meta : NULL
.. ..$ script : chr "jquery.min.js"
.. ..$ stylesheet: NULL
.. ..$ head : NULL
.. ..$ attachment: NULL
.. ..$ package : chr "crosstalk"
.. ..$ all_files : logi TRUE
.. ..- attr(*, "class")= chr "html_dependency"
..$ :List of 10
.. ..$ name : chr "crosstalk"
.. ..$ version : chr "1.1.0.1"
.. ..$ src :List of 1
.. .. ..$ file: chr "www"
.. ..$ meta : NULL
.. ..$ script : chr "js/crosstalk.min.js"
.. ..$ stylesheet: chr "css/crosstalk.css"
.. ..$ head : NULL
.. ..$ attachment: NULL
.. ..$ package : chr "crosstalk"
.. ..$ all_files : logi TRUE
.. ..- attr(*, "class")= chr "html_dependency"
..$ :List of 10
.. ..$ name : chr "plotly-htmlwidgets-css"
.. ..$ version : chr "1.52.2"
.. ..$ src :List of 1
.. .. ..$ file: chr "htmlwidgets/lib/plotlyjs"
.. ..$ meta : NULL
.. ..$ script : NULL
.. ..$ stylesheet: chr "plotly-htmlwidgets.css"
.. ..$ head : NULL
.. ..$ attachment: NULL
.. ..$ package : chr "plotly"
.. ..$ all_files : logi FALSE
.. ..- attr(*, "class")= chr "html_dependency"
..$ :List of 10
.. ..$ name : chr "plotly-main"
.. ..$ version : chr "1.52.2"
.. ..$ src :List of 1
.. .. ..$ file: chr "htmlwidgets/lib/plotlyjs"
.. ..$ meta : NULL
.. ..$ script : chr "plotly-latest.min.js"
.. ..$ stylesheet: NULL
.. ..$ head : NULL
.. ..$ attachment: NULL
.. ..$ package : chr "plotly"
.. ..$ all_files : logi FALSE
.. ..- attr(*, "class")= chr "html_dependency"
$ elementId : NULL
$ preRenderHook:function (p, registerFrames = TRUE)
$ jsHooks : list()
- attr(*, "class")= chr [1:2] "plotly" "htmlwidget"
- attr(*, "package")= chr "plotly"
You could extract the coordinates with:
unlist(fig$x$attrs)

R: Loaded tweets structure is untidy when str()

Differently from my collegue, after I load the tweets with R and I try to see the structure with str() the data appears in a messy way with a lot of dots, rather than being organized as a table, which is what happens with my collegue's computer, even if the codes are the same. I can't understand what is the problem, we have the same packages installed and the same R version.
library(rtweet)
library(ggplot2)
library(dplyr)
library(tibble)
library(tidytext)
library(stringr)
library(stringi)
library(igraph)
library(ggraph)
library(readr)
library(lubridate)
library(zoo)
appname <- ""
key <- ""
secret <- ""
twitter_token <- create_token( app = "", consumer_key = "", consumer_secret = "", access_token = "", access_secret = "")
tweets <- search_tweets(q = "#water + #climatechange", n = 10000, lang = "en", include_rts = FALSE)
str(tweets)
.. ..$ media :'data.frame': 1 obs. of 11 variables:
.. .. ..$ id : num 1.57e+18
.. .. ..$ id_str : chr "1573815153484759040"
.. .. ..$ indices :List of 1
.. .. .. ..$ :'data.frame': 1 obs. of 2 variables:
.. .. .. .. ..$ start: int 241
.. .. .. .. ..$ end : int 264
.. .. .. ..- attr(*, "class")= chr "AsIs"
.. .. ..$ media_url : chr "http://pbs.twimg.com/media/FddQiy2WAAAl59Q.jpg"
.. .. ..$ media_url_https: chr "https://pbs.twimg.com/media/FddQiy2WAAAl59Q.jpg"
.. .. ..$ url : chr "https
.. .. ..$ display_url : chr "pic.twitter.com/iFJTkF1S9S"
.. .. ..$ expanded_url : chr "https://twitter.com/TreeBanker/status/1573815156768968706/photo/1"
.. .. ..$ type : chr "photo"
.. .. ..$ sizes :List of 1
.. .. .. ..$ :'data.frame': 4 obs. of 4 variables:
.. .. .. .. ..$ w : int [1:4] 1096 680 150 1096
.. .. .. .. ..$ h : int [1:4] 733 455 150 733
.. .. .. .. ..$ resize: chr [1:4] "fit" "fit" "crop" "fit"
.. .. .. .. ..$ type : chr [1:4] "large" "small" "thumb" "medium"
.. .. ..$ ext_alt_text : logi NA
..$ :List of 5
.. ..$ media :'data.frame': 1 obs. of 11 variables:
.. .. ..$ id : num 1.57e+18
.. .. ..$ id_str : chr "1573815153484759040"
.. .. ..$ indices :List of 1
.. .. .. ..$ :'data.frame': 1 obs. of 2 variables:

Get data to be usable

I have been trying to get the data from this link to be usable
url <- "https://www.sec.gov/Archives/edgar/data/1061165/0001567619-21-010580.txt"
that should be the same information as the one on this link
https://www.sec.gov/Archives/edgar/data/1061165/000156761921010580/xslForm13F_X01/form13fInfoTable.xml
I have been able to download the file into a .txt, but can not get the data
Thanks

The file appears to be two nested XML files. We can extract each of the components into lists with this code:
txt <- readLines("https://www.sec.gov/Archives/edgar/data/1061165/0001567619-21-010580.txt")
grep("</?XML>", txt)
# [1] 46 101 109 719
txt[grep("</?XML>", txt)]
# [1] "<XML>" "</XML>" "<XML>" "</XML>"
A brief inspection of the file informed that grep, suggesting that an XML file started and stopped, and then another started/stopped. If we stay within that, we can extract most of the data with
library(xml2)
first <- as_list(read_xml(paste(txt[47:100], collapse = "")))
str(first)
# List of 1
# $ edgarSubmission:List of 2
# ..$ headerData:List of 2
# .. ..$ submissionType:List of 1
# .. .. ..$ : chr "13F-HR"
# .. ..$ filerInfo :List of 4
# .. .. ..$ liveTestFlag :List of 1
# .. .. .. ..$ : chr "LIVE"
# .. .. ..$ flags :List of 3
# .. .. .. ..$ confirmingCopyFlag :List of 1
# .. .. .. .. ..$ : chr "false"
# .. .. .. ..$ returnCopyFlag :List of 1
# .. .. .. .. ..$ : chr "true"
# .. .. .. ..$ overrideInternetFlag:List of 1
# .. .. .. .. ..$ : chr "false"
# .. .. ..$ filer :List of 1
# .. .. .. ..$ credentials:List of 2
# .. .. .. .. ..$ cik:List of 1
# .. .. .. .. .. ..$ : chr "0001061165"
# .. .. .. .. ..$ ccc:List of 1
# .. .. .. .. .. ..$ : chr "XXXXXXXX"
# .. .. ..$ periodOfReport:List of 1
# .. .. .. ..$ : chr "03-31-2021"
# ..$ formData :List of 3
and the second batch:
second <- as_list(read_xml(paste(txt[110:718], collapse = "")))
str(second)
# List of 1
# $ informationTable:List of 38
# ..$ infoTable:List of 7
# .. ..$ nameOfIssuer :List of 1
# .. .. ..$ : chr "ADOBE SYSTEMS INCORPORATED"
# .. ..$ titleOfClass :List of 1
# .. .. ..$ : chr "COM"
# .. ..$ cusip :List of 1
# .. .. ..$ : chr "00724F101"
# .. ..$ value :List of 1
# .. .. ..$ : chr "1246613"
# .. ..$ shrsOrPrnAmt :List of 2
# .. .. ..$ sshPrnamt :List of 1
# .. .. .. ..$ : chr "2622406"
# .. .. ..$ sshPrnamtType:List of 1
# .. .. .. ..$ : chr "SH"
# .. ..$ investmentDiscretion:List of 1
# .. .. ..$ : chr "SOLE"
# .. ..$ votingAuthority :List of 3
# .. .. ..$ Sole :List of 1
# .. .. .. ..$ : chr "2622406"
# .. .. ..$ Shared:List of 1
# .. .. .. ..$ : chr "0"
# .. .. ..$ None :List of 1
# .. .. .. ..$ : chr "0"
# ..$ infoTable:List of 7
I'm not certain offhand how to extract the front-matter, I hope this is a good enough start.

Using rvest to login into webpage with pop up sign in

I would like to login to a webpage with a pop up sign in window. This article logs into Stack Overflow, a webpage that has a visible login form. How can I use rvest to login into websites that don't have visible login forms? For example, the Washington Post's website has a sign in box on the top right of the page. Once clicked, a form appears where you can sign in.
library(rvest)
url <- 'https://www.rotary.org/myrotary/en'
url2 <- 'https://stackoverflow.com/users/login?ssrc=head&returnurl=http%3a%2f%2fstackoverflow.com%2f'
url3 <- 'https://www.washingtonpost.com/?noredirect=on'
If I get the structure of the forms on StackOverflow's login page,
pg_session <- html_session(url2)
html_form(pg_session) %>% str
List of 2
$ :List of 5
..$ name : chr "search"
..$ method : chr "GET"
..$ url : chr "/search"
..$ enctype: chr "form"
..$ fields :List of 2
.. ..$ q :List of 7
.. .. ..$ name : chr "q"
.. .. ..$ type : chr "text"
.. .. ..$ value : chr ""
.. .. ..$ checked : NULL
.. .. ..$ disabled: NULL
.. .. ..$ readonly: NULL
.. .. ..$ required: logi FALSE
.. .. ..- attr(*, "class")= chr "input"
.. ..$ <unnamed>:List of 7
.. .. ..$ name : chr "<unnamed>"
.. .. ..$ type : chr "submit"
.. .. ..$ value : NULL
.. .. ..$ checked : NULL
.. .. ..$ disabled: NULL
.. .. ..$ readonly: NULL
.. .. ..$ required: logi FALSE
.. .. ..- attr(*, "class")= chr "button"
.. ..- attr(*, "class")= chr "fields"
..- attr(*, "class")= chr "form"
$ :List of 5
..$ name : chr "login-form"
..$ method : chr "POST"
..$ url : chr "/users/login?ssrc=head&returnurl=http%3a%2f%2fstackoverflow.com%2f"
..$ enctype: chr "form"
..$ fields :List of 7
.. ..$ fkey :List of 7
.. .. ..$ name : chr "fkey"
.. .. ..$ type : chr "hidden"
.. .. ..$ value : chr "d5f8c65b7d92b368b4b58e43e59fd9d82cb4436bac4a6d430771d50b85e771aa"
.. .. ..$ checked : NULL
.. .. ..$ disabled: NULL
.. .. ..$ readonly: NULL
.. .. ..$ required: logi FALSE
.. .. ..- attr(*, "class")= chr "input"
.. ..$ ssrc :List of 7
.. .. ..$ name : chr "ssrc"
.. .. ..$ type : chr "hidden"
.. .. ..$ value : chr "head"
.. .. ..$ checked : NULL
.. .. ..$ disabled: NULL
.. .. ..$ readonly: NULL
.. .. ..$ required: logi FALSE
.. .. ..- attr(*, "class")= chr "input"
.. ..$ email :List of 7
.. .. ..$ name : chr "email"
.. .. ..$ type : chr "email"
.. .. ..$ value : NULL
.. .. ..$ checked : NULL
.. .. ..$ disabled: NULL
.. .. ..$ readonly: NULL
.. .. ..$ required: logi FALSE
.. .. ..- attr(*, "class")= chr "input"
.. ..$ password :List of 7
.. .. ..$ name : chr "password"
.. .. ..$ type : chr "password"
.. .. ..$ value : NULL
.. .. ..$ checked : NULL
.. .. ..$ disabled: NULL
.. .. ..$ readonly: NULL
.. .. ..$ required: logi FALSE
.. .. ..- attr(*, "class")= chr "input"
.. ..$ submit-button:List of 7
.. .. ..$ name : chr "submit-button"
.. .. ..$ type : NULL
.. .. ..$ value : NULL
.. .. ..$ checked : NULL
.. .. ..$ disabled: NULL
.. .. ..$ readonly: NULL
.. .. ..$ required: logi FALSE
.. .. ..- attr(*, "class")= chr "button"
.. ..$ oauth_version:List of 7
.. .. ..$ name : chr "oauth_version"
.. .. ..$ type : chr "hidden"
.. .. ..$ value : NULL
.. .. ..$ checked : NULL
.. .. ..$ disabled: NULL
.. .. ..$ readonly: NULL
.. .. ..$ required: logi FALSE
.. .. ..- attr(*, "class")= chr "input"
.. ..$ oauth_server :List of 7
.. .. ..$ name : chr "oauth_server"
.. .. ..$ type : chr "hidden"
.. .. ..$ value : NULL
.. .. ..$ checked : NULL
.. .. ..$ disabled: NULL
.. .. ..$ readonly: NULL
.. .. ..$ required: logi FALSE
.. .. ..- attr(*, "class")= chr "input"
.. ..- attr(*, "class")= chr "fields"
..- attr(*, "class")= chr "form"
, I can clearly locate where to fill out my email and password. However, I can't find it in the structure of the forms on the Washington Post's home page, which makes it difficult to call the form I need.
pg_session <- html_session(url3)
html_form(pg_session) %>% str
List of 1
$ :List of 5
..$ name : chr "search-form"
..$ method : chr "GET"
..$ url : chr "//www.washingtonpost.com/newssearch/"
..$ enctype: chr "form"
..$ fields :List of 1
.. ..$ query:List of 7
.. .. ..$ name : chr "query"
.. .. ..$ type : chr "text"
.. .. ..$ value : NULL
.. .. ..$ checked : NULL
.. .. ..$ disabled: NULL
.. .. ..$ readonly: NULL
.. .. ..$ required: logi FALSE
.. .. ..- attr(*, "class")= chr "input"
.. ..- attr(*, "class")= chr "fields"
..- attr(*, "class")= chr "form"
My particular case is to log in to this site, however the Washington Post's pop up log in seems similar enough that it would be the same procedure. How can I call these pop-up log ins?
*I am not too familiar with html, so if there are any better terms to use or ways to phrase it, feel free to correct me.

How to find out which index is out of bounds in object in R

Although I understand OOP, I've only just encountered them in R
I am using a package from Bioconductor to churn through some genomic data.
The object it creates is called readCounts and typing this into the command gives the following.
QDNAseqReadCounts (storageMode: lockedEnvironment)
assayData: 206391 features, 1 samples
element names: counts
protocolData: none
phenoData
sampleNames: SLX-10457.FastSeqA.BloodDMets_11AF_-AHMMH.s_1.r_1.fq.gz
varLabels: name total.reads used.reads expected.variance
varMetadata: labelDescription
featureData
featureNames: 1:825001-840000 1:840001-855000 ... 22:51165001-51180000 (168063 total)
fvarLabels: chromosome start ... use (9 total)
fvarMetadata: labelDescription
experimentData: use 'experimentData(object)'
Annotation:
I am trying to plot readcounts on a simple xy graph as follows:
plot(readCounts, logTransform=TRUE, ylim=c(-1000, binSize * 15))
However when I do so I get the following error:
Error in sort.int(x, partial = unique(c(lo, hi))) :
index 180 outside bounds
with the traceback() showing:
6: sort.int(x, partial = unique(c(lo, hi)))
5: FUN(newX[, i], ...)
4: apply(copynumber, 2, sdFUN, na.rm = TRUE)
3: .local(x, y, ...)
2: plot(readCounts, logTransform = TRUE, ylim = c(-1000, binSize *
15))
1: plot(readCounts, logTransform = TRUE, ylim = c(-1000, binSize *
15))
so having googled I thought it might be a missing values problem so I tried na.omit(readCounts) but got the same error again but this time setting the out of bounds index as being 207.
I have tried to inspect the data but I can't find anything wrong at row 207 although I'm not really sure which slot this refers to. I really don't know how to debug this. I'm happy to give more info regarding what I'm trying to do but I don't really know how to determine what the problem is with this error in a R object.
When I do str(readCounts) I get:
Formal class 'QDNAseqReadCounts' [package "QDNAseq"] with 7 slots
..# assayData :<environment: 0x13a99ed90>
..# phenoData :Formal class 'AnnotatedDataFrame' [package "Biobase"] with 4 slots
.. .. ..# varMetadata :'data.frame': 4 obs. of 1 variable:
.. .. .. ..$ labelDescription: chr [1:4] NA NA NA NA
.. .. ..# data :'data.frame': 1 obs. of 4 variables:
.. .. .. ..$ name : chr "SLX-10457.FastSeqA.BloodDMets_11AF_-AHMMH.s_1.r_1.fq.gz"
.. .. .. ..$ total.reads : num 0
.. .. .. ..$ used.reads : num 0
.. .. .. ..$ expected.variance: num Inf
.. .. ..# dimLabels : chr [1:2] "sampleNames" "sampleColumns"
.. .. ..# .__classVersion__:Formal class 'Versions' [package "Biobase"] with 1 slot
.. .. .. .. ..# .Data:List of 1
.. .. .. .. .. ..$ : int [1:3] 1 1 0
..# featureData :Formal class 'AnnotatedDataFrame' [package "Biobase"] with 4 slots
.. .. ..# varMetadata :'data.frame': 9 obs. of 1 variable:
.. .. .. ..$ labelDescription: chr [1:9] "Chromosome name" "Base pair start position" "Base pair end position" "Percentage of non-N nucleotides (of full bin size)" ...
.. .. ..# data :'data.frame': 168063 obs. of 9 variables:
.. .. .. ..$ chromosome : chr [1:168063] "1" "1" "1" "1" ...
.. .. .. ..$ start : num [1:168063] 825001 840001 855001 870001 885001 ...
.. .. .. ..$ end : num [1:168063] 840000 855000 870000 885000 900000 915000 930000 945000 960000 975000 ...
.. .. .. ..$ bases : num [1:168063] 100 100 100 100 100 100 100 100 100 100 ...
.. .. .. ..$ gc : num [1:168063] 48 61.8 65.1 65.5 62.6 ...
.. .. .. ..$ mappability: num [1:168063] 58.6 91.5 94.1 93.2 93.9 ...
.. .. .. ..$ blacklist : num [1:168063] 0.727 0 0 0 0 ...
.. .. .. ..$ residual : num [1:168063] -0.0627 0.05036 0.09384 0.00541 -0.00588 ...
.. .. .. ..$ use : logi [1:168063] TRUE TRUE TRUE TRUE TRUE TRUE ...
.. .. .. ..- attr(*, "na.action")=Class 'omit' Named int [1:38328] 1 2 3 4 5 6 7 8 9 10 ...
.. .. .. .. .. ..- attr(*, "names")= chr [1:38328] "1:1-15000" "1:15001-30000" "1:30001-45000" "1:45001-60000" ...
.. .. ..# dimLabels : chr [1:2] "featureNames" "featureColumns"
.. .. ..# .__classVersion__:Formal class 'Versions' [package "Biobase"] with 1 slot
.. .. .. .. ..# .Data:List of 1
.. .. .. .. .. ..$ : int [1:3] 1 1 0
..# experimentData :Formal class 'MIAME' [package "Biobase"] with 13 slots
.. .. ..# name : chr ""
.. .. ..# lab : chr ""
.. .. ..# contact : chr ""
.. .. ..# title : chr ""
.. .. ..# abstract : chr ""
.. .. ..# url : chr ""
.. .. ..# pubMedIds : chr ""
.. .. ..# samples : list()
.. .. ..# hybridizations : list()
.. .. ..# normControls : list()
.. .. ..# preprocessing : list()
.. .. ..# other : list()
.. .. ..# .__classVersion__:Formal class 'Versions' [package "Biobase"] with 1 slot
.. .. .. .. ..# .Data:List of 2
.. .. .. .. .. ..$ : int [1:3] 1 0 0
.. .. .. .. .. ..$ : int [1:3] 1 1 0
..# annotation : chr(0)
..# protocolData :Formal class 'AnnotatedDataFrame' [package "Biobase"] with 4 slots
.. .. ..# varMetadata :'data.frame': 0 obs. of 1 variable:
.. .. .. ..$ labelDescription: chr(0)
.. .. ..# data :'data.frame': 1 obs. of 0 variables
.. .. ..# dimLabels : chr [1:2] "sampleNames" "sampleColumns"
.. .. ..# .__classVersion__:Formal class 'Versions' [package "Biobase"] with 1 slot
.. .. .. .. ..# .Data:List of 1
.. .. .. .. .. ..$ : int [1:3] 1 1 0
..# .__classVersion__:Formal class 'Versions' [package "Biobase"] with 1 slot
.. .. ..# .Data:List of 4
.. .. .. ..$ : int [1:3] 3 1 2
.. .. .. ..$ : int [1:3] 2 26 0
.. .. .. ..$ : int [1:3] 1 3 0
.. .. .. ..$ : int [1:3] 1 2 4

Develop Reference

r css asp.net wordpress firebase qt symfony nginx http apache-flex

Converting an S4 object into a dataframe in R - r

Related

Plotly in R: How to reference and extract figure values?

R: Loaded tweets structure is untidy when str()

Get data to be usable

Using rvest to login into webpage with pop up sign in

How to find out which index is out of bounds in object in R

Categories

Resources