None of the keys entered are valid keys - R - r

I am trying to learn how to manipulate microarrays for differential expression analysis. While I am trying to add some annotation I can not find the keytype related to:
select(hugene10sttranscriptcluster.db,
keys = my_keys,
columns = c("GENENAME", "SYMBOL"),
keytype = "PROBEID")
-------------------------------------------------------
Error in .testForValidKeys(x, keys, keytype, fks) :
None of the keys entered are valid keys for 'PROBEID'. Please use the keys method to see a listing of valid arguments.
Being the keys:
my_keys
---------------------------------------------------------------------
[1] "16650045" "16650047" "16650049" "16650051" "16650053" "16650055" "16650057" "16650059"
I tried every possible type from keytypes(hugene10sttranscriptcluster.db) with no successful result:
"16650045" %in% keys(hugene10sttranscriptcluster.db, "GENEID")
------------------------------------------------------------------
[1] FALSE
Is there any documentation/alternative where I can find it. I have been looking through the documentation (Array Express) but did not help me. I am also not sure; is it possible that I require a different package (hugene10sttranscriptcluster.db)?

Effectively, I did have a problem with the package. If anyone has the same problem just try to look for the annotation of the microarray in the documentation (pd.hugene.2.0.st in my case) to install and use the proper package (hugene20sttranscriptcluster.db)

Related

Use pre-trained model vocabulary in an appropriate way with allennlp

When using a huggingface pre-traind model,i passed a tokennizer and indexer for my textfied in Datasetreader, also i want use the same tokennizer and indexer in my model. Which way is an appropriate way in allennlp ? (using config file ?)
Here is my code, i think this is a bad sloution. Give me some suggestions please.
`In my Dataset Reader::
self._tokenizer = PretrainedTransformerTokenizer("microsoft/DialoGPT-small",tokenizer_kwargs={'cls_token': '[CLS]',
'sep_token': '[SEP]',
'bos_token':'[BOS]'})
self._tokenindexer = {"tokens": PretrainedTransformerIndexer("microsoft/DialoGPT-small",
tokenizer_kwargs={'cls_token': '[CLS]',
'sep_token': '[SEP]',
'bos_token':'[BOS]'})}
In my Model:
self.tokenizer = GPT2Tokenizer.from_pretrained("microsoft/DialoGPT-small")
num_added_tokens = self.tokenizer.add_special_tokens({'bos_token':'[BOS]','sep_token': '[SEP]','cls_token':'[CLS]'})
self.emb_dim = len(self.tokenizer)
self.embeded_layer = self.encoder.resize_token_embeddings(self.emb_dim)
I have create two tokenizers for datasetreader and model, and both the tokenizers have the common vocabulary and special tokens. but when i add the three special token in the same order, the special token will have a different index. so i switched the order in Model`s codes to achieve the same indexs.(stupid but effective)
Is there exists a way to pass the tokennizer or vocab from DatasetReader to Model?
Which way is an appropriate way in allennlp to slove this problem ?

{getPost() does not retrieve reactions' component} & {"reactions" and "likes" with the same logical value return neither error nor warning msg}

[Win 10; R 3.4.3; RStudio 1.1.383; Rfacebook 0.6.15]
Hi!
I would like to ask two questions concerning the Rfacebook's getPost function:
Even though I have tried all possible combinations of the logical values for the arguments "comments", "reactions" and "likes", the best result I could get so far was a list of 3 components for each post ("post", "comments", and "likes") - that is, without the "reactions" component. Nevertheless, according to the rdocumentation, "getPost returns a list with up to four components: post, likes, comments, and reactions". getPost
Besides the (somehow strange) fact that, according to the same documentation, the argument "reactions" should be FALSE (default) in order to retrieve info on the total reactions to the post(s), I noticed a seemingly odd result: if I simultaneously set "reactions" and "likes" to be either TRUE or FALSE, R returns neither an error nor a warning message. The reason I find it a bit odd is because likes = !reactions in its own definition.
Here is the code:
#packageVersion("Rfacebook")
#[1] ‘0.6.15'
## temporary access token
fb_oauth <- "user access token"
qtd <- 5000
#pag_loop$id[1]
#[1] "242862559586_10156144461009587"
# arguments with default value (reactions = F, likes = T, comments = T)
x <- getPost(pag_loop$id[1], token = fb_oauth, n = qtd)
str(x)
# retrieves a list of 3: posts, likes, comments
Can someone please explain to me why I don't get the reaction's component?
Best,
Luana
Men, this is by the new version of facebook. This worked fine to V2.10 Version of API of facebook. As V2.11 and forward, it no longer works well.
I also can not capture the reactions, and the user's name is null. I have win 10 and R 3.4.2. Could to be R version? please, if you can to resolve this issue send me the response to my email

R: passing a variable to library, ls and?

I am trying to pass a variable to install.package,library, ls and ?
Passing a variable to install.package works fine, but I get an error for the others.
name1 <- as.character("dplyr")
install.packages(name1)
library(name1)
ls(name1)
?name1
I would be greatfull for your help.
One of the three issues, library(name1), can be resolved with the option character.only = TRUE:
library(name1, character.only = TRUE)
To list all the objects in the library with the name stored in name1, try
ls(paste0("package:",name1))
or
ls(getNamespace(name1))
(see here for a discussion on the difference between these two commands, including further options to show hidden objects).
Concerning the third point, ?, I have no solution to offer other than using help(name1) instead, as suggested also by #PierreLafortune.

Creating graph in titan from data in csv - example wiki.Vote gives error

I am new to Titan - I loaded titan and successfully ran GraphOfTheGods example including queries given. Next I went on to try bulk loading csv file to create graph and followed steps in Powers of ten - Part 1 http://thinkaurelius.com/2014/05/29/powers-of-ten-part-i/
I am getting an error in loading wiki-Vote.txt
gremlin> g = TitanFactory.open("/tmp/1m") Backend shorthand unknown: /tmp/1m
I tried:
g = TitanFactory.open('conf/titan-berkeleydb-es.properties’)
but get an error in the next step in load-1m.groovy
==>titangraph[berkeleyje:/titan-0.5.4-hadoop2/conf/../db/berkeley] No signature of method: groovy.lang.MissingMethodException.makeKey() is applicable for argument types: () values: [] Possible solutions: every(), any()
Any hints what to do next? I am using groovy for the first time. what kind of groovy expertise needed for working with gremlin
That blog post is meant for Titan 0.4.x. The API shifted when Titan went to 0.5.x. The same principles discussed in the posts generally apply to data loading but the syntax is different in places. The intention is to update those posts in some form when Titan 1.0 comes out with full support of TinkerPop3. Until then, you will need to convert those code examples to the revised API.
For example, an easy way to create a berkeleydb database is with:
g = TitanFactory.build()
.set("storage.backend", "berkeleyje")
.set("storage.directory", "/tmp/1m")
.open();
Please see the docs here. Then most of the schema creation code (which is the biggest change) is now described here and here.
After much experimenting today, I finally figured it out. A lot of changes were needed:
Use makePropertyKey() instead of makeKey(), and makeEdgeLabel() instead of makeLabel()
Use cardinality(Cardinality.SINGLE) instead of unique()
Building the index is quite a bit more complicated. Use the management system instead of the graph both to make the keys and labels, as well as build the index (see https://groups.google.com/forum/#!topic/aureliusgraphs/lGA3Ye4RI5E)
For posterity, here's the modified script that should work (as of 0.5.4):
g = TitanFactory.build().set("storage.backend", "berkeleyje").set("storage.directory", "/tmp/1m").open()
m = g.getManagementSystem()
k = m.makePropertyKey('userId').dataType(String.class).cardinality(Cardinality.SINGLE).make()
m.buildIndex('byId', Vertex.class).addKey(k).buildCompositeIndex()
m.makeEdgeLabel('votesFor').make()
m.commit()
getOrCreate = { id ->
def p = g.V('userId', id)
if (p.hasNext()) {
p.next()
} else {
g.addVertex([userId:id])
}
}
new File('wiki-Vote.txt').eachLine {
if (!it.startsWith("#")){
(fromVertex, toVertex) = it.split('\t').collect(getOrCreate)
fromVertex.addEdge('votesFor', toVertex)
}
}
g.commit()

RCurl - Boolean Options

These Curl docs: http://curl.haxx.se/docs/manpage.html#-d list many boolean options.
How do I specify these options in a postForm call in RCurl? For example, how do I specify the --sslv3 flag?
I tried
postForm(url, .opts = list(sslv3=TRUE))
but received the error:
Warning message:
In mapCurlOptNames(names(.els), asNames = TRUE) :
Unrecognized CURL options: sslv3
Thanks in advance.
SOLUTION
Through some trial and error, I found that this works:
options(RCurlOptions = list(sslversion=3))
postForm(url)
If anyone could clarify how to translate the Curl options to the RCurl options, it would appreciated!
Curl stands for a few things http://daniel.haxx.se/docs/curl-vs-libcurl.html. The problem here is you are looking at what the curl command line tool does and instead want to ask how the libcurl library implements something.
RCurl use the libcurl library. This can be accessed via an api. The "symbols" used in the api are listed here http://curl.haxx.se/libcurl/c/symbols-in-versions.html. We can compare them to the options listed by RCurl:
library(RCurl)
cInfo <- getURL("http://curl.haxx.se/libcurl/c/symbols-in-versions.html")
cInfo <- unlist(strsplit(cInfo, "\n"))
cInfo <- cInfo[grep("CURLOPT_", cInfo)]
cInfo <- gsub("([^[\\s]]*)\\s.*", "\\1", cInfo)
cInfo <- gsub("CURLOPT_", "", cInfo)
cInfo <- tolower(gsub("_", ".", cInfo))
listCurlOptions()[!listCurlOptions()%in%cInfo]
From the above we can see that all RCurl options are derived from libcurl api symbols. The
CURLOPT_ is removed _ is replaced by . and the letters are demoted to lower case.
The question then arises as to what types the symbols represent. I usually just look at the
php library documentation to discover this. http://php.net/manual/en/function.curl-setopt.php lists
CURLOPT_SSLVERSION The SSL version (2 or 3) to use. By default PHP will try to determine this itself, although in some cases this must be set manually.
as an integer type. expecting the value 2 or 3.
Alternatively you can look at the curl_easy_setopt manual page http://curl.haxx.se/libcurl/c/curl_easy_setopt.html.
CURLOPT_SSLVERSION
Pass a long as parameter to control what version of SSL/TLS to attempt to use. The available options are:
CURL_SSLVERSION_DEFAULT
The default action. This will attempt to figure out the remote SSL protocol version, i.e. either SSLv3 or TLSv1 (but not SSLv2, which became disabled by default with 7.18.1).
CURL_SSLVERSION_TLSv1
Force TLSv1
CURL_SSLVERSION_SSLv2
Force SSLv2
CURL_SSLVERSION_SSLv3
Force SSLv3
It says we would need to pass a long with value CURL_SSLVERSION_SSLv3 to stipulate sslv3.
What is the value of CURL_SSLVERSION_SSLv3? We can examine RCurl:::SSLVERSION_SSLv3
> c(RCurl:::SSLVERSION_DEFAULT, RCurl:::SSLVERSION_TLSv1, RCurl:::SSLVERSION_SSLv2, RCurl:::SSLVERSION_SSLv3)
[1] 0 1 2 3
>
So in fact the permissible values for sslversion are 0,1,2 or 3.
So the confusion in this case arose from the curl program which presumably uses the libcurl api implementing this in a binary fashion.
So the correct way in this case to use this option would be:
postForm(url, .opts = list(sslversion = 3))
or
postForm(url, .opts = list(sslv = 3))
you can use the shorter sslv as .opts is passed to mapCurlOptNames which will use pmatch
to find sslversion.
To be fair to the author of RCurl this is all explained in http://www.omegahat.org/RCurl/philosophy.html also located in /RCurl/inst/doc/philosophy.html .An excerpt reads:
Each of these and what it controls is described in the libcurl
man(ual) page for curl_easy_setopt and that is the authoritative
documentation. Anything we provide here is merely repetition or
additional explanation.
The names of the options require a slight explanation. These
correspond to symbolic names in the C code of libcurl. For example,
the option url in R corresponds to CURLOPT_URL in C. Firstly,
uppercase letters are annoying to type and read, so we have mapped
them to lower case letters in R. We have also removed the prefix
"CURLOPT_" since we know the context in which they option names are
being used. And lastly, any option names that have a _ (after we have
removed the CURLOPT_ prefix) are changed to replace the '_' with a '.'
so we can type them in R without having to quote them. For example,
combining these three rules, "CURLOPT_URL" becomes url and
CURLOPT_NETRC_FILE becomes netrc.file. That is the mapping scheme.
Try this (after reviewing examples on ?curlOptions after being referred by ?postForm:)
myOpts = curlOptions(sslv3 = TRUE)
postForm(url, .opts = myOpts)
Although I admit I thought your code should work. You may need to also post you version numbers. There is also a curlSetOpt that might be more "assertive".

Resources