Fill column in Julia - r

I am trying to fill a column in Julia with values from another matrix. In R, it would look like this:
for(id in 1:y){
countries[id,1] <- x[id, countries1[id]]
}
However, when I try to convert the left side of the equal sign to Julia like so:
countries[:1]
I get an error which says:
"ERROR: MethodError: Cannot `convert` an object of type Int64 to an
object of type Array{Int64,2}
This may have arisen from a call to the constructor Array{Int64,2} .
(...), since type constructors fall back to convert methods."
I don't think my Julia conversion in correct to start with since I am leaving off id. How can I convert the r code to Julia effectively?

It is not clear how the OP intends to convert the stated R code to Julia code. However, given that it involves countries[:1], we can make an educated guess about the error:
countries[:1] should be countries[:, 1].
countries[:, 1] returns the first column in the matrix countries.
countries[:1] resolves to countries[1] and returns the integer countries[1,1]. This is because a leading colon :<name> tells Julia to treat <name> as a symbol. At some point that symbol is parsed, returning 1.
The latter point explains the error message:
ERROR: MethodError: Cannot convert an object of type Int64 to an
object of type Array{Int64,2}
The OP expected countries[:1] to return an array (Array{Int64,2}) , when in fact, it returns an integer (Int64).

Related

Problems obtaining the correct object class. R

I created a small function to process a dataframe to be able to use the function:
preprocessCore::normalize.quantiles()
Since normalize.quintles() can only use a matrixc object, and I need to rearrange my data, I create a small function that takes a specific column (variable) in a especific data frame and do the following:
normal<-function(boco,df){
df_p1<-subset(df,df$Plate==1)
df_p2<-subset(df,df$Plate==2)
mat<-cbind(df_p1$boco,df_p2$boco)
norm<-preprocessCore::normalize.quantiles(mat)
df_1<-data.frame(var_1=c(norm[,1],norm[,2]),well=c(df_p1$well,df_p2$well))
return(df_1)
}
However, "mat" should be a matrix, but it seems the cbind() does not do its job since I'm obtaining the following Error:
normal(antitrombina_FI,Six_Plex_IID)
Error in preprocessCore::normalize.quantiles(mat) :
Matrix expected in normalize.quantiles
So, it is clear that the cbind() is not creating a matrix. I don't understand why this is happening.
Most likely you are binding two NULL objects together, yielding NULL, which is not a matrix. If your df objects are data.frame, then df_p1$boco is interpreted as "extract the variable named boco", not "extract the variable whose name is the value of an object having the symbol boco". I suspect that your data does not contain a variable literally named "boco", so df_p1$boco is evaluated as NULL.
If you want to extract the column that is given as the value to the formal argument boco in function normal() then you should use [[, not $:
normal<-function(boco,df){
df_p1<-subset(df,df$Plate==1)
df_p2<-subset(df,df$Plate==2)
mat<-cbind(df_p1[[boco]],df_p2[[boco]])
norm<-preprocessCore::normalize.quantiles(mat)
df_1<-data.frame(var_1=c(norm[,1],norm[,2]),well=c(df_p1$well,df_p2$well))
return(df_1)
}
Thanks for your help bcarlsen. However I have found some errors:
First, I believe you need to introduce quotes in
mat<-cbind(df_p1[["boco"]],df_p2[["boco"]])
If I run this script outside of a function works erally perfectly:
df_p1<-subset(Six_Plex_IID,Six_Plex_IID$Plate==1)
df_p2<-subset(Six_Plex_IID,Six_Plex_IID$Plate==2)
mat<-cbind(df_p1[["antitrombina_FI"]],df_p2[["antitrombina_FI"]])
norm<-preprocessCore::normalize.quantiles(mat)
However If I introduce this now in a function and try to run it like a function:
normal<-function(boco,df){
df_p1<-subset(df,df$Plate==1)
df_p2<-subset(df,df$Plate==2)
mat<-cbind(df_p1[["boco"]],df_p2[["boco"]])
norm<-preprocessCore::normalize.quantiles(mat)
df_1<-data.frame(var_1=c(norm[,1],norm[,2]),well=c(df_p1$well,df_p2$well))
return(df_1)
}
normal(antitrombina_FI,Six_Plex_IID)
I get the same error mesage:
Error in preprocessCore::normalize.quantiles(mat) :
Matrix expected in normalize.quantiles
I'm completely clueless about why this is happening, why outside the function I'm obtaining a matrix and why inside the function not.
Thanks

Julia: How to delte a column name starts with number in data frame

I have data frame which a column name starts with number, I want to delete the column with the following code, but there is error:
delete!(features, [:3SsnPorchH])
UndefVarError: SsnPorchH not defined
Your problem is that :3SsnPorchH is not correctly parsed as a symbol, but as follows:
julia> :(:3SsnPorchH)
:($(QuoteNode(3)) * SsnPorchH)
When a symbol cannot be correctly parsed, it most often works to put the "name" into parentheses:
julia> :(3SsnPorchH)
:(3SsnPorchH)
Another thing you could do is using the Symbol constructor directly:
julia> Symbol("3SsnPorchH")
Symbol("3SsnPorchH")
(But I'm not sure if that's a good idea -- maybe you lose interning then.)
That being said, it's probably a good idea to give columns a name which is a valid Julia identifier. This gives you construction using DataFrame with keyword arguments, and allows for certain macros to identify variables with columns. You'll just have an easier time.

Printing values inside for loop gives different results if you print by index or by value

I'm trying to understand why I get different results when I print the values of the list when I access them by index or just by directly:
ws <- c(as.Date('2016-01-01'))
for (w in ws) {
print(w)
}
# prints 16801
for (idx in 1:length(ws)) {
print(ws[idx])
}
# prints 2016-01-01
I'm not sure why the first time, when I access the value using in I get the wrong value.
How can I print the values directly accessing via in (not using indexes)? And also why is it happening?
Probably because ws is actually an integer; the "date" you see is just for printing purposes, but the data is tracked as days since a starting point.
When you create an indexing variable w in ws, R is probably coercing it to be an integer so w has lost all "date" attributes, and all you get is the underlying integer.
Indeed, in ?Control we see that for seq:
An expression evaluating to a vector (including a list and an
expression) or to a pairlist or NULL. A factor value will be coerced
to a character vector.
And as.vector(ws) will return ws to it's underlying, integer, state.
In your second example, ws has remained unchanged, so printing elements from it will continue to be formatted as dates.

Julia FITSio: FITS table with Float64 and ASCIIString

I am new to Julia, I hope my question is not too trivial.
I try to create a FITS binary table that includes various columns of Float64 and one column of ASCIIString. As explained in the FITSIO.jl documentation, the input to the write() function should be "a dictionary with ASCIIString keys (giving the column names) and Array values (giving data to write to each column)".
but it seems that a Dictionary cannot hold mixed types, and I get the following error:
data=Dict{"col1"=>[1.0,2.0,3.0], "col2"=>[4.0,5.0,6.0],"col3"=>["toto","tata","titi"]}
LoadError: TypeError: Dict: in parameter, expected Type{T}, got Pair{ASCIIString,Array{Float64,1}} while loading In[408], in expression starting on line 1
Does anyone knows how to create a FITS table including columns of mixed types, and in particular Float64 and ASCIIString?
It should be possible, since I can read such a table with the same FITSIO.jl library without problem, but the limited examples in the documentation do not illsutrate such a case.
Thank you!
Change the braces to parentheses and you'll create the list you intend.
data=Dict("col1"=>[1.,2.,3.], "col2"=>[4.,5.,6.], "col3"=>["toto","tata","titi"])
You are essentially calling the constructor of the Dict type using a sequence of pairs.
Extra info:
Braces are something else entirely. It's for specifying that the dictionary keys and values should be of (or converted to, if possible) a specific type. e.g.
julia> Dict{String,Array{Float64,1}}("a"=>[1.,2.,3.], "b"=>[4.,5.,6.])
Dict{String,Array{Float64,1}} with 2 entries:
"b" => [4.0,5.0,6.0]
"a" => [1.0,2.0,3.0]
julia> Dict{String,Array{Float64,1}}("a"=>[1.,2.,3.], "b"=>['a','b','c'])
Dict{String,Array{Float64,1}} with 2 entries:
"b" => [97.0,98.0,99.0]
"a" => [1.0,2.0,3.0]
julia> Dict{String,Array{Float64,1}}("a"=>[1.,2.,3.], "b"=>["a","b","c"])
ERROR: MethodError: Cannot `convert` an object of type String to an object of type Float64

Convert string argument to regular expression

Trying to get into Julia after learning python, and I'm stumbling over some seemingly easy things. I'd like to have a function that takes strings as arguments, but uses one of those arguments as a regular expression to go searching for something. So:
function patterncount(string::ASCIIString, kmer::ASCIIString)
numpatterns = eachmatch(kmer, string, true)
count(numpatterns)
end
There are a couple of problems with this. First, eachmatch expects a Regex object as the first argument and I can't seem to figure out how to convert a string. In python I'd do r"{0}".format(kmer) - is there something similar?
Second, I clearly don't understand how the count function works (from the docs):
count(p, itr) → Integer
Count the number of elements in itr for which predicate p returns true.
But I can't seem to figure out what the predicate is for just counting how many things are in an iterator. I can make a simple counter loop, but I figure that has to be built in. I just can't find it (tried the docs, tried searching SO... no luck).
Edit: I also tried numpatterns = eachmatch(r"$kmer", string, true) - no go.
To convert a string to a regex, call the Regex function on the string.
Typically, to get the length of an iterator you an use the length function. However, in this case that won't really work. The eachmatch function returns an object of type Base.RegexMatchIterator, which doesn't have a length method. So, you can use count, as you thought. The first argument (the predicate) should be a one argument function that returns true or false depending on whether you would like to count a particular item in your iterator. In this case that function can simply be the anonymous function x->true, because for all x in the RegexMatchIterator, we want to count it.
So, given that info, I would write your function like this:
patterncount(s::ASCIIString, kmer::ASCIIString) =
count(x->true, eachmatch(Regex(kmer), s, true))
EDIT: I also changed the name of the first argument to be s instead of string, because string is a Julia function. Nothing terrible would have happened if we would have left that argument name the same in this example, but it is usually good practice not to give variable names the same as a built-in function name.

Resources