Julia DataFrames countmap() - julia

Edit: countmap() is in StatsBase.jl and is working as expected. I'm not sure how to change this to an answered problem or if I should I delete it.
using DataFrames
using CSV
train = DataFrame(CSV.File("training.csv"))
countmap(train.column)
UndefVarError: countmap not defined
Stacktrace:
[1] top-level scope
# In[40]:1
[2] eval
# .\boot.jl:360 [inlined]
[3] include_string(mapexpr::typeof(REPL.softscope), mod::Module, code::String, filename::String)
# Base .\loading.jl:1094
I get that error running the code. Was countmap deprecated at some point and I just cannot find any updates about it or am I missing something?

Are you looking for countmap from the StatsBase.jl package?

You can use the following pattern to get a DataFrame instead of a Dict as a result:
combine(groupby(train, :column), nrow)
(in many cases it will be much faster than countmap)
Also instead of:
DataFrame(CSV.File("training.csv"))
it is more efficient to write:
CSV.read("training.csv", DataFrame)

Related

How can I use the Package LaTexStrings in Jupyter?

I am using Jupyter with the Julia Pro Kernel. And I want to just pass from a Julia Script to a Notebook. (I am getting to jupyter via anaconda)
I use the command
using LaTeXStrings to charge that package in the jl file. I use that command in order to put \alpha in the code and automatically appears the greek symbol α.
But when I am doing the same in Jupyter I just can't get the α.
I get this error as an output from that code chunk.
syntax: "\" is not a unary operator
Stacktrace:
[1] top-level scope at In[18]:1
[2] include_string(::Function, ::Module, ::String, ::String) at .\loading.jl:1091
[3] execute_code(::String, ::String) at C:\Users\jparedesm\.julia\packages\IJulia\rWZ9e\src\execute_request.jl:27
[4] execute_request(::ZMQ.Socket, ::IJulia.Msg) at C:\Users\jparedesm\.julia\packages\IJulia\rWZ9e\src\execute_request.jl:86
[5] #invokelatest#1 at .\essentials.jl:710 [inlined]
[6] invokelatest at .\essentials.jl:709 [inlined]
[7] eventloop(::ZMQ.Socket) at C:\Users\jparedesm\.julia\packages\IJulia\rWZ9e\src\eventloop.jl:8
[8] (::IJulia.var"#15#18")() at .\task.jl:356
Does anyone know how can I get the greek alphabet in Jupyter? It'll be very helpful!
If I understand you correctly, you just want to use the greek symbol alpha α. For that you don't need the package LaTeXString. You can just type \alpha and hit the TAB key. Then \alpha should automagically change to α.
LaTeXStrings is used to type LaTeX equations in string literals, like in L"1 + \alpha^2"

How R differentiates between the two filter function one in dplyr package and other for linear filtering in time series?

I wanted to filter a data set based on some conditions. When I looked at the help for filter function the result was:
filter {stats} R Documentation
Linear Filtering on a Time Series
Description
Applies linear filtering to a univariate time series or to each series separately of a multivariate time series.
After searching on web I found the filter function I needed i.e. from dplyr package. How can R have two functions with same name. What am I missing here?
At the moment the R interpreter would dispatch a call to filter to the dplyr environment, at least if the class of the object were among the avaialble methods:
methods(filter)
[1] filter.data.frame* filter.default* filter.sf* filter.tbl_cube* filter.tbl_df* filter.tbl_lazy*
[7] filter.ts*
As you can see there is a ts method, so if the object were of that class, the interpreter would instead deliver the x values to it. However, it appears that the authors of dplyr have blocked that mechanism and instead put in a warning function. You would need to use:
getFromNamespace('filter', 'stats')
function (x, filter, method = c("convolution", "recursive"),
sides = 2L, circular = FALSE, init = NULL)
{ <omitting rest of function body> }
# same result also obtained with:
stats::filter
R functions are contained in namespaces, so a full designation of a function would be: namespace_name::function_name. There is a hierarchy of namespace containers (actually "environments" in R terminology) arranged along a search path (which will vary depending on the order in which packages and their dependencies have been loaded). The ::-infix-operator can be used to specify a namespace or package name that is further up the search path than might be found in the context of the calling function. The function search can display the names of currently loaded packages and their associated namespaces. See ?search Here's mine at the moment (which is a rather bloated one because I answer a lot of questions and don't usually start with a clean systems:
> search()
[1] ".GlobalEnv" "package:kernlab" "package:mice" "package:plotrix"
[5] "package:survey" "package:Matrix" "package:grid" "package:DHARMa"
[9] "package:eha" "train" "package:SPARQL" "package:RCurl"
[13] "package:XML" "package:rnaturalearthdata" "package:rnaturalearth" "package:sf"
[17] "package:plotly" "package:rms" "package:SparseM" "package:Hmisc"
[21] "package:Formula" "package:survival" "package:lattice" "package:remotes"
[25] "package:forcats" "package:stringr" "package:dplyr" "package:purrr"
[29] "package:readr" "package:tidyr" "package:tibble" "package:ggplot2"
[33] "package:tidyverse" "tools:rstudio" "package:stats" "package:graphics"
[37] "package:grDevices" "package:utils" "package:datasets" "package:methods"
[41] "Autoloads"
At the moment I can find instances of 3 versions of filter using the help system:
?filter
# brings this up in the help panel
Help on topic 'filter' was found in the following packages:
Return rows with matching conditions
(in package dplyr in library /home/david/R/x86_64-pc-linux-gnu-library/3.5.1)
Linear Filtering on a Time Series
(in package stats in library /usr/lib/R/library)
Objects exported from other packages
(in package plotly in library /home/david/R/x86_64-pc-linux-gnu-library/3.5.1)

How Can I Get a List of Every Command in an R Package?

I want to see a list of the available commands in an R package, ideally, outputted to the console. In RStudio, I can type the name of a package, followed by two colons (e.g. ggplot2::) and RStudio's GUI will pop up a list of available commands. Is this such a list? Even so, I can't get that to output to the console, and it doesn't work in vanilla R. Any alternatives?
> require(ggplot2)
Then
> ls("package:ggplot2")
[1] "%+%" "aes"
[3] "aes_" "aes_all"
[5] "aes_auto" "aes_q"
[7] "aes_string" "alpha"
[9] "annotate" "annotation_custom"
[etc]
You can also use ls() with a position on the search list, eg
> ls(pos=2)
Get the search list with search().
This gets all the functions in particular package. Here's all the functions in tidyr:
objs <- mget(ls("package:tidyr"), inherits = TRUE)
funs <- Filter(is.function, objs)

How to call a specific S4 method in R

I was working in the mirt package in R and noticed that I couldn't use mirt:: or mirt::: to call the coef or residuals functions. From what I can tell this is a S3 to S4 difference (magic fingers & hand waving).
Which brings me to the question, how do you call a specific R function within it's package when it's coded in S4?
After
> library(mirt)
Loading required package: stats4
Loading required package: lattice
I see
> methods(coef)
[1] coef,ANY-method coef,DiscreteClass-method
[3] coef,MixedClass-method coef,mle-method
[5] coef,MultipleGroupClass-method coef,SingleGroupClass-method
[7] coef,summary.mle-method coef.aov*
[9] coef.Arima* coef.default*
[11] coef.listof* coef.nls*
see '?methods' for accessing help and source code
I guess you have an instance of one of the classes, e.g., 'DiscreteClass'. You can select the method with
selectMethod("coef", signature="DiscreteClass")
or maybe more naturally
selectMethod("coef", class(obj))
where obj is an instance of the object you're interested in. But you shouldn't have to call a specific method; this should be taken care of -- what's the problem you're actually experiencing.

Change search() locations without unloading packages

I am using code which depends on two packages that conflict. I would like to give one priority for only a short period of time and my plan is to just move it up to the front of search(). However, I can't just unload and reload. I tried that and it causes other problems, and running library on an already loaded package does not work.
Here is an example (the real use case involves non-CRAN packages):
library(ggplot2)
library(MASS)
> search()
[1] ".GlobalEnv" "package:MASS" "package:ggplot2"
[4] "package:stats" "package:graphics" "package:grDevices"
[7] "package:utils" "package:datasets" "package:methods"
[10] "Autoloads" "package:base"
How can I now move package:ggplot2 ahead of package:MASS without detaching/unloading ggplot2?
EDIT
Inside the function I need to call, say function1, there is an expression that makes further calls. I cannot edit those calls to append ::.
e.g.
unchangeable <- function1("abc") ~ function2("def")
Suppose mainFun is the one I want to call. I can do
mainFun(unchangeable)
but I cannot specify
mainFun::unchangeable
It is indeed possible to edit unchangeable by manipulating formula objects. But that is not ideal and I need a more general solution for an object of other types.
EDIT2:
Here is an example, which shows a similar problem.
library(mgcv)
library(gam)
y <- rnorm(100)
x <- rnorm(100)
thisformula <- y ~ s(x)
gamgam <- gam(thisformula)
# s <- mgcv::s
mgcvgam <- mgcv::gam(thisformula)
This gives me the error
Error: $ operator is invalid for atomic vectors
Uncommenting the line s <- mgcv::s solves the problem in this case. But in my more general case it doesn't, and in any case it seems like a hack. How can I have all functions that are called within mgcv::gam first be looked up in mgcv?
You can refer to the function in the specific package using ::. For example ggplot2::labs will always refer to that function under ggplot2 even if it is masked by some later package being loaded

Resources