What is select_at() for? - r

I understand how to use dplyr::select_if() and dplyr::mutate_at(). But I don't understand what dplyr::select_at() provides that a basic select() doesn't provide.
As far as I understand, the verb_at() functions allow you to utilize the select helper functions (like matches() and starts_with()). But select() already uses the select helpers--so why would you use select_at() instead of just select()?

The primary benefit of select_at() (as opposed to the vanilla select()) is that it provides an .funs= parameter so that you can use a function, eg. toupper() to rename files as you select them.
This makes a ton of sense for something like rename_at(). Providing similar functionality with select_at() makes sense from a tidyverse-style "everything works the same" perspective.

Related

why underscores are not recommended for variable names in Julia?

I read in the Julia doc page https://docs.julialang.org/en/v1/manual/variables/#:~:text=Variable%20names%20must%20begin%20with,Sm%20math%20symbols)%20are%20allowed. :
Word separation can be indicated by underscores ('_'), but use of
underscores is discouraged unless the name would be hard to read
otherwise
My question is if there any reasons to discourage the usage of underscores? Thanks.
I don't think underscores are really discouraged in user code and for internal variables. It is mostly for being consistent with the style in Base Julia, which follows this, mostly. And consistency is good, right?
But if you create a package or module, then the interface normally consists of types and functions. Typenames have strong convetion that they should be CapitalCase. User-facing functions are normally lowercase without _, because they are supposed to be simple, brief and should express a single well-defined concept. A bit like the Unix philospophy: every function should do one thing, and do it well.
A convention discouraging composite and long identifier names encourages you to create simple functions. If your function needs a name with underscores, it's possibly a sign that you should break it into multiple functions.
But in your own code, use whatever convension that suits you.
I’m no expert on Julia, but the line you quote is located under the header “Stylistic Conventions” and I would presume that’s basically it.
There is an additional section about naming conventions in the docs under Style Guide
There is a line in there that says:
“Underscores are also used to indicate a combination of concepts”.
So if you decided to use a lot of underscores in your function names, the next programmer to work on your code might think you are “combining concepts”.

Difference between the internal procedures and functions in Progress4gl?

Both internal procedures and functions are accepting the parameters to give the output. So what is the use of using Internal procedures instead of functions.
A user-defined function is used when you want to perform some calculation and return a single value. In this respect it is the same as a built-in ABL function, like the SUBSTRING or EXP functions. Putting this calculation code in a FUNCTION block instead of inline in your code allows you to put it in one place and reference it multiple times without code duplication.
An internal procedure is also an encapsulated piece of code that does some work, but it is more general-purpose. While a function must return a single value, an internal procedure may or may not have input parameters or output parameters.
https://docs.progress.com/category/openedge-archives
Also functions (like methods) parameters and return value type are checked at compile time, which removes some potential problems at run time later.
The question acknowledges that both functions and internal procedures allow OUTPUT parameters and asks "what is the use" of internal procedures instead of functions.
To me, this implies that the poster is contemplating always using functions and deprecating internal procedures and is asking: "what would I lose if I do that?"
Two things spring to mind:
Sort of the opposite of Jean-Christophe Cardot's point: you would lose some automatic type conversions and syntactic flexibility about the parameter lists. Some people see that flexibility in a negative light. Others see it as a positive.
You need to "forward declare" your functions or use dynamic invocations. With an internal procedure you can RUN it without providing a declaration earlier in the code.
If you tend to think that strict type checking is useful then these are probably not benefits that you think of as being lost. If you prefer more flexible behaviors, then you may regret choosing functions rather than internal procedures.

Ignoring certain types with respect to = in OCaml

I'm in a situation where I'm modifying an existing compiler written in OCaml. I've added locations to the AST of the compiled language, but it has cause a bunch of bugs, because equality checks that previously succeeded now fail when identical ASTs have a different location attached.
In particular, I'm seeing List.mem return false when it should return true, since it relies on equality.
I'm wondering, is there a way for me to specify that, for any two values of my location type, that = should always return true for any two values of this type?
It would be a ton of work to refactor the entire compiler to use a custom equality everywhere, particularly since many polymorphic functions rely on being able to use = on any type.
There's no existing OCaml mechanism to do what you want.
You can use ppx to write OCaml syntax extensions, and (as I understand it) the behavior can depend on types. So there's some chance you could get things working that way. But it wouldn't be as straightforward as what you're asking for. I suspect you would need to explicitly handle = and any standard functions (like List.mem) that use = implicitly. (Note that I have no experience with ppx.)
I found a description of PPX here: http://ocamllabs.io/doc/ppx.html
Many experienced OCaml programmers avoid the use of built-in polymorphic equality because its behavior is often surprising. So it might be worth converting to a custom comparison function after all.
What an annoying problem to have.
If you are desperate and willing to write a little C code you can change the representation of locations to Custom_tag blocks, which allow customising the behaviour of some of the polymorphic operations. It's a nasty solution, and I suggest you look hard for a better approach before resorting to this one.
One possibility is that most of the compiler does not use locations at all. If so, you might be able to get away with replacing every location in the AST with the same dummy location. That should allow equality to behave as if locations were not there at all. This is rather hacky, and may not be possible if passes later in the compiler make any use of location info.
The 'clean' solution is to define a sane equality operation for ASTs (or to derive one using ppx) and to change the code to use that. As you say, this would be a lot more work.

Convention for combining GET parameters with AND?

I'm designing an API and I want to allow my users to combine a GET parameter with AND operators. What's the best way to do this?
Specifically I have a group_by parameter that gets passed to a Mongo backend. I want to allow users to group by multiple variables.
I can think of two ways:
?group_by=alpha&group_by=beta
or:
?group_by=alpha,beta
Is either one to be preferred? I've consulted a few API design references but no-one seems to have a view on this.
There is no strict preference. The advantage to the first approach is that many frameworks will turn group_by into an array or similar structure for you, whereas in the second approach you need to parse out the values yourself. The second approach is also less verbose, which may be relevant if your query string is particularly large.
You may also want to test with the first approach that the query strings always come into your framework in the order the client sent them. Some frameworks have a bug where that doesn't happen.

Use data.frame in custom function?

Often functions that work with data.frames have the ability to let the user provide a dataset, so that the user can use its columns in a straight forward way. E.g.:
lm(mpg~cyl+gear,data=mtcars)
Instead of using mtcars$cyl in the formula, we can simply use cyl. How can I implement such behavior in custom built functions?
There are several different techniques for this, described in Standard nonstandard valuation rules.

Resources