I wrote a function in which I use several superassignments. This enables me, when I source this function, to automatically create new global variables ready for later use.
I got a comment on my code that briefly stated that using superassignments was a terrible idea and that this person would not consider the code trustworthy as long as I did not change that.
I am not trained in programming and I naively did not consider bad side effects of using superassignments.
When am I putting my code at risk by using them?
I'm writing a source-code-level optimizer for a language (LSL) whose server-side compiler is non-optimizing. The optimizer is working and performing well in some fields like constant folding, but to make it stronger, I wanted to build a CFG, to enable/simplify jump optimization, dead code removal, and later benefit from SSA.
Now, since it's source-level, I can't afford to convert the code to something like 3AC, otherwise I risk creating a version that is actually less optimal than the original.
So my question is: What would I include in the CFG? The logical answer would be to store AST nodes that represent whole statements. The problem I'm facing is that expressions may call user functions, and these functions may alter global variables, which would affect the SSA form, which is why I wanted to include the functions as part of the same graph, rather than different graphs.
In a Functional Programming book the author mentions the following are the side effects.
Modifying a variable
Modifying a data structure in place
Setting a field on an object
Throwing an exception or halting with an error
Printing to the console or reading user input
Reading from or writing to a file
Drawing on the screen
I am just wondering how it is possible to write pure functional program without reading or writing to a file if they are side effects. If yes what would be the common approach in the functional world to achieve this ?
Properly answering this question requires likely an entire book (not too long). The point here is that functional programming is meant to separate logic description / representation from its actual runtime interpretation. Your functional code just represents (doesn't run) the effects of your program as values, giving you back some kind of abstract syntax tree that describes your computation. A different part of your code (usually called the interpreter) will take those values and lazily run the actual effects. That part is not functional.
How is it possible to write a pure functional program that is useful in any way? It is not possible. A pure functional program only heats up the CPU. It needs an impure part (the interpreter) that actually writes to disk or to the network. There are several important advantages in doing it that way. The pure functional part is easy to test (testing pure functions is easy), and the referentially transparent nature of pure functions makes easy to reason about your code locally, making the development process as a whole less buggy and more productive. It also offers elegant ways to deal with traditionally obfuscated defensive code.
So what is the common approach in the functional world to achieve side effects? As said, representing them using values, and then writing the code that interprets those values. A really good explanation of the whole process can be found in these blog post series.
For the sake of brevity, let me (over)simplify and make the long story short:
To deal with "side effects" in purely functional programming, you (programmers) write pure functions from the input to the output, and the system causes the side effects by applying those pure functions to the "real world".
For example, to read an integer x and write x+1, you (roughly speaking) write a function f(x) = x+1, and the system applies it to the real input and outputs its return value.
For another example, instead of raising an exception as a side effect, your pure function returns a special value representing the exception. Various "monads" such as IO in Haskell generalize these ideas, that is, represent side effects by pure functions (actual implementations are more complicated, of course).
I have recently been reading the SBCL User Manual and started wondering about the title question.
Obviously some lisps, for example clojure, ban all side effects so they can easily parallelize the code. Common Lisp allows side effects and so I was wondering if the fact a given function is 'dirty' or 'clean' affects it's compilation.
For example in the CMUCL compiler manual let optimizations show how, in many casesm the use of 'let' to bind a new variable will be more efficient than modifying with 'setq'. I guess I'm asking if something similar is done for function calls.
I have read the relevant sections of the sbcl manual and poured through the question on stackoverflow but could not find an answer to this.
Not faster. Sometimes actually slower.
According to Stas Boukarev from SBCL-devel,
SBCL doesn't even know that a function has no side effects, so, no.
Besides, most of the time having side effects is the most optimal
I am aware of the fact that functions such as nreverse, which are destructive, tend to be faster than nondestructive functions (in this case reverse is the nondestructive version). They also come with many setbacks. As Peter Siebel put it:
Each recycling function is a loaded gun pointed footward.
I need to know if the data for training that is passed in the neuralnet call is randomized in the routine or does the routine uses the data in the same order that is given. I really need to know this info for a project that I am working on, and I have not being able to figure it out by looking at the source.
Look into the code - thats one of the most important advantages of FOSS: you can actually check what it is doing (neuralnet is pure R, so you don't even need to fear that you need to dig into FORTRAN or C code, and you can use debug to step through the code with example data to get an overview).
Moreover, if necessary, you can even introduce e.g. a new parameter that allows you to switch off randomization if needed.
Possibly maintainer ("neuralnet") would be willing to help you as well (and able to answer much faster than about everyone else here on SE).
I'm developing an major upgrade to the R package, and as part of the changes I want to start using the S3 methods so I can use the generic plot, summary and print functions. But I think I'm not totally sure I understand why and when to use generic functions in general.
For example, I currently have a function called logLikSSM, which computes the log-likelihood of a state space model. Instead of using this functions, I could make function logLik.SSM or something like that, as there is generic function logLik in R. The benefit of this would be that logLik is shorter to write than logLikSSM, but is there really any other point in this?
Similar case, there is a generic function called simulate in stats package, so in theory I could use that instead of simulateSSM. But now the description of the simulate function tells that function is used to "Simulate Responses", but my function actually simulates the hidden states, so it really doesn't fit into the description of the simulate function. So probably in this case I shouldn't use the generic function right?
I apologize if this question is too vague for here.
The advantages of creating methods for generics from the core of R include:
Ease of Use. Users of your package already familiar with those generics will have less to remember making it easier to use your package. They might even be able to do a certain amount without reading the documentation. If you come up with your own names then they must discover and remember new names which is an added cognitive burden.
Leverage Existing Functionality. Also any other functions that make use of generics you create methods for can then automatically use yours as well; otherwise, they would have to be changed. For example, AIC uses logLik.
A disadvantage is that the generic involves the extra level of dispatch and if logLik is in the inner loop of an optimization there could be an impact (although possibly not material). In that case you could check the performance of calling the generic vs. calling the method directly and use the latter if it makes a significant difference.
Regarding the case that your function has a completely different purpose than the generic in the core of R, then it might be more confusing than helpful so you might, in that case, not create a method but have your own function name.
You might want to read the zoo Design manual (see link to zoo Design under Vignettes near the bottom of that page) which discusses the design ideas that went into the zoo package. These include the idea being discussed here.
EDIT: Added disadvantates.
good question.
I'll split your Question into two parts; here's the first one:
i]s there really any other point in [making functions generic]?
Well, this pattern is usually invoked when the develper doesn't know the object class for every object he/she expects a user to pass in to the method under consideration.
And because of this uncertainty, this design pattern (which is called overloading in many other languages) is invokved, and which requires R to evaluate the object class, then dispatch that object to the appropriate method given the object type.
The second part of your Question: [i]n this case I shouldn't use [the generic function] right?
To try to give you an answer useful beyond the detail of your Question, consider what happens to the original method when you call setGeneric, passing that method in.
the original function body is replaced with code for performing a top-level dispatch based on type of object passed in. This replaces the original function body, which just slides down one level so that it becomes the default method that the top level (generic) function dispatches to.
showMethods() will let you see all of those methods which are called by the newly created dispatch function (generic function).
And now for one huge disadvantage:
Ease of MISUse:
Users of your package already familiar with those generics might do a certain amount without reading the documentation.
And therein lies the fallacy that components, reusable objects, services, etc are an easy panacea for all software challenges.
And why the overwhelming majority of software is buggy, bloated, and operates inconsistently with little hope of tech support being able to diagnose your problem.
There WAS a reason for static linking and small executables back in the day. But this generation of code now, get paid now, debug later if ever, before the layoffs/IPO come, has no memory of the days when code actually worked very reliably and installation/integration didn't require 200$/hr Big 4 consultants or hackers who spend a week trying to get some "simple" open source product installed and productively running.
But if you want to continue the tradition of writing ever shorter function/method names, be my guest.