Does R include an efficient implementation of sets? - r

Is there an efficient implementation of the set data structure in R?
In C++ I would use an std::set (which is implemented using red-black trees), in Python a set (which is implemented using hash tables), but I am not sure what I should use in R.
I have found this link, which describes some set operations, like union() and intersection(), that you can perform on vectors. So, I guess that since vectors are involved, the complexities would not be logarithmic, as you could have using the data structures mentioned above.
Fun fact, note how in this case the name of the language does not help, searching "r set" one finds many results concerning $\mathbb{R}$, and not the programming language :D

Related

R define a multidimensional object to behave like a vector

I would like to define a slightly more general version of a complex number in R. This should be a vector that has more than one component, accessible in a similar manner to using Re() and Im() for complex numbers. Is there a way to do this using S3/S4 classes?
I have read through the OO field guide among other resources, but most solutions seem focused around the use of lists as fundamental building objects. However, I need vectors for use in data.frames and matrices. I was hoping to use complex numbers as a template, but they seem to be implemented largely in C. At this point, I don't even know where to start.

Rules-of-thumb doc for mathematical programming in R?

Does there exist a simple, cheatsheet-like document which compiles the best practices for mathematical computing in R? Does anyone have a short list of their best-practices? E.g., it would include items like:
For large numerical vectors x, instead of computing x^2, one should compute x*x. This speeds up calculations.
To solve a system $Ax = b$, never solve $A^{-1}$ and left-multiply $b$. Lower order algorithms exist (e.g., Gaussian elimination)
I did find a nice numerical analysis cheatsheet here. But I'm looking for something quicker, dirtier, and more specific to R.
#Dirk Eddelbeuttel has posted a bunch of stuff on "high performance computing with R". He's also a regular so will probably come along and grab some well-deserved reputation points. While you are waiting you can read some of his stuff here:
http://dirk.eddelbuettel.com/papers/ismNov2009introHPCwithR.pdf
There is an archive of the r-devel mailing list where discussions about numerical analysis issues relating to R performance occur. I will often put its URL in the Google advanced search page domain slot when I want to see what might have been said in the past: https://stat.ethz.ch/pipermail/r-devel/

How to describe a MATLAB function using mathematical notation?

Basically I have created two MATLAB functions which involve some basic signal processing and I need to describe how these functions work in a written report. It specifically requires me to describe the algorithms using mathematical notation.
Maths really isn't my strong point at all, in fact I'm quite surprised I've even been able to develop the functions in the first place. I'm quite worried about the situation at the moment, it's the last section of writing I need to complete but it is crucially important.
What I want to know is whether I'm going to have to grab a book and teach myself mathematical notation in a very short space of time or is there possibly an easier/quicker way to learn? (Yes I know reading a book should be simple enough, but maths + short time frame = major headache + stress)
I've searched through some threads on here already but I really don't know where to start!
Although your question is rather vague, and I have no idea what sorts of algorithms you have coded that you are trying to describe in equation form, here are a few pointers that may help:
Check the MATLAB documentation: If you are using built-in MATLAB functions, they will sometimes give an equation in the documentation that describes what they are doing internally. Some examples are the functions CONV, CORRCOEF, and FFT. If the function is rather complicated, it may not have an equation but instead have links to some papers describing the algorithm, which may themselves have equations for the algorithm. An example is the function HILBERT (which you can also find equations for on Wikipedia).
Find some lists of common mathematical symbols: Some standard symbols used to represent common mathematical operations can be found here.
Look at some sample pseudocode to see how it's done: For algorithms you yourself have coded up, you'll have to write them out in equation or pseudocode form. A paper that I've used often in my work is Templates for the Solution of Linear Systems, and it has some examples of pseudocode that may be helpful to you. I would suggest first looking at the list of symbols used in that paper (on page iv) to see some typical notations used to represent various mathematical operations. You can then look at some of the examples of pseudocode throughout the rest of the document, such as in the box on page 8.
I suggest that you learn a little bit of LaTeX and investigate Matlab's publish feature. You only need to learn enough LaTeX to write mathematical expressions. Then you have to write Matlab comments in your source file in LaTeX, but only for the bits you want to look like high-quality maths. Finally, open the Matlab editor on your .m file, and select File | Publish.
See Very Quick Intro to LaTeX and check your Matlab documentation for publish.
In addition to the answers already here, I would strongly advise using words in addition to forumlae in your report to describe the maths that you are presenting.
If I were marking a student's report and they explained the concepts of what they were doing correctly, but had poor or incorrect mathematical notation to back it up: this would lose them some marks, but would hopefully not impede my understanding of the hard work they've put in.
If they had poor/wrong maths, with no explanation of what they meant to say, this could jeapordise my understanding of their entire project and cost them a passing grade.
The reason you haven't found any useful threads is because most of the time, people are trying to turn maths into algorithms, not vice versa!
Starting from an arbitrary algorithm, sometimes pseudo-code, along with suitable comments, is the clearest (and possibly only) representation.

values from different fields in matlab

does anybody familiar with a way that I could implement a matrix with values from a field (not the real or complex number, but lets say Z mod p). so I could perform all the operation of matlab on the matrix (with the values of the chosen field)
Ariel
I suspect that you will want to use Matlab's object-oriented capabilities so that you can define both the fundamental representation of the elements of your field, and the basic operations on them. There's a reasonably good, if elementary, example of implementing polynomials using Matlab's OO features in the product documentation. That might be a good place for you to start.

Applicative programming and common lisp types

I've just started learning Common Lisp--and rapidly falling in love with it--and I've just moved onto the type system. I seem to be developing a particular fondness for applicative programming.
As I understand it, in CL strings and lists are both sequences, but there don't seem to be any standard functions for mapping over a sequence, only lists. I can see why they would be supplied for lists, what with them being the fundamental datatype and all, but why was it not designed to work with sequences? As they are a more general type, it would seem more useful to target applicative functions at them rather than lists. Or am I completely misunderstandimatifying how it works?
Edit:
What I was feeling particularly confused about was the way that sequences -- the abstraction -- and lists -- an implementation -- seem to be muddled up in CL. The consensus seems to be that this is for historical reasons; lisp has been around so long that you can pretty much map out the development of software engineering practices through its functions and macros; which functions apply to sequences and which to lists seems arbitrary at first glance because CL has a mixture of pre-sequence-abstraction functions that operate only on lists, and functions that do the same thing in a more general way on sequences. As someone who is just learning CL at the moment, I think it would be useful if authors introduced sequences first as the cleaner abstraction, and then bought in lists as the most fundamental implementation of that abstraction. Lists would still be needed as syntax of course, but by the time it is necessary to state this explicitly many readers would have worked this out by themselves, which would be quite an ego boost when starting out.
Why, there are a lot of functions working on sequences. Mapping over a sequence is done with MAP or MAP-INTO.
Look at the sequences section of the CLHS to find out more.
There is also a quick reference that is nicely organized.
Well, you are generally correct. Most functions do indeed focus on lists (mapcar, find, count, remove, append etc.) For a few of these there are equivalent functions for sequences (concatenate, some and every come to mind), and some, where the list-equivalent is outdated (eg. nth for lists only vs. elt for all sequences). Some functions simply work on sequences (length, for example).
CL is a bit of a mess. It's a big language, as in huge. Over 700 functions, AFAIK. And it's old. Some of these functions are deprecated by convention, and others are rarely, if ever, used.
Yes, it would be more useful to have mapping functions be methods, that applied as intended on all sequences. CL was simply not built that way. If it were to be built again today, I'm sure this would be considered, and it would look very different.
That said, you are not left completely in the cold. The loop macro works on sequences, as does iterate (a separate looping macro, which i happen to like more). This will get you far. For most practical purposes you will be using lists, and this won't be more than a pragmatic problem. If you do happen to lack a mapping function for vectors (or sequences in general), who's to stop you from writing it?

Resources