R -Eye Tracking packages for SMI's BeGaze users - r

Currently, we are working on an eye-tracking study and we are not quite satisfied with the analysis option of SMI’s BeGaze. Hence, I would like to ask you know good way to extract data from BeGaze, which can be processed by a handy R-package, which is still working under R 3.6.1 – not like ETRAN.
It would be great to do add AOIs manually, do heat maps, analyze saccades, fixation times, and ratios between AOIs.
We came across eyetrackingR, but we are still struggling with extracting BeGaze's data in a processable way.
Any help, tutorial, hint, etc. is much appreciated.
David

Related

Analysing vocal similarity of little owls using warbleR in R

I am struggling a bit with an analysis I need to do. I have collected data consisting of little owl calls that were recorded along transects. I want to analyse these recordings for similarity, in order to see which recorded calls are from the same owls and which are from different owls. In that way I can make an estimate of the size of the population at my study area.
I have done a bit of research and it seems that the package warbleR seems to be suitable for this. However, I am far from an R expert and am struggling a bit with how to go about this. Do any of you have experience with these types of analyses and maybe have example scripts? It seems to me that I could use the function cross_correlation and maybe make a pca, however in the warbleR vignette I looked at they only do this for different types of calls and not for the same type call from different individuals, so I am not sure if it would work.
to be able to run analyses with warbleR you need to input the data using the "selection_table" format. Take a look at the example data "lbh_selec_table" to get a sense of the format:
library(warbleR)
data(lbh_selec_table)
head(lbh_selec_table)
The whole point of these objects is to tell R the time location in your sound files (in seconds) of the signals you want to analyze. Take a look at this link for more details on this object structure and how to import it into R.

R Package to Analyse Eye Tracking data

I was wondering if anyone out there has found a nice package for R to analyse eye-tracking data?
I came across eyetrackR but as far as I can tell there is no English support documentation available:
http://read.psych.uni-potsdam.de/pmr2/index.php?option=com_content&view=article&id=43:eyetrackr&catid=13:r-playground&Itemid=15
I will move onto another freeware that handles eye-tracking data if I need to but was really hoping there would be something accessible in R.
Ideas?
Cheers.
It would help if you could explain which kind of analyses you are intending to do. There are many different approaches depending on the research question and the research field. Many approaches involve the detection of fixations and saccades as a first step. An R package that can be used for fixation detection is called saccades and is available on CRAN. See also the Github page of the package for examples and screenshots.
A new eye-tracking analysis package for R (eyetrackingR) was recently released. It provides a variety of methods that handle data preparation/cleaning, visualization, and analysis.
Here's a list of several dozen instances of researcher contributed code (FOSS) for post-acquisition summarization and analysis of eye-movement data. You may be able to find something to suit your needs there.
List is provided in case anyone stumbling across this thread may find it useful.
https://github.com/davebraze/FDBeye/wiki/Researcher-Contributed-Eye-Tracking-Tools

Basic analysis of large CSV with FF package in R

I have been messing around with R for the last year and now want to get a little deeper. I want to learn more about the ff and big data packages because have been trouble getting through some of the documentation.
I like to learn by doing, so lets say I have a huge CSV called data.csv and its 300 mbs. It has 5 headers Url, PR, tweets, likes, age. I want to deduplicate the list based on URLs. Then I want to plot PR and likes on a scatter plot to see if there is any correlation. How would I go about doing that basic analysis?
I always get confused with the chunking of the big data processes and how you cant load everything in at once.
What are come common problems you have ran into using the ff package or big data?
Is there another package that works better?
Basically any information to get started using a lot of data in R would be useful.
Thanks!
Nico

How did you experience the transition from SPSS to R? [closed]

Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 9 years ago.
Improve this question
The discussion in this question is the direct cause for me asking this question. The more general reason is the fact that I often have to explain R use to people that are only familiar with SPSS. I know most of the basics of SPSS, as we still use it in the base course statistics. But as I'm more of an R guy, it's difficult to know how SPSS users experience the first meeting with R.
I know there is the book R for SAS and SPSS users and that contains already some information. Yet, I would like to know what the more difficult parts are when you switch from SPSS to R.
Or in other words : if you would have to explain R in one day to SPSS users, which topics would you focus on? This is not a hypothetical question by the way (yeah, I know, it's not because one get paid for it that it always makes sense...).
Firstly, data manipulation has been the most challenging thing to learn coming from SPSS/SAS to R. I've found, personally, that getting the data in the right shape for an analysis is usually much more difficult than the analysis itself. Secondly, a true understanding of how to deal with categorical values through the use of factors. Lastly, summary statistics and descriptives can sometimes be challenging to get in a format that is transmutable to PPT or Excel which are what (my) clients generally expect/demand for reporting.
I would focus on:
1 Data manipulation
Understanding data structures. Import/Export. Then in-depth training on the use of packages like plyer, reshape with a particular focus on how to effectively use cast with formulas and melt with ids. How to apply numerical functions within a data.frame using ddply.
2 Factoring Data
In general, an explanation of dealing with recoding with, epicalc or a user-defined function. Also an explanation of the significance of factors, levels, and labels
3 Descriptives
Take a few minutes to introduce xtabs(), table(), prop.table() using cast() from reshape to create columnar tables of data that are more reasonably exported to Excel.
Graphics are optional, if you've done a good job of the above they should be able to get the data they need to create graphs in whatever software they are most comfortable with.
4 Graphics
If you've done a good job teaching the data manipulation, getting data into the shape needed for graphing should be pretty straightforward (or at least reproducible) at this point. ggplot2 is complicated and requires a day just by itself to be played with. But it is possible to give a quick overview of it. Alternatively, base graphics are simple to understand and the help is much more clear on what things do and how the syntax works.
Note: I left out statistical analysis. However, an overview of lm() and perhaps anova(), or cor() would be helpful as a start point. But this should be explained at the same time as data.manipulation.
Although I "wrote the book" on R to SPSS migration, that was aimed at programmers and most SPSS users that I know prefer to "point-and-click" instead. A graphical user interface like Deducer (or R Commander) can help them feel at home while teaching them how R programming code works if they want to see it. Deducer's Plot Builder also does a nice job letting you create complex plots easily, and if you want to learn to ggplot2 code, it will show you that as well. Ian did a great job with it!
However, while the SPSS graphical user interface covers 98% of what SPSS can do, Deducer covers perhaps 1% of what R can do. That's probably still 75% of what your average researcher needs, but R is so broad that to get the most out of it people will need to learn to program. The free version of my book, "R for SAS and SPSS Users" is only 80 pages & covers the areas of programming that I think are most likely to confuse beginners. It's at http://r4stats.com.
Just recently I've had a student who was somewhat versed in statistics and did some analysis beforehand in SPSS. I then showed him how to do the exact same thing in R. We went through the code and plotting, explaining and debating each line. He realized how easy and convenient it is to do it in R. Thus, R community grew by 1. :)
The biggest issue that the researchers I've dealt with have is the lack of point-and-click GUI. While there are a number of efforts out there in the R community, none of them have reached the ease-of-use/power level that SPSS has.
Since coding is second nature to R users, sometimes we forget that the majority of users of statistical software can't program (and would avoid it like the plague), even though they may have a strong practical understanding of statistics.
If I had one day to bring an SPSS user into R, I'd start them on Deducer. Deducer is an R GUI project (Self promotion note: I'm the author) that should feel very familiar to a user coming from SPSS. As they find themselves needing more advanced functions, they will naturally move to the command line to fulfill their needs.

R and SPSS difference

I will be analysing vast amount of network traffic related data shortly, and will pre-process the data in order to analyse it. I have found that R and SPSS are among the most popular tools for statistical analysis. I will also be generating quite a lot of graphs and charts. Therefore, I was wondering what is the basic difference between these two softwares.
I am not asking which one is better, but just wanted to know what are the difference in terms of workflow between the two (besides the fact that SPSS has a GUI). I will be mostly working with scripts in either case anyway so I wanted to know about the other differences.
Here is something that I posted to the R-help mailing list a while back, but I think that it gives a good high level overview of the general difference in R and SPSS:
When talking about user friendlyness
of computer software I like the
analogy of cars vs. busses:
Busses are very easy to use, you just
need to know which bus to get on,
where to get on, and where to get off
(and you need to pay your fare). Cars
on the other hand require much more
work, you need to have some type of
map or directions (even if the map is
in your head), you need to put gas in
every now and then, you need to know
the rules of the road (have some type
of drivers licence). The big advantage
of the car is that it can take you a
bunch of places that the bus does not
go and it is quicker for some trips
that would require transfering between
busses.
Using this analogy programs like SPSS
are busses, easy to use for the
standard things, but very frustrating
if you want to do something that is
not already preprogrammed.
R is a 4-wheel drive SUV (though
environmentally friendly) with a bike
on the back, a kayak on top, good
walking and running shoes in the
pasenger seat, and mountain climbing
and spelunking gear in the back.
R can take you anywhere you want to go
if you take time to leard how to use
the equipment, but that is going to
take longer than learning where the
bus stops are in SPSS.
There are GUIs for R that make it a bit easier to use, but also limit the functionality that can be used that easily. SPSS does have scripting which takes it beyond being a mere bus, but the general phylosophy of SPSS steers people towards the GUI rather than the scripts.
I work at a company that uses SPSS for the majority of our data analysis, and for a variety of reasons - I have started trying to use R for more and more of my own analysis. Some of the biggest differences I have run into include:
Output of tables - SPSS has basic tables, general tables, custom tables, etc that are all output to that nifty data viewer or whatever they call it. These can relatively easily be transported to Word Documents or Excel sheets for further analysis / presentation. The equivalent function in R involves learning LaTex or using a odfWeave or Lyx or something of that nature.
Labeling of data --> SPSS does a pretty good job with the variable labels and value labels. I haven't found a robust solution for R to accomplish this same task.
You mention that you are going to be scripting most of your work, and personally I find SPSS's scripting syntax absolutely horrendous, to the point that I've stopped working with SPSS whenever possible. R syntax seems much more logical and follows programming standards more closely AND there is a very active community to rely on should you run into trouble (SO for instance). I haven't found a good SPSS community to ask questions of when I run into problems.
Others have pointed out some of the big differences in terms of cost and functionality of the programs. If you have to collaborate with others, their comfort level with SPSS or R should play a factor as you don't want to be the only one in your group that can work on or edit a script that you wrote in the future.
If you are going to be learning R, this post on the stats exchange website has a bunch of great resources for learning R: https://stats.stackexchange.com/questions/138/resources-for-learning-r
The initial workflow for SPSS involves justifying writing a big fat cheque. R is freely available.
R has a single language for 'scripting', but don't think of it like that, R is really a programming language with great data manipulation, statistics, and graphics functionality built in. SPSS has 'Syntax', 'Scripts' and is also scriptable in Python.
Another biggie is that SPSS squeezes its data into a spreadsheety table structure. Dealing with other data structures is probably very hard, but comes naturally to R. I wouldn't know where to start handling network graph type data in SPSS, but there's a package to do it for R.
Also with R you can integrate your workflow with your reporting by using Sweave - you write a document with embedded bits of R code that generate plots or tables, run the file through the system and out comes the report as a PDF. Great for when you want to do a weekly report, or you do a body of work and then the boss gives you an updated data set. Re-run, read it over, its done.
But you know, your call...
Well, are you a decent programmer? If you are, then it's worthwhile to learn R. You can do more with your data, both in terms of manipulation and statistical modeling, than you can with SPSS, and your graphs will likely be better too. On the other hand, if you've never really programmed before, or find the idea of spending several months becoming a programmer intimidating, you'll probably get more value out of SPSS. The level of stuff that you can do with R without diving into its power as a full-fledged programming language probably doesn't justify the effort.
There's another option -- collaborate. Do you know someone you can work with on your project (you don't say whether it's academic or industry, but either way...), who knows R well?
There's an interesting (and reasonably fair) comparison between a number of stats tools here
http://anyall.org/blog/2009/02/comparison-of-data-analysis-packages-r-matlab-scipy-excel-sas-spss-stata/
I work with both in a company and can say the following:
If you have a large team of different people (not all data scientists), SPSS is useful because it is plain (relatively) to understand. For example, if users are going to run a model to get an output (sales estimates, etc), SPSS is clear and easy to use.
That said, I find R better in almost every other sense:
R is faster (although, sometimes debatable)
As stated previously, the syntax in SPSS is aweful (I can't stress this enough). On the other hand, R can be painful to learn, but there are tons of resources online and in the end it pays much more because of the different things you can do.
Again, like everyone else says, the sky is the limit with R. Tons of packages, resources and more importantly: indepedence to do as you please. In my organization we have some very high level functions that get a lot done. The hard part is creating them once, but then they perform complicated tasks that SPSS would tangle in a never ending web of canvas. This is specially true for things like loops.
It is often overlooked, but R also has plenty of features to cooperate between teams (github integration with RStudio, and easy package building with devtools).
Actually, if everyone in your organization knows R, all you need is to maintain a basic package on github to share everything. This of course is not the norm, which is why I think SPSS, although a worst product, still has a market.
I have not data for it, but from my experience I can tell you one thing:
SPSS is a lot slower than R. (And with a lot, I really mean a lot)
The magnitude of the difference is probably as big as the one between C++ and R.
For example, I never have to wait longer than a couple of seconds in R. Using SPSS and similar data, I had calculations that took longer than 10 minutes.
As an unrelated side note: In my eyes, in the recent discussion on the speed of R, this point was somehow overlooked (i.e., the comparison with SPSS). Furthermore, I am astonished how this discussion popped up for a while and silently disappeared again.
There are some great responses above, but I will try to provide my 2 cents. My department completely relies on SPSS for our work, but in recent months, I have been making a conscious effort to learn R; in part, for some of the reasons itemized above (speed, vast data structures, available packages, etc.)
That said, here are a few things I have picked up along the way:
Unless you have some experience programming, I think creating summary tables in CTABLES destroys any available option in R. To date, I am unaware package that can replicate what can be created using Custom Tables.
SPSS does appear to be slower when scripting, and yes, SPSS syntax is terrible. That said, I have found that scipts in SPSS can always be improved but using the EXECUTE command sparingly.
SPSS and R can interface with each other, although it appears that it's one way (only when using R inside of SPSS, not the other way around). That said, I have found this to be of little use other than if I want to use ggplot2 or for some other advanced data management techniques. (I despise SPSS macros).
I have long felt that "reporting" work created in SPSS is far inferior to other solutions. As mentioned above, if you can leverage LaTex and Sweave, you will be very happy with your efficient workflows.
I have been able to do some advanced analysis by leveraging OMS in SPSS. Almost everything can be routed to a new dataset, but I have found that most SPSS users don't use this functionality. Also, when looking at examples in R, it just feels "easier" than using OMS.
In short, I find myself using SPSS when I can't figure it out quickly in R, but I sincerely have every intention of getting away from SPSS and using R entirely at some point in the near future.
SPSS provides a GUI to easily integrate existing R programs or develop new ones. For more info, see the SPSS Community on IBM Developer Works.
#Henrik, I did the same task you have mentioned (C++ and R) on SPSS. And it turned out that SPSS is faster compared to R on this one. In my case SPSS is aprox. 7 times faster. I am surprised about it.
Here is a code I used in SPSS.
data list free
/x (f8.3).
begin data
1
end data.
comp n = 1e6.
comp t1 = $time.
loop #rep = 1 to 10.
comp x = 1.
loop #i=1 to n.
comp x = 1/(1+x).
end loop.
end loop.
comp t2 = $time.
comp elipsed = t2 - t1.
form elipsed (f8.2).
exe.
Check out this video why is good to combine SPSS and R...
Link
http://bluemixanalytics.wordpress.com/2014/08/29/7-good-reasons-to-combine-ibm-spss-analytics-and-r/
If you have a compatible copy of R installed, you can connect to it from IBM SPSS Modeler and carry out model building and model scoring using custom R algorithms that can be deployed in IBM SPSS Modeler. You must also have a copy of IBM SPSS Modeler - Essentials for R installed. IBM SPSS Modeler - Essentials for R provides you with tools you need to start developing custom R applications for use with IBM SPSS Modeler.
The truth is: both packages are useful if you do data analysis professionally. Sure, R / RStudio has more statistical methods implemented than SPSS. But SPSS is much easier to use and gives more information per each button click. And, therefore, it is faster to exploit whenever a particular analysis is implemented in both R and SPSS.
In the modern age, neither CPU nor memory is the most valuable resource. Researcher's time is the most valuable resource. Also, tables in SPSS are more visually pleasing, in my opinion.
In summary, R and SPSS complement each other well.

Resources