What is currently the best workflow for statistical analysis and report writing? [closed] - r

Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 5 years ago.
Improve this question
Workflow for statistical analysis and report writing
This question had a lot of good answers, but as pointed out, they are outdated.
I mostly work on scripts that will probably never be re-run after a paper has been published. Are packages worth the trouble in cases where I don't need to redistribute the codes to the world for easy access? What about the organization of data? How can makefiles be used?

I think if you use the basics laid out by Josh Reichs in that post you provided, making sure that you create a directory to save everything in, then you are good to go.
My added step for the modern world would be to product a markdown report in one of the available formats.
rMarkdown- which you can run right out of rStudio
rNotebooks - which
you can run right out of rStudio
Jupyter Notebooks - which you can
run out of Anaconda or Jupyter with some easy tweaking.
The beauty of these three report systems is that you get to integrate the thought process, code, data, graphs and visualizations in a single spot.
So, if as you say no one will ever re-run your code, then they will at least see it to appease suspicions. Also, if they do choose to repeat your process, they just follow your logic and process in a duplicate document (especially easy with the notebooks)
As for using packages. That is a more complex question. If the packages are well orchestrated and save you a ton of time cleaning, sorting and structuring data, USE THEM! Time is money. If the things you are using them for are simple, straight forward, just as easy to program yourself and recognizable by those who would jury your paper, it probably does not matter either way.
The one place where I feel it matters is complex processes that are difficult (read that as easy to do wrong yourself) and have been implemented, tested and vetted by prior researchers.
Using those packages garners credibility and makes it easier for peers to accept your methods at face value. But if you are on the cutting edge..you should feel free to slice away. Maybe make a package of your own!

Related

What are the limitations of Shiny apps compared to another web programming language like Ruby on Rails? [closed]

Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 6 years ago.
Improve this question
I am using Shiny to create interactive graphs on a website, but it doesn't seem to have support for things like comment threads, or database storage. Are you supposed to somehow use Shiny within another language?
This question was downvoted, and I hope I won't lose scarce rep points by answering it. I can't speak for the Shiny development team, and I'm only a novice Shinyapps developer, but ...
It seems to me that Shiny aims to make it easy for for R programmers to build small to medium-sized, self-contained, web-based graphic-centric interactive data-analysis displays, without adding an unreasonable amount of code to what they wrote to do their actual work, i.e. the analysis. This is a fairly common requirement for researchers and practitioners (as opposed to full-time professional developers) coming from the R heritage and culture (stats and data science). Shiny achieves this aim pretty well!
You can find out more about the kinds of problems that Shiny aims to solve by going to the source. Note that it says Turn your analyses into interactive web applications, not Build a full-service website with interactive chat and a backing store. It sounds as if you want something different in scale and kind, and you may be wasting time by trying to shoehorn your requirement into the Shiny problem/solution space. I've occasionally hammered nails into wood using a pair of pliers because my toolbox was at the bottom of the ladder, but that didn't make it the right thing to do!

Modularization of PL/SQL packages [closed]

Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 8 years ago.
Improve this question
Currently I am doing a restructuring project mainly on the Oracle PL/SQL packages in our company. It involves working on many of the core packages of our company. We never had documentation for the back end work done so far and the intention of this project is to create a new set of APIs based on the current logic in a structured way along with avoiding all unwanted logic that currently exists in the system.
We are also making a new module currently for the main business of the organization that would work based on these newly created back-end APIs.
As I started of this project, I found out that most of the wrapper APIs had around more than 8000 lines of code. I managed to covert this code into many single APIs and invoked them from the wrapper API.
This activity in itself has been a time-consuming process but I was able to cut down the number of lines of code to just 900 in the wrapper API by calling independent APIs for each business functionality.
I would like to know from you experts if this mode of modularizing the code is good and worth the time invested in it as I am not sure if it would have many performance benefits.
But from a code readability perspective, this is definitely helping and now I am able to understand the 8000 lines of code much better after restructuring and I am sure the other developers in my organization too will understand.
Requesting you to let me know if I am doing the right thing and if its having its advantages apart from readability please do mention them. Sorry for the long explanation.
And is it okay having more than 1000 lines of code in a wrapper API.
Easy to debug
Easy to update
Easy to Modify/maintain
Less change proneness due to low coupling.
Increases reuse if the modules are made generic
Can identify unused code easily

Simple installed tool for digital Scrum Board [closed]

Closed. This question is off-topic. It is not currently accepting answers.
Want to improve this question? Update the question so it's on-topic for Stack Overflow.
Closed 9 years ago.
Improve this question
I am looking for a basic and simple-to-install digital version of a Scrum board.
I do prefer physical index cards, but in this case logistics makes it hard. Thus, I need to have it on the computer.
No real need to share data between several clients. To us it is enough if it runs on one single machine.
Just need basic functionality. A drag-drop board and a sprint burndown would do fine.
Due to regularly constraints I cannot use an online SaaS, must keep the data local.
Time is short, so simple install and ready-to-go.
Does not need to be free, but of course price is interesting.
I have not had this set of constraints earlier, so I am unfamiliar.
I have done some research and have some general experience. For example VersionOne, Mingle and Hansoft seem to have a good reputation. Anyone can comment on how those fit the above list? Anyone have other recommendations?
This thread is a bit old now, but leaving my find in the hope to help others searching the same topic.
If you are looking for a simple tool for developers to collaborate on a Scrum project, http://trello.com/ is very simple and intuitive. Absolutely no clutter and easily lets a small team manage their cards.
I would have a look at Atlassian Jira with the GreenHopper plugin - it has a nice dashboard.
http://www.atlassian.com/software/greenhopper/
Have a look at Mingle from ThoughtWorks. A really great tool. Wall looks like this
Free download/install for 1 year / 5 users.
Excel (or OpenOffice) spreadsheet? Why do you need a special tool for this?
I had a similar decision to make a year ago and went for Version One Team Edition - which is free.
http://www.versionone.com/Product/Compare_Editions.asp
It's easy to deploy the SQL database wherever you want it - so locally in your case.
Our team found using the software easy and intuitive.
The free version (up to 10 users) has ample features - the sprints/stories/tasks are easy to setup and view. The burndown chart is good.
All in all, I've no regrets with choosing Verison One - it's easy to install, easy to use and free.

What's a good example of really clean and clear [R] code, for pedagogical purposes? [closed]

Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 8 years ago.
Improve this question
I'm working with a small team of analysts and statisticians on what will be a medium-sized body of R code. They're smart people, but they're not trained or experienced as programmers, per se. (I am.) They've written some R code, but for our project to be expandable, efficient, and maintainable, it needs to become well-structured, and rather more piratical. One of the better way to learn to be a better programmer is to study elegant existing code. Can anyone suggest some open source examples of R code (on CRAN or wherever) that you think are particularly clear, literate, and good examples? Functional is good, S3 objects are OK, deep magic is bad.
My two favorite packages can both be browsed on R-Forge and are very well documented (although they may be too big for an introduction):
The caret homepage and source code.
The zelig homepage and and source code.
I think that the Google style guide does a great job of capturing the style of the Core team, although Hadley has his own style guide which can be read if you're looking at his packages. You can browse Hadley's packages on Github (and his homepage is full of useful content), in particular:
plyr
ggplot2
reshape
This article on the R-Wiki is also a good read for seeing ways to optimize code.
Not strictly related, but make sure you get them used to using Source Control (perforce, subversion, git, rcs, etc) as quickly as possible. That reduces the learning pains.

Software Design Description Practise [closed]

Closed. This question is off-topic. It is not currently accepting answers.
Want to improve this question? Update the question so it's on-topic for Stack Overflow.
Closed 9 years ago.
Improve this question
How many people actually write an SDD document before writing a single line of code?
How do you handle large CSCI's?
What standard do you use for SDD content?
What tailoring have you done?
I certainly have. Historically and on recent projects.
Years ago I worked in organisations where templates were everything.
Then I worked other places where the templates were looser or non-existent or didn't fit the projects I was working on.
Now the content of the software design is pretty much governed by what I need to describe to get the idea across to the audience.
"before writing a single line of code" there wouldn't be a a lot of detail. The documents I produce before I start coding are meant to get the idea of what we need to build across to the affected teams and senior management so they introduce high level architecture, functionality, technologies, risks and scope. Those last two are really important. The rest is to show other teams where you need to interface with them and to leave managers with a lingering notion that cool stuff is happening.
Most big software companies have their own practices. For example Motorola has detailed documentation for every aspect of software development process. There are standard templates for each type of documents. Having strict standards allows effectively maintain huge number of documents and integrate it with different tools. Each document obtains tracking number from special document-tracking system. They even have system (last time I seen it was in stage of early development) for automatically requirements tracking - you can say which line of code relate to given requirement\design guideline.
I would suppose that most people who write SDD documents and use terminology like CSCI have to be using a specific software development methodology and most likely are working for some serious government customer. They usually tend to take their preparations quite seriously and the documents are ready and approved before any development starts.
In an Agile process the development and the design document could be developed in parallel. It means that there will be plenty of refactoring to be done but it usually delivers very good results in the end.
In more formal processes (like RUP) a SAD document is mostly created during the elaboration/prototyping phase based on the team research.

Resources