Closed. This question needs details or clarity. It is not currently accepting answers.
Want to improve this question? Add details and clarify the problem by editing this post.
Closed 5 years ago.
Improve this question
Since initially running package.skeleton to create a package, I have added several S3 classes. Each of these classes has 5-10 methods. I've discovered the wonderful prompt command to create .Rd files from a function loaded into memory, but is it possible to have R automagically create a single help file that has all multiple functions documented? I'm thinking of something like an enhanced version of prompt where you would pass it a list of functions, and it would create a single .Rd file with only the additional information added to the help file.
For instance, if I have a generic called duration, and classes for which there are methods duration.bond(market,...), duration.account(market,time,...), duration.portfolio(market,...), I would like prompt to create a helpfile with a \usage section containing each \method{} and an \arguments{} section containing market,\dots, and time.
Any hope here? Copying and pasting is getting very tiring!
For completeness, adding in what I chose to do here, which is to pick the method that has the most arguments and use prompt on that, then add in the other methods manually to the same help file.
The other alternative would have been to use Rd2roxygen to convert everything that was already in .Rd back to Roxygen and then use Roxygen for the whole project. This will likely be what I do in the next release.
You could roll-your-own by reading in a template help file (with readLines), then editing it to suit each particular case (judicious use of paste and gsub), then writing the result back out to file (via writeLines).
Related
Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 2 years ago.
Improve this question
I am working on files used to monitor health plan data, and each type of report comes in the same template. I have created macros to automate the oversight of the files (find errors, gaps in data, logical improbabilities, etc). Now that I have that in place and the code has been cleaned up -- I am trying to automate the application of those macros to these files. For instance, for one type of report I have 10 client files I have to review. Currently I am going through the painstaking process of opening each file and dropping in the macro, applying it to the file, removing the macro (so that the clients cant take my code), and saving the resulting file. I repeat that process for each client -- and then do a similar process for many other reports. I know there has to be a better way, I am wondering if anyone has experience in this field and might be able to point me in the direction of how to achieve it. I use R studio for another process we automated and I believe I could utilize it for this process as well -- just need to find a jumping off point.
Manual intervention will still be needed to review the results of the macros, but I am hoping to eliminate the unnecessary manual touch points
Really appreciate any advise / knowledge you can share
Unless the macro you've written contains some very specific functions that don't have Python equivalents, I'd recommend simply abandoning VBA and manipulating your Excel sheets in Python via xlwings or openpyxl. If your data is very "database-like" in that the top row is simply column headers and every additional row contains nothing but data aligned to those headers, you can also use pandas to process the data as well.
If you do need access to those functions built directly into Excel that don't have Python equivalents, you can use win32com to communicate with Excel via Python. This library basically drives Excel via its COM interface. You can then either use the COM libraries directly to execute an equivalent of your VBA file from within Python, or if you prefer to stick with VBA, you can simply paste your VBA code into your Python script and inject it into workbook like in this example. From there, you can also remove your VBA code from the Excel sheet as shown in this example.
A pure VB solution would involve essentially making these same calls to inject and subsequently remove the CodeComponent in an environment outside your Excel workbook.
You may find it vastly easier to solve problems like this with a popular scripting language like Python since the support community around it is much larger than VBA's. VBA tends to be unpopular among developers and thus its support community also tends to be small. Large support communities also mean well-maintained and highly-convenient libraries such as the aforementioned xlwings and openpyxl.
Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 4 years ago.
Improve this question
I found the following line in Hadley Wickham's book about R packages:
While you’re free to arrange functions into files as you wish, the two extremes are bad: don’t put all functions into one file and don’t put each function into its own separate file (See here).
But why? This would seem to be the two options which make most sense to me. Especially keeping just one file seems appealing to me.
From user perspective:
When I want to know how something works in a package I often go to the respective GitHub page and look at the code. Having functions organised in different files makes this a lot harder and I regularly end up cloning a repository just so I can search the content of all files (e.g. via grep -rnw '/path/to/somewhere/' -e 'function <-').
From a developer's perspective
I also don't really see the upside for developing a package. Browsing through a big file doesn't seem much harder than browsing through a small one if you employ the outline window in R Studio. I know about the Ctrl + . shortcut but it still means I have to open a new file when working on a different function while Ctrl + . could basically do the same job if I keep just one file.
Wouldn't it make more sense to keep all functions in one single file? I know different people like to organise their projects in different ways and that is fine. I'm not asking for opinions here. Rather I would like to know if there are any real disadvantage of keeping everything in one file.
There is obviously not one single answer to that. One obvious question is the size of your package: If it contains only those two neat functions and a plot command, why bother organizing it in any difficult manner: Just hack it into one file and you are good to go.
If your project is large, say you try to throw the R graphics system over and write lots and lots of functions (geoms and stats and ...) a lot of small files might be a better idea. But then, having more files than there is room for tabs in RStudio might not be such a good idea as well.
Another important question is, whether you intend to develop alone or with hundreds of people on GitHub. You might prefer to accept a change in a small file as opposed the "the one big file" that was so easy to search, back when you were alone.
The people who invented Java originally had a "one file per class" going and C# seems to have something similar. That does not mean, that those people are less clever then Hadley. It just means, that your mileage may vary and you have the right to oppose to Hadleys opinions.
Why not put all files on your computer in the root directory?
Ultimately if you use a file tree you are back to using everything as single entities.
Putting things that conceptually belong together into the same file is the logical continuation of putting things into directories/libraries.
If you write a library and define a function as well as some convenience wrappers around them it makes sense to put them in one file.
Navigating the file tree is made easier as you have fewer files and navigating the files is easier as you don't have all functions in the same file.
Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 6 years ago.
Improve this question
I finally arrived at a point in using R where my programs are not anymore grown-up command line scripts, but real codes. At this point, I think it doesn't make sense to keep all the functions used by the main code in the same source file. Now, If I understand correctly, the way to use function myfunction, stored in file hereliesfunction.r, from a script stored in file myscript.r, is to add the line
source("hereliesfunction.r")
in file myscript.r, before the part of the script code where myfunction is used.
Is this the right approach in R?
Do I need a different source command for each function used by my main code? I guess it works "recursively",i.e., I can put source commands in
hereliesfunction.r to let myfunction use other functions.
What happens when I return from myfunction? Do these other
functions remain in memory, ready to be accessed by the main code too, or are they destroyed just like any other object created by myfunction?
Finally, is there some guideline on whether to store all the
functions used by a main code in the same directory as the main
code, or not?
Once you source a R file, it runs all the commands in that file. If it contains a function definition, it stores it into the global environment and is at your disposal until you remove it or close R session (so 3., yes).
Your entire post is screaming R package. As #docendodiscimus has pointed out, you should invest some time to develop a package. Not only does it hold your code in one place, is easy to maintain, it also offers a great platform to document your code (probably the most important part of code development/analysis) through help files and vignettes and offers easy version control through local and remote repositories (git, svn...).
[about sourcing] Is this the right approach in R?
Yes but in the mid-term, consider building a package as stated by #docendo discimus. devtools::create() and if you use RStudio Projects > New package are your friends. Learning to build packages is made simple by Hadley's R-pkg and was, personally, the best investment ever in R. Plus documenting and writing tutorials/vignettes and writing tests is always useful: it may be time consuming at the first glance, but you will probably soon hugely benefit from it (better understanding of your code, realizing you can improve the package architecture, etc.)
Do I need a different source command for each function used by my main code?
All functions, and in a larger extent code, located in the file sourced will be executed in R (so functions will be declared and available, you can check it with ls()
I guess it work "recursively",i.e., I can put source commands in hereliesfunction.r to let myfunction use other functions.
Yes
What happens when I return from myfunction? Do these other functions remain in memory, ready to be accessed by the main code too, or are they destroyed just like any other object created by myfunction?
Not sure to understand but may be related to previous points.
Finally, is there some guideline on whether to store all the functions used by a main code in the same directory as the main code, or not?
You can store them wherever you want, as long as the path for source is the right one. But it's generally a better practice to store all your functions in the same directory (or in a subfolder, eg /code, so that you just change your working directory once (or if you use RStudio's projects, you don't even need to bother, you just open the project), and as a side effect, as long as one is working in the same directory, the relative paths will still work. And thus you can share the folder with Dropbox or other, which ease collaboration.
Again, in the mid term or if many projects use the same source files, it's probably a good idea to write a package (for your own use, or to share on GitHub or CRAN or...)
Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 8 years ago.
Improve this question
I am trying to get some good practices happening and have recently moved to using git for version control.
One set of scripts I use is for producing measurement uncertainty estimates from laboratory data. The same scripts are used on different data files and produce a set of files and graphs based on that data. The core scripts change infrequently.
Should I create a branch for each new data set? Is this efficient?
Should I use one set of scripts and just manually relocate output files to a separate location after each use?
There are a few different aspects here that should be touched on. I will try provide my opinions/recommendations for each.
The core scripts change infrequently.
This sounds to me like you should make an R package of your own. If you some core functions that aren't supposed to change, it would probably be best to package them together. Ideally, you design the functions so that the code behind each doesn't need to be modified and you just change an argument (or even begin exploring R S3 or S4 classes).
The custom scripting, you could provide a vignette for yourself demonstrating how you approach a data set. If you wanted to save each final script, I would probably store them in the inst/examples directory for you to call again if you needed to re-run if you don't want to store them locally.
Should I create a branch for each new data set? Is this efficient?
No, I generally would not ever recommend someone put their data on github. It is also not 'efficient' to create a new branch for a new data set. The idea behind creating another branch is to add a new aspect/component to an existing project. Simply adding a dataset and modifying some scripts is, IMHO, a poor use of a branch.
What you should do with your data depends on the data characteristics. Is this data large? Would it benefit from a RDBMS? You at least want to have it backed up on a local laboratory hard drive. Secondly, if you are academically minded, once you finish analyzing the data you should look in to an online repository so that others could also analyze the data. If these datasets are small, you could also put them in your package in the data directory if they are not sensitive.
Should I use one set of scripts and just manually relocate output files to a separate location after each use?
No, I would recommend that with your core functions/scripts that you should look in to creating a wrapper for this part and provide an argument to specify the output path.
I hope these comments help you.
Closed. This question needs details or clarity. It is not currently accepting answers.
Want to improve this question? Add details and clarify the problem by editing this post.
Closed 8 years ago.
Improve this question
Is necessary to call wordpress functions(in template) unique names? I think if I call them simple names, they may conflict with plugins(if user will install in future them). Is it true?
Sorry for stupid question..
you should read all naming conventions here http://code.tutsplus.com/articles/the-wordpress-coding-standards-naming-conventions-and-function-arguments--wp-31683
Function Names
As mentioned earlier, if classes are nouns that ideally represent a single idea or single purpose, then their methods should be the actions that they are able to take. As such, they should be verbs - they should indicate what action will be taken whenever they are called.
Furthermore, the arguments that they accept should also factor into the name of the function. For example, if a function is responsible for opening a file, then its parameter should be a file name. Since our goal should make it as easy as possible to read code, then it should read something like "have the local file manager read the file having the following file name."
Use lowercase letters in variable, action, and function names (never camelCase). Separate words via underscores. Don't abbreviate variable names un-necessarily; let the code be unambiguous and self-documenting.
Of course, there are always worse - some developers resort to using single characters for their variable names (which is generally only acceptable within loops.)
Just as the Coding Standards state: Don't abbreviate variable names un-necessarily. Let the code be unambiguous and self-documenting.
Now, the truth is, code can only be unambiguous to a point.
Anyway, the bottom line is to lower case your method names, avoid all camel casing, separate by spacing, and be as specific as possible when naming your variables and avoid duplicate names.