Code reviews when re-formatting has also been done - code-cleanup

We are working on a massive code base which, over the years has several different formatting styles. We are keen to clean this up and have a consistent style. Up until now we have followed the Boy Scout rule and tidy up as we work in those areas, however this presents a reviewing nightmare. For example I have just had to review a 3500 line file which has around 40% code changes but there was actually only 2 lines of code added for the actual issue being addressed, all the other lines where from running the IDE auto-formatter tool.
I'm just wondering how other people/companies are dealing this with? I'm thinking it might be best to separate the 'doing work' and 'code cleanup' as 2 separate stories?

Related

Why not organise all functions in a package in one file? [closed]

Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 4 years ago.
Improve this question
I found the following line in Hadley Wickham's book about R packages:
While you’re free to arrange functions into files as you wish, the two extremes are bad: don’t put all functions into one file and don’t put each function into its own separate file (See here).
But why? This would seem to be the two options which make most sense to me. Especially keeping just one file seems appealing to me.
From user perspective:
When I want to know how something works in a package I often go to the respective GitHub page and look at the code. Having functions organised in different files makes this a lot harder and I regularly end up cloning a repository just so I can search the content of all files (e.g. via grep -rnw '/path/to/somewhere/' -e 'function <-').
From a developer's perspective
I also don't really see the upside for developing a package. Browsing through a big file doesn't seem much harder than browsing through a small one if you employ the outline window in R Studio. I know about the Ctrl + . shortcut but it still means I have to open a new file when working on a different function while Ctrl + . could basically do the same job if I keep just one file.
Wouldn't it make more sense to keep all functions in one single file? I know different people like to organise their projects in different ways and that is fine. I'm not asking for opinions here. Rather I would like to know if there are any real disadvantage of keeping everything in one file.
There is obviously not one single answer to that. One obvious question is the size of your package: If it contains only those two neat functions and a plot command, why bother organizing it in any difficult manner: Just hack it into one file and you are good to go.
If your project is large, say you try to throw the R graphics system over and write lots and lots of functions (geoms and stats and ...) a lot of small files might be a better idea. But then, having more files than there is room for tabs in RStudio might not be such a good idea as well.
Another important question is, whether you intend to develop alone or with hundreds of people on GitHub. You might prefer to accept a change in a small file as opposed the "the one big file" that was so easy to search, back when you were alone.
The people who invented Java originally had a "one file per class" going and C# seems to have something similar. That does not mean, that those people are less clever then Hadley. It just means, that your mileage may vary and you have the right to oppose to Hadleys opinions.
Why not put all files on your computer in the root directory?
Ultimately if you use a file tree you are back to using everything as single entities.
Putting things that conceptually belong together into the same file is the logical continuation of putting things into directories/libraries.
If you write a library and define a function as well as some convenience wrappers around them it makes sense to put them in one file.
Navigating the file tree is made easier as you have fewer files and navigating the files is easier as you don't have all functions in the same file.

How to combine two SCORM 2004 modules

Suppose I have two SCORM 2004 modules - instruction.zip and test.zip. The first contains instructional web pages and the second contains an interactive quiz. Each package was authored separately. I want to combine them to create a single course of study in which students read the web pages and then test their knowledge. (I will leave to one side the significant issues of sequencing and navigation.)
What is the recommended way of combining the two? I have tried (i) merging the two (did not work due to differences in file structure and dependencies) and (ii) adding test.zip to instruction.zip as a complete package and adding links (issues with reporting of test results).
I realise that most people author their courses using Captivate or other software to produce an single integrated package. For reasons that need not be discussed here, that is not an option in my case: the test assets will be developed separately and need to be combined with the instructional assets.
Grant,
I've got a packager on my site https://cybercussion.com which may be able to help you. If there is any advanced features your using though I haven't built out support for that yet. There is a 30 day trial for it.
You'd just need to expand the content into something like:
Multi-SCO/
SCO Title 1/ [all SCO 1 files]
SCO Title 2/ [all SCO 2 files]
You can also do this by hand by merging the imsmanifest organization markup together which if your friendly with XML is a option. You'll just need to manage the organization and resource elements. You also may have DTD/XSDs apart of both packages.
Manually zipping this yourself could result in a error importing on the LMS. Some platforms expect the imsmanifest.xml to be in the root of the zip and if its inside a folder it could error. So watch out for that.
We have some great SCORM 2004 samples on our site that may serve as a guide for you as far as sequencing and navigation goes. Check out the golf samples here
If you have any questions, please let us know!
Joe Donnelly
support#scorm.com

Is there a plugin or any way to automatically graph wiki content pages?

Having a DokuWiki with some content (regular to small in sice and depth) I would like to automatically generate a GraphViz or Freeplane or any Form of easy to grasp visualisation of my content.
Why? Because the wiki tends to become less and less effective, when searching and organizing its content. As a user I have no good way to get a sharp Idea of the Wiki structure, which is why more and more often topics are not written and found where they supposed to be.
How to generate graphical sitemap of large website is what I found so far, but because my wiki is not that big, it would be quicker for me to just manually make a graph. And because the main topics are not that often updated or extended (like 10 extension a month tops), it would not be that hard to keed it up to date manually.
However, I would like to avoid manual tasks, at least in the future.
So is there a plugin or any other good way to graph the contents?
starting on the landing page, following the internal-wiki-links
using the namespace-sitemap
Either one would be nice, 1. interest me a bit more, because it reflects the paths a user could go, when just calling the wiki-start-page. I am greatful for any help, thanks.
I wrote a simple tool to do just that, the graph can then be analyzed in Gephi. Have a look at this blogpost: http://www.splitbrain.org/blog/2010-08/02-graphing_dokuwiki_help_needed

Sweave/ODFWeave and tracking code chunks

I am getting started with the reproducible research tools in R, and I'm pretty excited about the prospects. Sweave/Knitr/Markdown, all that stuff is great. I use RStudio, and they have done a great job of integrating those tool, and I hear that StatET does a nice job putting all that together as well.
I don't write academic papers in LaTeX, and all the people I work with use Word, so I am very interested in an effective workflow to use ODFWeave to make documents.
My usual process is:
Develop the code chunks in my IDE (RStudio, in my case)
Go back and insert these into a ODT document and fill in the surrounding text.
run ODFweave
My problem is that I get confused in tracking code chunks and putting them into the ODF document. Keeping the ODF document in sync as I create the code is annoying, so I'd rather wait and insert the code chunks by name.
So finally, here are my questions:
What are people's suggestions for tracking code chunks or on how to optimize this workflow?
Can anyone recommend tools or tips for keeping track of the code chunks you write?
Being a software geek and a data nerd, I naturally imagine a piece of software doing this for me. Like I'd have a database of code chunks, and when writing the ODF document I'd be able to click on a chunk to insert it into my ODF file.
Has one anyone created this sort of thing?
When you check the number of items tagged odfweave on SO, you will notice that it is rarely used compared to Sweave and knit-offs. I do not fully understand why it did not take off, possible because of table-generation being such a nuisance (at least that what I remember from my attempts).
Since many customers insist on Word-Documents, we are using two alternatives currently:
Create html, e.g. with RStudio/knitr/rmd, and read it with Word. This is not really a good workflow, to get reasonable document you need much manual post-processing, but it works more or less.
You can also use the path via RDCOM. I don't remember what's the state of art here, because we have totally given up using it since the conditions of licensing were not transparent to us.
Use pandoc. This approach produces documents that do not need manual post-processing in MS-Word, but the range of features to create a nice layout (cross-linked images, figure numbering) are limited; it might be a problem that we are not yet good enough in using pandoc in its full.

Documentation Generation - What boxes should I aim to tick?

I'm looking at requiring my team to document their code more thoroughly for some major upcoming projects and to make life a little less painful, I am steering towards XML documentation generators such as Sandcastle, Doxygen or Box Live Documenter.
What are the key considerations I should keep in mind when evaluating the best option and what experiences have led you to a particular decision?
For me the key considerations would be:
Fully automated: Can it be set up in such a way so that pretty much
no outside work is required to
create or edit the documentation.
Fully styled: Can the documentation be fully styled so
that it looks great in a wiki or pdf
after it’s generated. I should be
able to change colors, font sizes,
layouts, etc.
Good Filtering: Can I select only the items I want to be
generated. I should be able to
filter the namespaces, file types,
classes, etc.
Customization: Can I include headers, footers, custom elements,
etc.
I found Doxygen could do all of this. Our workflow is as follows:
Developer makes a change to the code
They update the documentation tags right above the code they just changed
We click a generate button
Doxygen will then extract all the XML documentation from the code, filter it to only include the classes and methods we want, and apply the CSS styling we’ve pre-made for it. Our end result is an internal wiki that looks the way we want, and doesn’t require editing.
Extra: We have all our projects in various git repositories. We pull all these down to one root folder and generate the docs form this root folder..
Would be interested to know how others are automating even further..?
Who is paying for the documentation and why? (is the system stable enough, does it add enough value)
Who is going to read it, and why is she not using a more effective communication channel?
(if correct mostly distance in time/place)
Who is going to keep it up to date.
When are you going to destroy it? (Automatically if it hasn't been read or updated in the past three months?)
I mostly prefer better code to make my life less painful, over more documentation, but I like scenario & unit tests and a high level architecture description.
[edit] Documentation costs time and money to write and keep up to date. JavaDoc style documentation has a serious detrimental effect on the amount of code simultaneously visible and might be a good idea for the developers using the code, but not for those writing it.

Resources