Different local and remote organisation R Project and GitHub

Different local and remote organisation R Project and GitHub - r

I want to version control my R scripts so I've created an R project and a GitHub repo. My scripts are scattered through several directories within the same directory where the R project is.
I would like that my GitHub repository harbors only the scripts, independently of the folders they are locally stored in. However when I run the below command:
git add folder/file.R
git commit -m "my_message"
git push -u origin master
A directory named folder is created containing file.R but I'd like to just see file.R without the folder. Do you know how can I do this? Also, would it be good practice? My local folders are organized so each directory contains its own scripts and results, that's the reason the scripts are separated.
Thank you very much

is there a way to add the file.R without specifying the path?
Not using git add, no. The design constraint for git add is that it should store the file's name exactly as it appears, including the forward slashes, so if the file's name is folder/file.R, that's the file's name.
You have some options here though:
You can make a parallel directory where you put the files with the names you want them to have. Run git init in that directory, copy the folder/file.R file to file.R in that directory. Then cd ../gitdir or whatever is appropriate to get there, and git add file.R.
This method is probably the best because it's the simplest.
You can write your own programs using git hash-file -w and git update-index, which are two of Git's plumbing commands. A plumbing command, in Git, is basically a command that exists so that you can build user-facing commands: they're not meant to be run by humans but rather by other programs. So you write a program (in whatever language you like) that uses these plumbing programs to achieve whatever you want.
In particular, you can create or find a Git blob object holding the contents of file.R as read from anywhere you like, then use git update-index to create an index entry holding whatever path you like and referring to the blob object you created (or found) with git hash-object with the -w flag.
Since Git is a suite of tools, not a solution, you can come up with your own method. The tools in Git are made with particular approaches in mind, but they are flexible enough to be repurposed.

Related

Serve RMarkdown outputs without version controlling them

We frequently use RMarkdown based packages to create websites with R (bookdown, blogdown, distill...) and use github-pages to serve the html files via the url username.github.io/repo.
In this approach, the ouput (i.e. html / css) files are also version controlled, and are frequently included in commits by mistake (git commit -a). This is annoying since these files clutter the commit and often lead to fictitious files conflicts.
Ideally, the outputfiles would not be version controlled at all, since the binary files (images) additionally bloat the repo. So I'm looking for a solution where:
Git ignores the output files completely but provides an alternative (but comparable1) method to gh-pages to serve them
Git ignores the output files temporally and committing / pushing them to gh-pages is done in a separate, explicit command
1: The method should be command line based and provide a nice URL to access the website

You could have .html, .css etc. ignored in the main and all other branches but the branch, for example, the gh-page branch, where your github-page is built from.
Git does not support different .ignore files in different branches so you would have to set up a bash script that replaces the ignore file each time you checkout a new branch. See here for how to do that: https://gist.github.com/wizioo/c89847c7894ede628071
Maybe not the elegant solution you were hoping for but it should work.

If you have a python installation on your computer, you can use GitHub Pages Import, a tool designed specifically for this purpose.
You need a python installation since it has to be installed with pip, but once it's installed it integrates beautifully with into an R / RMarkdown workflow.
Once it's installed (pip install ghp-import), you can run ghp-import docs (assuming docs is where your RMarkdown outputs are stored).
There are a bunch of practical options that you can use, including -p to additionally push the changes to your remote after the commit.

You need to tell Git to ignore the folder the book gets built into.
So, for example, by default bookdown puts all the built files in a folder called "_book"
Just add the following line to the .gitignore file in your project.
_book
Then you can work on your book and build it and push changes without worrying about the site being updated.
To update the site, you want to create a gh-pages branch that is only used for the hosted content. Do that with these commands from in your book folder:
git checkout --orphan gh-pages
git rm -rf .
# create a hidden file .nojekyll
touch .nojekyll
git add .nojekyll
git commit -m"Initial commit"
git push origin gh-pages
Make sure (once you put your book content in that branch) that GitHub is set to use that branch for hosting, rather than the folder you were using before.
Then, you can switch back to your main branch with the following command:
git checkout master
Next, you will clone your gh-pages branch into your main project:
git clone -b gh-pages https://github.com/yourprojecturl.git book-output
Then, when you have a version of the built book (in the _book folder) ready to use as your live site, use the following commands to copy the content into the book-output folder and push that to the gh-pages branch where the live site is:
cd book-output
git rm -rf *
cp -r ../_book/* ./
git add --all *
git commit -m"Update the book"
git push -q origin gh-pages
You can continue to use this last set of commands whenever you have a version in _book that you're ready to push live.

Can a pre-commit Git hook zip a directory and add it to the repository?

I'm doing development on a Wordpress plugin. My development directory contains a lot of development-specific stuff (e.g. Grunt files, Sass files, the git repository itself, etc.).
Obviously, I don't want to distribute this folder containing all of those development files; people don't want a few MB of Grunt files when they download my Wordpress plugin.
Up until now, though, my "release" process has been cumbersome:
Commit the Git changes
Zip the entire folder
Open the zip file and delete the .git folder, grunt files, and all the other development-specific files
Release the new zip
I don't know the best way to accomplish this, but I'm very vaguely familiar with Git hooks, and I had this thought: could I set up a Git hook that would zip ONLY the needed production files into a ZIP file and store it with the repo? That way, every time I commit it would automatically create a new release ZIP.
Is that possible? If so, could someone point me in the right direction?
Oh also, I'm on Windows (・_・;). So I'm hoping that there's a way to do it on Windows.

I can't speak for Windows, but:
It's technically possible to do that sort of thing in a pre-commit hook.
Don't.
A pre-commit hook that modifies "what you will commit" is annoying (if nothing else, it violates the "rule of least astonishment", where your version control system simply stores the versions you tell it to store). Apart from that, storing large pre-compressed binaries interferes with git's attempt to save space in pack files, and will cause rapid repository bloat, poor performance, running out of memory, and so on. A ZIP-archive is a pre-compressed binary and hence will behave badly.
In general, a more reasonable "hook-y" way to handle releases is to set up a "release server" to which you push new releases, and have the push trigger the archive-generation. (There are ways to do this without a separate server / repository, and you can do it in a more pull-style fashion, but the push-style is easy to illustrate.)
[Edit: I had originally considered git archive but did not realize you could get it to exclude files conveniently, so wrote up the below instead. So, jthill's answer is better and should be one's first resort. I'll leave this in place as an alternative for some case where for some reason, git archive might not do.]
For instance, here's a server-side post-receive hook code fragment that checks whether a branch whose name matches release* has been pushed-to, and if so, invokes a shell function with the name of the branch (once for each such branch):
#! /bin/sh
NULL_SHA1=0000000000000000000000000000000000000000
scan()
{
local oldsha newsha fullref shortref
local optype
while read oldsha newsha fullref; do
case $oldsha,$newsha in
$NULL_SHA1,*) optype=create;;
*,$NULL_SHA1) optype=delete;;
*) optype=update;;
esac
case $fullref in
refs/heads/*)
reftype=branch
shortref=${fullref#refs/heads/}
;;
*)
reftype=other
shortref=fullref
;;
esac
case $optype,$reftype,$shortref in
create,branch,release*|update,branch,release*)
do_release $shortref;;
esac
done
}
scan
(much of the above is boilerplate, which I have stripped down to essentials). You would have to write the do_release function, which might resemble (totally untested):
do_release()
{
local tmpdir=/tmp/build.$$ # or use mktemp -d
# $tmpdir/index is git's index; $tmpdir/t is the work tree
trap "rm -rf $tmpdir; exit 1" 1 2 3 15
rm -rf $tmpdir
mkdir $tmpdir/t
GIT_INDEX_FILE=$tmpdir/index GIT_WORK_TREE=$tmpdir/t git checkout $1
# now clean out grunt files and make zip archive
(cd $workdir/t; rm -rf grunt; zip ../t.zip .)
# put completed zip archive in export location, name it
# based on the branch name
mv $workdir/t.zip /place/where/zip/files/live/$1.zip
# clean up temp dir now, and no longer need to clean up
# on signal related abort
rm -rf $tmpdir
trap - 1 2 3 15
}

There's actually a command for this, git archive.
git archive master -o wizzo-v1.13.0.zip
See the EXAMPLES section, you can select paths, add prefixes to them, define custom postprocessing by output extension, and some more minor tweaks.
Also see the ATTRIBUTES section: you can give files -- arbitrary patterns, really -- an export-ignore attribute to exclude them from archives.
It's got a bunch more handy-dandies, you can get archives from remote repos, expand arbitrary git log --pretty=format: placeholders, the git manpages are definitely worth whatever time you can invest in them.

Git Archive but first put all the files inside a folder then start archiving

As title suggest, I want to know if there is a single git command that put all my project in one folder first (not including .gitignored files) and then proceed archiving the folder— leaving ignored files not included when archiving which is nice.
This can be beneficial for me as I am working on WordPress plugin with multiple release. Some references.

I want all the files (minus the .gitignored files) move to a folder first then proceed archiving that folder
It is possible in one command provided you define an alias but this isn't git-related:
you can:
clone your repo elsewhere (that way you don't get any ignored or private file)
move your files as you see fit in that local clone
archive (tar cpvf yourArchive.tar yourFolder)
But git archive alone won't help you move those files, which is why I would recommend a script with custom bash commands (not git commands).

You don't really need to copy / clone the repo anywhere.
Make sure you committed all your changes.
Process the files any way you want.
Run tar -cvjf dist/archive-name.tbz2 --transform='s,^,archive-name/,' $(git ls-tree --full-tree -r --name-only --full-name HEAD)
run git reset --hard to restore without any of the changes you made in step #2.
Hints:
The --transform='s,^,archive-name/,' is so your files will be extracted toarchive-name/....`, you can remove it if you don't need that.

How do I create a directory with a file in it, in one step?

In the terminal, is there a way to create a directory with a file in it in one step?
Currently I do this in 2 steps:
1. mkdir foo
2. touch foo/bar.txt
Apparently, touch foo/bar.txt doesn't work.

With only standard unix tools, the most direct way to create a directory and a file in this directory is
mkdir foo && touch foo/bar.txt
Unix is built around the philosophy of simple, single-purpose tools with the shell as a glue to combine them. So to create a directory and a file, you instruct a shell to run the directory creation utility then the file creation utility.
I won't swear that there isn't some bizarre way of using a standard tool that lets you do it with a single command. (In fact, there is: unpack an archive — except that you'll need to provide that archive as a file, with predefined owner, date and other metadata, or else use another command to build an archive.) But whatever it is would be convoluted.

Directory for files generated by Make during installation from configure, make and make install

Just assume we are installing some libraries from its source distributed by the way GNU promoted. When using "./configure --prefix" to specify where to install.
(1) does Make generate the binaries under the current directory? Does Make install then copies them from the current directory (which is from where Make is run) to $prefix? If the answers are yes, I have two questions, each for different cases.
(2) when the current directory and $prefix are not the same, can I remove all the files generated by Make under current directories to save some space?
(3) when the current directory and $prefix are the same, will make install do nothing or copy the files to themselves? Can I just skip the make install step?

The answer to your first question is probably yes.
As for the rest, you may find a make clean which will tidy up the files created by the initial make.
I think the makefile will be able to handle the situation where current directory and $prefix are not the same, and do the right thing.
The current directory would not usually be the destination of files created by makefiles.
(of course it depends on how the makefile is written, so I can't give definite answers, but I've generally been impressed with the makefiles I've used)

You are absolutely right: make just creates files in current directory and make install copies it to the destination directories, based on $prefix and other variables maintained by configure script.
You can wipe out the whole directory you've ran the build at. It will not be used, because, well, that's what "install" means: you build in one directory and the the files are placed in the proper places of your system.
Usually install destination and the directory you build in differ. The hierarchy of the files being installed usually do not relate to directory hierarchy of the build system. Just install to the other dir: it's cheap to create just yet another directory.

Develop Reference

r css asp.net wordpress firebase qt symfony nginx http apache-flex