Install packages and libraries in a local directory on server - unix

I have accession to universities server: Red Hat Enterprise Linux release 6.2.
My work is to test various scientific analysis programs.
I have no problem untar'ing and running them in my local directory. But most of them have lots of dependencies (perl libraries, RE2, GNU SL, glibc.i686 ) - whenever I try to install those dependencies I come up with the permission problem.
All of those packages require root to install.
Is there a way how can I install various different packages only in my local directory without asking root to install them system wide?

Yes, but the procedure varies per package. Many modern Unix packages have a configure script that you must run, which takes a --prefix option + argument specifying where the package should be installed. A directory such as $HOME/pkg would be a good options for these. Other configuration/building/installation scripts have similar options.
When building a package with dependencies, make sure you have $HOME/pkg/bin in your PATH, $HOME/pkg/lib in your LD_LIBRARY_PATH and pass -I$HOME/pkg/include and -L$HOME/pkg/lib to the compiler and linker, respectively. E.g., put the following in your shell startup file (.bashrc for Bash):
PATH=$HOME/pkg/bin:$PATH
CFLAGS=-I$HOME/pkg/include
LDFLAGS=-L$HOME/pkg/lib
LD_LIBRARY_PATH=$HOME/pkg/lib
(You shouldn't have to install glibc, ever.)

Related

Compiling R from source: RStudio doesn't find the libraries if started directly

I have compiled R 4.1.0 from source against the Intel MKL.
I have put:
source /opt/intel/oneapi/mkl/latest/env/vars.sh intel64
in ~/.bashrc.
If I open a .R file with RStudio, no problem.
But if I open RStudio directly, it is unable to start R correctly, giving me the error:
/usr/lib/rstudio/bin/rsession: error while loading shared libraries: libmkl_gf_lp64.so.1: cannot open shared object file: No such file or directory
Why is that? Doesn't RStudio run ~/.bashrc when started directly?
I am running Fedora 34 Workstation.
When shared libraries are stored in "non-standard locations" we have to tell the dynamic linker about it. That is sometimes done in the calling script (often the case with bundled software, e.g. when you download RStudio which ships with a fair number of local builds of shared libraries) but a more general solution is to tell ldconfig via its configuration.
Older systems used a line per directory in /etc/ld.so.conf. Newer systems generalize this (like many other configurations) with a directory containing small files with entries. So you can create a file named, say, /etc/ld.so.conf.d/local-mkl.conf, and place the directory path in there. If you then run sudo ldconfig all applications will know about it -- including R and RStudio calling R.

Correct workflow? - Distributable environment including jupyter notebook

I am developing applications that use Jupyter notebook and ipywidgets for a GUI frontend to a backend codebase. I have run into issues distributing/installing packages in the normal way, such as:
unexpected differences between required library versions (e.g. pandas)
requirements.txt forcing an update to more recent version of a library when the user maintains and uses their own codebase on an older version of that library.
I think pipenv might be able to solve this problem, but I want to check I have a correct usage before going too far down this path.
Requirements:
the user needs to be able to restart Jupyter Notebook in the same env multiple times, running the program from scratch, until a new version is available.
Users are all on Mac.
Any installation should not alter site-packages etc, have no effect on the python setup any user currently has.
Workflow concept
Development:
Develop within a pipenv environment (I use Pycharm so this is relatively straightforward).
Include jupyter in Pipfile [requires], even though jupyter is not imported anywhere in my source.
Use pipenv install new_package as and when new packages are required by my codebase, and maintain the Pipfile (respecting --dev for testing packages etc).
User installation
Produce a zip file containing source code, setup.py etc plus Pipfile and Pipfile.lock.
User extracts the zip file to a known location on their machine.
In terminal, navigate to the unzipped folder location, and run pipenv install.
Use:
In terminal, navigate to the folder location, and run pipenv shell
Run pipenv run jupyter notebook to reload the env and notebook.
When finished, close out of notebook and run exit to close the env.
Uninstall env and upgrade to newer version
In terminal, navigate to the folder location, and run pipenv --rm.
Download new source zip and follow steps above.
If I've understood, this should ensure anyone can use the distribution in a tightly controlled environment, without making any alterations to their existing python install? Have I overcomplicated things?

How to install R on a linux cluster?

I use a cluster (OS is Linux) which does not have R. I would like to install R in my personal folders so that I can just do
Rscript example.R arg1 arg2
How should I install R on this cluster knowing that I don't have admin rights?
How can I then manage the packages?
I'm not sure this is on-topic, but: all you really have to do is
download the R source tarball from CRAN; unpack it somewhere in your file space
create an r-build directory at the same level of the hierarchy (not technically necessary, but it's better practice to keep the source and build directories separate)
create an installation directory (say ~/r_install) somewhere sensible within your file space
cd to the source directory; tools/rsync-recommended
cd to the build directory
../[srcdir]/configure --prefix=~/r_install
make (to build the binaries)
make install (to move everything where it belongs; not technically necessary, as you can run R from the build directory)
Where this may get hairy is with all of the system requirements for R (LaTeX, Java, bzip2, etc. etc. ...) it is theoretically possible to download all this stuff and install it in your own file space, but it starts to get sufficiently tedious that it will be easier to beg your sysadmin to install at least the dependencies for you ...
as #Hack-R points out the basics of this answer are already present on Unix & Linux stackexchange, although my answer is a little more detailed ...

Is there any special functionality in R package "exec" or "tools" directories?

I'm trying to develop an R package that will include some previously compiled executable programs and their supporting libraries. (I know this is bad form, but it is for internal use).
My question: Does the special exec and tools directories have any special functionality within R?
The documentation seems to be sparse. Here is what I've figured out so far:
From here
files contained in exec are marked as executable on install
subdirectories in exec are ignored
exec is rarely used (my survey of CRAN says tools is just as rarely used)
tools is around for configuration purposes?
Do these directories offer any that I couldn't get from creating an inst/programs directory?
[R-exts] has this to say:
Subdirectory exec could contain additional executable scripts the
package needs, typically scripts for interpreters such as the shell,
Perl, or Tcl. This mechanism is currently used only by a very few
packages. NB: only files (and not directories) under exec are
installed (and those with names starting with a dot are ignored), and
they are all marked as executable (mode 755, moderated by ‘umask’) on
POSIX platforms. Note too that this is not suitable for executable
programs since some platforms (including Windows) support multiple
architectures using the same installed package directory.
It's quite possible the last note won't apply to you if it's only for internal use.
Nevertheless, I'd suggest avoiding abusing any existing convention that might not apply precisely to your situation, and instead use inst/tools or inst/bin.
As far as I can tell, here is the functionality offered by the exec and tools directories.
exec
From R-exts by way of hadley:
Subdirectory exec could contain additional executable scripts the package needs, typically scripts for interpreters such as the shell, Perl, or Tcl. This mechanism is currently used only by a very few packages. NB: only files (and not directories) under exec are installed (and those with names starting with a dot are ignored), and they are all marked as executable (mode 755, moderated by ‘umask’) on POSIX platforms. Note too that this is not suitable for executable programs since some platforms (including Windows) support multiple architectures using the same installed package directory.
exec features I have figured out
On POSIX platforms (*nix, os x), the files within exec will be marked as executable.
No subdirectories of exec are included in the package, only files in exec root
(note, it could contain binary executables, but there is no architecture/platform handling
tools
From R-exts:
Subdirectory tools is the preferred place for auxiliary files needed during configuration, and also for sources need to re-create scripts (e.g. M4 files for autoconf).
tools features I have figured out
tools is to hold files used at package compile time
All files contained are copied recursively into the source *.tar.gz package (including subdirs)
tools is not included in the final, compiled form of the package. All contents are dropped

Compiling haskell module Network on win32/cygwin

I am trying to compile Network.HTTP (http://hackage.haskell.org/package/network) on win32/cygwin. However, it does fail with following message:
Setup.hs: Missing dependency on a foreign library:
* Missing (or bad) header file: HsNet.h
This problem can usually be solved by installing the system package that
provides this library (you may need the "-dev" version). If the library is
already installed but in a non-standard location then you can use the flags
--extra-include-dirs= and --extra-lib-dirs= to specify where it is.
If the header file does exist, it may contain errors that are caught by the C
compiler at the preprocessing stage. In this case you can re-run configure
with the verbosity flag -v3 to see the error messages.
Unfortuntely it does not give more clues. The HsNet.h includes sys/uio.h which, actually should not be included, and should be configurered correctly.
Don't use cygwin, instead follow Johan Tibells way
Installing MSYS
Install the latest Haskell Platform. Use the default settings.
Download version 1.0.11 of MSYS. You'll need the following files:
MSYS-1.0.11.exe
msysDTK-1.0.1.exe
msysCORE-1.0.11-bin.tar.gz
The files are all hosted on haskell.org as they're quite hard to find in the official MinGW/MSYS repo.
Run MSYS-1.0.11.exe followed by msysDTK-1.0.1.exe. The former asks you if you want to run a normalization step. You can skip that.
Unpack msysCORE-1.0.11-bin.tar.gz into C:\msys\1.0. Note that you can't do that using an MSYS shell, because you can't overwrite the files in use, so make a copy of C:\msys\1.0, unpack it there, and then rename the copy back to C:\msys\1.0.
Add C:\Program Files\Haskell Platform\VERSION\mingw\bin to your PATH. This is neccesary if you ever want to build packages that use a configure script, like network, as configure scripts need access to a C compiler.
These steps are what Tibell uses to compile the Network package for win and I have used this myself successfully several times on most of the haskell platform releases.
It is possible to build network on win32/cygwin. And the above steps, though useful (by Jonke) may not be necessary.
While doing the configuration step, specify
runghc Setup.hs configure --configure-option="--build=mingw32"
So that the library is configured for mingw32, else you will get link or "undefined references" if you try to link or use network library.
This combined with #Yogesh Sajanikar's answer made it work for me (on win64/cygwin):
Make sure the gcc on your path is NOT the Mingw/Cygwin one, but the
C:\ghc\ghc-6.12.1\mingw\bin\gcc.exe
(Run
export PATH="/cygdrive/.../ghc-7.8.2/mingw/bin:$PATH"
before running cabal install network in the Cygwin shell)

Resources