R Mapreduce library 'rmr2' shows a warning message when loaded - r

Why is the R Mapreduce library 'rmr2' generating a warning message ?
I have installed 'rmr2' library to execute Mapreduce programs in R. But when
library(rmr2)
is specified in R, it generates the following warning message:
Please review your hadoop settings. See help(hadoop.settings)
Warning message:
S3 methods ‘gorder.default’, ‘gorder.factor’, ‘gorder.data.frame’, ‘gorder.matrix’, ‘gorder.raw’
were declared in NAMESPACE but not found
What could be the reason ?

The main reason, you didn't include the path. before run library(rmr2), you must include the given 4 paths to prevent these type warnings.
Sys.setenv(HADOOP_HOME="/home/hadoop/hadoop-1.1.2") //Its hadooop path
Sys.setenv(HADOOP_CMD="/home/hadoop/hadoop-1.1.2/bin/hadoop") //It's CMD path
Sys.setenv(HADOOP_STREAMING="/home/hadoop/work/hadoop-1.1.2/contrib/streaming/hadoop-streaming-1.1.2.jar") //It's streaming path
Sys.setenv(JAVA_HOME="/usr/lib/jvm/java-1.6.0-openjdk-amd64") //Java path it is.
Than you include library(rmr2) and library(rhdfs) to do further process. All the best.

I think you didn't write the paths as it should be:
HADOOP_CMD='/usr/local/hadoop-2.7.2/bin/hadoop'
HADOOP_STREAMING='/usr/local/hadoop-2.7.2/share/hadoop/tools/lib/hadoop-streaming-2.7.2.jar'
HADOOP_HOME='/usr/local/hadoop-2.7.2'
the '' are very important, check if you forgot them.

Related

Error with install.packages using renv|knit|rmarkdown

I'm updating the renv folder from a project in order to adjust the libraries, but it seems I'm having a permission problem. After running renv::init() and trying to installing manually the remaining libraries using install.packages() I always get the message
Error: failed to retrieve 'https://cran.rstudio.com/bin/windows/contrib/4.2/ipeadatar_0.1.6.zip' [error code 23]
1: curl: (23) Failure writing output to destination
2: curl: (23) Failure writing output to destination
Using .libPath() I can see that the renv was created in the "AppData" hidden folder
1] "C:/Users/André Ferreira/AppData/Local/R/cache/R/renv/library/MacroBRA_Wrld-09789847/R-4.2/x86_64-w64-mingw32"
So checking my permissions, I couldn't see anything wrong. Any thoughts about this problem? The thing it's that when I open my .Rmd file and try to knit, I receive the same message "1: curl: (23) Failure writing output to destination", now from rmarkdown retrieve installation, so it may be a configuration/permission problem.
Adding "C:\rtools42\usr\bin" and "C:\Program Files\R\R-4.2.1\bin" in the environment variable didn't help.
As I could see, opening an empty file from rstudio, I could use install.packages() without problem.
Although this doesn't solve the problem directly, you can also instruct renv to use a different library path with something like:
# use a project-local library path
RENV_PATHS_LIBRARY = renv/library
in your project's .Renviron file. Depending on your environment, you might also consider placing the library path in an alternate location.
See https://rstudio.github.io/renv/articles/packages.html#r-cmd-build-and-the-project-library for more details.

How to solve "bad restore file magic number" when trying to load data?

I tried to load data to my R working directory and receive this error:
Error: bad restore file magic number (file may be corrupted) -- no data loaded
In addition: Warning message:
file ‘classize.RData’ has magic number 'RDX3'
Use of save versions prior to 2 is deprecated
I googled it and tried many options, unsuccessfully.
My Rstudio version is: 1.2.5033 (The error was happening before updating as well)
I create a new project, in the new directory, I put the data file
The data file is "classize.RData"
I have another alternative which is "classize.RDS" with the sugesstion to use readRDS(file = "classize.RDS"). When using this command, I receive that error:
cannot read workspace version 3 written by R 3.6.1; need R 3.5.0 or newer
This is in the context of a statistical course at university and my teacher assistant is unable to help me out, and whitout resolving this issue, I cannot move forward in the resolution of the needed exrecices. So please, couly you help me resolve that problem.
ps: all the students have access to the same data, It's just for me that it's not working, therefore the file should not be corrupted.

I am getting a pathing error that I do not understand regarding the "diskImageR" package

I am getting an error when trying to run the diskImageR package, specifically the IJMacro function, regarding an inability to locate ImageJ. This is what I think the error is stating although I do not know for sure.
I already tried changing the path and by following the pdf associated with running the package, but I still get the same error.
IJMacro("newProject",imageJLoc ="C:\\Users\\user\\Desktop\\ImageJ")
[1] "Searching for application name or filepath: ImageJ"
Error in ij$runScript(paste(script, IJarguments)) :
The imageJ binaries have not been located. Re-initialise the imageJInterface object with the correct location for the imageJ binaries
In addition: Warning message:
In setFilePath(filePath) :
The ImageJ application could not be found in the common install location on your system

R installed.packages() randomly stopped working on windows 7

installed.packages() command in R lists your installed packages. Mine was working for almost a year and then this command randomly started throwing an error. As this is a built-in command, I am not even sure how to "reinstall" it or address this. Any ideas how to fix the error and get the command working again?
> installed.packages()
Error in gzfile(file, mode) : cannot open the connection
In addition: Warning message:
In gzfile(file, mode) :
cannot open compressed file `'C:\Users\Mitch\AppData\Local\Temp\Rtmp6Dawpa/libloc_190_4464fd2b.rds', probable reason 'No such file or directory'`
One suggestion on here involved this in combination:
.libPaths()
installed.packages(lib.loc = 'my path')
The results of this produced yet another error as shown here. Looks like an issue with the installed file still but how to address is the question:
> installed.packages(lib.loc = 'C:/ProgramFilesCoders/R/R-3.3.2/library')
Error in gzfile(file, mode) : cannot open the connection
In addition: Warning message:
In gzfile(file, mode) :
cannot open compressed file 'C:\Users\Mitch\AppData\Local\Temp\Rtmp6Dawpa/libloc_190_4464fd2b.rds', probable reason 'No such file or directory'
>
That is odd.
What version of R are you running, standard R or Microsoft R? And did you recently update?
If you did recently update, perhaps your packages did not get copied over, hence the 'No such file or directory' statement.
If you haven't updated, I would install a newer version and see if it fixes the issue.
If your uncertain, you can always use the updateR function to check if you have the latest version and choose to install it or not.
library(installr)
updateR()
Good luck,
I think the issue lies in terms of the where the function is looking for the package information. installed.packages() needs an argument lib.loc.
From official documentation
lib.loc character vector describing the location of R library trees to search through
Looks like the function for some reason is looking in AppData\Local\Temp which is the download location and not the installed location.
Without looking at your R_Home and .libPaths() is difficult to nail down where the problem is, however running .libPaths() should give you one or more paths as shown in the below example. None of these should be temp locations.
>.libPaths()
[1] "C:/Users/UserName/Documents/R/win-library/3.4"
[2] "C:/Program Files/R/R-3.4.0/library"
If not, you can set the path within the .libPaths("your path") or pass the path of the library as part of installed.packages(lib.loc = 'your path') and try again.
Sometimes the most simple obvious solution is what works:
I closed my RStudio environment saving it to .RData
I re-opened RStudio and tried the command again
it worked
For the future, some good ideas got posted on here before I thought to try the above. Here are the suggestions that others included in case the above does not work if this problem is encountered by anyone in the future:
Use .libPaths() to find out proper path where this is installed, and then re-run the command with the path included in it like so: installed.packages(lib.loc = 'your path')
Try debugging it with: debug(installed.packages); Expectation is that we will likely find something wrong with .readPkgDesc(lib, fields) while stepping through debug. This was not tried yet so you may encounter things not written up here when you do try it.
Try Updating R in case it is out of date with these commands: library(installr) and updateR().

R is not connecting to HDFS

Why is R not connecting to Hadoop ?
I am using R to connect to HDFS using 'rhdfs' package. The 'rJava' package is installed and rhdfs package is loaded.
The HADOOP_CMD environment variable is set in R using:
Sys.setenv(HADOOP_CMD='/usr/local/hadoop/bin')
But when hdfs.init() function is given, the following error message is generated:
sh: 1: /usr/local/hadoop/bin: Permission denied
Error in .jnew("org/apache/hadoop/conf/Configuration") :
java.lang.ClassNotFoundException
In addition: Warning message:
running command '/usr/local/hadoop/bin classpath' had status 126
Also, 'rmr2' library was loaded, and the following code was typed:
ints = to.dfs(1:100)
which generated the message given below:
sh: 1: /usr/local/hadoop/bin: Permission denied
The R-Hadoop packages are accessible only to the 'root' user and not 'hduser' (Hadoop user), since they were installed when R was run by the 'root' user.
Simple, only 2 reasons to get this type of problem
1) Wrong path
2) No privileges/permissions to that jar ok
not only that include other system paths. such as given below.
Sys.setenv(HADOOP_HOME="/home/hadoop/path")
Sys.setenv(HADOOP_CMD="/home/hadoop/path/bin/hadoop")
Sys.setenv(HADOOP_STREAMING="/home/hadoop/path/streaming-jar-file.jar")
Sys.setenv(JAVA_HOME="/home/hadoop/java/path")
Then include ibrary(rmr2) and library(rhdfs) paths, surely that error don't occur.
But your problem is Permission problem. So as a root grant all privileges (755) to you then run that jar file, surely that error don't display.
try like this.
Sys.setenv(HADOOP_CMD='/usr/local/hadoop/bin/hadoop')
Sys.setenv(JAVA_HOME='/usr/lib/jvm/java-6-openjdk-amd64')
library(rhdfs)
hdfs.init()
please give the correct HADOOP_CMD path extend with /bin/hadoop

Resources