How to increase the heap size in WEKA? - out-of-memory

I am learning WEKA, I want to increase the heap size.
I did Q&A researches, most of the answers showed that heap size can be changed in RunWeka.ini, but I couldn't find RunWeka.ini and also maxheap.
My WEKA version is Weka 3.8.4

Welcome to SO. You have two options.
Option 1:
Open the notepad or text editor.
Write the following statement inside.
java -Xmx8000m -cp "weka.jar" weka.gui.GUIChooser
Here "-Xmx" sets the maximum memory that Weka can use. Another
version of this parameter is "-Xms". This parameter sets the minimum
memory that Weka will use. "8000m" means 8000 MB. You can also write
"4g" and specify in Gigabytes.
Save the file in the notebook as shown in the picture below. For
example "run.bat"
Copy the file and paste it into the directory where Weka is located.
I think it will be useful when you double-click the file you pasted.
Note: If you are using 32 bit java, you cannot increase more than
8GB. No problem on 64 bit.
Option 2:
Enter the folder where Weka is located.
Open the RunWeka.ini file. Find the line shown in the code below
and enter the value you want as shown.
javaOpts=%JAVA_OPTS% -Xmx8048m
Save and run Weka.
I saw this option on the site (link here). But I don't know if it works. I think it works.
I am using the first option and it works. Good luck.

Related

Loading a .RDS file in Python 3

I have a dataset in RDS format that I managed in RStudio, but I would like to open this in Python to do the analysis. Would it be possible to open this type of format into Python?
I tried the following codes already:
pip install pyreadr
import pyreadr
result = pyreadr.read_r('/path/to/file.Rds')
However, I get a
MemoryError: Unable to allocate 18.9 MiB for an array with shape
(2483385,) and data type float64.
What can I do?
Pyreadr is a wrapper around the C library librdata, and librdata has a hardcoded limit on the size an R vector can have. The limit used to be very low in old versions, but it was increased. Your vector would fail in older versions but should work in a recent one, so please check that you are using the most recent version.
If that doesn't help, then it may be a bug. If you can share the file please submit an issue in github.
Here a link to the old issues in github librdata and pyreadr (theoretically now solved)
https://github.com/WizardMac/librdata/issues/19.
https://github.com/ofajardo/pyreadr/issues/3
EDIT:
The limit is now permently removed in pyreadr 0.3.0. Now this should not be an issue anymore.
From my knowledge, you could store the data to a pandas dataframe as mentioned in this link.
The second option(link)
How can I explicitly free memory in Python?
If you wrote a Python program that acts on a large input file to create a few million objects representing and it’s taking tons of memory and you need the best way to tell Python that you no longer need some of the data, and it can be freed?
The Simple answer to this problem is:
Force the garbage collector for releasing an unreferenced memory with gc.collect().
I hope this answers your query

How to Manage R Packages given Windows 255 file path limit, e.g. checkpoint and Rcpp?

So I was trying to install Rcpp using the checkpoint package (with a March 1st 2020 date).
Most of my packages were fine, but Rcpp specifically makes a lot of temporary directories that it then deletes, for example:
00LOCK-Rcpp/00new/Rcpp/include/Rcpp/generated/InternalFunctionWithStdFunction_call.h
This is 84 characters long and I belive some are longer.
Checkpoint creates numerous directories as well, for example with a custom library here:
"custom_library/.checkpoint/2020-03-01/lib/x86_64-w64-mingw32/3.6.0/"
This is 67 characters, of which 52 are only necessary when managing multiple checkpoint dates or versions.
This means that for a file path such as:
"C:/Users/USER/OneDrive - COMPANY/Documents/LargeDirectory/SubDirectory1/SubDirectory2/custom_library/.checkpoint/2020-03-01/lib/x86_64-w64-mingw32/3.6.0/Rcpp"
Assuming even temporary files can't exceed 255 chars then I have definitely < 60 characters left available for all of the Rcpp temporary objects.
I tested with the following code:
setwd("C:/Users/USERNAME/OneDrive - COMPANY/Documents/LargeDirectory/SubDirectory1/SubDirectory2/")
dir.create("custom_library)
checkpoint(as.Date("2020-03-01"),
checkpointLocation = paste0(
"SubDirectory2","/custom_library")
)
y
install.packages("Rcpp")
It fails because of numerous "no file or directory found" which I believe actually fails because 00LOCK-Rcpp/00new/Rcpp/include/Rcpp/ can't be created to then unzip all the .h files to it. I was curious so I ran the following:
setwd("~") # up to Documents
dir.create("Rcpptest")
.libPaths("Rcpptest")
install.packages("Rcpp")
And it installed fine.
Any ideas on how to make checkpoint either not create so many nested directories OR ignore the file_path 255 limit until the whole package installs?
For now, I will likely move the directory up a few levels but any insight into whether my guess is actually correct or if I'm missing something would be appreciated!
I believe you are correct -- this is, to the best of my knowledge, a limitation of the internal unzip implementation used by R, which is ultimately a limitation of the Windows APIs used by R. See https://learn.microsoft.com/en-us/windows/win32/fileio/naming-a-file for some more discussion.
There are a couple options for mitigating the issue that may be worth trying.
Use utils::shortPathName() to construct a so-called Windows 'short path'. This will help trim longer path components and bring the full path down in size.
Create a junction to your project using Sys.junction() to a local path with a shorter length, and move to that directory. See ?Sys.junction for more information -- a junction is basically like a Windows shortcut, or a symlink to a directory.
In each case, you should hopefully be able to construct a path that is "identical" to your current project directory, but short enough that things can function as expected.

Where is convert in ImageMagick?

I just spent an infuriating day trying to make a gif out of a series of jpg files in R. I installed ImageMagick to run the following code:
system("convert -delay 40 *.png example_4.gif")
but I get the following error message:
Warning message:
running command 'convert -delay 40 *.png example_4.gif' had status 4
which looks like a path error. Now I've looked for convert in the Imagemagick download and can't see it anywhere. Does anyone know where it is?
Alternately, is there another easier method of making a gif from a series of jpegs in R that isn't ridiculously long?
Thanks
Three options:
Consider using the magick R package instead of using system().
Change your script from convert ... to magick convert ....
Re-install imagemagick, and enable the "Install legacy utilities (e.g. convert)" option.
This change has been around since 7.0.1 (now up to 7.0.7), and is discussed in their porting guide, specifically in the section entitled "Command Changes".
Philosophically, I prefer to not install the legacy utilities, mostly because it can cause some confusion with command names. For instance, the non-ImageMagick convert.exe in windows tries to convert a filesystem ... probably not what you want to accidentally trigger (there is a very low chance that you could get the arguments right to actually make a change, but it's still not 0). The order of directories in your PATH will dictate which you are calling.
EDITs:
From comments, it seems like the difference between "static" and "dll" installers might disable the option to install legacy utilities such as convert.exe. So you can either switch to the "dll" to get the legacy option, or you are restricted to options 1 (magick R package) and 2 ("magick convert ...").
From further comments (thanks to fmw42 and MarkSetchell), it is clear that the old convert.exe and the current legacy mode of magick.exe convert are not the same as the currently recommended magick.exe (without "convert"); the first two are legacy and compatibility modes, but they do not accept all arguments currently supported by magick-alone. So the use of "convert" anywhere in the command should indicate use of v6, not the current v7. This answer is then merely a patch for continued use of the v6 mechanisms; one could argue a better solution would be to use magick.exe's v7 interface, completely removing the "convert" legacy mode.

How to change toolkit in Octave?

My Octave crashes when I execute plot command. I found a solution in Assad Ebrahim's answer. He mentioned to switch the default toolkit to gnuplot, and change it in octave.rc file if I want to make the change permanently but I'm not clear about the permanent change in octaverc. When I open my octaverc with notepad++, it looks like this:
## System-wide startup file for Octave.
##
## This file should contain any commands that should be executed each
## time Octave starts for every user at this site.
EXEC_PATH (cstrcat (fullfile (OCTAVE_HOME, 'notepad++'), pathsep,
EXEC_PATH));
EXEC_PATH (cstrcat (fullfile (OCTAVE_HOME, 'bin'), pathsep, EXEC_PATH));
EDITOR (fullfile (OCTAVE_HOME, 'notepad++', 'notepad++.exe'));
What should I change and how?
First, the direct answer to your question is to append any command you want executed on startup to the end of the .octaverc file. So, to set a particular graphics toolkit you would add the line:
graphics_toolkit("gnuplot")
Or
graphics_toolkit("qt")
Or
graphics_toolkit("fltk")
For whichever toolkit you want.
Now, as pointed out by #Andy, if you are using Windows, it may be that you are misinterpreting a long delay for a crash. A still not entirely resolved windows bug concerns the fact that on the first plot Windows might need to create a font cache file. This can take a long time. Once this is complete, most subsequent plots will be much faster. Some info can be found about it at the following bug report page:
https://savannah.gnu.org/bugs/?45458
EDIT: in the time since this answer was posted, the bug linked above has been largely resolved. Part of the installation process now updates the font-cache file. If using a zip package rather than an executable installer, there is a batch file that should be run after extracting octave to ensure that this is done. Details are available at:
http://wiki.octave.org/Octave_for_Microsoft_Windows

R2PPT crashes R; are there alternatives to R2PPT?

I am attempting to automate the insertion of JPEG images into Powerpoint. I have a macro done for that already, except using R would be infinitely better for my purposes.
The package R2PPT should do this, I understand. However, I cannot use it. For example, when I try to use PPT.Open, I understand I can do it two different ways by calling method = "rcom" or method = "RDCOMClient". Using the latter, R will always crash, sending an error report to windows. Using the former, it tells me I need to install statconnDCOM , before giving the error:
Error in PPT.Open(x) : attempt to apply non-function.
I cannot install statconnDCOM freely, as I wouldn't call this work non-commercial. So if there isn't a way to get around this issue, are there at least some free alternatives to R2PPT so that I can save several hours of manual work with a simple R code? If there is a way for me to use R2PPT, that would be ideal.
Thanks!
Edit:
I'm using R version 2.15 and downloaded the most recent version of R2PPT. Powerpoint is 2007.
Do you have administrative privileges on this machine?
There is an issue with package RDCOMClient. It needs permissions to write file rdcom.err in the root of drive C:. If you don't have privileges to write to c:, there is a rather cumbersome workaround:
Close R
Create "c:\temp" folder if it doesn't exist.
Locate on your hard drive file rdcomclient.dll. It usually placed in \R\library\RDCOMClient\libs\i386\ and in \R\library\RDCOMClient\libs\x64\ (you need to patch file which corresponds your Windows version - 32 bit or 64 bit). It's recommended to make backup copy of this files before patching.
Open rdcomclient.dll in text editor (Notepad++, for example -http://notepad-plus-plus.org/)
Find in file string c:\rdcom.err - it occurs only once.
Go into overwrite mode (usually by pressing "Ins" key). It is very important that new path will have the same number of characters as original one. Type C:\temp\e.rr instead of c:\rdcom.err
Save the file.
Now all should work fine.
Arguably not an answer, but have you looked at using Sweave/knitr to render your presentations in LaTeX using something like Beamer? (As discussed on slide 17 here.)
Wouldn't help any with getting JPGs into a PowerPoint, but would certainly make putting R-output (numerical or graphical) into a presentation much easier!
Edit: if you want to use knitr (which I recommend), here's another reference.

Resources