I'm using PHPExcel 1.6.7 on wamp.
I'm trying to load a big xlsm file of ~2000kb (~2.0mb)
At first, php complained of the time the script takes to load,
then I changed that time in php.ini, then it complained of the memory size it consumes, again I increased that parameter in php.ini, finally I'm standing with Maxinum Execution Time ~ 5minuts and Memory Limit ~ 400mb and it's steel cannot be loaded.
Is there any way to optimise the loading process meaningfully? something like telling it not to load styles or pictures or only load text?
(Do you know how ASP.NET loading excel files? would it be the same?)
Version 1.6.7 is a pretty old version of PHPExcel: the latest is 1.7.6 which allows options for caching cell data outside of PHP memory (either in an external cache like memcache, wincache, apc; or to disk) or in a compressed form in PHP memory (which reduces the overall memory usage). There are also options to load only the cell data rather than the formatting. All of this is fully described in the PHPExcel manual.
Some additional techniques are also descibed in this thread
Note that xlsm (Excel Macro) files aren't officially supported by PHPExcel
Related
I have downloaded a large number of Sentinel-2 SAFE files using the R package 'sen2r', which has implemented a Google Cloud download method to retrieve products stored in Long Term Archive. This has worked for me, but after checking the files I have found a decent number of empty files appended with _.gstmp, which according to this represent partially downloaded temporary files that are supposed to be resumed by gsutil. I have re-run the sen2r() command (with server = "gcloud" setting) but it does not resume and correct the downloads as the folders are already there. I would like to resume downloading just the _.gstmp files as it took over a week to download all of the SAFE products and I don't want to start all over again. I'm guessing I can fix this by using 'gsutil' directly but I'm a bit out of my element as this is my first experience using Google Cloud and the sen2r author as they no longer have time to respond to issues on github. If you have any tips for resuming these downloads manually using gsutil command line it would be much appreciated.
I have searched stack exchange and also the sen2r manual and github issues and have found any other reports of the problem.
The environment is:
R: 3.6.1
readxl version: ‘1.3.1’
When I close the Excel program, read_excel takes a second or 2, but when I have the file opened in Excel, then read_excel in R can take a few minutes.
I wonder why was that?
Some programs, like Excel, put access restrictions on files while the files are open. This prevents accidental conflicts from external changes to the same file while it is open.
I don't know why specifically it would affect other tools from reading the file and why the effect would manifest as slower speed instead of complete inability. Maybe Excel is trying to monitor the access to the file and compare it to the content it has loaded.
I'm loading an R rds file into Julia with
using RData
objs = load(rds, convert=true)
The original rds file is ~3GB. When I run the load function about, the memory spikes to ~40GB.
Any ideas what's going on?
The rds files are actually compressed using gzip. Try unzipping your file and see how big it actually is (on Windows you could use 7-zip for that). The compression level for a dataframe easily could be around 80-90% so your numbers look fine.
Is there a way to connect to a DBF and work on it without loading the file completely into memory?
I understand that the foreign package can be used to read DBFs, but this method loads the file into memory. This is problematic if the DBF in question is heavy in terms of file size. I'm aware of solutions that enable loading of heavy files into memory, but any solution that connects to the file and makes changes to it without loading it into memory is welcome.
The following message appears everytime I attempt to use 'Gadfly', 'Bio', or several other packages (I'm using 'Bio' in the example):
julia> using Bio
INFO: Recompiling stale cache file C:\Users\CaitlinG\emacs251\.julia\lib\v0.5\Di
stributions.ji for module Distributions.
INFO: Recompiling stale cache file C:\Users\CaitlinG\emacs251\.julia\lib\v0.5\Bi
o.ji for module Bio.
Julia 0.5.1 (all packages updated)
Windows 10 (fully updated)
Emacs 25.1
This is inconvenient since I can only assume it is not a "typical" component of importing a package. Can the issue be resolved by deleting the .julia directory?
Thanks.
Moving my comment to an answer since that appears to have resolved the question:
Julia caches its precompiled output within the .julia/lib folder. If any of the files there are older than the original source, it'll recompile them. It seems like Julia was having trouble overwriting the cache for a few specific packages here, so it was persistently recompiling them. By deleting the lib folder, you clear out these caches. Julia will recompile all packages, but it should now write them with the correct permissions that will allow them to be overwritten in the future.
Deleting the entire .julia folder is a much more drastic step that risks losing edits you've made to packages, and you'd need to re-install all the packages you've added.
The messages about recompiling a stale cache file are not warnings, but rather for your information. It means that something changed on your system and the current cache file is no longer considered valid, so instead of providing you a potentially old cachefile, Julia automatically cleans up and recompiles the cache.