RStudio converted code to jibberish, froze, and saved as gibberish - r

I had a small but important R file that I have been working on for a few days.
I created and uploaded a list of about 1,000 ID's to SQL Server the other day and today I was repeating the process with a different type of ID. I frequently save the file and after having added a couple of lines and saved, I ran the sqlSave() statement to upload the new ID's.
RStudio promptly converted all of my code to gibberish and froze (see screen shot).
After letting it try to finish for several minutes I closed RStudio and reopened it. It automatically re-opened my untitled text files where I had a little working code, but didn't open my main code file.
When I tried to open it I was informed that the file is 55 Megabytes and thus too large to open. Indeed, I confirmed that it really is 55MB now and when opening it in an external text editor I see the same gibberish as in this screnshot.
Is there any hope of recovering my code?
I suppose a low memory must be to blame. The object and command I was executing at the time were not resource intensive, however a few minutes before that I did retrieve an overly large dataframe from SQL Server.

You overwrote your code with a binary representation of your objects with this line:
save.image('jive.R')
save.image saves the R objects, not your R script file. To save your script, you can just click "File->Save". To save your objects, you would have to put that in a different file.

Related

Cannot upload speech dataset because "Failed"

So I am trying to upload a dataset to the microsoft cognitive services speech portal for custom models.
I have been doing this for about a year without issue, however now I am getting "Failed" with the detail "Failed to upload data. Please check your data format and try to upload again." ... very useful.
So does anyone know what could be causing the issue apart from the below which I have already checked.
Filesize is 1.3GB (zipped) / 1.8GB (unzipped) which is below the 2GB limit for "Max acoustic dataset file size for Data Import" as specified in https://learn.microsoft.com/en-us/azure/cognitive-services/speech-service/speech-services-quotas-and-limits#model-customization
The Trans.txt file is a properly formatted 1.3MB UTF-8 with a BOM text file with tab separated filename / text values as specified in https://learn.microsoft.com/en-us/azure/cognitive-services/speech-service/how-to-custom-speech-test-and-train
All entries in the Trans.txt file are present in the directory
All files in the directory have an associated entry in the Trans.txt file
All files are WAV files in the specified format.
Basically all of the above has been working for a year with the only thing that really changes is the size of the zip file which is still below limits.
On the off-chance someone from MS sees this, the dataset ID is: 7a3f240c-5eb7-4942-8e0f-7efa1b808eee
Related feedback post: https://feedback.azure.com/forums/932041-azure-cognitive-services/suggestions/42375118-actionable-error-messaging-in-speech-portal
After contacting MS support it appears something broke server-side related to the file-size even though we are within limits. They are working on fixing it.

Why everything in RStudio workspace vanishes every time I close it?

Every time I close and open the RStudio, everything in the panels including all the data frames, functions, values, etc. vanishes and some very old ones that I have deleted long ago appears. I save workspace when I want to close it, but this happens every time. Importing my large dataset and generating everything again every time takes a lot of time. What can I do?
You can save your workspace and restore it under Tools -> Options -> General.
Please see picture below.
In addition you can also use:
save.image(file='Session.RData')
And load it later:
load('Session.RData')
However, generally speaking, some consider it bad to keep/save your environment/workspace.

Network timeouts when running lengthy shiny apps

I have a shiny app which selects a subset of observations from a large dataframe, and then renders r markdown reports against each observation of that subset, zipping them all these reports at the end and downloading the zip file.
When the subset is small (eg less than 10 reports), all works fine, but a network timeout occurs once it takes more than a certain amount of time to render all the reports in the background (eg in some cases more than 100 reports need to be rendered).
I have tried editing the config file to set app_init_timeout = 3600 and app_idle_timeout =3600 but this does not seem to impact this problem....
Any ideas?
I solved this problem by separating the report creation from the download. I used eventReactive to handle the report creation and the zipping of the files, and then made the downloadHandler conditional on the existence of the zip file, so that it only appeared when the downloaded file was ready.

Submit a new script after all parallel jobs in R have completed

I have an R script that creates multiple scripts and submits these simultaneously to a computer cluster, and after all of the multiple scripts have completed and the output has been written in the respective folders, I would like to automatically launch another R script that works on these outputs.
I haven't been able to figure out whether there is a way to do this in R: the function 'wait' is not what I want since the scripts are submitted as different jobs and each of them completes and writes its output file at different times, but I actually want to run the subsequent script after all of the outputs appear.
One way I thought of is to count the files that have been created and, if the correct number of output files are there, then submit the next script. However to do this I guess I would have to have a script opened that checks for the presence of the files every now and then, and I am not sure if this is a good idea since it probably takes a day or more before the completion of the first scripts.
Can you please help me find a solution?
Thank you very much for your help
-fra
I think you are looking at this the wrong way:
Not an R problem at all, R happens to be the client of your batch job.
This is an issue that queue / batch processors can address on your cluster.
Worst case you could just wait/sleep in a shell (or R script) til a 'final condition reached' file has been touched
Inter-dependencies can be expressed with make too

Does DB2 OS/390 BLOB support .docx file

ASP.net app inserts Microsoft Windows 2007 .docx file into a row on DB2 OS/390 Blob table. A different VB.net app gets the DB2 OS/390 Blob data. VB.net app kicks off Microsoft Word to open the .docx file but then Microsoft Word pops up a message that the data is corrupted. Word will allow you to fix the data so the file can be viewed but it is extra steps and users complain.
I've seen some examples where .docx can be converted to .doc but they only talk about stripping out the text. Some of our .docx have pictures in them.
Any ideas?
I see that this question is 10 months old. I hope it's not too late to be helpful.
Neither DB2 nor any other database that allows a "Blob" data type would know that the data came from a .docx file, or do anything that would cause Word to complain. The data is supposed to be an exact copy of whatever data you pass to it.
Similarly, the Word document does not "know" that it has been copied to a BLOB object and then back.
Therefore, the problem is almost certainly with your handling of the BLOB data, in one or both of your programs.
Please run your first program to copy the .docx file into the databse, then run the second one to read it back out. Then use a byte-by-byte tool to compare the two files. One way to do this would be to open a command window and type:
fc/b Doc1.docx Doc2.docx
If you have access to some better compare tools, by all means use them... but make sure that it looks at EVERY BYTE, not just the printable characters.
Obviously, you ARE going to find differences, or else Microsoft Word wouldn't give you errors on the second one when the first one is just fine. Once you see what the differences are, hopefully you will understand what is going wrong and how to fix them.
I had a similar problem several years ago (I was storing graphics, but it's the same basic problem). It turns out that the document size was being affected - I would store 8005 bytes into the BLOB object, and when I read it back out I was getting 8192 bytes. NUL (0) bytes were being appended to the end of the data.
My solution at the time was to append an "X" to the end of the BLOB data when I wrote it to the database. Then, when I read it back, I would search for the very last "X" in the data and remove it, along with any data after it. That way, I could recover the original data. What I should have done was store the data length in the database along with the BLOB data. Then you could truncate the file to that size, eliminating the corruption.
If appended NUL bytes aren't your problem, then you'll need to do something else to fix the problem. But you don't have a clue until you know what changed. Something did.

Resources