RDS table opened in RStudio. How do I find field Names? - r

As in the header, I have opened an RDS table in R Studio, and need to know the field names within that table.
But I don't know the correct command or syntax to follow this:
UK_2001 <- readRDS("D:/Census_History/Postcodes/2001_05_MAY_AFPD.rds")
Any guidance would be gratefully received.
Thanks in advance.

You can display the structure of any R object using str which will give you the object's type (e.g. data.frame), column names and column types as well.
str(UK_2001)
If you are just after the names of the columns colnames will do.

Related

How to solve the duplicated names of cores in a RWL file in R

When I read some tree ring width data of rwl files (by the function "read.rwl" in the pacakge of 'dplr') there are the errors that files can not be read due to the duplicated cores' names. So I intended to read the data by function 'read.delim' and convert its format to one column format and rename its name. But now I don't know how to convert it. Please give me some suggestions to resolve this problem. Thanks in advance.

Convert data dictionary from word to excel with R

I got the data dictionary from data provider which contains hundreds vars in different word files and looks like this:
In order to add this dictionary to my current dataset, I need to convert it to certain format in Excel. For example,for first var:"intarm_actual", i would like to create columns in a spreadsheet: col of "variable" puts the left top words, col of "label" store content of "label" (for this var, it is NA, but for second var, it should be "tpe_lab"), col of "type" stors the words of " string(str2), col of "value" stores "4", col of "missing" stores "46/102", col of "tabulation" stores "46 "", 14 "RO",14 "RV",14 "TO",14 "TV"". Ideally, it should look like this:
Could anyone who happens have done this before help to provide some suggestions for this? (I appreciate for any suggestion like what package I should refer and use, any related posts article I should read, similar type of code i can learn...)Can R package "labelled" handle this type of task? Thanks a lot~~!!
update:_________________________________________________
I use package qdapTool to imported one of the docx files, it looks like this:
How can I retrieve the demanded words and assign them to right place in my spreadsheet? Thanks~~!
Update 2:--------------------------------------------
Issue has been solved in another way.
In case someone will encounter the similar situation, 1) This type of codebook file is generated by STATA; 2) Instead of reading this complex text file, the alternative solution is using package of "codebook" in R to generate the new .csv codebook which contains both these information and even more.
assuming that indeed, you have zero clue, I would recommend you to get started with regular expressions in R. I often use the R package stringr to work with regular expressions, and you find the respective cheat sheet here. They will allow you to, e.g., select the word following a ":".
I have never worked with Word Documents in R, but I guess that there are packages out there that allow you to read Word documents into R. Just Google them. :) I am sure they also have good instructions on how to use them.
Another issue you might encounter is encoding. If you have issues with reading the text into read in the correct way, e.g. reading in strange character combinations, that is most likely the source of the problem.
Once you have looked at these things and started working on your own code, you will be able to ask more precise questions.

RStudio - weird formatting to append new row to .csv file

I am trying to append a new row of data to an existing .csv file but everytime I do so, it doesn't add it to a new row. Instead, it just appends to the last row like so:
Also, the first column is supposed to show the date but I'm not sure why it shows up as hashtags. But in the column where it shows 33882020-09-24, it should end at 3388 and everything else after that should be in their respective column below.
Here is what I have for my code. I've followed multiple forums on how to append to .csv files and did exactly what was shown so I am at a loss..
Any suggestions would be greatly appreciated! Thank you in advance!!
For the data, is possible that the Rstudio don't recognise the number as a data (because you use -, and for R is a operation. Try don't use this symbol, but for example_) or try to write 24sept2020.
For add a new row a found this info:
How can a add a row to a data frame in R?

How do I get EXCEL to interpret character variable without scientific notation in R using fwrite?

I have a relatively simple issue when writing out in R with fwrite from the data.table package I am getting a character vector interpreted as scientific notation by Excel. You can run the following code to create the data issue:
#create example
samp = data.table(id = c("7E39", "7G32","5D99999"))
fwrite(samp,"test.csv",row.names = F)
When you read this back into R you get values back no problem if you have scinote disable. My less code capable colleagues work with the csv directly in excel and they see this:
They can attempt to change the variable to text but excel then interprets all the zeros. I want them to see the original "7E39" from the data table created. Any ideas how to avoid this issue?
PS: I'm working with millions of rows so write.csv is not really an option
EDIT:
One workaround I've found is to just create a mock variable with quotes:
samp = data.table(id = c("7E39", "7G32","5D99999"))[,id2:=shQuote(id)]
I prefer a tidyr solution (pun intended), as I hate unnecessary columns
EDIT2:
Following R2Evan's solution I adapted it to data table with the following (factoring another numerical column, to see if any changes occured):
#create example
samp = data.table(id = c("7E39", "7G32","5D99999"))[,second_var:=c(1,2,3)]
fwrite(samp[,id:=sprintf("=%s", shQuote(id))],
"foo.csv", row.names=FALSE)
It's a kludge, and dang-it for Excel to force this (I've dealt with it before).
write.csv(data.frame(id=sprintf("=%s", shQuote(c("7E39", "7G32","5D99999")))),
"foo.csv", row.names=FALSE)
This is forcing Excel to consider that column a formula, and interpret it as such. You'll see that in Excel, it is a literal formula that assigns a static string.
This is obviously not portable and prone to all sorts of problems, but that is Excel's way in this regard.
(BTW: I used write.csv here, but frankly it doesn't matter which function you use, as long as it passes the string through.)
Another option, but one that your consumers will need to do, not you.
If you export the file "as is", meaning the cell content is just "7E39", then an auto-import within Excel will always try to be smart about that cell's content. However, you can manually import the data.
Using Excel 2016 (32bit, on win10_64bit, if it matters):
Open Excel (first), have an (optionally empty) worksheet already open
On the ribbon: Data > Get External Data > From Text
Navigate to the appropriate file (CSV)
Select "Delimited" (file type), click Next, select "Comma" (and optionally deselect any others that may default to selected), Next
Click on the specific column(s) and set the "Default data format" to "Text" (this will need to be done for any/all columns where this is a problem). Multiple columns can be Shift-selected (for a range of columns), but not Ctrl-selected. Finish.
Choose the top-left cell to import/paste the data (or a new worksheet)
Select Properties..., and deselect "Save query definition". Without this step, the data is considered a query into an external data source, which may not be a problem but makes some things a little annoying. (For example, try to highlight all data and delete it ... Excel really wants to make sure you know what you're doing there.)
This method provides a portable solution. It "punishes" the Excel users, but anybody/anything else will still be able to consume the files directly without change. The biggest disadvantage with this method is that you won't know if somebody loads it incorrectly unless/until they get odd results when the try to use the data and some fields are silently converted.

Get a list from R string that contains a csv

for one of my projects I will need to import the dataset (csv-File) outside of R and then assign it from the Ruby side of the project in R (this will be done with rinruby and already works).
In my R-Script I now need to create a list out of that csv file.
The variable contains an escaped string that contains the original csv.
data <- "\"\",\"futime\",\"fustat\",\"age\",\"resid.ds\",\"rx\",\"ecog.ps\"\n\"1\",59,1,72.3315,2,1,1\n\"2\",115,1,74.4932,2,1,1\n\"3\",156,1,66.4658,2,1,2\n\"4\",421,0,53.3644,2,2,1\n\"5\",431,1,50.3397,2,1,1\n\"6\",448,0,56.4301,1,1,2\n\"7\",464,1,56.937,2,2,2\n\"8\",475,1,59.8548,2,2,2\n\"9\",477,0,64.1753,2,1,1\n\"10\",563,1,55.1781,1,2,2\n\"11\",638,1,56.7562,1,1,2\n\"12\",744,0,50.1096,1,2,1\n\"13\",769,0,59.6301,2,2,2\n\"14\",770,0,57.0521,2,2,1\n\"15\",803,0,39.2712,1,1,1\n\"16\",855,0,43.1233,1,1,2\n\"17\",1040,0,38.8932,2,1,2\n\"18\",1106,0,44.6,1,1,1\n\"19\",1129,0,53.9068,1,2,1\n\"20\",1206,0,44.2055,2,2,1\n\"21\",1227,0,59.589,1,2,2\n\"22\",268,1,74.5041,2,1,2\n\"23\",329,1,43.137,2,1,1\n\"24\",353,1,63.2192,1,2,2\n\"25\",365,1,64.4247,2,2,1\n\"26\",377,0,58.3096,1,2,1"
And I would like to convert this to a R-List.
So my approach is basically to call read.csv(data_as_string) but unfortunately the signature is read.csv(file_where_data_lies).
How can this be done?
Thanks so much!
As Therkel mentioned above, myfunc(file = textConnection(data)) did exactly what I was about to do. Thanks!

Resources