I have created a dataset that consists of 574 Rows and 85 Columns. The data type is a list. I want to export this data to CSV as I want to perform some analysis. I tried converting List to Dataframe using dataFrame <- as.data.frame(Data) command. I also looked out for other commands but was not able to convert the list to dataframe, or any other format. My goal is to export the data to a CSV file.
This image is a preview of the dataset:
This image shows that data type is list of dimension 574*85:
You can try this "write.csv" function on your list.
write.csv(list,"a.csv")
it will automatically save in your working directory.
Provide a list format of yours. I'm not sure below answer is useful for you or not.
If your list is as below
all_data_list = [[1,2,3],[1,4,5],[1,5,6],...]
you have to do:
df = pd.DataFrame(all_data_list)
Related
In R, each time a data frame is filtered for example are there any changes made to the source data frame? What are best practices for preserving the original data frame?
Okay, so I do not understand exactly what you mean but, if you have a .csv file for example ("example.csv") in your working directory and you create an r-object (example) from it, the original .csv file is maintained intact.
The example object however changes whenever you apply functions or filters to it. The easiest way to maintain an original data frame is to apply those functions to a differently named object (i.e. example2)
you may save as another data frame or output them for preservation
mtcars1 <- mtcars %>%
select(mpg,cyl,hp,vs)
Save one object to a file
saveRDS(mtcars1 , file = "my_data.rds")
Restore the object
readRDS(file = "my_data.rds")
Save multiple objects
save(mtcars, mtcars1, file = "multi_data.RData")
Restore multiple objects again
load("multi_data.RData")
I am trying to figure out how to 'download' data into a nice CSV file to be able to analyse.
I am currently looking at WHO data here:
I am doing so through following documentation and getting output like so:
test_data <- jsonlite::parse_json(url("http://apps.who.int/gho/athena/api/GHO/WHS6_102.json?profile=simple"))
head(test_data)
This gives me a rather messy list of list of lists.
For example:
I get this
It is not very easy to analyse and rather messy. How could I clean this up by using say two columns that is returned from this json_parse, information only from say dim like REGION, YEAR, COUNTRY and then the values from the column Value. I would like to make this into a nice dataframe/CSV file so I can then more easily understand what is happening.
Can anyone give any advice?
jsonlite::fromJSON gives you data in a better format and the 3rd element in the list is where the main data is.
url <- 'https://apps.who.int/gho/athena/api/GHO/WHS6_102.json?profile=simple'
tmp <- jsonlite::fromJSON(url)
data <- tmp[[3]]
I am importing multiple excel workbooks, processing them, and appending them subsequently. I want to create a temporary dataframe (tempfile?) that holds nothing in the beginning, and after each successive workbook processing, append it. How do I create such temporary dataframe in the beginning?
I am coming from Stata and I use tempfile a lot. Is there a counterpart to tempfile from Stata to R?
As #James said you do not need an empty data frame or tempfile, simply add newly processed data frames to the first data frame. Here is an example (based on csv but the logic is the same):
list_of_files <- c('1.csv','2.csv',...)
pre_processor <- function(dataframe){
# do stuff
}
library(dplyr)
dataframe <- pre_processor(read.csv('1.csv')) %>%
rbind(pre_processor(read.csv('2.csv'))) %>%>
...
Now if you have a lot of files or a very complicated pre_processsing then you might have other questions (e.g. how to loop over the list of files or to write the right pre_processing function) but these should be separate and we really need more specifics (example data, code so far, etc.).
I am writing a dataframe using a csv file. I am making a data frame. However, when I go to run it, it's not recognizing the objects in the file. It will recognize some of them, but not all.
smallsample <- data.frame(read.csv("SmallSample.csv",header = TRUE),smallsample$age,smallsample$income,smallsample$gender,smallsample$marital,smallsample$numkids,smallsample$risk)
smallsample
It wont recognize marital or numkids, despite the fact that those are the column names in the table in the .csv file.
When you use read.csv the output is already in a dataframe.
You can simple use smallsample <- read.csv("SmallSample.csv")
Result using a dummy csv file
<table><tbody><tr><th> </th><th>age</th><th>income</th><th>gender</th><th>marital</th><th>numkids</th><th>risk</th></tr><tr><td>1</td><td>32</td><td>34932</td><td>Female</td><td>Single</td><td>1</td><td>0.9611315</td></tr><tr><td>2</td><td>22</td><td>50535</td><td>Male</td><td>Single</td><td>0</td><td>0.7257541</td></tr><tr><td>3</td><td>40</td><td>42358</td><td>Male</td><td>Single</td><td>1</td><td>0.6879534</td></tr><tr><td>4</td><td>40</td><td>54648</td><td>Male</td><td>Single</td><td>3</td><td>0.568068</td></tr></tbody></table>
I created hundreds of data frames in R, and I want to export them to a local position. All the names of the data frames are stored in a vector :
name.vec<-c('df1','df2','df3','df4','df5','df5')
each of which in name.vec is a data frame .
what I want to do is to export those data frames as excel file, but I did not want to do it the way below :
library("xlsx")
write.xlsx(df1,file ="df1.xlsx")
write.xlsx(df2,file ="df2.xlsx")
write.xlsx(df3,file ="df3.xlsx")
because with hundreds of data frames, it's tedious and dangerous.
I want some thing like below instead :
library('xlsx')
for (k in name.vec) {
write.xlsx(k,file=paste0(k,'.xlsx'))
}
but this would not work.
Anyone know how to achieve this? your time and knowledge would be deeply appreciated. Thanks in advance.
The first reason the for loop doesn't work is that the code is attempting to write a single name, 'df1' for example, as the xlsx file contents, instead of the data frame. This is because you're not storing the data frames themselves in the "name.vec" you have. So to fix the for loop, you'd have to do something more like this:
df.list<-list(df1,df2,df3)
name.vec<-c('df1','df2','df3')
library('xlsx')
for (k in 1:length(name.list)){
write.xlsx(df.list[[k]],file=paste0(name.vec[k],'.xlsx'))
}
However, for loops are generally slower than other options. So here's another way:
sapply(1:length(df.list),
function(i) write.xlsx(df.list[[i]],file=paste0(name.vec[i],'.xlsx')))
Output is 3 data frames, taken from the df list, named by the name vector.
It may also be best to at some point switch to the newer package for this: writexl.