Why jsonlite parses data into a list object ? - r

I try to parse data from a web API with jsonlite but for some reason the object it returns is a list.
It is said in the jsonlite package documentation that simplification process will automatically convert JSON list into a more specific R class but in my case it doesn't work.
It is like simplifyVector, simplifyDataFrame and simplifyMatrix function are disabled but each one is enabled by default.
What I would like is a dataframe to retrieve the $Name data (EAC, EFL, ELC, etc.).
I also try with the rjson library but still the same problem.
Any idea what could be wrong ?
Thank you,
Please find the code I use :
raw <- getURL("https://www.cryptocompare.com/api/data/coinlist")
library(jsonlite)
data <- fromJSON(txt=raw)
> class(data)
[1] "list"
> typeof(data)
[1] "list"
> str(data)
[...]
..$ EAC :List of 13
.. ..$ Id : chr "4437"
.. ..$ Url : chr "/coins/eac/overview"
.. ..$ ImageUrl : chr "/media/19690/eac.png"
.. ..$ Name : chr "EAC"
.. ..$ CoinName : chr "EarthCoin"
.. ..$ FullName : chr "EarthCoin (EAC)"
.. ..$ Algorithm : chr "Scrypt"
.. ..$ ProofType : chr "PoW"
.. ..$ FullyPremined : chr "0"
.. ..$ TotalCoinSupply : chr "13500000000"
.. ..$ PreMinedValue : chr "N/A"
.. ..$ TotalCoinsFreeFloat: chr "N/A"
.. ..$ SortOrder : chr "100"
..$ EFL :List of 13
.. ..$ Id : chr "4438"
.. ..$ Url : chr "/coins/efl/overview"
.. ..$ ImageUrl : chr "/media/19692/efl.png"
.. ..$ Name : chr "EFL"
.. ..$ CoinName : chr "E-Gulden"
.. ..$ FullName : chr "E-Gulden (EFL)"
.. ..$ Algorithm : chr "Scrypt"
.. ..$ ProofType : chr "PoW"
.. ..$ FullyPremined : chr "0"
.. ..$ TotalCoinSupply : chr "21000000 "
.. ..$ PreMinedValue : chr "N/A"
.. ..$ TotalCoinsFreeFloat: chr "N/A"
.. ..$ SortOrder : chr "101"
..$ ELC :List of 13
.. ..$ Id : chr "4439"
.. ..$ Url : chr "/coins/elc/overview"
.. ..$ ImageUrl : chr "/media/19694/elc.png"
.. ..$ Name : chr "ELC"
.. ..$ CoinName : chr "Elacoin"
.. ..$ FullName : chr "Elacoin (ELC)"
.. ..$ Algorithm : chr "Scrypt"
.. ..$ ProofType : chr "PoW"
.. ..$ FullyPremined : chr "0"
.. ..$ TotalCoinSupply : chr "75000000"
.. ..$ PreMinedValue : chr "N/A"
.. ..$ TotalCoinsFreeFloat: chr "N/A"
.. ..$ SortOrder : chr "102"
.. [list output truncated]
$ Type : int 100
NULL

You showed the lower end of the structure, but the answer to the question regarding why a dataframe was not returned is seen at the top of the structure:
# note: needed `require(RCurl)` to obtain getURL
> str(data)
List of 6
$ Response : chr "Success"
$ Message : chr "Coin list succesfully returned!"
$ BaseImageUrl: chr "https://www.cryptocompare.com"
$ BaseLinkUrl : chr "https://www.cryptocompare.com"
$ Data :List of 492
..$ BTC :List of 13
.. ..$ Id : chr "1182"
.. ..$ Url : chr "/coins/btc/overview"
.. ..$ ImageUrl : chr "/media/19633/btc.png"
.. ..$ Name : chr "BTC"
.. ..$ CoinName : chr "Bitcoin"
.. ..$ FullName : chr "Bitcoin (BTC)"
.. ..$ Algorithm : chr "SHA256"
# ------snipped the many, many pages of output that followed---------
Furthermore the $Data node of that list has irregular lengths so coercing to a dataframe in one step might be difficult:
> table( sapply(data$Data, length))
12 13 14
2 478 12
After loading pkg:plyr which provides a useful function to rbind similar but not identical dataframes I'm able to contruct a useful starting point for furhter analysis:
require(plyr)
money <- do.call(rbind.fill, lapply( data$Data, data.frame, stringsAsFactors=FALSE))
str(money)
#------------
'data.frame': 492 obs. of 14 variables:
$ Id : chr "1182" "3808" "3807" "5038" ...
$ Url : chr "/coins/btc/overview" "/coins/ltc/overview" "/coins/dash/overview" "/coins/xmr/overview" ...
$ ImageUrl : chr "/media/19633/btc.png" "/media/19782/ltc.png" "/media/20626/dash.png" "/media/19969/xmr.png" ...
$ Name : chr "BTC" "LTC" "DASH" "XMR" ...
$ CoinName : chr "Bitcoin" "Litecoin" "DigitalCash" "Monero" ...
$ FullName : chr "Bitcoin (BTC)" "Litecoin (LTC)" "DigitalCash (DASH)" "Monero (XMR)" ...
$ Algorithm : chr "SHA256" "Scrypt" "X11" "CryptoNight" ...
$ ProofType : chr "PoW" "PoW" "PoW/PoS" "PoW" ...
$ FullyPremined : chr "0" "0" "0" "0" ...
$ TotalCoinSupply : chr "21000000" "84000000" "22000000" "0" ...
$ PreMinedValue : chr "N/A" "N/A" "N/A" "N/A" ...
$ TotalCoinsFreeFloat: chr "N/A" "N/A" "N/A" "N/A" ...
$ SortOrder : chr "1" "3" "4" "5" ...
$ TotalCoinsMined : chr NA NA NA NA ...
If you wanted to be able to access the rows by way of the abbreviations for those crypto-currencies, you could do:
rownames(money) <- names(data$Data)
Which now lets you do this:
> money[ "BTC", ]
Id Url ImageUrl Name CoinName
BTC 1182 /coins/btc/overview /media/19633/btc.png BTC Bitcoin
FullName Algorithm ProofType FullyPremined TotalCoinSupply
BTC Bitcoin (BTC) SHA256 PoW 0 21000000
PreMinedValue TotalCoinsFreeFloat SortOrder TotalCoinsMined
BTC N/A N/A 1 <NA>
Where before access would have been a bit more clunky:
> money[ money$Name=="BTC", ]

I reply to my own question as - already said in the comment section - returned object is already in it's simplest form. Probably that jsonlite cannot create data frame from multiple lists (lists imbrication).
The solution I have found is to use unlist and data.frame like this :
> df <- data.frame(unlist(data))
> class(df)
[1] "data.frame"

Related

Plotly in R: How to reference and extract figure values?

I want to know how can I access, extract, and reference values from a plotly figure in R.
Consider, for example, the Sankey diagram from plotly's own site of which there is an abbreviated version here:
library(plotly)
fig <- plot_ly(
type = "sankey",
node = list(
label = c("A1", "A2", "B1", "B2", "C1", "C2"),
color = c("blue", "blue", "blue", "blue", "blue", "blue"),
line = list()
),
link = list(
source = c(0,1,0,2,3,3),
target = c(2,3,3,4,4,5),
value = c(8,4,2,8,4,2)
)
)
fig
If I do View(fig) in Rstudio, a new tab opens titled . (I don't know why this instead of 'fig'). In this tab I can go to x > visdat > 'strig of letters and numbers that is a function?' > attrs > node > x (as shown bellow).
Here all the x coordinates for the Sankey nodes appear.
I want to access these values so I can use them somewhere else. How do I do this? If I click on the right side of the Rsutudio tab to copy the code to console I get:
environment(.[["x"]][["visdat"]][["484c3ec36899"]])[["attrs"]][["node"]][["x"]]
which obviously doesn't work as there is no object named ..
In this case I have tried fig$x$visdat$`484c3ec36899`() but I cant do fig$x$visdat$`484c3ec36899`()$attr, and I don't know what else to do.
So, how can I access any value from a plotly object? Any documentation referencing this topic would also be helpful.
Thanks.
You can find the documentation of the data structure of plotly in R here: https://plotly.com/r/figure-structure/
To check the data structure you can use str(fig):
List of 8
$ x :List of 6
..$ visdat :List of 1
.. ..$ a3b8795a4:function ()
..$ cur_data: chr "a3b8795a4"
..$ attrs :List of 1
.. ..$ a3b8795a4:List of 6
.. .. ..$ node :List of 3
.. .. .. ..$ label: chr [1:6] "A1" "A2" "B1" "B2" ...
.. .. .. ..$ color: chr [1:6] "blue" "blue" "blue" "blue" ...
.. .. .. ..$ line : list()
.. .. ..$ link :List of 3
.. .. .. ..$ source: num [1:6] 0 1 0 2 3 3
.. .. .. ..$ target: num [1:6] 2 3 3 4 4 5
.. .. .. ..$ value : num [1:6] 8 4 2 8 4 2
.. .. ..$ alpha_stroke: num 1
.. .. ..$ sizes : num [1:2] 10 100
.. .. ..$ spans : num [1:2] 1 20
.. .. ..$ type : chr "sankey"
..$ layout :List of 3
.. ..$ width : NULL
.. ..$ height: NULL
.. ..$ margin:List of 4
.. .. ..$ b: num 40
.. .. ..$ l: num 60
.. .. ..$ t: num 25
.. .. ..$ r: num 10
..$ source : chr "A"
..$ config :List of 1
.. ..$ showSendToCloud: logi FALSE
..- attr(*, "TOJSON_FUNC")=function (x, ...)
$ width : NULL
$ height : NULL
$ sizingPolicy :List of 6
..$ defaultWidth : chr "100%"
..$ defaultHeight: num 400
..$ padding : NULL
..$ viewer :List of 6
.. ..$ defaultWidth : NULL
.. ..$ defaultHeight: NULL
.. ..$ padding : NULL
.. ..$ fill : logi TRUE
.. ..$ suppress : logi FALSE
.. ..$ paneHeight : NULL
..$ browser :List of 5
.. ..$ defaultWidth : NULL
.. ..$ defaultHeight: NULL
.. ..$ padding : NULL
.. ..$ fill : logi TRUE
.. ..$ external : logi FALSE
..$ knitr :List of 3
.. ..$ defaultWidth : NULL
.. ..$ defaultHeight: NULL
.. ..$ figure : logi TRUE
$ dependencies :List of 5
..$ :List of 10
.. ..$ name : chr "typedarray"
.. ..$ version : chr "0.1"
.. ..$ src :List of 1
.. .. ..$ file: chr "htmlwidgets/lib/typedarray"
.. ..$ meta : NULL
.. ..$ script : chr "typedarray.min.js"
.. ..$ stylesheet: NULL
.. ..$ head : NULL
.. ..$ attachment: NULL
.. ..$ package : chr "plotly"
.. ..$ all_files : logi FALSE
.. ..- attr(*, "class")= chr "html_dependency"
..$ :List of 10
.. ..$ name : chr "jquery"
.. ..$ version : chr "1.11.3"
.. ..$ src :List of 1
.. .. ..$ file: chr "lib/jquery"
.. ..$ meta : NULL
.. ..$ script : chr "jquery.min.js"
.. ..$ stylesheet: NULL
.. ..$ head : NULL
.. ..$ attachment: NULL
.. ..$ package : chr "crosstalk"
.. ..$ all_files : logi TRUE
.. ..- attr(*, "class")= chr "html_dependency"
..$ :List of 10
.. ..$ name : chr "crosstalk"
.. ..$ version : chr "1.1.0.1"
.. ..$ src :List of 1
.. .. ..$ file: chr "www"
.. ..$ meta : NULL
.. ..$ script : chr "js/crosstalk.min.js"
.. ..$ stylesheet: chr "css/crosstalk.css"
.. ..$ head : NULL
.. ..$ attachment: NULL
.. ..$ package : chr "crosstalk"
.. ..$ all_files : logi TRUE
.. ..- attr(*, "class")= chr "html_dependency"
..$ :List of 10
.. ..$ name : chr "plotly-htmlwidgets-css"
.. ..$ version : chr "1.52.2"
.. ..$ src :List of 1
.. .. ..$ file: chr "htmlwidgets/lib/plotlyjs"
.. ..$ meta : NULL
.. ..$ script : NULL
.. ..$ stylesheet: chr "plotly-htmlwidgets.css"
.. ..$ head : NULL
.. ..$ attachment: NULL
.. ..$ package : chr "plotly"
.. ..$ all_files : logi FALSE
.. ..- attr(*, "class")= chr "html_dependency"
..$ :List of 10
.. ..$ name : chr "plotly-main"
.. ..$ version : chr "1.52.2"
.. ..$ src :List of 1
.. .. ..$ file: chr "htmlwidgets/lib/plotlyjs"
.. ..$ meta : NULL
.. ..$ script : chr "plotly-latest.min.js"
.. ..$ stylesheet: NULL
.. ..$ head : NULL
.. ..$ attachment: NULL
.. ..$ package : chr "plotly"
.. ..$ all_files : logi FALSE
.. ..- attr(*, "class")= chr "html_dependency"
$ elementId : NULL
$ preRenderHook:function (p, registerFrames = TRUE)
$ jsHooks : list()
- attr(*, "class")= chr [1:2] "plotly" "htmlwidget"
- attr(*, "package")= chr "plotly"
You could extract the coordinates with:
unlist(fig$x$attrs)

R: Loaded tweets structure is untidy when str()

Differently from my collegue, after I load the tweets with R and I try to see the structure with str() the data appears in a messy way with a lot of dots, rather than being organized as a table, which is what happens with my collegue's computer, even if the codes are the same. I can't understand what is the problem, we have the same packages installed and the same R version.
library(rtweet)
library(ggplot2)
library(dplyr)
library(tibble)
library(tidytext)
library(stringr)
library(stringi)
library(igraph)
library(ggraph)
library(readr)
library(lubridate)
library(zoo)
appname <- ""
key <- ""
secret <- ""
twitter_token <- create_token( app = "", consumer_key = "", consumer_secret = "", access_token = "", access_secret = "")
tweets <- search_tweets(q = "#water + #climatechange", n = 10000, lang = "en", include_rts = FALSE)
str(tweets)
.. ..$ media :'data.frame': 1 obs. of 11 variables:
.. .. ..$ id : num 1.57e+18
.. .. ..$ id_str : chr "1573815153484759040"
.. .. ..$ indices :List of 1
.. .. .. ..$ :'data.frame': 1 obs. of 2 variables:
.. .. .. .. ..$ start: int 241
.. .. .. .. ..$ end : int 264
.. .. .. ..- attr(*, "class")= chr "AsIs"
.. .. ..$ media_url : chr "http://pbs.twimg.com/media/FddQiy2WAAAl59Q.jpg"
.. .. ..$ media_url_https: chr "https://pbs.twimg.com/media/FddQiy2WAAAl59Q.jpg"
.. .. ..$ url : chr "https
.. .. ..$ display_url : chr "pic.twitter.com/iFJTkF1S9S"
.. .. ..$ expanded_url : chr "https://twitter.com/TreeBanker/status/1573815156768968706/photo/1"
.. .. ..$ type : chr "photo"
.. .. ..$ sizes :List of 1
.. .. .. ..$ :'data.frame': 4 obs. of 4 variables:
.. .. .. .. ..$ w : int [1:4] 1096 680 150 1096
.. .. .. .. ..$ h : int [1:4] 733 455 150 733
.. .. .. .. ..$ resize: chr [1:4] "fit" "fit" "crop" "fit"
.. .. .. .. ..$ type : chr [1:4] "large" "small" "thumb" "medium"
.. .. ..$ ext_alt_text : logi NA
..$ :List of 5
.. ..$ media :'data.frame': 1 obs. of 11 variables:
.. .. ..$ id : num 1.57e+18
.. .. ..$ id_str : chr "1573815153484759040"
.. .. ..$ indices :List of 1
.. .. .. ..$ :'data.frame': 1 obs. of 2 variables:

R: Convert large list to data frame

I have a large list (of 10 elements) called res as shown below. Please, notice that I only show 3 of the elements so the post isn't too long.
> str(res)
List of 10
$ :'data.frame': 1 obs. of 13 variables:
..$ id : chr "121040004071"
..$ province : chr "Castellón/Castelló"
..$ comunidadAutonoma: chr "Comunitat Valenciana"
..$ muni : chr "Segorbe"
..$ type : chr "portal"
..$ address : chr "A-23"
..$ geom : chr "POINT(-0.428888910999945 39.806487449)"
..$ lat : num 39.8
..$ lng : num -0.429
..$ portalNumber : chr "23"
..$ stateMsg : chr "Resultado exacto de la búsqueda"
..$ state : chr "1"
..$ countryCode : chr "011"
$ :'data.frame': 1 obs. of 13 variables:
..$ id : chr "121040004071"
..$ province : chr "Castellón/Castelló"
..$ comunidadAutonoma: chr "Comunitat Valenciana"
..$ muni : chr "Segorbe"
..$ type : chr "portal"
..$ address : chr "A-23"
..$ geom : chr "POINT(-0.428888910999945 39.806487449)"
..$ lat : num 39.8
..$ lng : num -0.429
..$ portalNumber : chr "23"
..$ stateMsg : chr "Resultado exacto de la búsqueda"
..$ state : chr "1"
..$ countryCode : chr "011"
$ :'data.frame': 1 obs. of 13 variables:
..$ id : chr "121040004071"
..$ province : chr "Castellón/Castelló"
..$ comunidadAutonoma: chr "Comunitat Valenciana"
..$ muni : chr "Segorbe"
..$ type : chr "portal"
..$ address : chr "A-23"
..$ geom : chr "POINT(-0.428888910999945 39.806487449)"
..$ lat : num 39.8
..$ lng : num -0.429
..$ portalNumber : chr "23"
..$ stateMsg : chr "Resultado exacto de la búsqueda"
..$ state : chr "1"
..$ countryCode : chr "011"
Each observation corresponds to a certain address in the city of Valencia, Spain. After geocoding my 10 addresses, I ended up with 13 variables for each address containing information about longitude, latitude, province, etc.
I would like to make it a data frame so that for every row we have the main $:'data.frame and the rest of ..$ x are the variables/columns.
Thanks for your help
You can ouse the following functions:
Map function
list_data <- Map(as.data.frame, list_data)
rbindlist function
datarbind <- rbindlist(list_data)

How do I convert a JSON file to a data frame in R?

Link to data.
For my purposes, I downloaded the data from the above link and saved it as a JSON file.
json_convert <- do.call(rbind, lapply(paste(readLines("Myfile.json", warn=TRUE),
collapse=""),
jsonlite::fromJSON))
So far, I have managed to code the above. However, I am confused as to how I can convert this into a data frame. All help is appreciated.
Let's start by examining the data structure:
library(purrr)
library(tibble)
library(jsonlite)
my_json <- fromJSON("Myfile.json")
str(my_json)
List of 3
$ resource : chr "shotchartdetail"
$ parameters:List of 30
..$ LeagueID : chr "00"
..$ Season : chr "2017-18"
..$ SeasonType : chr "Regular Season"
..$ TeamID : int 1610612750
..$ PlayerID : int 0
..$ GameID : NULL
..$ Outcome : NULL
..$ Location : NULL
..$ Month : int 0
..$ SeasonSegment : NULL
..$ DateFrom : NULL
..$ DateTo : NULL
..$ OpponentTeamID: int 0
..$ VsConference : NULL
..$ VsDivision : NULL
..$ Position : NULL
..$ RookieYear : NULL
..$ GameSegment : NULL
..$ Period : int 0
..$ LastNGames : int 0
..$ ClutchTime : NULL
..$ AheadBehind : NULL
..$ PointDiff : NULL
..$ RangeType : int 0
..$ StartPeriod : int 1
..$ EndPeriod : int 10
..$ StartRange : int 0
..$ EndRange : int 28800
..$ ContextFilter : chr "SEASON_YEAR='2017-18'"
..$ ContextMeasure: chr "FGA"
$ resultSets:'data.frame': 2 obs. of 3 variables:
..$ name : chr [1:2] "Shot_Chart_Detail" "LeagueAverages"
..$ headers:List of 2
.. ..$ : chr [1:24] "GRID_TYPE" "GAME_ID" "GAME_EVENT_ID" "PLAYER_ID" ...
.. ..$ : chr [1:7] "GRID_TYPE" "SHOT_ZONE_BASIC" "SHOT_ZONE_AREA" "SHOT_ZONE_RANGE"
...
..$ rowSet :List of 2
.. ..$ : chr [1:7063, 1:24] "Shot Chart Detail" "Shot Chart Detail" "Shot Chart
Detail" "Shot Chart Detail" ...
.. ..$ : chr [1:20, 1:7] "League Averages" "League Averages" "League Averages" "League Averages" ...
Now you have to decide what it is that you want in your data frame.
I would assume that player statistics are in the first element of $rowSet (1:7063 = rows, 1:24 = columns) and the headers for those columns are in the first element of $resultSets$headers (1:24).
I'm sure there's a very elegant way to use the map functions in purrr. This isn't it, but it works:
my_list <- my_json %>%
flatten()
my_df <- my_list$rowSet[[1]] %>%
as.tibble() %>%
setNames(my_list$headers[[1]])
str(my_df)
Classes ‘tbl_df’, ‘tbl’ and 'data.frame': 7063 obs. of 24 variables:
$ GRID_TYPE : chr "Shot Chart Detail" "Shot Chart Detail" "Shot Chart Detail" "Shot Chart Detail" ...
$ GAME_ID : chr "0021700011" "0021700011" "0021700011" "0021700011" ...
$ GAME_EVENT_ID : chr "10" "12" "16" "21" ...
$ PLAYER_ID : chr "1626157" "202710" "202710" "201959" ...
$ PLAYER_NAME : chr "Karl-Anthony Towns" "Jimmy Butler" "Jimmy Butler" "Taj Gibson" ...
$ TEAM_ID : chr "1610612750" "1610612750" "1610612750" "1610612750" ...
$ TEAM_NAME : chr "Minnesota Timberwolves" "Minnesota Timberwolves" "Minnesota Timberwolves" "Minnesota Timberwolves" ...
$ PERIOD : chr "1" "1" "1" "1" ...
$ MINUTES_REMAINING : chr "11" "11" "10" "10" ...
$ SECONDS_REMAINING : chr "14" "9" "32" "21" ...
$ EVENT_TYPE : chr "Missed Shot" "Made Shot" "Missed Shot" "Missed Shot"
...
$ ACTION_TYPE : chr "Jump Shot" "Jump Shot" "Driving Reverse Layup Shot" "Jump Shot" ...
$ SHOT_TYPE : chr "2PT Field Goal" "3PT Field Goal" "2PT Field Goal" "3PT Field Goal" ...
$ SHOT_ZONE_BASIC : chr "Mid-Range" "Above the Break 3" "Restricted Area" "Left Corner 3" ...
$ SHOT_ZONE_AREA : chr "Left Side Center(LC)" "Right Side Center(RC)" "Center(C)" "Left Side(L)" ...
$ SHOT_ZONE_RANGE : chr "16-24 ft." "24+ ft." "Less Than 8 ft." "24+ ft." ...
$ SHOT_DISTANCE : chr "20" "25" "1" "22" ...
$ LOC_X : chr "-113" "199" "-11" "-225" ...
$ LOC_Y : chr "169" "152" "6" "16" ...
$ SHOT_ATTEMPTED_FLAG: chr "1" "1" "1" "1" ...
$ SHOT_MADE_FLAG : chr "0" "1" "0" "0" ...
$ GAME_DATE : chr "20171018" "20171018" "20171018" "20171018" ...
$ HTM : chr "SAS" "SAS" "SAS" "SAS" ...
$ VTM : chr "MIN" "MIN" "MIN" "MIN" ...

r: merge list of unnamed data sets

After importing data from a JSON stream, I have a data frame that is 621 lists of the same 22 variables.
List of 621
$ :List of 22
..$ _id : chr "55c79e711cbee48856a30886"
..$ number : num 1
..$ country : chr "Yemen"
..$ date : chr "2002-11-03T00:00:00.000Z"
..$ narrative : chr ""
..$ town : chr ""
..$ location : chr ""
..$ deaths : chr "6"
..$ deaths_min : chr "6"
..$ deaths_max : chr "6"
..$ civilians : chr "0"
..$ injuries : chr ""
..$ children : chr ""
..$ tweet_id : chr "278544689483890688"
..$ bureau_id : chr "YEM001"
..$ bij_summary_short: chr ""
..$ bij_link : chr ""
..$ target : chr ""
..$ lat : chr "15.47467"
..$ lon : chr "45.322755"
..$ articles : list()
..$ names : chr ""| __truncated__
$ :List of 22
..$ _id : chr "55c79e711cbee48856a30887"
..$ number : num 2
..$ country : chr "Pakistan"
..$ date : chr "2004-06-17T00:00:00.000Z"
..$ narrative : chr ""
..$ town : chr ""
..$ location : chr ""
..$ deaths : chr "6-8"
..$ deaths_min : chr "6"
..$ deaths_max : chr "8"
..$ civilians : chr "2"
..$ injuries : chr "1"
..$ children : chr "2"
..$ tweet_id : chr "278544750867533824"
..$ bureau_id : chr "B1"
..$ bij_summary_short: chr ""| __truncated__
..$ bij_link : chr ""
..$ target : chr ""
..$ lat : chr "32.30512565"
..$ lon : chr "69.57624435"
..$ articles : list()
..$ names : chr ""
...
How can I combine these lists into one data frame of 621 observations of 22 variables? Notice that all 621 lists are unnamed.
edit: Per request, here is how I got this data set:
library(rjson)
url <- 'http://api.dronestre.am/data'
document <- fromJSON(file=url, method='C')
str(document$strike)
Can you provide example on how you generated the data ? I did not test the answer but, the following should help. If you can update the Q, on how you came up with the data, I can work to try that.
update
library(rjson)
library(data.table)
library(dplyr)
url <- 'http://api.dronestre.am/data'
document <- fromJSON(file=url, method='C')
is(document)
listdata<- document$strike
df<-do.call(rbind,listdata) %>% as.data.table
dim(df)
purrr has a useful transpose function which 'inverts' a list. The $articles element causes trouble as it appears always to be empty, and scuppers you when you try to convert to a data.frame, so I've subsetted for it.
library(purrr)
df <- transpose(document$strike) %>%
t %>%
apply(FUN = unlist, MARGIN = 2)
df <- df[-21] %>% data.frame %>% tbl_df
df
Source: local data frame [621 x 21]
X_id number country date
(fctr) (dbl) (fctr) (fctr)
1 55c79e711cbee48856a30886 1 Yemen 2002-11-03T00:00:00.000Z
2 55c79e711cbee48856a30887 2 Pakistan 2004-06-17T00:00:00.000Z
3 55c79e711cbee48856a30888 3 Pakistan 2005-05-08T00:00:00.000Z
4 55c79e721cbee48856a30889 4 Pakistan 2005-11-05T00:00:00.000Z
5 55c79e721cbee48856a3088a 5 Pakistan 2005-12-01T00:00:00.000Z
6 55c79e721cbee48856a3088b 6 Pakistan 2006-01-06T00:00:00.000Z
7 55c79e721cbee48856a3088c 7 Pakistan 2006-01-13T00:00:00.000Z
8 55c79e721cbee48856a3088d 8 Pakistan 2006-10-30T00:00:00.000Z
9 55c79e721cbee48856a3088e 9 Pakistan 2007-01-16T00:00:00.000Z
10 55c79e721cbee48856a3088f 10 Pakistan 2007-04-27T00:00:00.000Z
.. ... ... ... ...
Variables not shown: narrative (fctr), town (fctr), location (fctr), deaths
(fctr), deaths_min (fctr), deaths_max (fctr), civilians (fctr), injuries
(fctr), children (fctr), tweet_id (fctr), bureau_id (fctr), bij_summary_short
(fctr), bij_link (fctr), target (fctr), lat (fctr), lon (fctr), names (fctr)

Resources