Extract attributes in XML using R - r

Trying to extract two attributes from the XML file extract (from a large XML file) namely 'nmRegime' and 'CalendarSystemT' (this is the date). Once extract those two records need to be saved as two columns in a data frame in R along with the filename.
There are several 'event' nodes within one given XML file and there are nearly 100 individual XML files.
<Event tEV="FirA" clearEV="false" onEV="true" dateOriginEV="Calendar" nYrsFromStEV="" nDaysFromStEV="" tFaqEV="Blank" tAaqEV="Blank" aqStYrEV="0" aqEnYrEV="0" nmEV="Fire_Cool" categoryEV="CatUndef" tEvent="Doc" idSP="105" nmRegime="Wheat, Tilled, stubble cool burn" regimeInstance="1">
<notesEV></notesEV>
<dateEV CalendarSystemT="FixedLength">19710331</dateEV>
<FirA fracAfctFirA="0.6" fracGbfrToAtmsFirA="0.98" fracStlkToAtmsFirA="0.98" fracLeafToAtmsFirA="0.98" fracGbfrToGlitFirA="0.02" fracStlkToSlitFirA="0.02" fracLeafToLlitFirA="0.02" fracCortToCodrFirA="1.0" fracFirtToFidrFirA="1.0" fracDGlitToAtmsFirA="0.931" fracRGlitToAtmsFirA="0.931" fracDSlitToAtmsFirA="0.931" fracRSlitToAtmsFirA="0.931" fracDLlitToAtmsFirA="0.931" fracRLlitToAtmsFirA="0.931" fracDCodrToAtmsFirA="0.0" fracRCodrToAtmsFirA="0.0" fracDFidrToAtmsFirA="0.0" fracRFidrToAtmsFirA="0.0" fracDGlitToInrtFirA="0.019" fracRGlitToInrtFirA="0.019" fracDSlitToInrtFirA="0.019" fracRSlitToInrtFirA="0.019" fracDLlitToInrtFirA="0.019" fracRLlitToInrtFirA="0.019" fracDCodrToInrtFirA="0.0" fracRCodrToInrtFirA="0.0" fracDFidrToInrtFirA="0.0" fracRFidrToInrtFirA="0.0" fracSopmToAtmsFirA="" fracLrpmToAtmsFirA="" fracMrpmToAtmsFirA="" fracSommToAtmsFirA="" fracLrmmToAtmsFirA="" fracMrmmToAtmsFirA="" fracMicrToAtmsFirA="" fracSopmToInrtFirA="" fracLrpmToInrtFirA="" fracMrpmToInrtFirA="" fracSommToInrtFirA="" fracLrmmToInrtFirA="" fracMrmmToInrtFirA="" fracMicrToInrtFirA="" fracMnamNToAtmsFirA="" fracSAmmNToAtmsFirA="" fracSNtrNToAtmsFirA="" fracDAmmNToAtmsFirA="" fracDNtrNToAtmsFirA="" fixFirA="" phaFirA="" />
</Event>
Had some success in extracting 'nmRegime' but no success with 'CalendarSystemT'. Used below code for data extraction.
The second question, is there a way to loop the list of XML files and do this operation?
# get records
library(xml2)
recs <- xml_find_all(xml, "//Event")
#extract the names
labs <- trimws(xml_attr(recs, "nmRegime"))
names <- labs[!is.na(labs)]
# Extract the date
recs_t <- xml_find_all(xml, "//Event/dateEV")
time <- trimws(xml_attr(recs_t, "CalendarSystemT"))

The calendar time value is not an attribute but is stored as the node's element and is accessed directly.
Also note that if an Event Node is missing a "dateEV" then there will be problems aligning the "labs" with the "time". It is better to extract the time value from each parent node instead of the entire document.
library(xml2)
library(dplyr)
xml<- read_xml('<Event tEV="FirA" clearEV="false" onEV="true" dateOriginEV="Calendar" nYrsFromStEV="" nDaysFromStEV="" tFaqEV="Blank" tAaqEV="Blank" aqStYrEV="0" aqEnYrEV="0" nmEV="Fire_Cool" categoryEV="CatUndef" tEvent="Doc" idSP="105" nmRegime="Wheat, Tilled, stubble cool burn" regimeInstance="1">
<notesEV></notesEV>
<dateEV CalendarSystemT="FixedLength">19710331</dateEV>
<FirA fracAfctFirA="0.6" fracGbfrToAtmsFirA="0.98" fracStlkToAtmsFirA="0.98" fracLeafToAtmsFirA="0.98" fracGbfrToGlitFirA="0.02" fracStlkToSlitFirA="0.02" fracLeafToLlitFirA="0.02" fracCortToCodrFirA="1.0" fracFirtToFidrFirA="1.0" fracDGlitToAtmsFirA="0.931" fracRGlitToAtmsFirA="0.931" fracDSlitToAtmsFirA="0.931" fracRSlitToAtmsFirA="0.931" fracDLlitToAtmsFirA="0.931" fracRLlitToAtmsFirA="0.931" fracDCodrToAtmsFirA="0.0" fracRCodrToAtmsFirA="0.0" fracDFidrToAtmsFirA="0.0" fracRFidrToAtmsFirA="0.0" fracDGlitToInrtFirA="0.019" fracRGlitToInrtFirA="0.019" fracDSlitToInrtFirA="0.019" fracRSlitToInrtFirA="0.019" fracDLlitToInrtFirA="0.019" fracRLlitToInrtFirA="0.019" fracDCodrToInrtFirA="0.0" fracRCodrToInrtFirA="0.0" fracDFidrToInrtFirA="0.0" fracRFidrToInrtFirA="0.0" fracSopmToAtmsFirA="" fracLrpmToAtmsFirA="" fracMrpmToAtmsFirA="" fracSommToAtmsFirA="" fracLrmmToAtmsFirA="" fracMrmmToAtmsFirA="" fracMicrToAtmsFirA="" fracSopmToInrtFirA="" fracLrpmToInrtFirA="" fracMrpmToInrtFirA="" fracSommToInrtFirA="" fracLrmmToInrtFirA="" fracMrmmToInrtFirA="" fracMicrToInrtFirA="" fracMnamNToAtmsFirA="" fracSAmmNToAtmsFirA="" fracSNtrNToAtmsFirA="" fracDAmmNToAtmsFirA="" fracDNtrNToAtmsFirA="" fixFirA="" phaFirA="" />
</Event>')
recs <- xml_find_all(xml, "//Event")
#extract the names
labs <- trimws(xml_attr(recs, "nmRegime")) names <- labs[!is.na(labs)]
# Extract the date
time <- xml_find_first(recs, ".//dateEV") %>% xml_text() %>% trimws()
To answer your second question, yes you could can wrap the above script into a function and then use lapply to loop through your entire list of files.
See this question and answer for details: R XML - combining parent and child nodes(w same name) into data frame

Related

How do I import a file into r with extension .DUSMCPUB?

I’m trying to import the Mortality Multiple Cause Files from the National Center for Health Statistics, located at this link:
https://www.cdc.gov/nchs/data_access/vitalstatsonline.htm#Downloadable
link to image of where to find file on NCHS website
The files have an extension .DUSMCPUB (e.g., the file for 2020 is called "VS20MORT.DUSMCPUB_r20220105”). How do I import such a file? I’m not familiar with the extension.
I have tried to import with the following code, but it causes my R program to terminate. Can you please provide me with a suggestion on how to import these types of files?
VS20MORT <- read_delim("VS20MORT.DUSMCPUB_r20220105")
Thanks #Mel G for sharing this approach. When I tried to run it, I realized that the mortality file includes a few new variables as of 2020 (namely decedent’s occupation and industry). Here’s a slight variation that includes the new variables.
# Install and load necessary packages
# install.packages("sqldf") # Used to read in DUSMCPUB file
# install.packages("dplyr") # Used for tidy data management
library(sqldf)
library(dplyr)
#Increase memory limit to make space for large file
# memory.limit()
memory.limit(size=20000)
# Create dataframe containing variables for column width, name, and end position
columns <- data.frame(widths=c(19,1,40,2,1,1,2,2,1,4,1,2,2,2,2,1,1,1,16,4,1,1,1,
1,34,1,1,4,3,1,3,3,2,1,2,7,7,7,7,7,7,7,7,7,7,7,7,
7,7,7,7,7,7,7,7,36,2,1,5,5,5,5,5,5,5,5,5,5,5,5,5,
5,5,5,5,5,5,5,1,2,1,1,1,1,33,3,1,1,2,315,4,2,4,2))
columns$names <- c("blank1", # tape locations 1-19
"Resident_Status_US", # tape location 20
"blank2",
"Education_1989",
"Education_2003",
"Education_flag",
"Month_of_Death",
"blank3",
"Sex",
"DetailAge",
"Age_Substitution_Flag",
"Age_Recode_52",
"Age_Recode_27",
"Age_Recode_12",
"Infant_Age_Recode_22",
"Place_of_Death_and_Status",
"Marital_Status",
"Day_of_Week_of_Death",
"blank4",
"Current_Data_Year",
"Injury_at_Work",
"Manner_of_Death",
"Method_of_Disposition",
"Autopsy",
"blank5",
"Activity_Code",
"Place_of_Injury",
"ICD_Code_10",
"Cause_Recode_358",
"blank6",
"Cause_Recode_113",
"Infant_Cause_Recode_130",
"Cause_Recode_39",
"blank7",
"Number_Entity_Axis_Conditions",
"Condition_1EA", "Condition_2EA", "Condition_3EA", "Condition_4EA", "Condition_5EA",
"Condition_6EA", "Condition_7EA", "Condition_8EA", "Condition_9EA", "Condition_10EA",
"Condition_11EA", "Condition_12EA", "Condition_13EA", "Condition_14EA", "Condition_15EA",
"Condition_16EA", "Condition_17EA", "Condition_18EA", "Condition_19EA", "Condition_20EA",
"blank8",
"Number_Record_Axis_Conditions",
"blank9",
"Condition_1RA", "Condition_2RA", "Condition_3RA", "Condition_4RA", "Condition_5RA",
"Condition_6RA", "Condition_7RA", "Condition_8RA", "Condition_9RA", "Condition_10RA",
"Condition_11RA", "Condition_12RA", "Condition_13RA", "Condition_14RA", "Condition_15RA",
"Condition_16RA", "Condition_17RA", "Condition_18RA", "Condition_19RA", "Condition_20RA",
"blank10",
"Race",
"Bridged_Race_Flag",
"Race_Imputation_Flag",
"Race_Recode_3",
"Race_Recode_5",
"blank11",
"Hispanic_Origin",
"blank12",
"Hispanic_Origin_9_Race_Recode",
"Race_Recode_40",
"blank13",
"CensusOcc",
"Occ_26",
"CensusInd",
"Ind_23")
# Read in file using parameters from 'columns' dataframe
mort2020<- read.fwf("VS20MORT.DUSMCPUB_r20220105", widths=columns$widths, stringsAsFactors=F)
# Attach column names to variables
colnames(mort2020) <- columns$names
# Remove blank variables
mort2020x <- mort2020 %>% dplyr::select(-starts_with("blank"))
Alternatively, it looks like the files are published for most years in a CSV format here: https://www.nber.org/research/data/mortality-data-vital-statistics-nchs-multiple-cause-death-data. 2020 isn’t up yet, but for other years, it can be much faster to read a CSV into R than to use read.fwf.
The data is in the form of a fixed-width file. The user's guide to the data from the National Center for Health Statistics contains the appropriate widths. The answer I present is a modified answer from another forum, posted by #Hack-R.
https://opendata.stackexchange.com/questions/18375/how-can-one-interpret-the-nvss-mortality-multiple-cause-of-death-data-sets
map <- data.frame(widths=c(19, 1,40,2,1,1,2,2,1,1,1,1,1,1,2,2,2,2,1,1,1,16,4,1,1,1,1,34,1,1,4,
3,1,3,3,2,1,2,7,7,7,7,7,7,7,7,7,7,7,7,7,7,7,7,7,7,7,7,
36,2,1,5,5,5,5,5,5,5,5,5,5,5,5,5,5,5,5,5,5,5,5,1,2,1,1,1,1,33,3,
1,1))
#Set column names
map$cn <- c("blank", # cols 1-19
"res_status", #20
"blank2", # 21-60
"ed_v89",#61-62
"ed_v03",#63
"ed_flag", #64
"death_month", #65-66
"blank3",
"sex",
"age_years",
"age_months",
"age_3",
"age_4",
"age_sub_flag",
"age_recode_52",
"age_recode_27",
"age_recode_12",
"infant_age_recode_22",
"place_of_death",
"marital_status",
"death_day",
"blank4",
"current_year",
"work_injury",
"death_manner",
"disposition",
"autopsy",
"blank5",
"activity_code",
"place_injured",
"icd_cause_of_death",
"cause_recode358",
"blank6",
"cause_recode113",
"infant_cause_recode130",
"cause_recode39",
"blank7",
"num_entity_axis",
"cond1","cond2","cond3","cond4","cond5","cond6","cond7","cond8","cond9","cond10",
"cond11","cond12","cond13","cond14","cond15","cond16","cond17","cond18","cond19",
"cond20",
"blank7",
"num_rec_axis_cond",
"blank8",
"acond1", "acond2", "acond3", "acond4", "acond5", "acond6", "acond7",
"acond8", "acond9", "acond10", "acond11", "acond12", "acond13", "acond14",
"acond15", "acond16", "acond17", "acond18", "acond19", "acond20",
"blank9",
"race",
"bridged_race_flag",
"race_imp_flag",
"race_recode3",
"race_recode5",
"blank10",
"hisp",
"blank11",
"hisp_recode")
#Import the file
mort2020 <- read_fwf("./data/original/VS20MORT.DUSMCPUB_r20220105", fwf_widths(map$widths, map$cn))

xml to R dataframe different levels

I have this xml code:
<?xml version="1.0"?>
-<sct type="detail" version="0.1">
-<serviceLevelSegments timestamp="2017-10-25T11:45:44Z">
<Segment travelTimeMinutes="0.387" c-value="100" score="30" reference="85"
average="77" speed="95" code="447332188"/>
-<Segment travelTimeMinutes="0.597" c-value="100" score="30" reference="85"
average="77" speed="85" code="447332203">
<SubSegment speed="95" PkEnd="782.112" PkIni="781.9" offset="0,212"/>
<SubSegment speed="77" PkEnd="782.746" offset="635,846" pkIni="782.535"/>
</Segment>
</serviceLevelSegments>
</sct>
And I want to obtain a table with all attributes (of all the levels). It would be like this:
Value1 Value2 Value3 ...
timestamp
code
travelTimeMinutes
c-value
score
reference
average
speed (from Segment)
speed (from SubSegment)
PkEnd
PkIni
offset
If some cell is empty, I want to put inside: "NA".
I only have found the way to obtain a sepparate table to each level (serviceLevelSegments, Segment or SubSegment by sepparate). For example, to obtain a table for the SubSegment level I have found this code:
library(XML)
arch = "file.xml" #code in xml
input <- xmlParse(arch)
nodes <- getNodeSet(input,"//SubSegment")
all_parameters <- sapply(nodes, xmlAttrs)
all_parameters
But I obtain a table only with this attributes: speed, offset, PkIni, PkEnd. And I need all the attributes.
Thanks in advance.

Nestled Loop not Working to gather data from NOAA

I'm using the R package rnoaa(along with it required other packages) to gather historical weather data. I wrote this nestled loop to gather all the data sets but I keep getting errors when I run it. It seems to run for a second fine
The loop:
require('triebeard')
require('bindr')
require('colorspace')
require('mime')
require('curl')
require('openssl')
require('R6')
require('urltools')
require('httpcode')
require('stringr')
require('assertthat')
require('bindrcpp')
require('glue')
require('magrittr')
require('pkgconfig')
require('rlang')
require('Rcpp')
require('BH')
require('plogr')
require('purrr')
require('stringi')
require('tidyselect')
require('digest')
require('gtable')
require('plyr')
require('reshape2')
require('lazyeval')
require('RColorBrewer')
require('dichromat')
require('munsell')
require('labeling')
require('viridisLite')
require('data.table')
require('rjson')
require('httr')
require('crul')
require('lubridate')
require('dplyr')
require('tidyr')
require('ggplot2')
require('scales')
require('XML')
require('xml2')
require('jsonlite')
require('rappdirs')
require('gridExtra')
require('tibble')
require('isdparser')
require('geonames')
require('hoardr')
require('rnoaa')
install.package('ncdf4')
install.packages("devtools")
library(devtools)
install_github("rnoaa", "ropensci")
library(rnoaa)
list <- buoys(dataset='wlevel')
lid <- data.frame(list$id)
foo <- for(range in 1990:2017){
for(bid in lid){
bid_range <- buoy(dataset = 'wlevel', buoyid = bid, year = range)
bid.year.data <- data.frame(bid.year$data)
write.csv(bid.year.data, file='cwind/bid_range.csv')
}
}
The response:
Using c1990.nc
Using
Error: length(url) == 1 is not TRUE
It saves the first data-set but it does not apply the for in the file name it just names it bid_range.csv.
This error message shows that there are no any data of a given station id in 1990. Because you were using for loop, once it gots an error, it stops.
Here I introduce the use of tidyverse to download the NOAA buoy data. A lot of the following functions are from the purrr package, which is part of the tidyverse.
# Load packages
library(tidyverse)
library(rnoaa)
Step 1: Create a "Grid" containing all combination of id and year
The expand function from tidyr can create the combination of different values.
data_list <- buoys(dataset = 'wlevel')
data_list2 <- data_list %>%
select(id) %>%
expand(id, year = 1990:2017)
Step 2: Create a "safe" version that does not break when there is no data.
Also make this function suitable for the map2 function
Because we will use map2 to loop through all the combination of id and year using the map2 function by its .x and .y argument. We modified the sequence of argument to create buoy_modify. We also use the safely function to create a safe version of buoy_modify. Now when it meets error, it will store the error message and moves to the next one rather than breaks.
# Modify the buoy function
buoy_modify <- function(buoyid, year, dataset, ...){
buoy(dataset, buoyid = buoyid, year = year, ...)
}
# Creare a safe version of buoy_modify
buoy_safe <- safely(buoy_modify)
Step 3: Apply the buoy_safe function
wlevel_data <- map2(data_list2$id, data_list2$year, buoy_safe, dataset = "wlevel")
# Assign name for the element in the list based on id and year
names(wlevel_data) <- paste(data_list2$id, data_list2$year, sep = "_")
After this step, all the data were downloaded in wlevel_data. Each element in wlevel_data has two parts. $result shows the data if the download is successful, otherwise, it shows NULL. $error shows NULL if the download is successful, otherwise, it shows the error message.
Step 4: Access the data
transpose can turn a list "inside out". So now wlevel_data2 has two elements: result and error. We can store these two and access the data.
# Turn the list "inside out"
wlevel_data2 <- transpose(wlevel_data)
# Get the error message
wlevel_error <- wlevel_data2$error
# Get he result
wlevel_result <- wlevel_data2$result
# Remove NULL element in wlevel_result
wlevel_result2 <- wlevel_result[!map_lgl(wlevel_result, is.null)]

xml to R dataframe, multiple layers of children

I was trying to convert an xml into r df using XML package. Was able to get a df successfully, but whenever there were grandchildren under a child, values of grandchildren was merged into one column.
Here is how the xml looks like:
<user>
<created-at type="datetime">2012-12-20T18:32:20+00:00</created-at>
<details></details>
<is-active type="boolean">true</is-active>
<last-login type="datetime">2017-06-22T16:52:11+01:00</last-login>
<time-zone>Pacific Time (US & Canada)</time-zone>
<updated-at type="datetime">2017-06-22T21:00:47+01:00</updated-at>
<is-verified type="boolean">true</is-verified>
<groups type="array">
<group>
<created-at type="datetime">2015-02-09T09:34:41+00:00</created-at>
<id type="integer">23215935</id>
<is-active type="boolean">true</is-active>
<name>Product Managers</name>
<updated-at type="datetime">2015-02-09T09:34:41+00:00</updated-at>
</group>
</groups>
</user>
The code I used were:
users_xml = xmlTreeParse("users.xml")
top_users = xmlRoot(users_xml)
users = xmlSApply(top_users, function(x) xmlSApply(x, xmlValue))
The result I got had all the elements listed fine besides it combined everything under "groups" into one column. Is there anyway I can make each element under "group" a separate column in the final dataframe?
I also tried
nodes=getNodeSet(top_users, "//groups[#group]")
and
nodes=getNodeSet(top_users, "//groups/group[#group]")
and
nodes=getNodeSet(top_users, "//.groups/group[#group]")
and switched "top_users" to "user_xml", but each time got error message:
Error: 1: Input is not proper UTF-8, indicate encoding !
Bytes: 0xC2 0x3C 0x2F 0x6E
Then tried
data.frame(t(xpathSApply(xmlRoot(xmlTreeParse("users.xml", useInternalNodes = T)),
"//user", function(y) xmlSApply(y, xmlValue))))
Which gave me the exact same thing as the first solution.
And finally, I tried
data.frame(t(xpathSApply(xmlRoot(xmlTreeParse("users.xml", useInternalNodes = T)),
"//user/groups/group", function(y) xmlSApply(y, xmlValue))))
Which did give me a dataframe but only with elements in "group", and there is no way I can map it back to the first table I got that has all elements in "user".
Consider column binding with xmlToDataFrame() of user children and groups children:
userdf <- xmlToDataFrame(nodes=getNodeSet(doc, "/user"))
groupdf <- xmlToDataFrame(nodes=getNodeSet(doc, "/user/groups/group"))
df <- transform(cbind(userdf, groupdf), groups = NULL) # REMOVE groups COL
df
# created.at details is.active last.login time.zone
# 1 2012-12-20T18:32:20+00:00 true 2017-06-22T16:52:11+01:00 Pacific Time (US & Canada)
# updated.at is.verified created.at.1 id is.active.1 name
# 1 2017-06-22T21:00:47+01:00 true 2015-02-09T09:34:41+00:00 23215935 true Product Managers
# updated.at.1
# 1 2015-02-09T09:34:41+00:00

Convert R JSON Twitter data to list

When using SearchTwitter, I converted to dataframe and then exported to JSON. However, all the text is in one line, etc (sample below). I need to separate so that each tweet is its own.
phish <- searchTwitteR('phish', n = 5, lang = 'en')
phishdf <- do.call("rbind", lapply(phish, as.data.frame))
exportJson <-toJSON(phishdf)
write(exportJson, file = "phishdf.json")
json_phishdf <- fromJSON(file="phishdf.json")
I tried converting to a list and am wondering if maybe converting to a data frame is a mistake.
However, for a list, I tried:
newlist['text']=phish[[1]]$getText()
But this will just give me the text for the first tweet. Is there a way to iterate over the entire data set, maybe in a for loop?
{"text":["#ilazer #abbijacobson I do feel compelled to say that I phind phish awphul... sorry, Abbi!","#phish This on-sale was an embarrassment. Something needs to change.","FS: Have 2 Tix To Phish In Chula Vista #Phish #facevaluetickets #phish #facevalue GO: https://t.co/dFdrpyaotp","RT #WKUPhiDelt: Come unwind from a busy week of class and kick off the weekend with a Phish Fry! 4:30-7:30 at the Phi Delt house. Cost is $\u2026","RT #phish: Tickets for Phish's July 15 & 16 shows at The Gorge go on sale in fifteen minutes at 1PM ET: https://t.co/tEKLNjI5u7 https://t.c\u2026"],
"favorited":[false,false,false,false,false],
"favoriteCount":[0,0,0,0,0],
"replyToSN":["rAlexandria","phish","NA","NA","NA"],
"created":[1456521159,1456521114,1456521022,1456521016,1456520988],
"truncated":[false,false,false,false,false],
"replyToSID":["703326502629277696","703304948990222337","NA","NA","NA"],
"id":["703326837720662016","703326646074343424","703326261045829632","703326236722991105","703326119328686080"],
"replyToUID":["26152867","14503997","NA","NA","NA"],"statusSource":["Mobile Web (M5)","Twitter for iPhone","CashorTrade - Face Value Tickets","Twitter for iPhone","Twitter for Android"],
"screenName":["rAlexandria","adamgelvan","CashorTrade","Kyle_Smith1087","timogrennell"],
"retweetCount":[0,0,0,2,5],
"isRetweet":[false,false,false,true,true],
"retweeted":[false,false,false,false,false],
"longitude":["NA","NA","NA","NA","NA"],
"latitude":["NA","NA","NA","NA","NA"]}
I followed your code and don't have the issue you're describing. Are you using library(twitteR) and library(jsonlite)?
Here is the code, and a screenshot of it working
library(twitteR)
library(jsonlite)
phish <- searchTwitteR('phish', n = 5, lang = 'en')
phishdf <- do.call("rbind", lapply(phish, as.data.frame))
exportJson <-toJSON(phishdf)
write(exportJson, file = "./../phishdf.json")
## note the `txt` argument, as opposed to `file` used in the question
json_phishdf <- fromJSON(txt="./../phishdf.json")

Resources