Import JSON dataset to R - r

I want to make a data frame from the following JSON sample:
{"gender": "M", "age": 68, "id": "e2127556f4f64592b11af22de27a7932", "became_member_on": "20180426", "income": 70000}
{"gender": null, "age": 118, "id": "8ec6ce2a7e7949b1bf142def7d0e0586", "became_member_on": "20170925", "income": null}
{"gender": null, "age": 118, "id": "68617ca6246f4fbc85e91a2a49552598", "became_member_on": "20171002", "income": null}
{"gender": "M", "age": 65, "id": "389bc3fa690240e798340f5a15918d5c", "became_member_on": "20180209", "income": 53000}
{"gender": null, "age": 118, "id": "8974fc5686fe429db53ddde067b88302", "became_member_on": "20161122", "income": null}
{"gender": null, "age": 118, "id": "c4863c7985cf408faee930f111475da3", "became_member_on": "20170824", "income": null}
{"gender": null, "age": 118, "id": "148adfcaa27d485b82f323aaaad036bd", "became_member_on": "20150919", "income": null}

We can use stream_in
out <- jsonlite::stream_in(textConnection(str1))
str(out)
#'data.frame': 7 obs. of 5 variables:
# $ gender : chr "M" NA NA "M" ...
# $ age : int 68 118 118 65 118 118 118
# $ id : chr "e2127556f4f64592b11af22de27a7932" "8ec6ce2a7e7949b1bf142def7d0e0586" "68617ca6246f4fbc85e91a2a49552598" "389bc3fa690240e798340f5a15918d5c" ...
# $ became_member_on: chr "20180426" "20170925" "20171002" "20180209" ...
# $ income : int 70000 NA NA 53000 NA NA NA
If we are reading from a file
out <- jsonlite::stream_in(file('yourfile.json'))
Or with ndjson::stream_in
out <- ndjson::stream_in('yourfile.json', 'tbl')
data
str1 <- '{"gender": "M", "age": 68, "id": "e2127556f4f64592b11af22de27a7932", "became_member_on": "20180426", "income": 70000}
{"gender": null, "age": 118, "id": "8ec6ce2a7e7949b1bf142def7d0e0586", "became_member_on": "20170925", "income": null}
{"gender": null, "age": 118, "id": "68617ca6246f4fbc85e91a2a49552598", "became_member_on": "20171002", "income": null}
{"gender": "M", "age": 65, "id": "389bc3fa690240e798340f5a15918d5c", "became_member_on": "20180209", "income": 53000}
{"gender": null, "age": 118, "id": "8974fc5686fe429db53ddde067b88302", "became_member_on": "20161122", "income": null}
{"gender": null, "age": 118, "id": "c4863c7985cf408faee930f111475da3", "became_member_on": "20170824", "income": null}
{"gender": null, "age": 118, "id": "148adfcaa27d485b82f323aaaad036bd", "became_member_on": "20150919", "income": null}'

Related

Extract values from JSON complex object

I've a json file as shown below. I would like to extract the data into a R dataframe as follows. See the json object, that has a list of values for various dates. I would like to extract those values into the dataframe. Can you kindly help, on how I should build this?
Output Dataframe
Jan-18 a 5
Jan-18 b 0
Jan-18 c 9
Jan-18 d 0
Jan-18 e 5
Jan-19 a 4
Jan-19 b 0
Jan-19 c 26
Jan-19 d 0
Jan-19 e 35
value_headers = ['a', 'b', 'c', 'd', 'e']
Input JSON content:
{
"default": {
"timelineData": [
{
"time": "1610928000",
"formattedTime": "Jan 18, 2021",
"formattedAxisTime": "Jan 18",
"value": [
5,
0,
9,
0,
5
],
"hasData": [
true,
false,
true,
false,
true
],
"formattedValue": [
"5",
"0",
"9",
"0",
"5"
]
},
{
"time": "1611014400",
"formattedTime": "Jan 19, 2021",
"formattedAxisTime": "Jan 19",
"value": [
4,
0,
26,
0,
35
],
"hasData": [
true,
false,
true,
false,
true
],
"formattedValue": [
"4",
"0",
"26",
"0",
"35"
]
}
],
"averages": [
5,
1,
34,
25,
25
]
}
}
Using tidyverse could be something like:
library(jsonlite)
library(tidyverse)
json_dt <- fromJSON('{
"default": {
"timelineData": [
{
"time": "1610928000",
"formattedTime": "Jan 18, 2021",
"formattedAxisTime": "Jan 18",
"value": [
5,
0,
9,
0,
5
],
"hasData": [
true,
false,
true,
false,
true
],
"formattedValue": [
"5",
"0",
"9",
"0",
"5"
]
},
{
"time": "1611014400",
"formattedTime": "Jan 19, 2021",
"formattedAxisTime": "Jan 19",
"value": [
4,
0,
26,
0,
35
],
"hasData": [
true,
false,
true,
false,
true
],
"formattedValue": [
"4",
"0",
"26",
"0",
"35"
]
}
],
"averages": [
5,
1,
34,
25,
25
]
}
}')
tibble(
time = json_dt$default$timelineData$formattedTime,
value = json_dt$default$timelineData$formattedValue
) %>%
unnest(value) %>%
group_by(time) %>%
mutate(
letter = letters[1:n()],
value = as.integer(value),
time = str_replace(time, ",.*", ""),
time = str_replace(time, " ", "-")
)

IIFE version lightningchart xydata

const {
createSampledDataGenerator
} = require('#arction/xydata')
Hi all, I'm trying to let this ECG (https://www.arction.com/lightningchart-js-interactive-examples/edit/lcjs-example-0150-ecg.html) work without nodejs or nmp. I've seen that is possible to use IIFE js version. Implementing the IIFE version on the website i find an error running the above command, I'm not able to execute in in my webserver. How can i run it? Is there an IIFE version of xydata?
What script do i have to include and how? Thanks
xydata can be run directly in browser by using the .iife.js version of the xydata build. You can either download the #arction/xydata package from npm and use the xydata.iife.js file from there or alternatively you can use a CDN like UNPKG.
In your html head add <script src="https://unpkg.com/#arction/xydata#1.2.1/dist/xydata.iife.js"> (replace src url if you are providing the file locally)
And then replace the require("#arction/xydata") with just xydata. The iife file will add the xydata variable to the global context.
See a full working example below. I did have to cut the amount of data in the example to be able to fit the example here, as StackOverflow limits the length of an answer.
<!DOCTYPE html>
<html lang="en">
<head>
<script src="https://unpkg.com/#arction/lcjs#2.0.3/dist/lcjs.iife.js"></script>
<script src="https://unpkg.com/#arction/xydata#1.2.1/dist/xydata.iife.js"></script>
<title>Using chart in HTML page</title>
<meta charset="utf-8" />
<!-- Flexbox styling to have the chart and header fill the page.
Chart will take as much space as possible. -->
<style>
html,
body {
height: 100%;
margin: 0;
}
.box {
display: flex;
flex-flow: column;
height: 100%;
}
.box .row.header {
flex: 0 1 auto;
}
.box .row.content {
flex: 1 1 auto;
}
</style>
</head>
<body class="box">
<h1 class="row header">LightningChart<sup>®</sup> JS in HTML page</h1>
<!-- Create div to render the chart into-->
<div id="target" class="row content"></div>
<!--IIFE assembly (lcjs.iife.js) is a standalone JS file,
which does not need any build tools,
such as NPM, to be installed-->
<!--Script source must be defined in it's own script tag-->
<script src="lcjs.iife.js"></script>
<!--Actual chart related script tag-->
<script>
// Replace the contents of this script tag if you want to test code from our examples:
// https://www.arction.com/lightningchart-js-interactive-examples/
// Extract required parts from LightningChartJS.
const {
lightningChart,
DataPatterns,
AxisScrollStrategies,
SolidLine,
SolidFill,
ColorHEX,
AutoCursorModes,
Themes
} = lcjs
// Import data-generators from 'xydata'-library.
const {
createSampledDataGenerator
} = xydata
// Create a XY Chart.
const chart = lightningChart().ChartXY({
// theme: Themes.dark
container: 'target'
}).setTitle('ECG')
// Add line series to visualize the data received
const series = chart.addLineSeries({
dataPattern: DataPatterns.horizontalProgressive
})
// Style the series
series
.setStrokeStyle(new SolidLine({
thickness: 2,
fillStyle: new SolidFill({
color: ColorHEX('#5aafc7')
})
}))
.setMouseInteractions(false)
chart.setAutoCursorMode(AutoCursorModes.disabled)
// Setup view nicely.
chart.getDefaultAxisY()
.setTitle('mV')
.setInterval(-1600, 1000)
.setScrollStrategy(AxisScrollStrategies.expansion)
chart.getDefaultAxisX()
.setTitle('milliseconds')
.setInterval(0, 2500)
.setScrollStrategy(AxisScrollStrategies.progressive)
// Points that are used to generate a continuous stream of data.
const point = [{
x: 2,
y: 81
},
{
x: 3,
y: 83
},
{
x: 4,
y: 88
},
{
x: 5,
y: 98
},
{
x: 6,
y: 92
},
{
x: 7,
y: 85
},
{
x: 8,
y: 73
},
{
x: 9,
y: 71
},
{
x: 10,
y: 70
},
{
x: 11,
y: 83
},
{
x: 12,
y: 73
},
{
x: 13,
y: 79
},
{
x: 14,
y: 84
},
{
x: 15,
y: 78
},
{
x: 16,
y: 67
},
{
x: 17,
y: 71
},
{
x: 18,
y: 76
},
{
x: 19,
y: 77
},
{
x: 20,
y: 64
},
{
x: 21,
y: 53
},
{
x: 22,
y: 0
},
{
x: 23,
y: 41
},
{
x: 24,
y: 51
},
{
x: 25,
y: 3
},
{
x: 26,
y: 31
},
{
x: 27,
y: 37
},
{
x: 28,
y: 35
},
{
x: 29,
y: 48
},
{
x: 30,
y: 40
},
{
x: 31,
y: 42
},
{
x: 32,
y: 42
},
{
x: 33,
y: 32
},
{
x: 34,
y: 21
},
{
x: 35,
y: 41
},
{
x: 36,
y: 48
},
{
x: 37,
y: 47
},
{
x: 38,
y: 45
},
{
x: 39,
y: 42
},
{
x: 40,
y: 28
},
{
x: 41,
y: 15
},
{
x: 42,
y: 1
},
{
x: 43,
y: -12
},
{
x: 44,
y: -4
},
{
x: 45,
y: 15
},
{
x: 46,
y: 23
},
{
x: 47,
y: 22
},
{
x: 48,
y: 40
},
{
x: 49,
y: 46
},
{
x: 50,
y: 49
},
{
x: 51,
y: 48
},
{
x: 52,
y: 43
},
{
x: 53,
y: 52
},
{
x: 54,
y: 49
},
{
x: 55,
y: 44
},
{
x: 56,
y: 41
},
{
x: 57,
y: 41
},
{
x: 58,
y: 45
},
{
x: 59,
y: 57
},
{
x: 60,
y: 67
},
{
x: 61,
y: 65
},
{
x: 62,
y: 58
},
{
x: 63,
y: 47
},
{
x: 64,
y: 34
},
{
x: 65,
y: 35
},
{
x: 66,
y: 23
},
{
x: 67,
y: 11
},
{
x: 68,
y: 7
},
{
x: 69,
y: 14
},
{
x: 70,
y: 23
},
{
x: 71,
y: 18
},
{
x: 72,
y: 31
},
{
x: 73,
y: 35
},
{
x: 74,
y: 44
},
{
x: 75,
y: 49
},
{
x: 76,
y: 34
},
{
x: 77,
y: 7
},
{
x: 78,
y: -3
},
{
x: 79,
y: -8
},
{
x: 80,
y: -11
},
{
x: 81,
y: -20
},
{
x: 82,
y: -28
},
{
x: 83,
y: -4
},
{
x: 84,
y: 15
},
{
x: 85,
y: 20
},
{
x: 86,
y: 26
},
{
x: 87,
y: 26
},
{
x: 88,
y: 24
},
{
x: 89,
y: 34
},
{
x: 90,
y: 35
},
{
x: 91,
y: 30
},
{
x: 92,
y: 22
},
{
x: 93,
y: 12
},
{
x: 94,
y: 15
},
{
x: 95,
y: 18
},
{
x: 96,
y: 24
},
{
x: 97,
y: 18
},
{
x: 98,
y: 26
},
{
x: 99,
y: 25
},
{
x: 100,
y: 13
},
{
x: 101,
y: 2
},
{
x: 102,
y: 1
},
{
x: 103,
y: -10
},
{
x: 104,
y: -10
},
{
x: 105,
y: -4
},
{
x: 106,
y: 8
},
{
x: 107,
y: 15
},
{
x: 108,
y: 15
},
{
x: 109,
y: 15
},
{
x: 110,
y: 15
},
{
x: 111,
y: 18
},
{
x: 112,
y: 19
},
{
x: 113,
y: 3
},
{
x: 114,
y: -12
},
{
x: 115,
y: -14
},
{
x: 116,
y: -10
},
{
x: 117,
y: -22
},
{
x: 118,
y: -24
},
{
x: 119,
y: -29
},
{
x: 120,
y: -21
},
{
x: 121,
y: -19
},
{
x: 122,
y: -26
},
{
x: 123,
y: -9
},
{
x: 124,
y: -10
},
{
x: 125,
y: -6
},
{
x: 126,
y: -8
},
{
x: 127,
y: -31
},
{
x: 128,
y: -52
},
{
x: 129,
y: -57
},
{
x: 130,
y: -40
},
{
x: 131,
y: -20
},
{
x: 132,
y: 7
},
{
x: 133,
y: 14
},
{
x: 134,
y: 10
},
{
x: 135,
y: 6
},
{
x: 136,
y: 12
},
{
x: 137,
y: -5
},
{
x: 138,
y: -2
},
{
x: 139,
y: 9
},
{
x: 140,
y: 23
},
{
x: 141,
y: 36
},
{
x: 142,
y: 52
},
{
x: 143,
y: 61
},
{
x: 144,
y: 56
},
{
x: 145,
y: 48
},
{
x: 146,
y: 48
},
{
x: 147,
y: 38
},
{
x: 148,
y: 29
},
{
x: 149,
y: 33
},
{
x: 150,
y: 20
},
{
x: 151,
y: 1
},
{
x: 152,
y: -7
},
{
x: 153,
y: -9
},
{
x: 154,
y: -4
},
{
x: 155,
y: -12
},
{
x: 156,
y: -3
},
{
x: 157,
y: 5
},
{
x: 158,
y: -3
},
{
x: 159,
y: 12
},
{
x: 160,
y: 6
},
{
x: 161,
y: -10
},
{
x: 162,
y: -2
},
{
x: 163,
y: 15
},
{
x: 164,
y: 17
},
{
x: 165,
y: 21
},
{
x: 166,
y: 22
},
{
x: 167,
y: 15
},
{
x: 168,
y: 16
},
{
x: 169,
y: 1
},
{
x: 170,
y: -2
},
{
x: 171,
y: -9
},
{
x: 172,
y: -16
},
{
x: 173,
y: -18
}
]
// Create a data generator to supply a continuous stream of data.
createSampledDataGenerator(point, 1, 10)
.setSamplingFrequency(1)
.setInputData(point)
.generate()
.setStreamBatchSize(48)
.setStreamInterval(50)
.setStreamRepeat(true)
.toStream()
.forEach(point => {
// Push the created points to the series.
series.add({
x: point.timestamp,
y: point.data.y
})
})
</script>
</body>
</html>

Convert R dataframe into tough JSON list of lists for d3.hierarchy model

Edit: I have cleaned up a bit the question posting, and added a bounty. I will be afk for a few days, but getting this resolved would be a huge help
I would like to create using d3 a d3.hierarchy of a tree model, using basketball data. I essentially want to create a bracket structured as such:
...where the graph / model is a tree where each node has exactly two children (except for all of the end / leaf nodes, of course). This is a textbook example of when you'd want to use the d3.tree() and d3.hierarchy() functionalities, but it requires a JSON in a fairly specific format for the d3.hierarchy command. In particular, for a bracket of 8 basketball teams in a tournament that goes 8 - 4 - 2 - 1, the JSON data needs to be formatted like this:
const playoffData = {
"name": "Rockets",
"round": 4,
"id": 15,
"children": [
{
"name": "Rockets",
"round": 3,
"id": 14,
"children": [
{
"name": "Rockets",
"round": 2,
"id": 9,
"children": [
{
"name": "Rockets",
"round": 1,
"id": 1
},
{
"name": "Timberwolves",
"round": 1,
"id": 8
}
]
},
{
"name": "Jazz",
"round": 2,
"id": 12,
"children": [
{
"name": "Jazz",
"round": 1,
"id": 4
},
{
"name": "Thunder",
"round": 1,
"id": 5
}
]
}
]
},
{
"name": "Warriors",
"round": 3,
"id": 13,
"children": [
{
"name": "Warriors",
"round": 2,
"id": 10,
"children": [
{
"name": "Warriors",
"round": 1,
"id": 2
},
{
"name": "Spurs",
"round": 1,
"id": 7
}
]
},
{
"name": "Pelicans",
"round": 2,
"id": 11,
"children": [
{
"name": "Pelicans",
"round": 1,
"id": 3
},
{
"name": "Trail Blazers",
"round": 1,
"id": 6
}
]
}
]
}
]
};
Note the nested nature of the JSONs. The root node corresponds with the winner of the bracket, and leaf nodes correspond to teams in the first round of the bracket.
I have the following R dataframe of basketball data for the bracket:
> dput(mydata)
structure(list(id = c(1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12,
13, 14, 15), teamname = c("Rockets", "Warriors", "Trail Blazers",
"Jazz", "Thunder", "Pelicans", "Spurs", "Timberwolves", "Rockets",
"Warriors", "Pelicans", "Jazz", "Rockets", "Warriors", "Rockets"
), conference = c("West", "West", "West", "West", "West", "West",
"West", "West", "West", "West", "West", "West", "West", "West",
"West"), seeding = c(1, 2, 3, 4, 5, 6, 7, 8, NA, NA, NA, NA,
NA, NA, NA), round = c(1, 1, 1, 1, 1, 1, 1, 1, 2, 2, 2, 2, 3,
3, 4), child1 = c(NA, NA, NA, NA, NA, NA, NA, NA, 1, 2, 3, 4,
9, 11, 13), child2 = c(NA, NA, NA, NA, NA, NA, NA, NA, 8, 7,
6, 5, 12, 10, 14), wins = c(0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0), losses = c(0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0), completed = c(FALSE, FALSE, FALSE, FALSE, FALSE, FALSE,
FALSE, FALSE, FALSE, FALSE, FALSE, FALSE, FALSE, FALSE, FALSE
), winprobs = c(NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA,
NA, NA, NA)), .Names = c("id", "teamname", "conference", "seeding",
"round", "child1", "child2", "wins", "losses", "completed", "winprobs"
), row.names = c(1L, 2L, 3L, 4L, 5L, 6L, 7L, 8L, 17L, 18L, 19L,
20L, 25L, 26L, 29L), class = "data.frame")
> mydata
> playoff.data
id teamname conference seeding round child1 child2 wins losses completed winprobs
1 1 Rockets West 1 1 NA NA 0 0 FALSE NA
2 2 Warriors West 2 1 NA NA 0 0 FALSE NA
3 3 Trail Blazers West 3 1 NA NA 0 0 FALSE NA
4 4 Jazz West 4 1 NA NA 0 0 FALSE NA
5 5 Thunder West 5 1 NA NA 0 0 FALSE NA
6 6 Pelicans West 6 1 NA NA 0 0 FALSE NA
7 7 Spurs West 7 1 NA NA 0 0 FALSE NA
8 8 Timberwolves West 8 1 NA NA 0 0 FALSE NA
17 9 Rockets West NA 2 1 8 0 0 FALSE NA
18 10 Warriors West NA 2 2 7 0 0 FALSE NA
19 11 Pelicans West NA 2 3 6 0 0 FALSE NA
20 12 Jazz West NA 2 4 5 0 0 FALSE NA
25 13 Rockets West NA 3 9 12 0 0 FALSE NA
26 14 Warriors West NA 3 11 10 0 0 FALSE NA
29 15 Rockets West NA 4 13 14 0 0 FALSE NA
If you can tell, My R Dataframe has a row for what will be each node in my d3 graph. Notice the tree structure in particular, and the child1 and child2 helper columns for identifying children - for the Final Round (row 15), its child nodes are the two nodes in the previous round (13 and 14). For row 13 (the semi finals), its children nodes are 9 and 12, etc. The first 8 rows are the first round, and therefore these are leaf nodes and have no children.
Its a bit long, but I wanted to include the whole JSON and R dataframe to keep things clear. I would also like other dataframe columns (wins, losses, win probs) included in the JSON structure, however for a bit of brevity, I did not show these in the JSON above.
A last note: while I work mainly in R, this is a d3 graph I am making, and as such there is quite a bit of javascript coding that I must do for this. My opinion is that R is better for this type of data manip, however since this is a nested JSON object we're dealing with, maybe JS is better. If there's an eas(ier) solution that involves using javasript to map a 2D JSON version of the R dataframe into the desired nested JSON, that would probably be sufficient as well.
Any help with this is appreciated! I promise to select a top answer once I return to award the bounty.
Here is a tidyverse solution.
We reformat your data and split the data.frame in 4 data.frames.
Then we join those, nesting the relevant columns at each step.
Finally we use toJSON to finish the job :
my.split <- my.data %>%
gather(temp,children,child1,child2) %>%
select(-temp) %>%
select(name= teamname,round,id,children) %>% # change here to keep more columns
distinct %>%
split(.$round)
my.split[[1]] %>%
select(-children) %>%
right_join(my.split[[2]],by=c(id="children"),suffix=c("",".y")) %>%
nest(1:3) %>% # change here to keep more columns
setNames(names(my.split[[1]])) %>%
right_join(my.split[[3]],by=c(id="children"),suffix=c("",".y")) %>%
nest(1:4) %>% # change here to keep more columns
setNames(names(my.split[[1]])) %>%
right_join(my.split[[4]],by=c(id="children"),suffix=c("",".y")) %>%
nest(1:4) %>% # change here to keep more columns
setNames(names(my.split[[1]])) %>%
jsonlite::toJSON(pretty=TRUE)
output:
[
{
"name": "Rockets",
"round": 4,
"id": 15,
"children": [
{
"name": "Rockets",
"round": 3,
"id": 13,
"children": [
{
"name": "Rockets",
"round": 2,
"id": 9,
"children": [
{
"name": "Rockets",
"round": 1,
"id": 1
},
{
"name": "Timberwolves",
"round": 1,
"id": 8
}
]
},
{
"name": "Jazz",
"round": 2,
"id": 12,
"children": [
{
"name": "Jazz",
"round": 1,
"id": 4
},
{
"name": "Thunder",
"round": 1,
"id": 5
}
]
}
]
},
{
"name": "Warriors",
"round": 3,
"id": 14,
"children": [
{
"name": "Pelicans",
"round": 2,
"id": 11,
"children": [
{
"name": "Trail Blazers",
"round": 1,
"id": 3
},
{
"name": "Pelicans",
"round": 1,
"id": 6
}
]
},
{
"name": "Warriors",
"round": 2,
"id": 10,
"children": [
{
"name": "Warriors",
"round": 1,
"id": 2
},
{
"name": "Spurs",
"round": 1,
"id": 7
}
]
}
]
}
]
}
]
You can try this recursive function together with jsonlite::toJSON():
get_node <- function(df, id) {
node <- as.list(df[df$id == id, c("teamname", "round", "id")])
names(node) = c("name", "round", "id")
id1 <- df[df$id == id,]$child1
id2 <- df[df$id == id,]$child2
if (!is.na(id1) && !is.na(id2)) {
child1 <- get_node(df, id1)
child2 <- get_node(df, id2)
if (child1$name == node$name)
node$children <- list(child1, child2)
else if (child2$name == node$name)
node$children <- list(child2, child1)
else
stop("Inout data is inconsistent!")
}
node
}
jsonlite::toJSON(get_node(playoffs, 15), pretty = TRUE, auto_unbox = TRUE)
With your data I get the following JSON:
{
"name": "Rockets",
"round": 4,
"id": 15,
"children": [
{
"name": "Rockets",
"round": 3,
"id": 13,
"children": [
{
"name": "Rockets",
"round": 2,
"id": 9,
"children": [
{
"name": "Rockets",
"round": 1,
"id": 1
},
{
"name": "Timberwolves",
"round": 1,
"id": 8
}
]
},
{
"name": "Jazz",
"round": 2,
"id": 12,
"children": [
{
"name": "Jazz",
"round": 1,
"id": 4
},
{
"name": "Thunder",
"round": 1,
"id": 5
}
]
}
]
},
{
"name": "Warriors",
"round": 3,
"id": 14,
"children": [
{
"name": "Warriors",
"round": 2,
"id": 10,
"children": [
{
"name": "Warriors",
"round": 1,
"id": 2
},
{
"name": "Spurs",
"round": 1,
"id": 7
}
]
},
{
"name": "Pelicans",
"round": 2,
"id": 11,
"children": [
{
"name": "Pelicans",
"round": 1,
"id": 6
},
{
"name": "Trail Blazers",
"round": 1,
"id": 3
}
]
}
]
}
]
}

Converting to JSON (key,value) pair using R

My data frame contains data as follows:
Tester W1 W2 W3 A P WD(%) TS(Hrs.) AT(Hrs.) SU(%)
a 60 40 102 202 150 100 120 120 100
b 30 38 46 114 150 76 135 120 100
c 25 30 52 107 150 71 120 120 100
By using the package jsonlite I have converted to json format:
{
"Tester": [ "a", "b", "c" ],
"W1": [ 60, 30, 25],
"W2": [ 40, 38, 30 ],
"W3": [ 102, 46, 52 ],
"A": [ 202, 114, 107 ],
"P": [ 150, 150, 150 ],
"WD...": [ 100, 76, 71 ],
"TS.Hrs..": [ 120, 135, 120 ],
"AT.Hrs..": [ 120, 120, 120 ],
"SU...": [ 100, 100, 100 ]
}
But my requirement is to get the JSON format like:
[ {
"Tester":"a"
"W1": 60,
"w2": 40
"w3": 102,
"A": 202
"P": 150,
"WD(%)":100,
"TS (Hrs.) ": 120,
"AT (Hrs.)": 120,
"SU(%)": 100
}]
Can someone please help me?
The output that you're seeing is produced by jsonlite, when a data set is a list:
library(jsonlite)
toJSON(as.list(head(iris)))
{"Sepal.Length":[5.1,4.9,4.7,4.6,5,5.4],"Sepal.Width":[3.5,3,3.2,3.1,3.6,3.9],"Petal.Length":[1.4,1.4,1.3,1.5,1.4,1.7],"Petal.Width":[0.2,0.2,0.2,0.2,0.2,0.4],"Species":["setosa","setosa","setosa","setosa","setosa","setosa"]}
Make sure that your data set is indeed a data frame and you will see the expected output:
library(jsonlite)
toJSON(head(iris), pretty = TRUE)
[
{
"Sepal.Length": 5.1,
"Sepal.Width": 3.5,
"Petal.Length": 1.4,
"Petal.Width": 0.2,
"Species": "setosa"
},
{
"Sepal.Length": 4.9,
"Sepal.Width": 3,
"Petal.Length": 1.4,
"Petal.Width": 0.2,
"Species": "setosa"
},
{
"Sepal.Length": 4.7,
"Sepal.Width": 3.2,
"Petal.Length": 1.3,
"Petal.Width": 0.2,
"Species": "setosa"
},
{
"Sepal.Length": 4.6,
"Sepal.Width": 3.1,
"Petal.Length": 1.5,
"Petal.Width": 0.2,
"Species": "setosa"
},
{
"Sepal.Length": 5,
"Sepal.Width": 3.6,
"Petal.Length": 1.4,
"Petal.Width": 0.2,
"Species": "setosa"
},
{
"Sepal.Length": 5.4,
"Sepal.Width": 3.9,
"Petal.Length": 1.7,
"Petal.Width": 0.4,
"Species": "setosa"
}
]

Trying to use separate to split one column into more than 2 columns

I'm new to R and practicing using the Titanic data set from Kaggle. I am attempting to separate last name, first name, salutation, and extra information into separate columns so that I can try to categorize the age of the passengers - adult or child.
The following is sample data from the Train data set:
head(traindf,5)
# Source: local data frame [5 x 12]
#
# PassengerId Survived Pclass
# 1 1 0 3
# 2 2 1 1
# 3 3 1 3
# 4 4 1 1
# 5 5 0 3
# Variables not shown: Name (chr), Sex (fctr), Age (dbl), SibSp (int), Parch
# (int), Ticket (fctr), Fare (dbl), Cabin (fctr), Embarked (fctr)
The following is a sample that includes the Name:
select(traindf,Survived,Pclass,Name,Sex)
# Source: local data frame [891 x 4]
#
# Survived Pclass Name Sex
# 1 0 3 Braund, Mr. Owen Harris male
# 2 1 1 Cumings, Mrs. John Bradley (Florence Briggs Thayer) female
# 3 1 3 Heikkinen, Miss. Laina female
# 4 1 1 Futrelle, Mrs. Jacques Heath (Lily May Peel) female
# 5 0 3 Allen, Mr. William Henry male
# 6 0 3 Moran, Mr. James male
# 7 0 1 McCarthy, Mr. Timothy J male
# 8 0 3 Palsson, Master. Gosta Leonard male
# 9 1 3 Johnson, Mrs. Oscar W (Elisabeth Vilhelmina Berg) female
# 10 1 2 Nasser, Mrs. Nicholas (Adele Achem) female
I can use the following code to separate last name from the rest of the column:
require(tidyr) # for the separate() function
traindfnames <- traindf %>%
separate(Name, c("Lastname","Salutation"), sep = ",")
traindfnames
# Source: local data frame [891 x 13]
#
# PassengerId Survived Pclass Lastname
# 1 1 0 3 Braund
# 2 2 1 1 Cumings
# 3 3 1 3 Heikkinen
# 4 4 1 1 Futrelle
# 5 5 0 3 Allen
# 6 6 0 3 Moran
# 7 7 0 1 McCarthy
# 8 8 0 3 Palsson
# 9 9 1 3 Johnson
# 10 10 1 2 Nasser
# .. ... ... ... ...
# Variables not shown: Salutation (chr), Sex (fctr), Age (dbl), SibSp (int),
# Parch (int), Ticket (fctr), Fare (dbl), Cabin (fctr), Embarked (fctr)
However, when I try to add a field for First Name:
traindfnames <- traindf %>%
separate(Name, c("Lastname","Salutation","firstname"), sep =",,")
I get this error:
# Error: Values not split into 3 pieces at 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 51, 52, 53, 54, 55, 56, 57, 58, 59, 60, 61, 62, 63, 64, 65, 66, 67, 68, 69, 70, 71, 72, 73, 74, 75, 76, 77, 78, 79, 80, 81, 82, 83, 84, 85, 86, 87, 88, 89, 90, 91, 92, 93, 94, 95, 96, 97, 98, 99, 100, 101, 102, 103, 104, 105, 106, 107, 108, 109, 110, 111, 112, 113, 114, 115, 116, 117, 118, 119, 120, 121, 122, 123, 124, 125, 126, 127, 128, 129, 130, 131, 132, 133, 134, 135, 136, 137, 138, 139, 140, 141, 142, 143, 144, 145, 146, 147, 148, 149, 150, 151, 152, 153, 154, 155, 156, 157, 158, 159, 160, 161, 162, 163, 164, 165, 166, 167, 168, 169, 170, 171, 172, 173, 174, 175, 176, 177, 178, 179, 180, 181, 182, 183, 184, 185, 186, 187, 188, 189, 190, 191, 192, 193, 194, 195, 196, 197, 198, 199, 200, 201, 202, 203, 204, 205, 206, 207, 208, 209, 210, 211, 212, 213, 2
Am I using incorrect syntax or 3 fields from one column isn't possible?
Having looked at this data, I think the easiest way to do it is using something like str_match() from package stringr. If you assume data$Name is in the form
"[Lastname], [Salutation]. [Firstname]"
the regular expression to match this is
str_match(data$Name, "([A-Za-z]*),\\s([A-Za-z]*)\\.\\s(.*)")
# [,1] [,2] [,3] [,4]
# [1,] "Braund, Mr. Owen Harris" "Braund" "Mr" "Owen Harris"
# [2,] "Cumings, Mrs. John Bradley (Florence Briggs Thayer)" "Cumings" "Mrs" "John Bradley (Florence Briggs Thayer)"
# [3,] "Heikkinen, Miss. Laina" "Heikkinen" "Miss" "Laina"
# [4,] "Futrelle, Mrs. Jacques Heath (Lily May Peel)" "Futrelle" "Mrs" "Jacques Heath (Lily May Peel)"
# [5,] "Allen, Mr. William Henry" "Allen" "Mr" "William Henry"
# [6,] "Moran, Mr. James" "Moran" "Mr" "James"
So you need to add columns 2 to 4 above to your original data frame. I am not sure you can do this with separate actually. Writing
separate(data, Name, c("Lastname", "Salutation", "Firstname"), sep = "[,\\.]")
will try to split each entry by comma or dot, but it runs into a problem in the 514th entry that looks like "Rothschild, Mrs. Martin (Elizabeth L. Barrett)" (notice the second dot).
In short, the easiest way I can see of doing what you want is
data[c("Firstname", "Salutation", "Lastname")] <-
str_match(data$Name, "([A-Za-z]*),\\s([A-Za-z]*)\\.\\s(.*)")[, 2:4]

Resources