I am trying to join two json objects into a single json object in R using jsonlite.
As a simple illustration, if I have the following:
The api that I am using needs a Json object that has the column names of a dataframe as the first element, followed by the numeric output of the rows. To illustrate:
df <- data.frame(A = rnorm(2), B = rnorm(2), C = rnorm(2))
Which I need to look like:
set.seed(123)
[["A", "B", "C"], [-0.5605,1.5587,0.1293],[-0.2302,0.0705,1.7151]]
But the following attempts fail at achieving the above:
c( jsonlite::toJSON( names(df) ), jsonlite::toJSON( df, "values" ))
paste0( jsonlite::toJSON( names(df) ), jsonlite::toJSON( df, "values" ))
This solution does not work, and I haven't found any other suggestions for how to achieve this.
Any ideas would be appreciated.
An option is to split by row (asplit with MARGIN = 1) into a list, concatenate (c) with the names of the data and apply the toJSON
library(jsonlite)
toJSON(c(list(names(df)), asplit(df, 1)))
#[["A","B","C"],[-0.5605,1.5587,0.1293],[-0.2302,0.0705,1.7151]]
Related
I have a list of dataframes, like this:
a<-as.data.frame(a)
b<-matrix(1,nrow = 10,ncol = 10)
b<-as.data.frame(b)
c<-matrix(1,nrow = 10,ncol = 10)
c<-as.data.frame(c)
my_data<-list(a,b,c)
And I would like to add a column in a specific position (before the last column) but only for a specific data.frame (the first one, a). This is why I am using add_column from tibble package.
install.packages("tibble")
library("tibble")
my_column<-rep("new",10)
a<-add_column(a,my_column,.before = "V10")
This perfectly works on the single data.frame, but in my case the data.frames are imported all together in R. Thus, I would like to do something like this:
my_data<-add_column(my_data[[a]],my_column,.before = "V10")
But it is not working, as I get this error:
Error in .subset2(x, i, exact = exact) : invalid subscript type 'list'
Any advice? Thank you in advance.
Try this:
my_data[[1]] <- add_column(my_data[[1]], my_column,.before = "V10")
If you want to name dataframes in list:
names(my_data) <- c("a", "b", "c")
my_data$a <- add_column(my_data$a, my_column, .before = "V10")
I am trying to output a data frame in R as a json file to use for highcharts plots that I am making outside R. This is what my output looks like :-
[{"name":"alpha","value":1},{"name":"brave","value":2},{"name":"charlie","value":3}]
However, I want my output to look like this :-
[{name:"alpha",value:1},{name:"brave",value:2} {name:"charlie",value:3}]
What should I do so that the names of my data frame (in this case name and value) are not put into quotes? If converting my data into a json file is not the best way, what else can/should I do?
library(tidyverse)
library(jsonlite)
data = tibble(name = c("alpha", "bravo", "charlie"),
value = c(1, 2, 3))
output = toJSON(data, dataframe="rows")
write(output, "output.txt")
One possible way using regex, removing quotes from values appearing before colon :
json_string <- jsonlite::toJSON(data, dataframe="rows")
temp <- stringr::str_replace_all(json_string, '"(\\w+)"\\s*:', '\\1:')
cat(temp)
#[{name:"alpha",value:1},{name:"bravo",value:2},{name:"charlie",value:3}]
write(temp, "output.txt")
Not sure how to do this inside toJSON but you can use mgsub from the qdap library
sapply(names(data), function(name_i){
output <<- mgsub(paste0("\"", name_i, "\""), name_i, output)})
That gives you
output
[{name:"alpha",value:1},{name:"bravo",value:2},{name:"charlie",value:3}]
An alternative way -
library(tidyverse)
library(jsonlite)
data = tibble(name = c("alpha", "bravo", "charlie"), value = c(1, 2, 3))
output = toJSON(data)
output=gsub("\"(\\w*)\":", "\\1:", output,perl=TRUE)
print(output)
Output -
[{name:"alpha",value:1},{name:"bravo",value:2},{name:"charlie",value:3}]
My dataframe has a column named Code of the type char which goes like b,b1,b110-b139,b110,b1100,b1101,... (1602 entries)
I am trying to select all the entries that match the strings in a vector and all the ones that start with the same string.
So lets say I have the vector
Selection=c("b114","d2")
then i want all codes like b114, b1140, b1141, b1142, ... as well as d2, d200, d2000, d2001, d2002, d2003 etc...
what does work in principle is to create a new dataframe like this:
bTable <- TreeMapTable[substr(TreeMapTable$Code,1,4)=="b114"|substr(TreeMapTable$Code,1,2)=="d2",]
which gives me all the data i want, but requires me to manually type the condition for each entry and i just want to give the script a vector with the strings.
I tried to do it like this:
SelectionL=nchar(Selection)
Beispieltable <- TreeMapTable[substr(TreeMapTable$Code,1,AuswahlL)==Auswahl1,]
but this gives me somehow only half of the required entries and i confess i don't really know what it is doing. I know i could use a for loop but from everything i read so far, loops should be avoided and the problem should be solveable by use of vectors.
sample data
df <- data.frame( Code = c("b114", "b115", "b11456", "d2", "d12", "d200", "db114"),
stringsAsFactors = FALSE)
Selection=c("b114","d2")
answer
library( dplyr )
#create a regex pattern to filter on
pattern <- paste0( "^", Selection, collapse = "|" )
#filter out all rows wher 'Code' dows not start with the entries from 'Selection'
df %>% filter( grepl( pattern, Code, perl = TRUE ) )
# Code
# 1 b114
# 2 b11456
# 3 d2
# 4 d200
I have a vector of column names called tbl_colnames.
I would like to create a tibble with 0 rows and length(tbl_colnames) columns.
The best way I've found of doing this is...
tbl <- as_tibble(data.frame(matrix(nrow=0,ncol=length(tbl_colnames)))
and then I want to name the columns so...
colnames(tbl) <- tbl_colnames.
My question: Is there a more elegant way of doing this?
something like tbl <- tibble(colnames=tbl_colnames)
my_tibble <- tibble(
var_name_1 = numeric(),
var_name_2 = numeric(),
var_name_3 = numeric(),
var_name_4 = numeric(),
var_name_5 = numeric()
)
Haven't tried, but I guess it works too if instead of initiating numeric vectors of length 0 you do it with other classes (for example, character()).
This SO question explains how to do it with other R libraries.
According to this tidyverse issue, this won't be a feature for tribbles.
Since you want to combine a list of tibbles. You can just assign NULL to the variable and then bind_rows with other tibbles.
res = NULL
for(i in tibbleList)
res = bind_rows(res,i)
However, a much efficient way to do this is
bind_rows(tibbleList) # combine all tibbles in the list
For anyone still interested in an elegant way to create a 0-row tibble with column names given by a character vector tbl_colnames:
tbl_colnames %>% purrr::map_dfc(setNames, object = list(logical()))
or:
tbl_colnames %>% purrr::map_dfc(~tibble::tibble(!!.x := logical()))
or:
tbl_colnames %>% rlang::rep_named(list(logical())) %>% tibble::as_tibble()
This, of course, results in each column being of type logical.
The following command will create a tibble with 0 row and variables (columns) named with the contents of tbl_colnames
tbl <- tibble::tibble(!!!tbl_colnames, .rows = 0)
You could abuse readr::read_csv, which allow to read from string. You can control names and types, e.g.:
tbl_colnames <- c("one", "two", "three", "c4", "c5", "last")
read_csv("\n", col_names = tbl_colnames) # all character type
read_csv("\n", col_names = tbl_colnames, col_types = "lcniDT") # various types
I'm a bit late to the party, but for future readers:
as_tibble(matrix(nrow = 0, ncol = length(tbl_colnames)), .name_repair = ~ tbl_colnames)
.name_repair allows you to name you columns within the same function.
EDIT
I am trying to name a column and rename all items within the column of a dataset:
dataSet <- read.csv(url) %>%
rename("newColumn1" = V1) %>%
mutate(newColumn1 = recode(newColumn1, "oldEntryX" = "newEntryX") %>%
select(dataSet, newColumn1)
And I get this error:
Error in recode(newColumn1, oldEntryX = "newEntryX" :
object 'newColumn1' not found
What am I missing?
The code runs correctly up through the rename function and displays the renamed column correctly, but soon as I include mutate it throws an error.
I have no problem sharing the real code but wanted to generalize it for the crowd.
source info was from https://archive.ics.uci.edu/ml/machine-learning-databases/mushroom/agaricus-lepiota.data
IN the mutate step, you don't need quotes for column names on the lhs of =. Also, there are couple of case mismatches
Assuming the dataset is read correctly, we can
df1 %>%
rename(newColumn1 = V1, newColumn2 = V2) %>%
mutate(newColumn1 = recode(newColumn1, oldEntryX = "newEntryX"),
newColumn2 = recode(newColumn2, oldEntryY = "newEntryY"))
Based on the OP's code there is no closing quote as well "newColumn1
data
set.seed(24)
df1 <- data.frame(V1 = sample(c("oldEntryX", "x", "y"), 10, replace = TRUE),
V2 = sample(c("oldEntryY", "x", "y"), 10, replace = TRUE), stringsAsFactors= FALSE)
you can do this with some simple codes of R programming:
How to read csv file
Syntax :- `read.csv("filename.csv")
by using this command 1st row will be used as header. To improve this fault one should write
data <- read.csv("datafile.csv", header=FALSE)
How to rename the header/Column name:
names(data) <- c("Column1", "Column2", "Column3")
Now your headers are replaced by Column1, Column2 and Column3
Now to change Column1 data you can follow steps
data$Column1 <- c(write down set of values with which you want to replace)
To see the output type
data