I am tring to conduct a basic bibliometrix analysis using biblioshiny. However, since I have both Scopus and WoS databases, I am finding it difficult to combine them. So far, I have been able to import both the data using codes in R, and I have also already combined them. But I can't figure out how to use this combined data as input into the biblioshiny() app.
#Importing WoS and Scopus data individually
m1 = convert2df("WOS.txt", "wos", "plaintext")
m2 = convert2df("scopus.csv", "scopus", "csv")
#Merging them
M = mergeDbSources(m1, m2, remove.duplicated = TRUE)
#Creating the results
results = biblioAnalysis(M, sep = ";")
I just need to know how to export the results in a relevant format for data input in biblioshiny. Please help!
Put all of the WOS data files (in txt format) into a zip file and upload that zip file into biblioshiny. That's all you have to do.
use this command
library(openxlsx)
write.xlsx(results, file="mergedfile.xlsx")
it will save results with a name of mergedfile
Related
I am trying to retrieve the R output from environment to either excel or a csv file. I have run the following code to extract stock prices from Yahoo finance and saved it in a new environment. writexl only works on dataframe or list of dataframes. Can I convert it to dataframe to export to excel? Any suggestion is greatly appreciated.
EQUITY_L <- read.csv("D:/Thesis/Chapter/EQUITY_L.csv", stringsAsFactors = FALSE)
NSE_Symbols <- paste0(EQUITY_L$SYMBOL,".NS")
NSE_stocks <- new.env()
library(quantmod)
sapply(NSE_Symbols, function(x){try(getSymbols(x, env=NSE_stocks), silent=TRUE)})
I would be very grateful for any guidance on how to use the xltabr package to automatically format tables in r, please:
https://github.com/moj-analytical-services/xltabr
In SPSS for example, I would apply the relevant weight and then run a cross tab on the raw data e.g var1*var2.
How would you go about doing this in r so that the package recognises it to produce the table?
Much appreciated.
You need to create/ read in the dataframe which you want to use first.
dat <- read.spss("mydataframe.sav")
Then you need to put it in the format you want: As in your example of crosstables, you can do this:
library(reshape2)
ct <- reshape2::dcast(iris, variable1 ~ variable2, fun.aggregate = length)
#depending on what data you want, you can change the fun.aggreagte function (e.g. sum or mean).
Then you can use the xltabr package to prepare the excel file by creating a Workbook:
wb <- xltabr::auto_crosstab_to_wb(ct)
Then you can save it as .xlsx file:
library(openxlsx)
openxlsx::saveWorkbook(wb, file = "crosstable.xlsx", overwrite = T)
I hope this helps
I just start using R and I have a question regarding cluster analysis in R.
I apply agnes function to apply cluster analysis for my dataset. But I realized that cluster results and the pltrees are different when I used the .txt file and .csv file.
Maybe it would be better to explain my problem with the images:
My dataset in .txt format;
I used the following code to see the data in R;
data01 <- read.table("D:/CLUSTER_ANALYSIS/NumericData3_IN.txt", header = T)
and everything is fine, it seems like;
I apply the cluster anaylsis,
complete1 <- agnes(data01, stand = FALSE, method = 'complete')
plot(complete1, which.plots=2, main='Complete-Linkage')
And here is the pltree:
I made the same steps with .csv file, which includes exactly the same dataset. Here is the dataset in .csv format:
Again the cluster analysis for .csv file:
data02 <- read.csv("D:/CLUSTER_ANALYSIS/NumericData3.csv", header = T)
complete2 <- agnes(data02, stand = FALSE, method = 'complete')
plot(complete2, which.plots=2, main='Complete-Linkage')
And the pltree is completely different,
So, DECIMAL SEPARATOR for the txt is COMMA and for csv file it is DOT. Which of these results are correct? Is the decimal separator for numeric dataset comma or dot in R?
From the R manual on read.table (and read.csv) you can see the default separators. They are dot for each of your used functions. You can also set them to whatever you like with the "dec" parameter. Eg:
data01 <- read.table("D:/CLUSTER_ANALYSIS/NumericData3_IN.txt", header = T, dec=",")
I have a large, un-organized XML file that I need to search to determine if a certain ID numbers are in the file. I would like to use R to do so and because of the format, I am having trouble converting it to a data frame or even a list to extract to a csv. I figured I can search easily if it is in a csv format. So , I need help understanding how to do convert it and extract it properly, or how to search the document for values using R. Below is the code I have used to try and covert the doc,but several errors occur with my various attempts.
## Method 1. I tried to convert to a data frame, but the each column is not the same length.
require(XML)
require(plyr)
file<-"EJ.XML"
doc <- xmlParse(file,useInternalNodes = TRUE)
xL <- xmlToList(doc)
data <- ldply(xL, data.frame)
datanew <- read.table(data, header = FALSE, fill = TRUE)
## Method 2. I tried to convert it to a list and the file extracts but only lists 2 words on the file.
data<- xmlParse("EJ.XML")
print(data)
head(data)
xml_data<- xmlToList(data)
class(data)
topxml <- xmlRoot(data)
topxml <- xmlSApply(topxml,function(x) xmlSApply(x, xmlValue))
xml_df <- data.frame(t(topxml),
row.names=NULL)
write.csv(xml_df, file = "MyData.csv",row.names=FALSE)
I am going to do some research on how to search within R as well, but I assume the file needs to be in a data frame or list to so either way. Any help is appreciated! Attached is a screen shot of the data. I am interested in finding matching entity id numbers to a list I have in a excel doc.
I am using sapply(tk_choose.files) to produce an interactive window where I can choose which .csv files (multiple) to import. I then do some basic data manipulation so that the mean of one particular column can be plotted using ggplot.
So far my code looks something like this:
>tfiles <- data.frame(sapply(sapply(tk_choose.files(caption="Choose T files
(hold CTRL to select multiple files)"), read.table, header=TRUE, sep=","), c))
>rfiles <- data.frame(sapply(sapply(tk_choose.files(caption="Choose R files
(hold CTRL to select multiple files)"), read.table, header=TRUE, sep=","), c))
I have then calculated the mean of a particular column for both tfiles and rfiles so that I could plot 100-tfiles-rfiles.
While this is working fine for one set of data, I would like to now import more sets of data, preferably also using sapply(tk_choose.files). Essentially I need to get t/rfiles1, t/rfiles2...and repeat the data manipulation process after that, so that I could get a plot of multiple sets of data. I have no idea how to do this without having to copy and paste my code!
Sorry if this is a stupid question, I am very new to R so I am really stuck, your help is greatly appreciated!
Assuming that the files in the working directory are as follow:
all.files<-list.files(pattern="\\.csv")
all.files
[1] "R01.csv" "R02.csv" "R03.csv" "R04.csv" "T01.csv" "T02.csv" "T03.csv" "T04.csv"
And you wish to call tfiles1 as merged data of T01 and T02; tfiles2 as merged data of T03 and T04
T <- grep("T", all.files, value=T)
T
[1] "T01.csv" "T02.csv" "T03.csv" "T04.csv"
t.list <- list(T[1:2], T[3:4])
all.T <- lapply(t.list, function(x)ldply(x, read.csv))
for (i in 1:length(all.T)) assign(paste0("tfiles", i), all.T[[i]]) #this will produce tfiles1 and tfiles2 in your R environment.