Trouble With Outputting all Elements to Console - r

I have an issue where I'm trying to use sink() to capture my console output to a text file. However, my console keeps on restricting my print statements, despite having set max.print to the maximum integer in R.
I have consulted various other stackoverflow links but to no avail. Has anyone solved this issue?
This is a sample output, despite having changed max.print.
options(max.print = .Machine$integer.max)
> print(outputFile[1])
[[1]]
+ 1681/519133 vertices, named, from 71aeda5:
[1] p_8945206-t_25 p_24353782-t_0 p_5096967-t_0
[4] p_12728438-t_2 p_1914103-t_8 p_7949965-t_59
[7] p_5171435-t_4 p_6628106-t_7 p_2535537-t_0
[10] p_45026190-t_2 p_25504870-t_8 p_796238-t_1
[13] p_135998-t_13 p_20853906-t_1 p_17154085-t_0
[16] p_29505258-t_4 p_27269129-t_13 p_6793896-t_92
[19] p_5331193-t_1 p_11521441-t_2 p_34271996-t_2
[22] p_95594-t_0 p_16395989-t_0 p_582576-t_3
[25] p_9368888-t_1 p_697462-t_28 p_80124-t_72
[28] p_7595644-t_0 p_14372110-t_4 p_2083314-t_2
+ ... omitted several vertices
Additionally, I have tried indexing but it hasn't worked.

igraph specific options like auto.print.lines should still affect the printing of your graph objects, even if they're contained in a list. Using a combination of auto.print.lines and max.print, I'm able to get graphs to print out in full:
library(purrr)
library(igraph)
# Using purrr to create a list of multiple large graphs
gs = map(1:5, ~ random.graph.game(200, 0.1))
options(max.print = .Machine$integer.max)
igraph_options(auto.print.lines = Inf)
print(gs)

Related

How to get the center and scale after using the scale function in R

It seems a silly question, but I have searched on line, but still did not find any sufficient reply.
My question is: suppose we have a matrix M, then we use the scale() function, how can we extract the center and scale of each column by writing a line of code (I know we can see the centers and scales..), but my matrix has lots of columns, it is cumbersome to do it manually.
Any ideas? Many thanks!
you are looking for the attributes function:
set.seed(1)
mat = matrix(rnorm(1000),,10) # Suppose you have 10 columns
s = scale(mat) # scale your data
attributes(s)#This gives you the means and the standard deviations:
$`dim`
[1] 100 10
$`scaled:center`
[1] 0.1088873669 -0.0378080766 0.0296735350 0.0516018586 -0.0391342406 -0.0445193567 -0.1995797418
[8] 0.0002549694 0.0100772648 0.0040650015
$`scaled:scale`
[1] 0.8981994 0.9578791 1.0342655 0.9916751 1.1696122 0.9661804 1.0808358 1.0973012 1.0883612 1.0548091
These values can also be obtained as:
colMeans(mat)
[1] 0.1088873669 -0.0378080766 0.0296735350 0.0516018586 -0.0391342406 -0.0445193567 -0.1995797418
[8] 0.0002549694 0.0100772648 0.0040650015
sqrt(diag(var(mat)))
[1] 0.8981994 0.9578791 1.0342655 0.9916751 1.1696122 0.9661804 1.0808358 1.0973012 1.0883612 1.0548091
you get a list that you can subset the way you want:
or you can do
attr(s,"scaled:center")
[1] 0.1088873669 -0.0378080766 0.0296735350 0.0516018586 -0.0391342406 -0.0445193567 -0.1995797418
[8] 0.0002549694 0.0100772648 0.0040650015
attr(s,"scaled:scale")
[1] 0.8981994 0.9578791 1.0342655 0.9916751 1.1696122 0.9661804 1.0808358 1.0973012 1.0883612 1.0548091

str_split on first and second occurence of delimter at different locations in character vector

I have a character list that has weather variables followed by "mean_#" where # is a number between 5 and 10. I want to subset the list to only have the weather variable names themselves. The mean weather variables look like this:
> mean_vars
[1] "dew_mean_10" "dew_mean_5" "dew_mean_6" "dew_mean_7"
[5] "dew_mean_8" "dew_mean_9" "humid_mean_10" "humid_mean_5"
[9] "humid_mean_6" "humid_mean_7" "humid_mean_8" "humid_mean_9"
[13] "rain_mean_10" "rain_mean_5" "rain_mean_6" "rain_mean_7"
[17] "rain_mean_8" "rain_mean_9" "soil_moist_mean_10" "soil_moist_mean_5"
[21] "soil_moist_mean_6" "soil_moist_mean_7" "soil_moist_mean_8" "soil_moist_mean_9"
[25] "soil_temp_mean_10" "soil_temp_mean_5" "soil_temp_mean_6" "soil_temp_mean_7"
[29] "soil_temp_mean_8" "soil_temp_mean_9" "solar_mean_10" "solar_mean_5"
[33] "solar_mean_6" "solar_mean_7" "solar_mean_8" "solar_mean_9"
[37] "temp_mean_10" "temp_mean_5" "temp_mean_6" "temp_mean_7"
[41] "temp_mean_8" "temp_mean_9" "wind_dir_mean_10" "wind_dir_mean_5"
[45] "wind_dir_mean_6" "wind_dir_mean_7" "wind_dir_mean_8" "wind_dir_mean_9"
[49] "wind_gust_mean_10" "wind_gust_mean_5" "wind_gust_mean_6" "wind_gust_mean_7"
[53] "wind_gust_mean_8" "wind_gust_mean_9" "wind_spd_mean_10" "wind_spd_mean_5"
[57] "wind_spd_mean_6" "wind_spd_mean_7" "wind_spd_mean_8" "wind_spd_mean_9"
And this is all I want at the end:
> var_names
"dew" "humid" "rain" "solar" "temp" "soil_moist" "soil_temp" "wind_dir" "wind_gust" "wind_spd"
Now I figured out how to do it but I fill my method is extraneous due to a lack of ability with regular expressions. I also will have to repeat my process 20 times substituting "mean" with other words.
var_names <- unique(str_split_fixed(mean_vars, "_", n = 3)[c(1:18,31:42),1])
var_names <- unlist(c(var_names, unique(unite(as_tibble(str_split_fixed(mean_vars, "_", n = 3)[c(19:30,43:60), 1:2])))))
I've been trying to stay within the realm of the tidyverse packages as much as possible so I was using stringr::str_split_fixed.
If you have a solution using this same function that would be ideal as I could continue the same programming style, but I'm open to all suggestions.
Thanks.
Use sub and unique. This is shorter and has no package dependencies (or use unique(str_replace(mean_vars, "_mean.*", "")) with stringr):
unique(sub("_mean.*", "", mean_vars))
giving:
[1] "dew" "humid" "rain" "soil_moist" "soil_temp"
[6] "solar" "temp" "wind_dir" "wind_gust" "wind_spd"
If for some reason you really want to use str_split then:
rmMean <- function(x) paste(head(x, -2), collapse = "_")
unique(sapply(str_split(mean_vars, "_"), rmMean))
Note
mean_vars <- c("dew_mean_10", "dew_mean_5", "dew_mean_6", "dew_mean_7", "dew_mean_8",
"dew_mean_9", "humid_mean_10", "humid_mean_5", "humid_mean_6",
"humid_mean_7", "humid_mean_8", "humid_mean_9", "rain_mean_10",
"rain_mean_5", "rain_mean_6", "rain_mean_7", "rain_mean_8", "rain_mean_9",
"soil_moist_mean_10", "soil_moist_mean_5", "soil_moist_mean_6",
"soil_moist_mean_7", "soil_moist_mean_8", "soil_moist_mean_9",
"soil_temp_mean_10", "soil_temp_mean_5", "soil_temp_mean_6",
"soil_temp_mean_7", "soil_temp_mean_8", "soil_temp_mean_9", "solar_mean_10",
"solar_mean_5", "solar_mean_6", "solar_mean_7", "solar_mean_8",
"solar_mean_9", "temp_mean_10", "temp_mean_5", "temp_mean_6",
"temp_mean_7", "temp_mean_8", "temp_mean_9", "wind_dir_mean_10",
"wind_dir_mean_5", "wind_dir_mean_6", "wind_dir_mean_7", "wind_dir_mean_8",
"wind_dir_mean_9", "wind_gust_mean_10", "wind_gust_mean_5", "wind_gust_mean_6",
"wind_gust_mean_7", "wind_gust_mean_8", "wind_gust_mean_9", "wind_spd_mean_10",
"wind_spd_mean_5", "wind_spd_mean_6", "wind_spd_mean_7", "wind_spd_mean_8",
"wind_spd_mean_9")

R: Extract words from a website

I am attempting to extract all words that start with a particular phrase from a website. The website I am using is:
http://docs.ggplot2.org/current/
I want to extract all the words that start with "stat_". I should get 21 names like "stat_identity" in return. I have the following code:
stats <- readLines("http://docs.ggplot2.org/current/")
head(stats)
grep("stat_{1[a-z]", stats, value=TRUE)
I am returned every line containing the phrase "stat_". I just want to extract the "stat_" words. So I tried something else:
gsub("\b^stat_[a-z]+ ", "", stats)
I think the output I got was an empty string, " ", where a "stat_" phrase would be? So now I'm trying to think of ways to extract all the text and set everything that is not a "stat_" phrase to empty strings. Does anyone have any ideas on how to get my desired output?
rvest & stringr to the rescue:
library(xml2)
library(rvest)
library(stringr)
pg <- read_html("http://docs.ggplot2.org/current/")
unique(str_match_all(html_text(html_nodes(pg, "body")),
"(stat_[[:alnum:]_]+)")[[1]][,2])
## [1] "stat_bin" "stat_bin2dCount"
## [3] "stat_bindot" "stat_binhexBin"
## [5] "stat_boxplot" "stat_contour"
## [7] "stat_density" "stat_density2d"
## [9] "stat_ecdf" "stat_functionSuperimpose"
## [11] "stat_identity" "stat_qqCalculation"
## [13] "stat_quantile" "stat_smooth"
## [15] "stat_spokeConvert" "stat_sum"
## [17] "stat_summarySummarise" "stat_summary_hexApply"
## [19] "stat_summary2dApply" "stat_uniqueRemove"
## [21] "stat_ydensity" "stat_defaults"
Unless you need the links (then you can use other rvest functions), this removes all the markup for you and just gives you the text of the website.

R - get values from multiple variables in the environment

I have some variables in my current R environment:
ls()
[1] "clt.list" "commands.list" "dirs.list" "eq" "hurs.list" "mlist" "prec.list" "temp.list" "vars"
[10] "vars.list" "wind.list"
where each one of the variables "clt.list", "hurs.list", "prec.list", "temp.list" and "wind.list" is a (huge) list of strings.
For example:
clt.list[1:20]
[1] "clt_Amon_ACCESS1-0_historical_r1i1p1_185001-200512.nc" "clt_Amon_ACCESS1-3_historical_r1i1p1_185001-200512.nc"
[3] "clt_Amon_bcc-csm1-1_historical_r1i1p1_185001-201212.nc" "clt_Amon_bcc-csm1-1-m_historical_r1i1p1_185001-201212.nc"
[5] "clt_Amon_BNU-ESM_historical_r1i1p1_185001-200512.nc" "clt_Amon_CanESM2_historical_r1i1p1_185001-200512.nc"
[7] "clt_Amon_CCSM4_historical_r1i1p1_185001-200512.nc" "clt_Amon_CESM1-BGC_historical_r1i1p1_185001-200512.nc"
[9] "clt_Amon_CESM1-CAM5_historical_r1i1p1_185001-200512.nc" "clt_Amon_CESM1-CAM5-1-FV2_historical_r1i1p1_185001-200512.nc"
[11] "clt_Amon_CESM1-FASTCHEM_historical_r1i1p1_185001-200512.nc" "clt_Amon_CESM1-WACCM_historical_r1i1p1_185001-200512.nc"
[13] "clt_Amon_CMCC-CESM_historical_r1i1p1_190001-190412.nc" "clt_Amon_CMCC-CESM_historical_r1i1p1_190001-200512.nc"
[15] "clt_Amon_CMCC-CESM_historical_r1i1p1_190501-190912.nc" "clt_Amon_CMCC-CESM_historical_r1i1p1_191001-191412.nc"
[17] "clt_Amon_CMCC-CESM_historical_r1i1p1_191501-191912.nc" "clt_Amon_CMCC-CESM_historical_r1i1p1_192001-192412.nc"
[19] "clt_Amon_CMCC-CESM_historical_r1i1p1_192501-192912.nc" "clt_Amon_CMCC-CESM_historical_r1i1p1_193001-193412.nc"
What I need to do is extract the subset of the string that is between "Amon_" and "_historical".
I can do this for a single variable, as shown here:
levels(as.factor(sub(".*?Amon_(.*?)_historical.*", "\\1", clt.list[1:20])))
[1] "ACCESS1-0" "ACCESS1-3" "bcc-csm1-1" "bcc-csm1-1-m" "BNU-ESM" "CanESM2" "CCSM4"
[8] "CESM1-BGC" "CESM1-CAM5" "CESM1-CAM5-1-FV2" "CESM1-FASTCHEM" "CESM1-WACCM" "CMCC-CESM"
However, what I'd like to do is to run the command above for all the five variables at once. Instead of using just "ctl.list" as argument in the command above, I'd like to use all variables "clt.list", "hurs.list", "prec.list", "temp.list" and "wind.list" at once.
How can I do that?
Many thanks in advance!
You can put your operation into a function and then iterate over it:
get_my_substr <- function(vecname)
levels(as.factor(sub(".*?Amon_(.*?)_historical.*", "\\1", get(vecname))))
lapply(my_vecnames,get_my_substr)
lapply acts like a loop. You can create your list of vector names with
my_vecnames <- ls(pattern=".list$")
It is generally good practice to post a reproducible example in your question. Since none was provided here, I tested this approach with...
# example-maker
prestr <- "grr_Amon_"
posstr <- "_historical_zzz"
make_ex <- function()
replicate(
sample(10,1),
paste0(prestr,paste0(sample(LETTERS,sample(5,1)),collapse=""),posstr)
)
# make a couple examples
set.seed(1)
m01 <- make_ex()
m02 <- make_ex()
# test result
lapply(ls(pattern="^m[0-9][0-9]$"),get_my_substr)
One solution would be to create a vector containing the variable names that you want extract the data from, for example:
var.names <- c("clt.list", "commands.list", "dirs.list")
Then to access the value of each variable from the name:
for (var.name in var.names) {
var.value <- as.list(environment())[[var.name]]
# Do something with var.value
}

R: How to remove quotation marks in a vector of strings, but maintain vector format as to call each individual value?

I want to create a vector of names that act as variable names so I can then use themlater on in a loop.
years=1950:2012
for(i in 1:length(years))
{
varname[i]=paste("mydata",years[i],sep="")
}
this gives:
> [1] "mydata1950" "mydata1951" "mydata1952" "mydata1953" "mydata1954" "mydata1955" "mydata1956" "mydata1957" "mydata1958"
[10] "mydata1959" "mydata1960" "mydata1961" "mydata1962" "mydata1963" "mydata1964" "mydata1965" "mydata1966" "mydata1967"
[19] "mydata1968" "mydata1969" "mydata1970" "mydata1971" "mydata1972" "mydata1973" "mydata1974" "mydata1975" "mydata1976"
[28] "mydata1977" "mydata1978" "mydata1979" "mydata1980" "mydata1981" "mydata1982" "mydata1983" "mydata1984" "mydata1985"
[37] "mydata1986" "mydata1987" "mydata1988" "mydata1989" "mydata1990" "mydata1991" "mydata1992" "mydata1993" "mydata1994"
[46] "mydata1995" "mydata1996" "mydata1997" "mydata1998" "mydata1999" "mydata2000" "mydata2001" "mydata2002" "mydata2003"
[55] "mydata2004" "mydata2005" "mydata2006" "mydata2007" "mydata2008" "mydata2009" "mydata2010" "mydata2011" "mydata2012"
All I want to do is remove the quotes and be able to call each value individually.
I want:
>[1] mydata1950 mydata1951 mydata1952 mydata1953, #etc...
stored as a variable such that
varname[1]
> mydata1950
varname[2]
> mydata1951
and so on.
I have played around with
cat(varname[i],"\n")
but this just prints values as one line and I can't call each individual string. And
gsub("'",'',varname)
but this doesn't seem to do anything.
Suggestions? Is this possible in R? Thank you.
There are no quotes in that character vector's values. Use:
cat(varname)
.... if you want to see the unquoted values. The R print mechanism is set to use quotes as a signal to your brain that distinct values are present. You can also use:
print(varname, quote=FALSE)
If there are that many named objects in you workspace, then you need desperately to learn to use lists. There are mechanisms for "promoting" character values to names, but this would be seen as a failure on your part to learn to use the language effectively:
var <- 2
> eval(as.name('var'))
[1] 2
> eval(parse(text="var"))
[1] 2
> get('var')
[1] 2

Resources