I have written an R function to write some SAS code (yes, I know, I am lazy), and it prints text as follows:
proc_sql_missing <- function(cols, data){
cat("\nproc sql;")
cat(paste0("\n\tselect\tPat_TNO,\n\t\t\t", paste(names(data)[cols], collapse = ",\n\t\t\t")))
cat("\n\tfrom", substitute(data))
cat(paste0("\n\twhere\t", paste(names(data)[cols], collapse = "= . OR\n\t\t\t"), "= .;"))
cat("\nquit;")
}
Which when called prints something similar to the following to the R output:
proc sql;
select Pat_TNO,
fuq_pa_enjoy_hate,
fuq_pa_bored_interest,
fuq_pa_like_dislike,
fuq_pa_pleasble_unpleasble,
fuq_pa_absorb_notabsorb,
fuq_pa_fun_notfun,
fuq_pa_energizing_tiring,
fuq_pa_depress_happy,
fuq_pa_pleasant_unpleast,
fuq_pa_good_bad,
fuq_pa_invigor_notinvigor,
fuq_pa_frustrate_notfrust,
fuq_pa_gratifying_notgrat,
fuq_pa_exhilarate_notexhil,
fuq_pa_stimulate_notstim,
fuq_pa_accom_notaccom,
fuq_pa_refresh_notrefresh,
fuq_pa_doing_notdoing
from followup
where fuq_pa_enjoy_hate= . OR
fuq_pa_bored_interest= . OR
fuq_pa_like_dislike= . OR
fuq_pa_pleasble_unpleasble= . OR
fuq_pa_absorb_notabsorb= . OR
fuq_pa_fun_notfun= . OR
fuq_pa_energizing_tiring= . OR
fuq_pa_depress_happy= . OR
fuq_pa_pleasant_unpleast= . OR
fuq_pa_good_bad= . OR
fuq_pa_invigor_notinvigor= . OR
fuq_pa_frustrate_notfrust= . OR
fuq_pa_gratifying_notgrat= . OR
fuq_pa_exhilarate_notexhil= . OR
fuq_pa_stimulate_notstim= . OR
fuq_pa_accom_notaccom= . OR
fuq_pa_refresh_notrefresh= . OR
fuq_pa_doing_notdoing= .;
quit;
Is there any way I can automatically copy this text to the clipboard so I can paste straight into SAS?
capture.output to capture the output:
output <- capture.output(proc_sql_missing(cols, data))
Then you can use clipr::write_clip(output), or writeClipboard(output) on Windows.
You can specify file = "clipboard" as an argument to cat, so if you change your function to
proc_sql_missing <- function(cols, data) {
cat(
paste(
"\nproc sql;",
paste0("\n\tselect\tPat_TNO,\n\t\t\t",
paste(names(data)[cols], collapse = ",\n\t\t\t")),
paste("\n\tfrom", substitute(data)),
paste0("\n\twhere\t",
paste(names(data)[cols], collapse = "= . OR\n\t\t\t"),
"= .;"),
"\nquit;",
sep = "\n"),
file = "clipboard"
)
}
Then when you run, say,
proc_sql_missing(1:3, mtcars)
Then your clipboard will now contain
proc sql;
select Pat_TNO,
mpg,
cyl,
disp
from mtcars
where mpg= . OR
cyl= . OR
disp= .;
quit;
Related
I have a data frame and I want to extract the specific string on one of the columns by delimiter but there are several conditions. I want to mutate a new column that contain the COSVxxxx strings only.
df:
ID
.
COSV50419740
.
.
.
rs375210814
.
rs114284775;COSV60321424
.
.
.
rs67376798;88974
rs1169783812
rs56386506;51676;COSV66451617
rs80358907;52202
.
.
.
482972
629301
COSV66463357
rs80358408;51066
rs80358420;51100;COSV66464432
desired df:
ID COSV.ID
. .
COSV50419740 COSV50419740
. .
. .
. .
rs375210814 rs375210814
. .
rs114284775;COSV60321424 COSV60321424
.
.
.
rs67376798;88974 rs67376798;88974
rs1169783812 rs1169783812
rs56386506;51676;COSV66451617 COSV66451617
rs80358907;52202 rs80358907;52202
. .
. .
. .
482972 482972
629301 629301
COSV66463357 COSV66463357
rs80358408;51066 rs80358408;51066
rs80358420;51100;COSV66464432 COSV66464432
I want to keep the string if there are no COSV annotation. However, my problem is that there are some rows containing from one to four annotation by colon delimiter. I tried to use cSplit function to separate them but have no idea how to convert the COSV string into one column.
You could use sub here, e.g.
df$ID_new <- ifelse(grepl("\\bCOSV\\d+\\b", df$ID),
sub("^.*\\b(COSV\\d+)\\b.*$", "\\1", df$ID),
NA)
This option will assign the (last) COSV value, should it exist in the ID column, otherwise it will assign NA.
I need to execute a Perl program as part of a larger R program.
The R code generates a series of output files with different extensions, for instance .out or .lis.
I have a Perl program that converts those files to CSV.
I've seen Perl arguments executed on R, but nothing with this complexity.
#outfiles = glob( "*.lis" );
foreach $outfile ( #outfiles ) {
print $outfile, "\n";
$outfile =~ /(\S+)lis$/;
$csvfile = $1 . "lis.csv";
print $csvfile, "\n";
open( OUTFILE, "$outfile" ) || die "ERROR: Unable to open $outfile\n";
open( CSVFILE, ">$csvfile" ) || die "ERROR: Unable to open $csvfile\n";
$lineCnt = 0;
while ( $outline = <OUTFILE> ) {
chomp( $outline );
$lineCnt++;
$outline =~ s/^\s+//; # Remove whitespace at the beginning of the line
if ( $lineCnt == 1 ) {
$outline =~ s/,/\./g; # Replace all the commas with periods in the hdr line
}
$outline =~ s/\s+/,/g; # Replace remaining whitespace delimiters with a comma
print CSVFILE "$outline\n";
}
close( OUTFILE );
close( CSVFILE );
}
Is there any way I can integrate the Perl code into my R code? I could develop an R program that does the same. But I wouldn't know where to start to convert a .lis or .out file to .csv.
Call it by using R's system call:
my.seed <- as.numeric(try(system(" perl -e 'print int(rand(1000000))'", intern = TRUE))) #get random number :D
However, I must agree with #ikegami, there are better modules to handle CSV data.
I think I've tried every intuitive combination of these four commands, with and without colons, in my _vimrc:
syntax enable
syntax on
set filetype=r
set syntax=r
But when I open a script in gVim, it's all one solid color. Within a session, both ':set syntax=r' and ':set filetype=r' work fine, while the other two do nothing.
My full _vimrc is below:
set nocompatible
source $VIMRUNTIME/vimrc_example.vim
source $VIMRUNTIME/mswin.vim
behave mswin
set keymodel-=stopsel
set diffexpr=MyDiff()
function MyDiff()
let opt = '-a --binary '
if &diffopt =~ 'icase' | let opt = opt . '-i ' | endif
if &diffopt =~ 'iwhite' | let opt = opt . '-b ' | endif
let arg1 = v:fname_in
if arg1 =~ ' ' | let arg1 = '"' . arg1 . '"' | endif
let arg2 = v:fname_new
if arg2 =~ ' ' | let arg2 = '"' . arg2 . '"' | endif
let arg3 = v:fname_out
if arg3 =~ ' ' | let arg3 = '"' . arg3 . '"' | endif
let eq = ''
if $VIMRUNTIME =~ ' '
if &sh =~ '\<cmd'
let cmd = '""' . $VIMRUNTIME . '\diff"'
let eq = '"'
else
let cmd = substitute($VIMRUNTIME, ' ', '" ', '') . '\diff"'
endif
else
let cmd = $VIMRUNTIME . '\diff'
endif
silent execute '!' . cmd . ' ' . opt . arg1 . ' ' . arg2 . ' > ' . arg3 . eq
endfunction
filetype plugin indent on
" show existing tab with 4 spaces width
set tabstop=4
" when indenting with '>', use 4 spaces width
set shiftwidth=4
" On pressing tab, insert 4 spaces
set expandtab
"show line numbers
set number
"syntax highlighting for R
"syntax enable
syntax on
set filetype=r
"set syntax=r
colorscheme elflord
"see commands as they're being typed
set showcmd
"change the key combo for normal mode to 'jk'
inoremap jk <ESC>
"add line below cursor in insert mode
:autocmd InsertEnter * set cul
:autocmd InsertLeave * set nocul
This problem to me is harder than it might sound. I imported a GML file. I now have all of my rows with numbers followed by a ,. I can't figure out how to remove and make numeric. I have tried as.numeric and gsub, but when I do my adjacency matrix I get this output:
[1,] . 1 . . 1 . . . . 1 . . . . . . 1 . . . . . . 1 . . . . . . . . . 1 . 1 . . . ......
[2,] 1 . . . . . . . . . . . . . . . . . . . . . . . . 1 . 1 . . . . . 1 . . . 1 . ......
I need the numbers in the [1,] to be a real number so I can attempt a loop that I will come back later for help on!
This code doesn't work:
games[0] <- as.numeric(gsub("[^[:digit:]]","",games[0]))
I get this error:
Error in `[<-.igraph`(`*tmp*`, 0, value = numeric(0)) :
Logical or numeric value must be of length 1
Here is the code I have:
library(igraph)
games <- read.graph("football.gml", format="gml")
and I eventually need to be able to look this algorithm:
get.shortest.paths(games, 1, 155, weights = NULL ,output=c("vpath", "epath", "both"))
[1,] is a row with multiple values (one for each column), not a single string. gsub returns an error because it is only designed for use on a single string. You need to loop over each value in the n x k matrix (or use an apply function to do this) and apply the gsub function to each individual value. Also not sure why you are replacing "[^[:digit:]]". Keep in mind this will substitute out the literal string "[^[:digit:]]" , not whatever this references in R. Here is an example in a loop:
for (i in 1:nrow(data)){
for (j in 1:ncol(data)){
data[i,j] <- gsub(".", "", data[i,j])
}
}
Maybe you could do something creative like this:
read.table(text='1 2 3 4 ,
5 6 7 8 ,
9 1 2 3 ,', sep=' ', na.strings=',')
And then drop the last column.
I have to extract value of a variable which occurs multiple times in a file. for example, I have a text file abc.txt . There is a variable result. Suppose value of result in first line is 2, in third line it is 55 and in last line it is 66.
Then my desired output should be :
result:2,55,66
I am new in unix so I could not figure out how to do this. Please help
The contents of text file can be as follows:
R$#$#%$W%^BHGF, result=2,
fsdfsdsgf
VSDF$TR$R,result=55
fsdf4r54
result=66
Try this :
using awk code :
awk -F'(,| |^)result=' '
/result=/{
gsub(",", "", $2)
v = $2
str = (str) ? str","v : v
}
END{print "result:"str}
' abc.txt
Using perl code :
perl -lane '
push #arr, $& if /\bresult=\K\d+/;
END{print "result:" . join ",", #arr}
' abc.txt
Output :
result:2,55,66