How to random mixing string in Python3 code? - python-3.6

I'am beginner programmer in python3 code and have file.txt in format:
> 1.11слова1qwert
> 2.22Предложенный2fghjk
> 3.3текст434lkhh
> 4.798показанkjbhj
> 5.+_lкачества(7^#5
> 6.изучитьязыкQuestionsпрограммированияPython3
I want to writing python3 code, he is mixing strings in this file.txt. For example:
> 4.798показанkjbhj
> 1.11слова1qwert
> 5.+_lкачества(7^#5
> 2.22Предложенный2fghjk
> 6.изучитьязыкQuestionsпрограммированияPython3
> 3.3текст434lkhh
Please, help me, how to developing this code? What sort of python-method or python-function?
ps. Sorry for my bad English language

There is a random module which you can import.
i give you some hints:
you have to read the lines of the data with .readlines()
then mix it with shuffle method of random module
after that you have to write the lines to "file.txt" (method:.writelines())
Try to find out how to write the code by yourself. If you have any questions just ask.

Related

What are ppearson and spearson in datamash, and how do you use them?

When I look at the documentation, there is no "correlation", but there is "ppearson" and "spearson". They are mentioned exactly once, as a "group-by statistical operation." But .. how exactly are they defined?
Also, when I try to use one, there is an error message, but I don't understand how to fix it. How do you use ppearson or spearson?
$ cat > foo.tsv
1^I2
2^I3
$ cat foo.tsv | datamash ppearson 1,2
datamash: operation ‘ppearson’ requires field pairs
EDIT: This documentation section says
GNU Datamash is designed to closely follow R project’s (https://www.r-project.org/) statistical functions. See the files/operators.R file for the R equivalent code for each of datamash’s operators. When building datamash from source code on your local computer, operators are compared to known results of the equivalent R functions.
Looking in R, I don't see an spearson:
> ?spearson
No documentation for ‘spearson’ in specified packages and libraries:
you could try ‘??spearson’

Strange error from data.table::fread using sed

I think this is an accurate title but feel free to change it if anyone thinks it can be worded better. I am running the following commands using data.table::fread.
fread("sed 's+0/0+0+g' R.test.txt > R.test.edit.txt")
fread("sed 's+0/1+1+g' R.test.edit.txt > R.test.edit2.txt")
fread("sed 's+1/1+2+g' R.test.edit2txt > R.test.edit3.txt")
fread("sed 's+./.+0.01+g' R.test3..edit3.txt > R.test.edit.final.txt")
After each line I get the following message
Warning messages:
1: In fread("sed 's+0/0+0+g' /R/R.test.small.txt > /R/R.test.edit.small.txt") :
File '/path/to/tmp/RtmpwqJu82/file7e7e250b96bf' has size 0. Returning a NULL data.table.
2: In fread("sed 's+0/1+1+g' /R/R.test.edit.small.txt > /R/R.test.edit2.small.txt") :
File '/path/to/tmp/RtmpwqJu82/file7e7e8456d82' has size 0. Returning a NULL data.table.
3: In fread("sed 's+1/1+2+g' /R/R.test.edit2.small.txt > /R/R.test.edit3.small.txt") :
File '/path/to/tmp/RtmpwqJu82/file7e7e3f96bc35' has size 0. Returning a NULL data.table.
4: In fread("sed 's+./.+0.01+g' /R/R.test.edit3.small.txt > /R/R.test.edit.final.small.txt") :
File '/path/to/tmp/RtmpwqJu82/file7e7e302a3cde' has size 0. Returning a NULL data.table.
So it is weird... fread makes all the files I need when I run it on my laptop but gives that error for each file. When I got to run the script on our cluster, the script crashes and gives the following message.
> fread("sed 's+0/0+0+g' /R/R.test.txt > /R/R.test.edit.txt")
Error in fread("sed 's+0/0+0+g' /R/R.test.txt > /R/R.test.edit.txt") :
File is empty: /dev/shm/file38d161d613c
Execution halted
I think it has to do with the message I get when I run the script on my laptop? I think it a user issue but maybe it is a bug. I was wondering if anyone had any ideas. I was wondering if anyone had any ideas? I thought of a work around using the following
end_time <- Sys.time()
print(end_time)
peakRAM(system(paste("sed 's+0/0+0+g' /R/R.test.txt > /R/R.test.edit.txt")),
system(paste("sed 's+0/1+1+g' /R/R.test.edit.txt > /R/R.test.edit2.txt")),
system(paste("sed 's+1/1+2+g' /R/R.test.edit2.txt > /R/R.test.edit3.txt")),
system(paste("sed 's+./.+0.01+g' /R/R.test.edit3.txt > /R/R.test.edit.final.txt")))
end_time <- Sys.time()
print(end_time)
And this works fine. So I think there's a problem with sed or anything like that. I am just wondering what I am doing wrong when I use fread
Comments above are correct about what to do; I tried looking in the documentation for fread but didn't find anything helpful for you so I filed an issue to improve... thanks!
When you pass a terminal command to fread, it creates a tmp file for you automatically in the background. You can see the exact line here, stylized:
system(paste0('(', cmd, ') > ', tmpFile<-tempfile(tmpdir=tmpdir))
Then fread is applied to that file. As mentioned, the file resulting from your command with > tmpFile appended has size 0.
If you actually want to keep those intermediate files (e.g. R.test.edit.txt), you have two options: (1) first, run system('grep > R.test.edit.txt') then run fread on the output; or (2) [available on development version only for now; see Installation wiki] supply the tmpdir argument to fread and omit the > R.test.edit.txt part; fread will do the outputting itself for you.
If you don't actually care about the intermediate files, simply omit the > R.test.edit.txt part and fread should work as you were expecting, e.g.:
fread("sed 's+0/0+0+g' R.test.txt")

In R, how do I access information in a data set using a variable after the $?

I'm using Bioconductor to look at GO terms. I can use for instance GOBPANCESTOR$"GO:0060412" to get all the ancestral terms to 0060412. However, I need to loop through many possible terms. However, I can't seem to get GOBPANCESTOR$ to accept a variable after the $.
> GOBPANCESTOR$"GO:0060412"
[1] "GO:0003007" "GO:0003205" "GO:0003206" "GO:0003231" "GO:0003279" "GO:0003281" "GO:0007275" "GO:0009653" "GO:0009887" "GO:0007507" "GO:0008150"
[12] "GO:0032501" "GO:0032502" "GO:0044699" "GO:0044707" "GO:0044767" "GO:0048513" "GO:0048731" "GO:0048856" "GO:0060411" "GO:0072358" "GO:0072359"
[23] "all"
But...
> mygoterm <- "GO:0060412"
> GOBPANCESTOR$mygoterm
NULL
Also tried using paste to no avail. I feel like I must be misunderstanding something integral about the way R works...
Thanks for your help!

R grep with 'AND' logic

I'm working with RJDBC on a server whose maintainers frequently update jar versions. Since RJDBC requires classpaths, this poses a problem when paths break. My situation is fortuitous in that the most current jars will always be in the same directory, but the version numbers will have changed.
I'm trying to use a simple grep function in R to isolate which jar I need based on a regex with some AND logic, however R makes this surprisingly difficult...
This question demonstrates how grep in R can function with the | operator for OR logic, but I can't seem to find similar AND logic operator.
Here's an example:
## Let's say I have three jars in a directory
jars <- list.files('/the/dir')
> jars
[1] "hive-jdbc-1.1.0-cdh5.4.3-standalone.jar" "hive-jdbc-1.1.0-cdh5.4.3.jar" "jython-standalone-2.5.3.jar"
The jar I want is "hive-jdbc-1.1.0-cdh5.4.3-standalone.jar"—how can I use AND logic in grep to extract it?
## I know that OR logic is supported:
j <- jars[grep('hive-jdbc|standalone', jars)]
> j
[1] "hive-jdbc-1.1.0-cdh5.4.3-standalone.jar" "hive-jdbc-1.1.0-cdh5.4.3.jar" "jython-standalone-2.5.3.jar"
## Would AND logic look like the same format?
> jars[grep('hive-jdbc&standalone', jars)]
character(0)
Not all-that-surprisingly, that last piece doesn't work... I found a useful, yet non-comprehensive, link for grep in R, but it doesn't show an AND operator. Any thoughts?
You could try
grep('hive-jdbc.*standalone', jars) # 'hive-jdbc' followed by 'standalone'
or
grepl('hive-jdbc', jars) & grepl('standalone', jars) # 'hive-jdbc' AND 'standalone'

Passing parameters in .jcall

I have just started working with rJava to utilise a host of Java code in an R based application. I've tried some simple "Hello world" type things so I know the basic setup is working. I have several issues however I am hoping they will be resolved if I can resolve this basic problem using .jcall.
> cal = new(J("java/util/GregorianCalendar"))
> obj = new(J("au.gov.ips.dataarchive.TIndex"))
> obj$monthlyT(cal)
[1] 77
> .jcall(obj,"I","monthlyT",cal)
Error in .jcall(obj, "I", "monthlyT", cal) :
method monthlyT with signature (Ljava/util/GregorianCalendar;)I not found
To my understanding, the 3rd and 4th lines are equivalent and should produce the same result. Clearly I am doing something wrong. The 'monthlyT' method is defined in the java code as:
static public Integer monthlyT(Calendar month)
I am not a Java expert, so please let me know what other info about the Java objects I might need to provide to answer the question.
cal is a java.util.GregorianCalendar and not a java.util.Calendar. If you want to use the low-level .jcall interface (why?) then you need to do the casting yourself. So something like this:
.jcall(obj,"I","monthlyT",.jcast(cal, "java/util/Calendar" ))

Resources