Fastest route to last key of dict in Tcl - dictionary

Let's say that I have a Tcl dictionary. I want the fastest way to get the last entered key of the dict (not the value).
Theoretically, I could use:lindex [ dict keys $dict ] end Does anyone know anything else, which is faster?

This is pretty fast:
% set data {a 1 b 2 c 3}
a 1 b 2 c 3
% proc p1 {} {dict get $::data [lindex [dict keys $::data] end]}
% time {p1} 100000
1.87782 microseconds per iteration
But you can shave off about a microsecond by remembering the last key:
oo::object create mydict
oo::objdefine mydict {
variable data last
method add args {
lappend data {*}$args
set last [lindex [dict keys $data] end]
}
method getLast {} {
dict get $data $last
}
}
% mydict add a 1 b 2 c 3
c
% time {mydict getLast} 100000
0.82731 microseconds per iteration
Documentation:
create (method of oo::class),
dict,
lappend,
lindex,
method (object configuration subcommand),
proc,
oo::objdefine (object definition command),
oo::object (class of objects),
set,
time,
variable (object slot subcommand),
{*} (syntax)

I'm seeing [lindex $data end-1] to be faster.
tclsh last.tcl
time_list : 1.0693949999999999 microseconds per iteration
time_dict : 279.470543 microseconds per iteration
last.tcl
set data [dict create]
for {set i 0} {$i < 10000} {incr i} {
dict set data "key_$i" $i
}
set time_dict [time {
set last_key [lindex [dict keys $data] end]
} 1000]
set time_list [time {
set last_key [lindex $data end-1]
} 1000]
puts "time_list : $time_list"
puts "time_dict : $time_dict"

Related

Expressing the double summation sequence in Raku

How to express the double variable double summation sequence in Perl 6?
For an example of double variable double summation sequence see this
It must be expressed as is, i.e. without mathematically reducing the double summation into a single summation. Thank you.
The X (cross operator) and the [+] (reduction metaoperator [ ] with additive operator +) make this surprisingly easy:
To represent1 the double summation ∑³x = 1 ∑⁵y = 1 2x + y , you can do the following:
[+] do for 1..3 X 1..5 -> ($x, $y) { 2 * $x + $y }
# for 1..3 X 1..5 # loop cross values
# -> ($x, $y) # plug into x/y
# { 2 * $x + $y } # calculate each iteration
# do # collect loop return vals
# [+] # sum them all
If you wanted to create a sub for this, you could write it as the following2
sub ΣΣ (
Int $aₒ, Int $aₙ, # to / from for the outer
Int $bₒ, Int $bₙ, # to / from for the inner
&f where .arity = 2  # 'where' clause guarantees only two params
) {
[+] do for $aₒ..$aₙ X $bₒ..$bₙ -> ($a, $b) { &f(a,b) }
}
say ΣΣ 1,3, 1,5, { 2 * $^x + $^y }
Or even simplify things more to
sub ΣΣ (
Iterable \a, # outer values
Iterable \b, # inner values
&f where .arity = 2) { # ensure only two parameters
[+] do f(|$_) for a X b
}
# All of the following are equivalent
say ΣΣ 1..3, 1..5, -> $x, $y { 2 * $x + $y }; # Anonymous block
say ΣΣ 1..3, 1..5, { 2 * $^x + $^y }; # Alphabetic args
say ΣΣ 1..3, 1..5, 2 * * + * ; # Overkill, but Whatever ;-)
Note that by typing it, we can ensure ranges are passed, but by typing it as Iterable rather than Range we can allow more interesting summation sequences, like, say, ΣΣ (1..∞).grep(*.is-prime)[^99], 1..10, { … } that would let us use sequence of the first 100 primes.
In fact, if we really wanted to, we could go overboard, and allow for an arbitrary depth summation operator, which is made easiest by moving the function to the left:
sub ΣΣ (
&function,
**#ranges where # slurp in the ranges
.all ~~ Iterable && # make sure they're Iterables
.elems == &function.arity # one per argument in the function
) {
[+] do function(|$_) for [X] #ranges;
};
Just like [+] sums up all the values of our f() function, [X] calculates the cross iteratively, e.g., [X] 0..1, 3..4, 5..6 first does 0..1 X 3..4 or (0,3),(0,4),(1,3),(1,4), and then does (0,3),(0,4),(1,3),(1,4) X 5..6, or (0,3,5),(0,4,5),(1,3,5),(1,4,5),(0,3,6),(0,4,6),(1,3,6),(1,4,6).
1. Sorry, SO doesn't let me do LaTeX, but you should get the idea. 2. Yes, I know that's a subscript letter O not a zero, subscript numbers aren't valid identifiers normally, but you can use Slang::Subscripts to enable them.

Recursive TCL proc without explicit return

I am writting a simple "proc" to calculate the factorial. I would like to understand why my function does not work without the return statement.
According to TCL docs, functions that are defined without explicit "return",
return the value of the last executed command in its body.
proc fac { n } {
if { $n == 1 } {
return 1
}
puts $n
set n [expr {$n - 1}]
return [expr {[fac $n ] * $n}]
}
puts [fac 5] # ans 24
When the "return" is removed, I get the following error message:
invalid command name "1"
while executing
"[expr {[fac $n ] * $n}] "
(procedure "fac" line 7)
invoked from within
I expected that without the explicit "return", the function should return 24 as well.
Your expectation is correct. But you have square brackets around expr procedure in the last line. It is:
[expr {[fac $n] * $n}]
This means for the interpreter: 1) execute expr procedure with given argument; 2) execute the result of expr procedure. Because of this, the interpreter tries to execute procedure 1 that doesn't exist and you receive an error.
To fix this error - just remove square brackets from the last line:
proc fac { n } {
if { $n == 1 } {
return 1
}
puts $n
set n [expr {$n - 1}]
expr {[fac $n ] * $n}
}

Perl's Hash of Hashes equivalent implementation for dict of dicts in Tcl

I have a very large file that contains data like below:
*1 RES L1 N1 0.32
*22 RES L2 N2 0.64
*100 CAP A1 B1 0.3
*200 CAP A2 B1 0.11
*11 IND K1 K2 0.002
*44 IND X1 Y1 0.00134
... and so on
For such files (let us assume the above data is in a file called "example.txt"), I can easily create a Hash of Hashes in Perl and pass these nested Hashes to otherr parts of my Perl program:
#!/usr/bin/perl
use strict;
use warnings;
open(FILE,"<", "example.txt") or die "Cannot open file:$!";
if (-f "example.txt") {
while(<FILE>) {
chomp;
if(/^\s*(\S+)\s+(RES|CAP|IND)\s+(\S+)\s+(\S+)\s+(\S+)\s*$/) {
$hoh{$1}{$2}{$3}{$4} = $5;
}
}
close FILE;
}
What is a similar way to create a Tcl Hash of Hashes (or rather Dictionary of Dictionaries)?
I tried a small piece of code setting the dict like below (not printing the full code here, to keep focus on the problem):
...
set dod [dict create]
if [regexp {^\s*(\S+)\s+(RES|CAP|IND)\s+(\S+)\s+(\S+)\s+(\S+)\s*$} $line all id type x y elemValue] {
dict set dod $id $type $x $y $elemValue
}
But that does not seem to work. I tested it like below:
foreach id [dict keys $dod] {
if [dict exists $dod "RES"] {
puts "RES KEY EXISTS"
} else {
puts "RES KEY NOT FOUND"
}
}
Thanks.
Your immediate problem is a stray slash in the beginning of the regular expression.
To answer the question: a multi-key dictionary is a "hash of hashes". Every key adds a new level of dictionaries.
dict set foo aa bb cc 1
sets the member {cc 1} in a dictionary which is the value of the member {bb ...} in the dictionary which is the value of the member {aa ...} in foo.
If you don't want a multi-level dictionary and still need to use several key values, you need to do:
dict set foo [list aa bb cc] 1
Also, I don't know how much is simplified away in your example, but the code to add an item could be better stated as:
if {[lindex $line 1] in {RES CAP IND}} {
dict set dod {*}$line
}
But if you want to check existence by e.g. "RES", you need to set it as the top-level key, which you don't in your example (the items in the first column become top-level keys). Initializing as above, the value of dod is
*1 {RES {L1 {N1 0.32}}} *22 {RES {L2 {N2 0.64}}} *100 {CAP {A1 {B1 0.3}}} *200 {CAP {A2 {B1 0.11}}} *11 {IND {K1 {K2 0.002}}} *44 {IND {X1 {Y1 0.00134}}}
so you do get a dictionary, but dict exists $dod RES is still necessarily false. By using
if {[lindex $line 1] in {RES CAP IND}} {
dict set dod {*}[lrange $line 1 end]
}
(i.e. all the items in the line after the first as keys, except the last which becomes the value) you get the dictionary
RES {L1 {N1 0.32} L2 {N2 0.64}} CAP {A1 {B1 0.3} A2 {B1 0.11}} IND {K1 {K2 0.002} X1 {Y1 0.00134}}
in which you can test for the existence of "RES".
Going back to the dict-of-dicts
*1 {RES {L1 {N1 0.32}}} *22 {RES {L2 {N2 0.64}}} *100 {CAP {A1 {B1 0.3}}} *200 {CAP {A2 {B1 0.11}}} *11 {IND {K1 {K2 0.002}}} *44 {IND {X1 {Y1 0.00134}}}
you can check for "RES" by examining each of the sub-dictionaries until you find one that has that key:
set found 0
dict for {key subdict} $dod {
if {[dict exists $subdict RES]} {
set found 1
break
}
}
Documentation:
dict
Not exactly same but somewhat similar:
set data "*1 RES L1 N1 0.32
*22 RES L2 N2 0.64
*100 CAP A1 B1 0.3
*200 CAP A2 B1 0.11
*11 IND K1 K2 0.002
*44 IND X1 Y1 0.00134
"
set pattern {\s*(\S+)\s+(RES|CAP|IND)\s+(\S+)\s+(\S+)\s+(\S+)?\s*$}
set result [regexp -all -line -inline -- $pattern $data]
if {[llength $result] == 0} {
puts "Not found"
exit 1
}
array set my_data {}
foreach {all ind_0 ind_1 ind_2 ind_3 ind_4} $result {
set my_data($ind_0)($ind_1)($ind_2)($ind_3) $ind_4
}
puts [parray my_data]
Sample output:
my_data(*1)(RES)(L1)(N1) = 0.32
my_data(*100)(CAP)(A1)(B1) = 0.3
my_data(*11)(IND)(K1)(K2) = 0.002
my_data(*200)(CAP)(A2)(B1) = 0.11
my_data(*22)(RES)(L2)(N2) = 0.64
my_data(*44)(IND)(X1)(Y1) = 0.00134

TCL recursively call procedure

I'm a beginner at TCL and while trying to build the GCD algorithm I ran into some problems I'd like some help with:
how can I call a proc inside a proc recursively like so
proc Stein_GCD { { u 0 } { v 0 } } {
if { $v == 0 } {
puts "$u\t\t$v\t\t$v"
}
if { [expr { $v % 2 } && { $u % 2 } ] == 0 } {
return [expr 2 * ${Stein_GCD 1 0} ]
}
}
set a [Stein_GCD 2 2 ]
puts $a
as you can see, I made the proc to evaluate GCD(the code does not make any sense because I'm trying to solve an example issue), and I'm trying to recursively call the proc again to continue evaluating(notice that I made an if statement that can understand the Stein_GCD 1 0 call, yet the tcl 8.6.6 online EDA emulator says:
can't read "Stein_GCD 1 0": no such variable
while executing
"expr 2 * ${Stein_GCD 1 0} "
(procedure "Stein_GCD" line 5)
invoked from within
"Stein_GCD 2 2 "
invoked from within
"set a [Stein_GCD 2 2 ]"
(file "main.tcl" line 7)
Can you tell me how to efficiently recursively call a proc, and where was my mistake?
will gladly provide more info in the case I did a bad job at explaining.
The error can't read "Stein_GCD 1 0": indicates that you are treating the data as a single string instead of separate arguments. The problem line:
return [expr 2 * ${Stein_GCD 1 0} ]
is not written correctly. ${Stean_GCD 1 0} is not a variable.
You should have:
return [expr 2 * [Stein_GCD 1 0] ]
You want the result from Stein_GCD 1 0, so the brackets should be used.

Tcl error : wrong # args: should be "set varName ?newValue?"

I tried to run the following Tcl script and got the error: wrong # args: should be "set varName ?newValue?"
What does this mean?
Note: The script includes terms specific to VMD program, such as mol and resid. Please disregard them.
#count water molecules between chain A and chain C or between #chain B and chain C
set input_file [open ./name_3_pdb_chain_renamed.dat r]
set data [read $input_file]
set data [split $data "\n"]
close $input_file
set chain_list [lindex $data 0]
cd 7_count_water
set outfile [open count_water3.dat w]
set chain_compare ""
set pdblen [llength $chain_list]
for {set i 0} {$i<$pdblen} {incr i} {
set pid [lindex [lindex $chain_list $i] 0]
set len [llength [lindex $chain_list $i]]
mol load pdb ../2_chain_rename/${pid}_chain_revised.pdb
mol modstyle 0 top NewCartoon
if {$len==4} {
set chain_compare [lappend chain_compare $pid]
}
set 11 [atomselect top all]
set mid [$11 molid]
mol delete $mid
}
set lll [llength $chain_compare]
for {set j 0} {$j< $lll} {incr j} {
set pid [lindex $chain_compare $j]
mol load pdb ../2_chain_rename/${pid}_chain_revised.pdb
set 11 [atomselect top "chain A and name CA"]
set res_len [llength [$11 get resid]]
set res_id [$11 get resid]
#residue length for chain C
set ag [atomselect top "chain C and name CA"]
set ag_len [llength [$ag get resid]]
set ag_id [$ag get resid]
#loop water between chain A and chain C
for {set k 0} {$k<$res_len} {incr k} {
set water_around_a [atomselect top "{resname HOH and {within 5.0 of {chain A and resid [lindex $res_id $k]} and {within 5.0 of chain C}}} "]
set water_around_a_resid [$water_around_a get resid]
set water_around_a_resname [$water_around_a get resname]
#loop antigen residues around water
for {set g 0} {$g < $ag_len} {incr g} {
set ag_around_water [atomselect top "{chain C and resid [lindex $ag_id $g] and {within 5.0 of {resname HOH and {within 5.0 of {chain A and resid [lindex $res_id $k]}}}}} "]
set ag_around_water resid [$ag_around_water get resid]
set ag_around_water_resname [$ag_around_water get resname]
puts $outfile "$pid [lindex $res_id $k] [lindex [$11 get resname] $k] $ag_around_water_resname A: $water_around_a_resname"
}
}
set b11 [atomselect top "chain B and name CA"]
set b_res_len [llength [$b11 get resid]]
set b_res_id [$b11 get resid]
#residue length for chain C
set ag [atomselect top "chain C and name CA"]
set ag_len [llength [$ag get resid]]
set ag_id [$ag get resid]
for {set k 0} {$k<$res_len} {incr k} {
set water_around_b [atomselect top "{resname HOH and {within 5.0 of {chain B and resid [lindex $b_res_id $k]} and {within 5.0 of chain C}}} "]
set water_around_b_resid [$water_around_b get resid]
set water_around_b_resname [$water_around_b get resname]
#loop antigen residues around water
for {set g 0} {$g < $ag_len} {incr g} {
set ag_around_water [atomselect top "{chain C and resid [lindex $ag_id $g] and {within 5.0 of {resname HOH and {within 5.0 of {chain B and resid [lindex $b_res_id $k]}}}}} "]
set ag_around_water resid [$ag_around_water get resid]
set ag_around_water_resname [$ag_around_water get resname]
puts $outfile "$pid [lindex $b_res_id $k] [lindex [$b11 get resname] $k] $ag_around_water_resname A: $water_around_b_resname"
}
}
}
close $outfile
cd ..
Thank you
That message:
wrong # args: should be "set varName ?newValue?"
is a standard error thrown when a built-in command gets the wrong number of arguments to evaluate. In this case, it's coming from the set command, and indicates that you've either said set on its own, or given more than two further arguments to it.
If you examine the stack trace (usually printed with the error message when using standard tclsh, though it's changeable with user code) then you'll get told where the problem happened. However, in this case we can look through and see that this line near the bottom of the script:
set ag_around_water resid [$ag_around_water get resid]
has what appears to be a space instead of an underscore in the variable name. Now, spaces are legal in variable names, but then the variable name needs to be quoted, and that can get a bit annoying. It's usually best to avoid using them like that. Without quoting, Tcl doesn't know that that's meant to be one word; the generic parsing layer decides there's really four words there (set, ag_around_water, resid and the complex [$ag_around_water get resid]) and tells set to deal with that, which it doesn't like.
Remember, Tcl's generic syntactic parsing happens first, before command arguments are interpreted semantically. Always.
The line set ag_around_water resid [$ag_around_water get resid] needs to be changed. You probably want ag_around_water_resid instead.

Resources