Emacs lisp list processing with spaces in list items - recursion

I have an emacs lisp list that says:
(setq states '(
Nebraska
NE
Nevada
NV
New Hampshire
NH
New Jersey)
)
I created a function that prints only the state names, not the abbreviations:
(defun names (los)
"Get names from list of states"
(when los
(print (car los))
(names (cdr(cdr los)))
)
)
I call the function like: (names states) and get
Nebraska
Nevada
New
NH
Jersey
How do I tell my lisp function (or setup my list) so that a the spaces in the words are not delimiters, only the newlines are delimiters?
Thanks

Elisp treats whitespace as a delimiter. To answer the immediate question, you can do as Patrick suggested in his comment and put the strings in quotes.
More broadly, you should consider using an association list or a property list, as they are designed for the task you have in mind. Read up on assoc and plist-get for more information. Examples:
The alist version would look like:
(setq states-alist '((NH "New Hampshire")
(NE "Nebraska")
(NV "Nevada")))
(cadr (assoc 'NH states-alist))
The plist version would look like:
(setq states-plist '(NH "New Hampshire"
NE "Nebraska"
NV "Nevada"))
(plist-get states-plist 'NH)

As Patrick Collins noted in a comment, you should turn the list items into strings by putting them inside double quotes:
(setq states '(
"Nebraska"
"NE"
"Nevada"
"NV"
"New Hampshire"
"NH"
"New Jersey")
)
If you really want to, you can keep them as symbols (as they are now) by escaping spaces with backslashes, e.g. New\ Hampshire. Whether to use symbols or strings depends on what you want to use them for.

Related

Regex to match only semicolons not in parenthesis [duplicate]

This question already has answers here:
Regex - Split String on Comma, Skip Anything Between Balanced Parentheses
(2 answers)
Closed 1 year ago.
I have the following string:
Almonds ; Roasted Peanuts (Peanuts; Canola Oil (Antioxidants (319; 320)); Salt); Cashews
I want to replace the semicolons that are not in parenthesis to commas. There can be any number of brackets and any number of semicolons within the brackets and the result should look like this:
Almonds , Roasted Peanuts (Peanuts; Canola Oil (Antioxidants (319; 320)); Salt), Cashews
This is my current code:
x<- Almonds ; Roasted Peanuts (Peanuts; Canola Oil (Antioxidants (319; 320)); Salt); Cashews
gsub(";(?![^(]*\\))",",",x,perl=TRUE)
[1] "Almonds , Roasted Peanuts (Peanuts, Canola Oil (Antioxidants (319; 320)); Salt), Cashews "
The problem I am facing is if there's a nested () inside a bigger bracket, the regex I have will replace the semicolon to comma.
Can I please get some help on regex that will solve the problem? Thank you in advance.
The pattern ;(?![^(]*\)) means matching a semicolon, and assert that what is to the right is not a ) without a ( in between.
That assertion will be true for a nested opening parenthesis, and will still match the ;
You could use a recursive pattern to match nested parenthesis to match what you don't want to change, and then use a SKIP FAIL approach.
Then you can match the semicolons and replace them with a comma.
[^;]*(\((?>[^()]+|(?1))*\))(*SKIP)(*F)|;
In parts, the pattern matches
[^;]* Match 0+ times any char except ;
( Capture group 1
\( Match the opening (
(?> Atomic group
[^()]+ Match 1+ times any char except ( and )
| Or
(?1) Recurse the whole first sub pattern (group 1)
)* Close the atomic group and optionally repeat
\) Match the closing )
) Close group 1
(*SKIP)(*F) Skip what is matched
| Or
; Match a semicolon
See a regex demo and an R demo.
x <- c("Almonds ; Roasted Peanuts (Peanuts; Canola Oil (Antioxidants (319; 320)); Salt); Cashews",
"Peanuts (32.5%); Macadamia Nuts (14%; PPPG(AHA)); Hazelnuts (9%); nuts(98%)")
gsub("[^;]*(\\((?>[^()]+|(?1))*\\))(*SKIP)(*F)|;",",",x,perl=TRUE)
Output
[1] "Almonds , Roasted Peanuts (Peanuts; Canola Oil (Antioxidants (319; 320)); Salt), Cashews"
[2] "Peanuts (32.5%), Macadamia Nuts (14%; PPPG(AHA)), Hazelnuts (9%), nuts(98%)"

Is there a convenient way to replicate R's concept of 'named vectors' in Raku, possibly using Mixins?

Recent questions on StackOverflow pertaining to Mixins in Raku have piqued my interest as to whether Mixins can be applied to replicate features present in other programming languages.
For example, in the R-programming language, elements of a vector can be given a name (i.e. an attribute), which is very convenient for data analysis. For an excellent example see: "How to Name the Values in Your Vectors in R" by Andrie de Vries and Joris Meys, who illustrate this feature using R's built-in islands dataset. Below is a more prosaic example (code run in the R-REPL):
> #R-code
> x <- 1:4
> names(x) <- LETTERS[1:4]
> str(x)
Named int [1:4] 1 2 3 4
- attr(*, "names")= chr [1:4] "A" "B" "C" "D"
> x
A B C D
1 2 3 4
> x[1]
A
1
> sum(x)
[1] 10
Below I try to replicate R's 'named-vectors' using the same islands dataset used by de Vries and Meys. While the script below runs and (generally, see #3 below) produces the desired/expected output, I'm left with three main questions, at bottom:
#Raku-script below;
put "Read in data.";
my $islands_A = <11506,5500,16988,2968,16,184,23,280,84,73,25,43,21,82,3745,840,13,30,30,89,40,33,49,14,42,227,16,36,29,15,306,44,58,43,9390,32,13,29,6795,16,15,183,14,26,19,13,12,82>.split(","); #Area
my $islands_N = <<"Africa" "Antarctica" "Asia" "Australia" "Axel Heiberg" "Baffin" "Banks" "Borneo" "Britain" "Celebes" "Celon" "Cuba" "Devon" "Ellesmere" "Europe" "Greenland" "Hainan" "Hispaniola" "Hokkaido" "Honshu" "Iceland" "Ireland" "Java" "Kyushu" "Luzon" "Madagascar" "Melville" "Mindanao" "Moluccas" "New Britain" "New Guinea" "New Zealand (N)" "New Zealand (S)" "Newfoundland" "North America" "Novaya Zemlya" "Prince of Wales" "Sakhalin" "South America" "Southampton" "Spitsbergen" "Sumatra" "Taiwan" "Tasmania" "Tierra del Fuego" "Timor" "Vancouver" "Victoria">>; #Name
"----".say;
put "Count elements (Area): ", $islands_A.elems; #OUTPUT 48
put "Count elements (Name): ", $islands_N.elems; #OUTPUT 48
"----".say;
put "Create 'named vector' array (and output):\n";
my #islands;
my $i=0;
for (1..$islands_A.elems) {
#islands[$i] := $islands_A[$i] but $islands_N[$i].Str;
$i++;
};
say "All islands (returns Area): ", #islands; #OUTPUT: returns 48 areas (above)
say "All islands (returns Name): ", #islands>>.Str; #OUTPUT: returns 48 names (above)
say "Islands--slice (returns Area): ", #islands[0..3]; #OUTPUT: (11506 5500 16988 2968)
say "Islands--slice (returns Name): ", #islands[0..3]>>.Str; #OUTPUT: (Africa Antarctica Asia Australia)
say "Islands--first (returns Area): ", #islands[0]; #OUTPUT: 11506
say "Islands--first (returns Name): ", #islands[0]>>.Str; #OUTPUT: (Africa)
put "Islands--first (returns Name): ", #islands[0]; #OUTPUT: Africa
put "Islands--first (returns Name): ", #islands[0]>>.Str; #OUTPUT: Africa
Is there a simpler way to write the Mixin loop ...$islands_A[$i] but $islands_N[$i].Str;? Can the loop be obviated entirely?
Can a named-vector or nvec wrapper be written around put that will return (name)\n(value) in the same manner that R does, even for single elements? Might Raku's Pair method be useful here?
Related to #2 above, calling put on the single-element #islands[0] returns the name Africa not the Area value 11506. [Note this doesn't happen with the call to say]. Is there any simple code that can be implemented to ensure that put always returns (numeric) value or always returns (Mixin) name for all-lengthed slices of an array?
Is there a simpler way?
Yes using the zip meta operator Z combined with infix but
my #islands = $islands_A[] Z[but] $islands_N[];
Why don't you modify the array to change the format?
put calls .Str on the value it gets, say calls .gist
If you want put to output some specific text, make sure that the .Str method outputs that text.
I don't think you actually want put to output that format though. I think you want say to output that format.
That is because say is for humans to understand, and you want it nicer for humans.
When you have a question of “Can Raku do X” the answer is invariable yes, it's just a matter of how much work would it be, and if you would still call it Raku at that point.
The question you really want to ask is how easy it is to do X.
I went and implemented something like that link you provided talks about.
Note that this was just a quick implementation that I created right before bed. So think of this as a first rough draft.
If I were actually going to do this for-real, I would probably throw this away and start over after spending days learning enough R to figure out what it is actually doing.
class NamedVec does Positional does Associative {
has #.names is List;
has #.nums is List handles <sum>;
has %!kv is Map;
class Partial {
has $.name;
has $.num;
}
submethod TWEAK {
%!kv := %!kv.new: #!names Z=> #!nums;
}
method from-pairlist ( +#pairs ) {
my #names;
my #nums;
for #pairs -> (:$key, :$value) {
push #names, $key;
push #nums, $value;
}
self.new: :#names, :#nums
}
method from-list ( +#list ){
my #names;
my #nums;
for #list -> (:$name, :$num) {
push #names, $name;
push #nums, $num;
}
self.new: :#names, :#nums
}
method gist () {
my #widths = #!names».chars Zmax #!nums».chars;
sub infix:<fmt> ( $str, $width is copy ){
$width -= $str.chars;
my $l = $width div 2;
my $r = $width - $l;
(' ' x $l) ~ $str ~ (' ' x $r)
}
(#!names Zfmt #widths) ~ "\n" ~ (#!nums Zfmt #widths)
}
method R-str () {
chomp qq :to/END/
Named num [1:#!nums.elems()] #!nums[]
- attr(*, "names")= chr [1:#!names.elems()] #!names.map(*.raku)
END
}
method of () {}
method AT-POS ( $i ){
Partial.new: name => #!names[$i], num => #!nums[$i]
}
method AT-KEY ( $name ){
Partial.new: :$name, num => %!kv{$name}
}
}
multi sub postcircumfix:<{ }> (NamedVec:D $v, Str:D $name){
$v.from-list: callsame
}
multi sub postcircumfix:<{ }> (NamedVec:D $v, List \l){
$v.from-list: callsame
}
my $islands_A = <11506,5500,16988,2968,16,184,23,280,84,73,25,43,21,82,3745,840,13,30,30,89,40,33,49,14,42,227,16,36,29,15,306,44,58,43,9390,32,13,29,6795,16,15,183,14,26,19,13,12,82>.split(","); #Area
my $islands_N = <<"Africa" "Antarctica" "Asia" "Australia" "Axel Heiberg" "Baffin" "Banks" "Borneo" "Britain" "Celebes" "Celon" "Cuba" "Devon" "Ellesmere" "Europe" "Greenland" "Hainan" "Hispaniola" "Hokkaido" "Honshu" "Iceland" "Ireland" "Java" "Kyushu" "Luzon" "Madagascar" "Melville" "Mindanao" "Moluccas" "New Britain" "New Guinea" "New Zealand (N)" "New Zealand (S)" "Newfoundland" "North America" "Novaya Zemlya" "Prince of Wales" "Sakhalin" "South America" "Southampton" "Spitsbergen" "Sumatra" "Taiwan" "Tasmania" "Tierra del Fuego" "Timor" "Vancouver" "Victoria">>;
# either will work
#my $islands = NamedVec.from-pairlist( $islands_N[] Z=> $islands_A[] );
my $islands = NamedVec.new( names => $islands_N, nums => $islands_A );
put $islands.R-str;
say $islands<Asia Africa Antarctica>;
say $islands.sum;
A named vector essentially combines a vector with a map from names to integer positions and allows you to address elements by name. Naming a vector alters the behavior of the vector, not that of its elements. So in Raku we need to define a role for an array:
role Named does Associative {
has $.names;
has %!index;
submethod TWEAK {
my $i = 0;
%!index = map { $_ => $i++ }, $!names.list;
}
method AT-KEY($key) {
with %!index{$key} { return-rw self.AT-POS($_) }
else { self.default }
}
method EXISTS-KEY($key) {
%!index{$key}:exists;
}
method gist() {
join "\n", $!names.join("\t"), map(*.gist, self).join("\t");
}
}
multi sub postcircumfix:<[ ]>(Named:D \list, \index, Bool() :$named!) {
my \slice = list[index];
$named ?? slice but Named(list.names[index]) !! slice;
}
multi sub postcircumfix:<{ }>(Named:D \list, \names, Bool() :$named!) {
my \slice = list{names};
$named ?? slice but Named(names) !! slice;
}
Mixing in this role gives you most of the functionality of an R named vector:
my $named = [1, 2, 3] but Named<first second last>;
say $named; # OUTPUT: «first␉second␉last␤1␉2␉3␤»
say $named[0, 1]:named; # OUTPUT: «first␉second␤1␉2␤»
say $named<last> = Inf; # OUTPUT: «Inf␤»
say $named<end>:exists; # OUTPUT: «False␤»
say $named<last end>:named; # OUTPUT: «last␉end␤Inf␉(Any)␤»
As this is just a proof of concept, the Named role doesn't handle the naming of non-existing elements well. It also doesn't support modifying a slice of names. It probably does support creating a pun that can be mixed into more than one list.
Note that this implementation relies on the undocumented fact that the subscript operators are multis. If you want to put the role and operators in a separate file, you probably want to apply the is export trait to the operators.
It might not be the most optimal way of doing it (or what you're specifically looking for) but as soon as I saw this particular problem's statement, the first thing that came to mind were Raku's allomorphs, which are types with two related values that are accessible separately depending on context.
my $areas = (11506,5500,16988,2968,16,184,23,280,84,73,25,43,21,82,3745,840,13,30,30,89,40,33,49,14,42,227,16,36,29,15,306,44,58,43,9390,32,13,29,6795,16,15,183,14,26,19,13,12,82);
my $names = <"Africa" "Antarctica" "Asia" "Australia" "Axel Heiberg" "Baffin" "Banks" "Borneo" "Britain" "Celebes" "Celon" "Cuba" "Devon" "Ellesmere" "Europe" "Greenland" "Hainan" "Hispaniola" "Hokkaido" "Honshu" "Iceland" "Ireland" "Java" "Kyushu" "Luzon" "Madagascar" "Melville" "Mindanao" "Moluccas" "New Britain" "New Guinea" "New Zealand (N)" "New Zealand (S)" "Newfoundland" "North America" "Novaya Zemlya" "Prince of Wales" "Sakhalin" "South America" "Southampton" "Spitsbergen" "Sumatra" "Taiwan" "Tasmania" "Tierra del Fuego" "Timor" "Vancouver" "Victoria">;
my #islands;
for (0..^$areas) -> \i {
#islands[i] := IntStr.new($areas[i], $names[i]);
}
say "Areas: ", #islands>>.Int;
say "Names: ", #islands>>.Str;
say "Areas slice: ", (#islands>>.Int)[0..3];
say "Names slice: ", (#islands>>.Str)[0..3];
say "Areas first: ", (#islands>>.Int)[0];
say "Names first: ", (#islands>>.Str)[0];
I think I would just do something like this:
class MyRow {
has Str $.island is rw;
has Numeric $.area is rw;
method Str {
$!island;
}
method Numeric {
+$!area;
}
# does Cool coercion of strings that look numeric
submethod BUILD ( Numeric(Cool) :$!area, :$!island ) {
};
}
class MyTable {
has #.data;
has MyRow #.rows is rw;
has %!lookup;
submethod TWEAK {
#!rows = gather
for #!data -> ( $island, $area ) {
my $row = MyRow.new( :$island, :$area );
%!lookup{ $island } = $row;
take $row;
}
}
method find_island( $island ) {
return %!lookup{ $island };
}
}
To set up a table:
my #raw = #island_names Z #island_areas;
my $table = MyTable.new( data => #raw );
Accessing the rows of the table by name:
my $row = $table.find_island('Africa');
say $row; # MyRow.new(island => "Africa", area => 11506)
Using the row element like a string gets you the name,
using it like a number gets you the area:
say ~$row; # Africa
say +$row; # 11506
One of the features here is that you can add more fields to your
rows, you're not constrained to just a value and a name.
The "find_island" method uses an internal %lookup hash to index
the rows by island name, but unlike a simple hash solution
there's no uniqueness constraint: if you have a duplicate island
name, "find_island" will locate the latest row in the set, but
the other row would still be there.
Caveat: I haven't thought much about how well this supports
dynamically adding more rows to the table.

pyparsing recursive grammar space separated list inside a comma separated list

Have the following string that I'd like to parse:
((K00134,K00150) K00927,K11389) (K00234,K00235)
each step is separated by a space and alternation is represented by a comma. I'm stuck in the first part of the string where there is a space inside the brackets. The desired output I'm looking for is:
[[['K00134', 'K00150'], 'K00927'], 'K11389'], ['K00234', 'K00235']
What I've got so far is a basic setup to do recursive parsing, but I'm stumped on how to code in a space separated list into the bracket expression
from pyparsing import Word, Literal, Combine, nums, \
Suppress, delimitedList, Group, Forward, ZeroOrMore
ortholog = Combine(Literal('K') + Word(nums, exact=5))
exp = Forward()
ortholog_group = Suppress('(') + Group(delimitedList(ortholog)) + Suppress(')')
atom = ortholog | ortholog_group | Group(Suppress('(') + exp + Suppress(')'))
exp <<= atom + ZeroOrMore(exp)
You are on the right track, but I think you only need one place where you include grouping with ()'s, not two.
import pyparsing as pp
LPAR,RPAR = map(pp.Suppress, "()")
ortholog = pp.Combine('K' + pp.Word(pp.nums, exact=5))
ortholog_group = pp.Forward()
ortholog_group <<= pp.Group(LPAR + pp.OneOrMore(ortholog_group | pp.delimitedList(ortholog)) + RPAR)
expr = pp.OneOrMore(ortholog_group)
tests = """\
((K00134,K00150) K00927,K11389) (K00234,K00235)
"""
expr.runTests(tests)
gives:
((K00134,K00150) K00927,K11389) (K00234,K00235)
[[['K00134', 'K00150'], 'K00927', 'K11389'], ['K00234', 'K00235']]
[0]:
[['K00134', 'K00150'], 'K00927', 'K11389']
[0]:
['K00134', 'K00150']
[1]:
K00927
[2]:
K11389
[1]:
['K00234', 'K00235']
This is not exactly what you said you were looking for:
you wanted: [[['K00134', 'K00150'], 'K00927'], 'K11389'], ['K00234', 'K00235']
I output : [[['K00134', 'K00150'], 'K00927', 'K11389'], ['K00234', 'K00235']]
I'm not sure why there is grouping in your desired output around the space-separated part (K00134,K00150) K00927. Is this your intention or a typo? If intentional, you'll need to rework the definition of ortholog_group, something that will do a delimited list of space-delimited groups in addition to the grouping at parens. The closest I could get was this:
[[[[['K00134', 'K00150']], 'K00927'], ['K11389']], [['K00234', 'K00235']]]
which required some shenanigans to group on spaces, but not group bare orthologs when grouped with other groups. Here is what it looked like:
ortholog_group <<= pp.Group(LPAR + pp.delimitedList(pp.Group(ortholog_group*(1,) & ortholog*(0,))) + RPAR) | pp.delimitedList(ortholog)
The & operator in combination with the repetition operators gives the space-delimited grouping (*(1,) is equivalent to OneOrMore, *(0,) with ZeroOrMore, but also supports *(10,) for "10 or more", or *(3,5) for "at least 3 and no more than 5"). This too is not quite exactly what you asked for, but may get you closer if indeed you need to group the space-delimited bits.
But I must say that grouping on spaces is ambiguous - or at least confusing. Should "(A,B) C D" be [[A,B],C,D] or [[A,B],C],[D] or [[A,B],[C,D]]? I think, if possible, you should permit comma-delimited lists, and perhaps space-delimited also, but require the ()'s when items should be grouped.

Lower Case Certain Words R

I need to convert certain words to lower case. I am working with a list of movie titles, where prepositions and articles are normally lower case if they are not the first word in the title. If I have the vector:
movies = c('The Kings Of Summer', 'The Words', 'Out Of The Furnace', 'Me And Earl And The Dying Girl')
What I need is this:
movies_updated = c('The Kings of Summer', 'The Words', 'Out of the Furnace', 'Me and Earl and the Dying Girl')
Is there an elegant way to do this without using a long series of gsub(), as in:
movies_updated = gsub(' In ', ' in ', movies)
movies_updated = gsub(' In', ' in', movies_updated)
movies_updated = gsub(' Of ', ' of ', movies)
movies_updated = gsub(' Of', ' of', movies_updated)
movies_updated = gsub(' The ', ' the ', movies)
movies_updated = gsub(' the', ' the', movies_updated)
And so on.
In effect, it appears that you are interested in converting your text to title case. This can be easily achieved with use of the stringi package, as shown below:
>> stringi::stri_trans_totitle(c('The Kings of Summer', 'The Words', 'Out of the Furnace'))
[1] "The Kings Of Summer" "The Words" "Out Of The Furnace"
Alternative approach would involve making use of the toTitleCase function available in the the tools package:
>> tools::toTitleCase(c('The Kings of Summer', 'The Words', 'Out of the Furnace'))
[1] "The Kings of Summer" "The Words" "Out of the Furnace"
Though I like #Konrad's answer for its succinctness, I'll offer an alternative that is more literal and manual.
movies = c('The Kings Of Summer', 'The Words', 'Out Of The Furnace',
'Me And Earl And The Dying Girl')
gr <- gregexpr("(?<!^)\\b(of|in|the)\\b", movies, ignore.case = TRUE, perl = TRUE)
mat <- regmatches(movies, gr)
regmatches(movies, gr) <- lapply(mat, tolower)
movies
# [1] "The Kings of Summer" "The Words"
# [3] "Out of the Furnace" "Me And Earl And the Dying Girl"
The tricks of the regular expression:
(?<!^) ensures we don't match a word at the beginning of a string. Without this, the first The of movies 1 and 2 will be down-cased.
\\b sets up word-boundaries, such that in in the middle of Dying will not match. This is slightly more robust than your use of space, since hyphens, commas, etc, will not be spaces but do indicate the beginning/end of a word.
(of|in|the) matches any one of of, in, or the. More patterns can be added with separating pipes |.
Once identified, it's as simple as replacing them with down-cased versions.
Another example of how to turn certain words to lower case with gsub (with a PCRE regex):
movies = c('The Kings Of Summer', 'The Words', 'Out Of The Furnace', 'Me And Earl And The Dying Girl')
gsub("(?!^)\\b(Of|In|The)\\b", "\\L\\1", movies, perl=TRUE)
See the R demo
Details:
(?!^) - not at the start of the string (it does not matter if we use a lookahead or lookbehind here since the pattern inside is a zero-width assertion)
\\b - find leading word boundary
(Of|In|The) - capture Of or In or The into Group 1
\\b - assure there is a trailing word boundary.
The replacement contains the lowercasing operator \L that turns all the chars in the first backreference value (the text captured into Group 1) to lower case.
Note it can turn out a more flexible approach than using tools::toTitleCase. The code part that keeps specific words in lower case is:
## These should be lower case except at the beginning (and after :)
lpat <- "^(a|an|and|are|as|at|be|but|by|en|for|if|in|is|nor|not|of|on|or|per|so|the|to|v[.]?|via|vs[.]?|from|into|than|that|with)$"
If you only need to apply lowercasing and do not care about the other logic in the function, it might be enough to add these alternatives (do not use ^ and $ anchors) to the regex at the top of the post.

Is there a "quote words" operator in R? [duplicate]

This question already has answers here:
Does R have quote-like operators like Perl's qw()?
(6 answers)
Closed 5 years ago.
Is there a "quote words" operator in R, analogous to qw in Perl? qw is a quoting operator that allows you to create a list of quoted items without having to quote each one individually.
Here is how you would do it without qw (i.e. using dozens of quotation marks and commas):
#!/bin/env perl
use strict;
use warnings;
my #NAM_founders = ("B97", "CML52", "CML69", "CML103", "CML228", "CML247",
"CML322", "CML333", "Hp301", "Il14H", "Ki3", "Ki11",
"M37W", "M162W", "Mo18W", "MS71", "NC350", "NC358"
"Oh7B", "P39", "Tx303", "Tzi8",
);
print(join(" ", #NAM_founders)); # Prints array, with elements separated by spaces
Here's doing the same thing, but with qw it is much cleaner:
#!/bin/env perl
use strict;
use warnings;
my #NAM_founders = qw(B97 CML52 CML69 CML103 CML228 CML247 CML277
CML322 CML333 Hp301 Il14H Ki3 Ki11 Ky21
M37W M162W Mo18W MS71 NC350 NC358 Oh43
Oh7B P39 Tx303 Tzi8
);
print(join(" ", #NAM_founders)); # Prints array, with elements separated by spaces
I have searched but not found anything.
Try using scan and a text connection:
qw=function(s){scan(textConnection(s),what="")}
NAM=qw("B97 CML52 CML69 CML103 CML228 CML247 CML277
CML322 CML333 Hp301 Il14H Ki3 Ki11 Ky21
M37W M162W Mo18W MS71 NC350 NC358 Oh43
Oh7B P39 Tx303 Tzi8")
This will always return a vector of strings even if the data in quotes is numeric:
> qw("1 2 3 4")
Read 4 items
[1] "1" "2" "3" "4"
I don't think you'll get much simpler, since space-separated bare words aren't valid syntax in R, even wrapped in curly brackets or parens. You've got to quote them.
For R, the closest thing that I can think of, or that I've found so far, is to create a single block of text and then break it up using strsplit, thus:
#!/bin/env Rscript
NAM_founders <- "B97 CML52 CML69 CML103 CML228 CML247 CML277
CML322 CML333 Hp301 Il14H Ki3 Ki11 Ky21
M37W M162W Mo18W MS71 NC350 NC358 Oh43
Oh7B P39 Tx303 Tzi8"
NAM_founders <- unlist(strsplit(NAM_founders,"[ \n]+"))
print(NAM_founders)
Which prints
[1] "B97" "CML52" "CML69" "CML103" "CML228" "CML247" "CML277" "CML322"
[9] "CML333" "Hp301" "Il14H" "Ki3" "Ki11" "Ky21" "M37W" "M162W"
[17] "Mo18W" "MS71" "NC350" "NC358" "Oh43" "Oh7B" "P39" "Tx303"
[25] "Tzi8"

Resources