Aligning matrices vertically as well as horizontally in LaTeX - math

I'm trying to accomplish this in LaTeX:
⎡a⎤ ⎡b … n⎤
⎢⁞⎢ ⎢⁞ ⋱ ⁞⎢
⎣x⎦ ⎣y … z⎦
[a … x]
I'm able to get a vector + a matrix on one line, but I'm not sure how to align the vector below so that it sits perfectly under the large matrix.
Here's a less-unicode text representation of the 'drawing' above:
[a] [ b c ]
[d] [ e f ]
[ g h ]
Note that the last line ([ g h ]) is a single-row matrix, separate from the 2x2 matrix above it.

\edit2
final answer:
\begin{align*}
\begin{vmatrix} 1 \\ 2 \end{vmatrix} &\begin{vmatrix} 1 & 2 & 3 \\ 3 & 4 & 5 \end{vmatrix} \\[6px]
&\begin{vmatrix} 2 & 3 & 4 \end{vmatrix}
\end{align*}
does exactly what you want, read below for more info about placement and such. The "&" sign is used to align in general. Forgot the first line had 2 matrices but now you have it :).
info on spacing and such
\begin{align*}
&\begin{pmatrix} 1 & 2 & 3 \ 3 & 4 & 5 \end{pmatrix} \[6px]
&\hspace{2px}\begin{pmatrix} 2 & 3 & 4 \end{pmatrix}
\end{align*}
would do the job. For some strange reason the align gave errors when leaving out the first "&" symbol and it gave a 2px offset. I figured you wanted some space between the two if not leave the [6px]. You can always use \hspace{amount of whitespace} to place your second matrix in the place you want. This can be given in pt's, px's (which i did) etc.
//edit
Hm I notice the \hspace{} is actually not needed, but can be used in case of pmatrix. What happens is that the pmatrix brackets give a biased image of the matrices. When using vmatrix like:
\begin{align*}
&\begin{vmatrix} 1 & 2 & 3 \\ 3 & 4 & 5 \end{vmatrix} \\[6px]
&\begin{vmatrix} 2 & 3 & 4 \end{vmatrix}
\end{align*}
It all goes well :). So basically, probably the easiest way to fix it is either use other brackets to make it look good or use the \hspace to align as you like it.

If all else fails, PGF/TikZ can do this. See this example.

Wrap the thing in \begin{align*} ... \end{align*} and use & as the alignment marker in your formulas.
Example:
\begin{align*}
\begin{pmatrix} ... vector here \end{pmatrix}
&\begin{pmatrix} ... first matrix here \end{pmatrix}\\
&\begin{pmatrix} ... second matrix here \end{pmatrix}
\end{align*}

Related

Extract two consecutive lines that have non-consecutive strings

I have a very large text file with 2 columns and more than 10 mio of lines.
Most lines have in column 2 a number that is the number of column 2 of the previous line +1. However, few thousands of lines behave differently (see example below).
Input file:
A 1
A 2
A 3
A 10
A 11
A 12
A 40
A 41
I would like to extract the pair of two lines that do not respect the +1 increment in column 2.
Desired output file:
A 3
A 10
A 12
A 40
Is there (preferentially) an awk command that allows to do that?
I tried several codes comparing column 2 of two consecutive lines but unfortunately I fail until now (see the code below).
awk 'FNR==1 {print; next} $2==p2+1 {print p $0; p=""; next} {p=$0 ORS; p2=$2}' input.txt > output.txt
Thanks for your help. Best,
Would you please try the following:
awk 'NR>1 {if ($2!=p2+1) print p ORS $0} {p=$0; p2=$2}' input.txt > output.txt
Output:
A 3
A 10
A 12
A 40
The variables names are similar to yours: p holds the previous line and
p2 holds the second column of the previous line.
The condition NR>1 suppresses to print on the 1st line.
if ($2!=p2+1) print p ORS $0 prints the pairs of two lines which
meet the condition.
The block {p=$0; p2=$2} preserves values of current line for the next iteration.
I like perl for the text processing that needs arithmetic.
$ perl -ane 'print and next if $.<3; print $p and print if $F[3]!=$fp+1; $fp=$F[3]; $p=$_' input.txt
| COLUMN 1 | COLUMN 2 |
| -------- | -------- |
| A | 3 |
| A | 10 |
| A | 12 |
| A | 40 |
This is using -a to autosplit into #F.
Prints first 2 lines: print and next if $.<3
On subsequent lines, prints previous line and current line if the 4th field isn't exactly one more than the prior 4th field: print $p and print if $F[3]!=$fp+1
Saves the 4th field as $fp and the entire line as $p: $fp=$F[3]; $p=$_
Assumptions:
columns are tab-delimited
the 1st column may contain white space (this isn't demonstrated in the sample provided by OP but it also hasn't been ruled out)
lines of interest must have the same value in the 1st column (ie, if the values in the 1st column differ then we don't bother with comparing the values in the 2nd column and instead proceed to the next input line)
if 3 consecutive lines meet the criteria, the 2nd/middle line is only printed once
Setup:
$ cat input.txt
A 1
A 2
A 3 # match
A 10 # match
A 11
A 12 # match
A 23 # match
A 40 # match
A 41
X to Z 101
X to Z 102 # match
X to Z 104 # match
X to Z 105
NOTE: comments only added here to highlight the lines that match the search criteria
One awk idea:
awk -F'\t' '
FNR==1 { prevline=$0 }
FNR>1 { if ($1 == prev1 && $2+0 != prev2+1) {
if (prevline) print prevline
print
prevline="" # make sure this line is not printed again if next line also meets criteria
}
else
prevline=$0
}
{ prev1=$1; prev2=$2 }
' input.txt
This generates:
A 3
A 10
A 12
A 23
A 40
X to Z 102
X to Z 104
This might work for you (GNU sed):
sed -nE 'N;h
s/.*\s+(.*)\n.*(\s.*)/echo "$((\1+1))\2"/e;/^(.*)\s\1$/!{x;p;x};x;D' file
Open a two line window throughout the length of the file.
Make a copy of the window and increment the 2nd column of the first line by one. If this amended value is equal to the 2nd column of the second line then print both unadulterated lines.
Delete the first line and repeat.
N.B. This may print the second of these lines twice if the following line meets the same criteria.

How to know if there is a different element in one array in Scilab?

My goal is to check if there are misplaced objects in one array.
for example the array is
2.
2.
2.
2.
2.
1.
3.
1.
3.
3.
3.
1.
3.
1.
1.
1.
1.
I want to know if the first 5 elements, 6 to 13 and 14-17 are the same.
The purpose of this is to identify the misplaced elements in a clustering solution.
I have tried for the first 5 elements
ISet=5
IVer=7
IVir=5
for i=1:ISet
if(isequal(FIRSTMIN(i,1,2),FIRSTMIN(i+1,1,2))==%f)
numMisp=numMisp+1
mprintf("Set misp: %i",numMisp)
end
end
For the next 6 to 13 elements
for i=ISet+1:IVer+ISet-1
if(isequal(FIRSTMIN(i,1,2),FIRSTMIN(i+1,1,2))==%f)
mprintf("%i %i Ver misp: %i\n",FIRSTMIN(i,1,2),FIRSTMIN(i+1,1,2),i)
numMisp=numMisp+1
end
end
For the next 14 to 17 elements
for i=IVer+ISet:IVer+IVir-1
if(isequal(FIRSTMIN(i,1,2),FIRSTMIN(i+1,1,2))==%f)
mprintf("%i %i Ver misp: %i\n",FIRSTMIN(i,1,2),FIRSTMIN(i+1,1,2),i)
numMisp=numMisp+1
mprintf("Vir misp: %i",i)
end
end
You can use unique for that purpose. For example the following test checks if the first five elements are the same
x=[2 2 2 2 2 1 3 1 3 3 3 1 3 1 1 1 1];
if length(unique(x(1:5))) == 1
//
end
You can do the the same for the other clusters by replacing 1:5 by 6:13 then 14:17.

Comparing consecutive rows within an file

I have two files with with 1s and 0s in each column, where the field separator is "," :
1,0,0,1,1,1,0,0,0,0,1,0,0,1,1,0,1,0
0,1,0,1,1,1,0,1,0,1,0,0,0,0,0,0,0,0
1,0,0,0,1,0,0,1,0,0,0,1,0,0,1,0,1,0
1,0,0,0,1,0,0,1,0,0,0,1,0,0,1,0,1,0
1,0,1,0,0,0,0,1,1,1,1,1,1,1,1,1,0,1
1,0,1,0,0,0,0,1,1,1,1,1,1,1,1,1,0,0
1,1,1,1,1,1,1,1,1,1,1,0,0,0,0,0,0,0
1,1,1,1,1,1,1,1,1,1,1,0,0,0,0,0,0,0
1,1,1,0,0,0,1,1,1,0,0,0,1,1,1,0,0,0
1,1,1,0,0,0,1,1,1,0,0,0,1,1,1,0,0,0
Want I want to do is look at the file in pairs of rows, compare them, and if they are exactly the same output a 1. So for this example the rows 1 & 2 are different so they don't get a 1, rows 3 & 4 are exactly the same so they get a 1, and rows 5&6 differ by 1 column so they don't get a 1, and so on.
So the desired output could be something like :
1
1
1
Because here there are exactly 3 pairs (they are paired by the fact if they are consecutive) of rows that are exactly the same: rows 3&4, 7&8, and 9&10. The comparison should not reuse a row, so if you compare rows 1 & 2, you shouldn't then compare rows 2 & 3.
You can do this with awk like:
awk -F, '!(NR%2) {print $0==p} {p=$0}' data
0
1
0
1
1
where every line that's evenly divisible by two will print a 0 if the current line doesn't match the last value for p or a 1 if it matches.
If you truly only want the 1s, which is throwing away any information about which pairs matched, you could:
awk -F, '!(NR%2)&&$0==p {print 1} {p=$0}' data
1
1
1
Alternatively, you could output matching pair line numbers like:
awk -F, '!(NR%2)&&$0==p {print NR-1 "," NR} {p=$0}' data
3,4
7,8
9,10
Or just the counts of all matched pairs:
awk -F, '!(NR%2)&&$0==p {c++} {p=$0} END{ print c}' data
3
Another useful variant might be just to return the matching lines directly:
awk -F, '!(NR%2)&&$0==p {print} {p=$0}' data
1,0,0,0,1,0,0,1,0,0,0,1,0,0,1,0,1,0
1,1,1,1,1,1,1,1,1,1,1,0,0,0,0,0,0,0
1,1,1,0,0,0,1,1,1,0,0,0,1,1,1,0,0,0
I would use a shell script like this:
while read line
do
if test "$prevline" = "$line"
then
echo 1
fi
prevline=$line
done
I'm not 100% sure about your requirement to "not reuse a row", but I think that could be achieved by changing inner part of the loop to
if test "$prevline" = "$line"
then
echo 1
line="" # don't reuse a line
fi

Make a file counting instances in sets of 5

I have a file that looks like this:
1 rs531842 503939 61733 G A
1 rs10494103 35025 114771 C T
1 rs17038458 254490 21116837 G A
1 rs616378 525783 21127670 T C
1 rs3845293 432526 21199392 A C
2 rs16840461 233620 157112959 A G
2 rs1560628 224228 157113214 T C
2 rs17200880 269314 257145829 C T
2 rs10497165 35844 357156412 C T
2 rs7607531 624696 457156575 T C
...with column 1 stretching on to 22, and several thousand entries in total.
I want to create a file that lists bins of 5 million from column 4 which have data, separating by column 1.
Basically, all but column 1 and 4 can be discarded. A simple imput would look like this:
InputChr1:
61733
114771
21116837
21127670
21199392
InputChr2:
157112959
157113214
257145829
357156412
457156575
So, for the example above, I would want to get two files that look like this:
OutputChr1.txt
Start End Occurrences
1 5000000 2
20000001 25000000 3
OutputChr2.txt
Start End Occurrences
155000001 160000000 2
255000001 260000000 1
355000001 360000000 1
455000001 460000000 1
Any ideas? It seems like something that should be doable with lapply in R, but I can't get the for loops to work...
EDIT: Actually, I made this look much harder than it needed to be - basically, I want to split the original file by column 1, extract the data in column 4, and then count the instances in bins of 5 million.
(Apologies for slightly random tags, just trying to think of which tools might be best!)
Well, this happened to be very challenging. I couldn't find a way to use an unique awk command, though.
awk -v const=5000000 -v max=150
'{a[$1,int($4/const)]++; b[$1]}
END{for (i in b)
{for (j=0; j<max; j++)
print i, j*const +1, (j+1)*const, a[i,j]
}
}' file
And then to get only the results:
awk 'NF==4'
Explanation
-v const=5000000 -v max=150 give the variables. const is the 5 million value to split the results. max is the biggest number up to which we will look for info in the END block.
a[$1,int($4/const)]++ create an array with (1st field, 4th field) as index. Note the second is int($4/const) is to get from 23432 --> 0, 6000000 --> 1, etc. That is, to see in which block of values is every 4th column.
b[$1] keep track of the first columns that have been processed.
END{for (i in b) {for (j=0; j<max; j++) print j, j*const +1, (j+1)*const, a[i,j]}}' print the values.
awk 'NF==4' just print those lines that have 4 columns. This way it just outputs those cases in which there were matches.
In case you want to store the values into a new file, you can do
awk 'NF==4 {print > "OutputChr"$1".txt}'
Sample output
$ awk -v const=5000000 -v max=150 '{a[$1,int($4/const)]++; b[$1]} END{for (i in b) {for (j=0; j<max; j++) print i, j*const +1, (j+1)*const, a[i,j]}}' a | awk 'NF==4'
1 1 5000000 2
1 20000001 25000000 3
2 155000001 160000000 2
2 255000001 260000000 1
2 355000001 360000000 1
2 455000001 460000000 1
All in one
awk '{ v=int($4/const)
a[$1 FS v]++
min[$1]=min[$1]<v?min[$1]:v # get the Minimum of column $4 for group $1
max[$1]=max[$1]>v?max[$1]:v # get the Minimum of column $4 for group $1
}END{ for (i in min)
for (j=min[i];j<=max[i];j++) # set the for loop, and use the min and max value.
if (a[i FS j]!="") print j*const+1,(j+1)*const,a[i FS j] > "OutputChr" i ".txt" # if the data is exist, print to file "OutputChr" i ".txt"
}' const=5000000 file
result:
$ cat OutputChr1.txt
1 5000000 2
20000001 25000000 3
$ cat OutputChr2.txt
155000001 160000000 2
255000001 260000000 1
355000001 360000000 1
455000001 460000000 1

Classic ASP - Subtracting numbers but keep leading zeros

I was wondering if someone could help me out.
I have a number, it could be 004 or 010 ... I would like to subtract it by 1 and keep the leading zero.
Everytime i try to subtract, it always takes away the leading zero.
for instance
004 - 1
This always ends up as just
3
But i would like it to be
003
Having said that if i have 010 and i subtract 1 i would like it to be 009
Any help would be greatly appreciated.
Cheers,
Do your calculations with integers then change to a string to display them. then concatenate "0" or "00" where needed eg
c = a - b
displayvalue = cstr(c)
if c < 100 then
displayvalue = "0" & cstr(c)
end if
if c < 10 then
displayvalue = "00" & cstr(c)
end if
Continue doing the math as you are but include some padding 0's using the Right and String functions.
Dim a, b
a = 003
b = 1
Right(String(3,"0") + cstr(a-b), 3)

Resources