Get position of character in collating sequence - hex

I'm looking a way to convert ASCII text into hex data, I mean the way of hex that you can obtain making
MOVE X'nn' TO MYVAR.
Then I need to use it like a number (COMP-3 I guess).
I tried to move a PIC(X) to a PIC S9(2)V COMP-3 but does not work as I thought...
Further explanation, as my question was marked as unclear:
First of all, sorry, I made this question late in the night and now that I'm reading again, yes, it's unclear.
Now, the real issue is that I want to use a char (let's say "A") as it's hexadecimal numeric representation to use it as a index for a internal table.
For example, in C it could be easy, making:
int mynum;
char mytext;
mynum = atoi(mytext);
then using mychar to access an array. So, in COBOL I have:
01 MY-TABLE.
05 MY-TABLE-ITEM PIC X OCCURS 1000.
01 MY-TEXT PIC X 100.
01 MY-TEXT-X PIC X OCCURS 100.
Then, I want to iterate MY-TEXT-X and transform it into it's hex code to store it into a numeric variable (PIC 9(n)) to use it to access MY-TABLE-ITEM, something like:
PERFORM VARYING I FROM 1 BY 1 UNTIL I > 100
PERFORM TRANSFORM-DATA
DISPLAY MY-TABLE-ITEM(MY-NUMBER)
END-PERFORM
As I said, I thought I can move a PIC X to a PIC S9(2)V COMP-3 so the numeric variable can get the value but it's not working as I expected...
EDIT:
So I just found my compiler doesn't support intrinsic functions, so that does not help me...
EDIT - Added source code
So, here's the source I'm using, and also displays from complier and executions.
SOURCE:
IDENTIFICATION DIVISION.
PROGRAM-ID. likeatoi.
DATA DIVISION.
WORKING-STORAGE SECTION.
01 the-char PIC X
VALUE "K".
01 the-result PIC 999.
01 the-other-result PACKED-DECIMAL PIC 9(8)
VALUE ZERO.
01 FILLER
REDEFINES the-other-result.
05 FILLER PIC X.
05 char-to-convert PIC X.
01 num pic 9(8).
PROCEDURE DIVISION.
MAINLINE.
* with instrinsic function
* MOVE FUNCTION ORD ( the-char )
* TO the-result.
DISPLAY
">"
the-char
"<"
">"
the-result
"<".
* Old School
MOVE the-char TO char-to-convert.
DISPLAY
">"
the-char
"<"
">"
the-other-result
"<".
MOVE the-other-result TO num.
DISPLAY num.
STOP RUN.
Now, here a detail of all I tried:
First, try to compile it wit INTRINSIC FUNCTION ORD:
***** 1) 0384: E User-defined word expected instead of reserved word. (scan su
With this compilation, run program (Ignore error):
COBOL procedure error 211 at line 17 in ./ESCRITORIO/HEX/LIKEATOI.COB
(/home/dohitb/Escritorio/HEX/likeatoi.COB) compiled 17/03/05 20:37:29.
Comment FUNCTION part, then compile again:
Errors: 0, Warnings: 1, Lines: 37 for program LIKEATOI.
(Warning for displaying a COMP variable, it's OK)
Execute again (without the "num", and still with comp variable):
>A<> <
>A<>A<
Add "num" variable, change char to "K" and change COMP to PACKED-DECIMAL (in HEX: 4B)
>K<> <
>K<>K<
04900000
So, as I was telling, neither options are working. The most accurate rigth now is using PACKED-DECIMAL with redefines to PIC 9, but with HEX positions higher than "A" it gives a "9" so it's still not valid.
I think it could be a matter of local COLLATION.
FINAL EDIT
Now I made a variant of the original source code:
IDENTIFICATION DIVISION.
PROGRAM-ID. likeatoi.
DATA DIVISION.
WORKING-STORAGE SECTION.
01 the-char PIC X
VALUE "K".
01 the-result PIC 999.
01 the-other-result BINARY PIC 9(4)
VALUE ZERO.
01 FILLER-1
REDEFINES the-other-result.
05 FILLER PIC X.
05 char-to-convert PIC X.
01 the-comp-result COMP PIC 9(4)
VALUE ZERO.
01 FILLER-2
REDEFINES the-comp-result.
05 FILLER PIC X.
05 char-to-convert PIC X.
01 the-packed-result PACKED-DECIMAL PIC 9(4)
VALUE ZERO.
01 FILLER-3
REDEFINES the-packed-result.
05 FILLER PIC X.
05 char-to-convert PIC X.
01 num PIC 9(8).
01 alfa PIC X(20)
VALUE 'ABCDEFGHIJabcdefghij'.
01 FILLER REDEFINES alfa.
05 char PIC X OCCURS 20.
01 w-index PIC 99 VALUE ZEROES.
PROCEDURE DIVISION.
MAINLINE.
PERFORM VARYING w-index FROM 1 BY 1 UNTIL w-index > 20
MOVE char(w-index) TO the-char
* Variations of "Old School" code
MOVE the-char TO char-to-convert OF FILLER-1
MOVE the-char TO char-to-convert OF FILLER-2
MOVE the-char TO char-to-convert OF FILLER-3
DISPLAY
">"
the-char
"<"
" with BINARY >"
the-other-result
"<"
MOVE the-other-result TO num
DISPLAY "Numeric value: " num
DISPLAY
">"
the-char
"<"
" with COMP >"
the-comp-result
"<"
MOVE the-comp-result TO num
DISPLAY "Numeric value: " num
DISPLAY
">"
the-char
"<"
" with PACKED >"
the-packed-result
"<"
MOVE the-packed-result TO num
DISPLAY "Numeric value: " num
END-PERFORM.
STOP RUN.
And, for my surprise, it's giving me this output
>A< with BINARY >A<
Numeric value: 00000065
>A< with COMP >A<
Numeric value: 00000100
(and so on...) So now looks like it's working... Could it be because the first try I made I was working with 05-LEVEL variables?
Looks like now it's done!
Thanks for all, Bill, you will figure on the greetings section of my project :)
At last, one detail.
If I make a "MOVE"
MOVE 'A' TO CHAR
Then do all the binary stuff, the results are different... here an example.
with VALUE, for "D" I get 68, but with MOVE I get 60...

You have been suffering from using an old compiler. It is to the COBOL 85 Standard, but does not have the intrinsic functions which were a 1989 Extension to the Standard.
Also, it has a non-Standard behaviour which I have not encountered before, which is difficult to explain fully (not having access to that compiler).
The point of using the > and < in the DISPLAY is so that you always know exactly how long each output field is. You know whether there is a blank, or some non-printable character. Your DISPLAY of fields defined as COMP and BINARY only show one character, rather than four numeric digits which would typically be held in two bytes of storage (like an INT, except with a limit of 9999).
Therefore I suggested the MOVE, where you then get the expected result when defined as BINARY and an... unexplained result when defined as COMP.
One explanation for the COMP result may be that COMPUTATIONAL fields are entirely down to the compiler implementor to define. So what is COMP on one system may not be the same type of field as COMP on another system (same with COMP-1, COMP-2, COMP-3 etC). This is why the 1985 Standard introduced new names (for example BINARY and PACKED-DECIMAL) so that they would be portable across COBOL compilers.
If you are stuck with using that compiler, you are unfortunate. If you have the possibility of using another compiler, you can find, amongst other choices, the open-source GnuCOBOL (I am a moderator on the discussion area of the GnuCOBOL project at SourceForge.Net). Use a different compiler if you can.
Here's an example program which will work on modern COBOL compilers using both the intrinsic function ORD and a way it used to be done (and probably is still done). Note, if your COMP field is "little endian", swap the order of the FILLER and field under the REDEFINES.
IDENTIFICATION DIVISION.
PROGRAM-ID. likeatoi.
DATA DIVISION.
WORKING-STORAGE SECTION.
01 the-char PIC X
VALUE "A".
01 the-result PIC 999.
01 the-other-result BINARY PIC 9(4)
VALUE ZERO.
01 FILLER
REDEFINES the-other-result.
05 FILLER PIC X.
05 char-to-convert PIC X.
PROCEDURE DIVISION.
* with instrinsic function
MOVE FUNCTION ORD ( the-char )
TO the-result
DISPLAY
">"
the-char
"<"
">"
the-result
"<"
* Old School
MOVE the-char TO char-to-convert
DISPLAY
">"
the-char
"<"
">"
the-other-result
"<"
STOP RUN
.
The ORD is easy, it is effectively the same as your atoi in C (assuming that that gives you the position in the collating sequenece).
In the second, since COBOL, traditionally, can't have a one-byte binary, is a way, using REDEFINES, to get a character in the low-order part of a two-byte binary, so that the whole binary field represents the "numeric value" of the representation of that character.
The output from the above is:
>A<>066<
>A<>0065<
Note that ORD gives the position in the collating sequence (binary zero with ORD would return one) and the second is just giving the direct representation (binary zero would give zero).
To use either value you may want to "re-base" afterwards if you are only interested in printable characters.
Note, I'm confused that you have a compiler which supports in-line PERFORM but not intrinsic functions. If a USAGE of BINARY is rejected, use COMP instead.

Related

Conversion of RE to FSA without ε

i have a reguler expresion
10+(0+11)*1
how to change the reguler expression to Finite State Automata ?
There are algorithms used in the propf of equivalency between REs and NFAs on the one hand, and between NFAs and DFAs on the other. That is one option and does not require insight or understanding about what language the RE generates.
Another option is to try to understand the language first, and then write a DFA for the language from scratch.
The way I will show you involves the Myhill-Nerode theorem which says a regular language has as many equivalence classes over the indististinguishability relation as there are states in a minimal DFA for that language. Two strings qre indistinguishable with respect yo the language if the same set of strings can be appended to them to get some string in the language. For instance, the strings a, b and ba are distinguishable w.r.t. L(ab) since a can be followed b, b by the empty string only, and ba by nothing (empty set) to get a string in L. This tells us a minimal DFA for ab requires at least three states.
For your language, L(10 + (0 + 11)*1, we observe:
the empty string is the first we look at. It needs a state - the DFA's initial state - and can be followed by any string in L to get a string in L. Call this state [e].
the string 0 can be followed by 1 to get a string in the language; this makes it fifferent from the empty string, so a new state is required. Call this [0].
the string 1 can be followed by the empty string to get another string in the language; this makes it distinguishable from the empty string and the string 0. Call its state [1].
the string 00 can be followed by exactly the same strings as could follow 0 to get a string in the language. This means 00 does not need a new state; 00 will take our DFA to state [0].
the string 01 can be followed by strings of the form 1(0 + 11)*1 to get a string in the language. This is new, so we need a new state. Call this [01]
the string 10 can be followed by the empty string only to get a string in L. This is new, so call its state [10].
the string 11 can be followed by exactly the same strings as could follow 0 to get a string in the language. 11 will take our DFA to state [0].
the string 010 can't ever lead to a string in the language; it must lead to a dead state in our minimal DFA. Call this [010].
the string 011 is indistinguishable from strings 0 and 11.
the strings 100 and 101 can't lead to a string in the language, so it must take our DFA to the dead state [010].
The states we found we needed are these: [e], [0], [1], [01], [10] and [010]. The transitions are not too hard to figure out:
[e] transitions to [0] and [1] on inputs 0 and 1, respectively
[0] transitions to [0] and [01] on inputs 0 and 1, respectively
[1] transitions to [10] and [0] on inputs 0 and 1, respectively
[01] transitions to [010] and [0] on inputs 0 and 1, respectively
[10] transitions to [010] ob inputs 0 or 1
[010] always transitions to itself
You now have a minimal DFA for your language, as well as a proof of such minimality.

GOTO/OF in HP Time-Shared BASIC

For the following code in HP Time-Shared BASIC, I am wondering about line 2270:
2180 INPUT X
2190 IF X=1 THEN 2210
2200 LET X=2
2210 LET X=X+1
2220 IF X=3 THEN 2260
2230 IF B>39 THEN 2260
[irrelevant code lines removed for clarity]
2260 X1=X1*(-1)
2270 GOTO X OF 2290,2540,2720
Based on other examples in this code base, it seems GOTO [variable] OF [line1,line2,...] seems to be the equivalent of if X == 1 GOTO LINE 1; if X == 2, GOTO LINE 2, etc.
I found the relevant Wikipedia bit "Calculated flow-control via the GOTO/OF and GOSUB/OF statements" but I'd like more clarity.
Can anyone confirm?
Thanks,
Caleb
Thankfully, the Wikipedia page has a link to all the original documentation:
http://www.bitsavers.org/pdf/hp/2000TSB/
This includes a full language reference:
http://www.bitsavers.org/pdf/hp/2000TSB/22687-90001_AccessBasic9-75.pdf
Which says this about GOTO/OF on page 11-40
GO TO numeric expression OF statement number list
...
When the second form of the GO TO statement is used, the numeric expression is evaluated and rounded to an integer "n". Control then is transferred to the "nth" statement number in the statement number list, where statement number list is one or more statement numbers separated by commas. If there is no statement number corresponding to the value of the numeric expression, the GO TO statement is ignored and the statement following the GO TO statement is executed
That appears to confirm your guess

Understanding a few iTerm2 default keycode mappings

What are the following used for in iTerm2? For example:
^2 through ^9 and ^- --> Send Hex Code 0x00 .. 0x7f ?
⌥↑ and ⌥↓ --> Send Hex Codes: 0x1b 0x1b 0x5b 0x41...
To put it all into one here are the ones in question in a nicely formatted way:
Short answer: For compatibility with old terminals. Real terminals that were dedicated pieces of hardware, not applications in a window system!
About the ^0...^9 ones: Terminals would typically use 7-bit ASCII codes, with character codes ranging from 0 to 127, inclusive. Codes 32-126 where used for letters, numbers and punctuation (as they still are in Unicode). 127 was usually the DELETE key (though sometimes that key would send code 8, "backspace" instead, leading to problems which persist in FAQ lists to this day. CTRL-A to CTRL-Z would generate the corresponding ASCII code minus 64, i.e. 1-26. The remaining codes could be a bit harder to generate. CTRL-# usually gave you a 0, but sometimes that was on CTRL-space instead. The 27-31 range was usually at CTRL plus the "ASCII minus 64" position, so for example CTRL-] would give you 0x1D, but there were some terminals where those codes were mapped to CTRL-number instead, and those mappings in iTerm2 seem to be there to cater to people used to such terminals.
As for alt-arrow giving 1b 1b 5b 41, that is ESC ESC [ A . Now ESC [ A is a fairly common sequence for the up arrow alone, and prefixing it with an extra escape probably makes Emacs users happy, because it makes the Alt key work as a Meta key for them. I have not looked at the other multi-byte sequences, but I guess they have similar explanations.

Selecting character code table in ESC/POS command

I need print non-english chars on print receipts, use thermal POS receipt printer. Xprinter XP-58III thermal POS receipt printer suppport generic ESC/POS commands.
As I know this should be done by setting character code table. In my case, target code page is 21.
The ESC/POS command for setting Code Page is 'ESC t n' (ASCII) or '1B 74 n' (Hex) where 'n' is page n of the character code table.
I case use of Hex form of command: shold I convert '21' to hex value, or I should use this number without converting, i.e. '1B 74 21'?
Also, where should be added thnis command, right after initialization code?
0x1B 0x40 0x1B 0x74 0x21
I use hex editor to add/edit ESC/POS codes inside the binary file.
EDIT: I solved the issue myself. In order to print any non-english characters on the POS receipt printer, we have to fulfill two conditions: 1) set the correct Code Page, and 2) set the corresponding encoding in receipt file or POS software (same encoding as Code Page). The correct Code Page for this POS printer model is 25 [WPC1257].
I solved the issue myself: the problem was in wrong Code Page set. The correct Code Page for this POS printer is 25 [WPC1257]. We have also set the corresponding encoding in receipt file (same encoding as Code Page).
Page 21 would be "Thai Character Code 11", where 21 is represented in decimal and you need to say "0x15" in binary. Then the command will look like "0x1B 0x74 0x15".
Regarding the command position, the ESC/POS commands are executed in place and affects thereafter in general. There may be no problem it you put it just after the initialization command. Just try.

How can I use SyncSort to convert data to unsigned packed format?

I have a requirement to convert numeric data (stored as character on input) to either packed signed or packed unsigned formats. I can convert to packed/signed using the "PD" format, but I'm having a difficult time getting unsigned packed data.
For instance, I need a ZD number like 14723 converted to:
042
173
Using PD, I get this (which is fine):
0173
042C
Any suggestions? We do not have COBOL at this shop and are relying on SyncSort to handle these data conversions. I'm not seeing a "PK" option in SyncSort, but I've missed things before!
So you don't want a packed-decimal, which always has a sign (even when F for unsigned) in the low-order half-byte. You want Binary Coded Decimal (BCD).
//STEP0100 EXEC PGM=SORT
//SYSOUT DD SYSOUT=*
//SORTOUT DD SYSOUT=*
//SYSIN DD *
OPTION COPY
INREC IFTHEN=(WHEN=INIT,OVERLAY=(1,5,ZD,MUL,+10,TO=PD,LENGTH=4)),
IFTHEN=(WHEN=INIT,BUILD=(1,3))
//SORTIN DD *
14723
Will give you, in vertical hex:
042
173
To use an existing BCD, look at field-type PD0.

Resources