How to obtain an xmlBuf for use with xmlBufGetNodeContent/xmlBufNodeDump etc.? - libxml2

The description of xmlNodeDump() (which takes an xmlBufferPtr) states:
Since this is using xmlBuffer structures it is limited to 2GB and somehow deprecated, use xmlBufNodeDump() instead.
Fair enough, but since xmlBufNodeDump() (and e.g. xmlBufGetNodeContent) takes an xmlBufPtr my question: How to create such an xmlBuf buffer?
I can't find anything obvious.
And how to free it?
I mean, for xmlBuffer there is e.g. xmlBufferCreate().
Or is one supposed to obtain an xmlOutputBuffer via xmlAllocOutputBuffer() and use its xmlOutputBuffer::buffer attribute (of type xmlBuf*) for xmlBufNodeDump()?

For those who need an answer:
xmlBufferPtr buffer;
buffer = xmlBufferCreate();
xmlBufPtr buf;
buf = xmlBufFromBuffer(buffer);

Related

Resize HDF5 dataset in Julia

Is there a way to resize a chunked dataset in HDF5 using Julia's HDF5.jl? I didn't see anything in the documentation. Looking through the source, all I found was set_dims!(), but that cannot extend a dataset (only shrink it). Does HDF5.jl have the ability to enlarge an existing (chunked) dataset? This is a very important feature for me, and I would rather not have to call into another language.
The docs have a brief mention of extendible dimensions in hdf5.md excerpted below.
You can use extendible dimensions,
d = d_create(parent, name, dtype, (dims, max_dims), "chunk", (chunk_dims), [lcpl, dcpl, dapl])
set_dims!(d, new_dims)
where dims is a tuple of integers. For example
b = d_create(fid, "b", Int, ((1000,),(-1,)), "chunk", (100,)) #-1 is equivalent to typemax(Hsize)
set_dims!(b, (10000,))
b[1:10000] = [1:10000]
I believe I've got it figured out. The issue is that I forgot to give the dataspace a large enough max_dims. Doing that required digging into the lower-level API. The solution I found was:
dspace = HDF5.dataspace((6,20)::Dims, max_dims=(6,typemax(Int64)))
dtype = HDF5.datatype(Float64)
dset = HDF5.d_create(prt, "trajectory", dtype, dspace, "chunk", (6,10))
Once I created a dataset that can be resized appropriately, the set_dims! function resizes the dataset correctly.
I think I located a few minor issues with the API, which I had to work around or change in my local version. I will get in touch with the HDF5.jl owner regarding those. For those interested:
The constant H5S_UNLIMITED is of type Uint64, but the dataspace function will only accept tuples of Int64, hence why I used typemax(Int64) for my max_dims to imitate how H5S_UNLIMITED is derived.
The form of d_create which I used calls h5d_create incorrectly; it passes parent instead of checkvalid(parent).id (can be seen by comparison with other forms of d_create).

libxml2 XML_PARSE_HUGE option for xmlParseMemory

C++ on Centos 6.4, libxml2.x86_64 2.7.6-12.el6_4.1:
I'm trying to fix an old C++ program that occasionally gets XML parser errors on large xml files, seems to need the XML_PARSE_HUGE option set. But I can't see any place to set it! The code that's failing is using the xmlParseMemory function which only has 2 parameters - the char array to parse and its size.
Is there some way to set the XML_PARSE_HUGE option globally?
You have to switch to xmlReadMemory which has an options parameter. Simply convert calls like
xmlParseMemory(buffer, size);
to
xmlReadMemory(buffer, size, NULL, NULL, XML_PARSE_HUGE);
(I think xmlParseMemory predates the parser options and is only retained for backward compatibility. Also see this question.)

GnuPG 1.4 RSA: Where's the Padding?

In an effort to better understand RSA I've been fooling around with the source code for GunPG 1.4, specifically the RSA implementation in the rsa.c file. As the title says, I can't figure out where the padding is happening.
So typically in RSA, padding is done right before the encryption and is taken off during the decryption. Encryption first starts around line 409 where we see
int
rsa_encrypt( int algo, MPI *resarr, MPI data, MPI *pkey )
{
RSA_public_key pk;
if( algo != 1 && algo != 2 )
return G10ERR_PUBKEY_ALGO;
pk.n = pkey[0];
pk.e = pkey[1];
resarr[0] = mpi_alloc( mpi_get_nlimbs( pk.n ) );
public( resarr[0], data, &pk );
return 0;
}
That seems easy, it's giving data to "public" function higher up on line 220. Public is responsible for calculating the important (c = m^e mod n) process. That all looks like:
static void
public(MPI output, MPI input, RSA_public_key *pkey )
{
if( output == input ) { /* powm doesn't like output and input the same */
MPI x = mpi_alloc( mpi_get_nlimbs(input)*2 );
mpi_powm( x, input, pkey->e, pkey->n );
mpi_set(output, x);
mpi_free(x);
}
else
mpi_powm( output, input, pkey->e, pkey->n );
}
Wait a second...now it looks like public is passing the job of that calculation off to mpi_powm() located in the mpi-pow.c file. I'll spare you the details but that function gets really long.
Somewhere in all of this some sort of PKCS#1 padding and unpadding (or something similar) is happening but I can't figure out where for the life of me. Can anyone help me see where the padding happens?
In an effort to better understand RSA I've been fooling around with the source code for GnuPG 1.4, specifically the RSA implementation in the rsa.c file.
Since you’re looking at the older (< 2.0) stuff anyway, and since it’s only for learning purposes, I would rather advise you to check out “ye olde rsaref.c from gnupg.org” where the padding is implemented in a pretty obvious way.
… some sort of PKCS#1…
To be exact, GnuPG uses PKCS #1 v1.5 (specified in RFC 4880).
Can anyone help me see where the padding happens?
Hmmm, let’s see if I can wrap that up somewhat logically. GnuGP pads according to PKCS #1 v1.5, so it just adds random pad to satisfy length requirements.
If you take a look at the cipher/pubkey.c file (which includes the rsa.h file in its head), you’ll notice a pubkey_table_s struct which defines a list of elements that define the key. For padding reasons, random bytes are appended to that list (better: after that struct). It’s done that way because those random bytes can easily be stripped by looking for the end of the list. Keeping a long story short, that’s where random.c probably starts to make a bit more sense to you. Now, all that stuff (and a whole lot more) is compiled into a lib called libcipher… which in itself is compiled to be used by functions that add the padding and handle the RSA stuff the way you expected it. In the end, the compiled executables use the functions libcipher provides to take care of the padding – depending on the individual need for padding.
So what you currently expect to find in 1 or 2, maybe 3 files is actually spread out across more than half a dozen files… which I regard not to be the best base for your learning efforts. As said, for reference purposes, I’ld go for the old rsaref.c they once started out with.
Not sure if this actually provides all the details you wanted to get, but it should give you a first good heads-up… hope it helps.
GPG 1.4 doesn't use any padding at all. It encrypts the raw session key.

Arduino program code in .data

It seems as though some of my functions are being placed into the .data section. This is for a library that has classes.
I've looked at the memory map as suggested here:
http://www.nongnu.org/avr-libc/user-manual/group_demo_project.html
I've also been using avr-size to see the size of the .data and .text questions.
Any ideas why the program code is getting placed in .data and not .text?
I think I figured out what was happening.
It only looked like code was going into the .data section. What was actually going in there were the char * from debug messages and thus taking up large fractions of space.
For example, I had a bunch of Serial.println("debug message that is a long string."); An easy way around this for Serial.println is the use of the F() macro, which stores the string in FLASH instead of RAM (.data section I was seeing).
Also, this link provides some good info on memory conservation of strings:
http://arduino.cc/en/Reference/PROGMEM

Unix write() function (libc)

I am making a C application in Unix that uses raw tty input.
I am calling write() to characters on the display, but I want to manipulate the cursor:
ssize_t
write(int d, const void *buf, size_t nbytes);
I've noticed that if buf has the value 8 (I mean char tmp = 8, then passing &tmp), it will move the cursor/pointer backward on the screen.
I was wondering where I could find all the codes, for example, I wish to move the cursor forward but I cannot seem to find it via Google.
Is there a page that lists all the code for the write() function please?
Thank you very much,
Jary
8 is just the ascii code for backspace. You can type man ascii and look at all the values (the man page on my Ubuntu box has friendlier names for the values). If you want to do more complicated things you may want to look at a library like ncurses.
You have just discovered that character code 8 is backspace (control-H).
You would probably be best off using the curses library to manage the screen. However, you can find out what control sequences curses knows about by using infocmp to decompile the terminfo entry for your terminal. The format isn't particularly easy to understand, but it is relatively comprehensive. The alternative is to find a manual for the terminal, which tends to be rather hard.
For instance, I'm using a color Xterm window; infocmp says:
# Reconstructed via infocmp from file: /usr/share/terminfo/78/xterm-color
xterm-color|nxterm|generic color xterm,
am, km, mir, msgr, xenl,
colors#8, cols#80, it#8, lines#24, ncv#, pairs#64,
acsc=``aaffggiijjkkllmmnnooppqqrrssttuuvvwwxxyyzz{{||}}~~,
bel=^G, bold=\E[1m, clear=\E[H\E[2J, cr=^M,
csr=\E[%i%p1%d;%p2%dr, cub=\E[%p1%dD, cub1=^H,
cud=\E[%p1%dB, cud1=^J, cuf=\E[%p1%dC, cuf1=\E[C,
cup=\E[%i%p1%d;%p2%dH, cuu=\E[%p1%dA, cuu1=\E[A,
dch=\E[%p1%dP, dch1=\E[P, dl=\E[%p1%dM, dl1=\E[M, ed=\E[J,
el=\E[K, enacs=\E)0, home=\E[H, ht=^I, hts=\EH, il=\E[%p1%dL,
il1=\E[L, ind=^J,
is2=\E[m\E[?7h\E[4l\E>\E7\E[r\E[?1;3;4;6l\E8, kbs=^H,
kcub1=\EOD, kcud1=\EOB, kcuf1=\EOC, kcuu1=\EOA,
kdch1=\E[3~, kf1=\E[11~, kf10=\E[21~, kf11=\E[23~,
kf12=\E[24~, kf13=\E[25~, kf14=\E[26~, kf15=\E[28~,
kf16=\E[29~, kf17=\E[31~, kf18=\E[32~, kf19=\E[33~,
kf2=\E[12~, kf20=\E[34~, kf3=\E[13~, kf4=\E[14~,
kf5=\E[15~, kf6=\E[17~, kf7=\E[18~, kf8=\E[19~, kf9=\E[20~,
kfnd=\E[1~, kich1=\E[2~, kmous=\E[M, knp=\E[6~, kpp=\E[5~,
kslt=\E[4~, meml=\El, memu=\Em, op=\E[m, rc=\E8, rev=\E[7m,
ri=\EM, rmacs=^O, rmcup=\E[2J\E[?47l\E8, rmir=\E[4l,
rmkx=\E[?1l\E>, rmso=\E[m, rmul=\E[m,
rs2=\E[m\E[?7h\E[4l\E>\E7\E[r\E[?1;3;4;6l\E8, sc=\E7,
setab=\E[4%p1%dm, setaf=\E[3%p1%dm, sgr0=\E[m, smacs=^N,
smcup=\E7\E[?47h, smir=\E[4h, smkx=\E[?1h\E=, smso=\E[7m,
smul=\E[4m, tbc=\E[3g, u6=\E[%i%d;%dR, u7=\E[6n,
u8=\E[?1;2c, u9=\E[c,
That contains information about box drawing characters, code sequences generated by function keys, various cursor movement sequences, and so on.
You can find out more about X/Open Curses (v4.2) in HTML. However, that is officially obsolete, superseded by X/Open Curses v7, which you can download for free in PDF.
If you're using write just so you have low-level cursor control, I think you are using the wrong tool for the job. There are command codes for many types of terminal. VT100 codes, for example, are sequences of the form "\x1b[...", but rather than sending raw codes, you'd be much better off using a library like ncurses.

Resources