tech-userlevel archive

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]

Re: A Library for Converting Data to and from C Structs for Lua



On Sat, 23 Nov 2013 11:46:19 -0200
Lourival Vieira Neto <lourival.neto%gmail.com@localhost> wrote:
> On Sat, Nov 23, 2013 at 1:22 AM, James K. Lowden
> <jklowden%schemamania.org@localhost> wrote:
> > On Mon, 18 Nov 2013 09:07:52 +0100
> > Marc Balmer <marc%msys.ch@localhost> wrote:
> >
> >> After discussion we lneto@ and others we realised that there are
> >> several such libraries around, and that I as well as lneto@ wrote
> >> one.  SO we decided to merge our works 
> >
> > How do you deal with the usual issues of alignment and endianism?
> 
> d = data.new{0xF0, 0xFF, 0x00} -- creates a new data object with 3
> bytes. d:layout{
>   x = { __offset = 0, __length = 3 },
>   y = { __offset = 8, __length = 16, __endian = 'net' },
>   z = { __offset = 0, __step = 9 }
> }
> 
> d.x -- returns the 3 most significant bits from d (that is, 7)
> d.y -- returns 16 bits counting from bit-8 most significant.
>       -- in this case, these 2 bytes are converted using ntohs(3),
> that is 0xFF00.
> d.z[1] -- returns the 9 most significant bits from d (that is, 0x1E1).

Hi Lourival, 

Thanks for your answer.  A few questions and observations, if I may.  

1.  What is the significance of the leading underscores?  
2.  I assume you mean d.x represents the three *least* significant bits.

I don't understand "step", not that it matters.  For purposes of
extracting/packing values in a buffer, offset and length are all you
need.  

Semantics require a type system for the bit patterns.  I guess "y" is
implied to be a 16-bit integer, since it has endianism, but its
signedness is unspecified.  I suggest you enumerate all types you will
support, and that that set encompass all types that a C compiler can
generate.  If you include an "ignore" type (cf. Perl's pack/unpack
functions), you can drop "offset" from your description, for which
you'll be glad eventually.  

For purposes of binary transfer, host endianism is unimportant; what
matters is the endianism of the wire format.  TCP/IP uses big-endian
format by definition.  ISTM that should be your default, too, else the
same code compiled on two different machines means two different
things.  

A 2-byte integer starting at a 5-bit offset is weird for a
byte-addressable machine.  I don't see a need to support bitfields
unless you have an existing use case; bit arrays can always be
transmitted as character arrays, which after all is how they appear in
memory.  By "alignment" I was asking about padding and offsets in data
structures that the C language leaves up to the implementation.  

> > (BTW, I'm curious what unstructured binary data might be.)
> 
> I don't think that Marc meant 'structured binary data' in opposition
> to 'unstructured binary data'. We both are working with structured
> data. The main difference is that I'm working on random accessing and
> he is working on encoding/decoding data at once (e.g., x, y, z =
> d:extract(fmt)).

Yes, "structured binary data" is overdressed for the party, makes it
sound like more than it is.  It's just binary data (or, perhaps, binary
data structures).  

For your extract format ("fmt"), you might want to consider the gdb
x/fmt command because it encompasses everything you could need and is
the soul of brevity.  

As far as I can tell, by the way, you're reinventing part of ASN.1.
Nothing wrong with that, in and of itself; perhaps you can create
something more convenient to use.  But you might want to use it as a
reference for functionality, and be ready to explain why your library
should be used instead.  

HTH.  

--jkl


Home | Main Index | Thread Index | Old Index