ll.ul4on – Object serialization

This module provides functions for encoding and decoding a lightweight text-based format for serializing the object types supported by UL4.

It is extensible to allow encoding/decoding arbitrary instances (i.e. it is basically a reimplementation of pickle, but with string input/output instead of bytes and with an eye towards cross-plattform support).

There are implementations for Python (this module), Java and Javascript (as part of the UL4 packages for those languages).

Furthermore there’s an Oracle package that can be used for generating UL4ON encoded data.

Basic usage follows the API design of pickle, json, etc. and supports most builtin Python types:

>>> from ll import ul4on
>>> ul4on.dumps(None)
'n'
>>> ul4on.loads('n')
>>> ul4on.dumps(False)
'bF'
>>> ul4on.loads('bF')
False
>>> ul4on.dumps(42)
'i42'
>>> ul4on.loads('i42')
42
>>> ul4on.dumps(42.5)
'f42.5'
>>> ul4on.loads('f42.5')
42.5
>>> ul4on.dumps('foo')
"S'foo'"
>>> ul4on.loads("S'foo'")
'foo'

date, datetime and timedelta objects are supported too:

>>> import datetime
>>> ul4on.dumps(datetime.date.today())
'X i2014 i11 i3'
>>> ul4on.dumps(datetime.datetime.now())
'Z i2014 i11 i3 i18 i16 i45 i314157'
>>> ul4on.loads('X i2014 i11 i3')
datetime.date(2014, 11, 3)
>>> ul4on.loads('Z i2014 i11 i3 i18 i16 i45 i314157')
datetime.datetime(2014, 11, 3, 18, 16, 45, 314157)
>>> ul4on.dumps(datetime.timedelta(days=1))
'T i1 i0 i0'
>>> ul4on.loads('T i1 i0 i0')
datetime.timedelta(1)

ll.ul4on also supports Color objects from ll.color:

>>> from ll import color
>>> ul4on.dumps(color.red)
'C i255 i0 i0 i255'
>>> ul4on.loads('C i255 i0 i0 i255')
Color(0xff, 0x00, 0x00)

Lists, dictionaries and sets are also supported:

>>> ul4on.dumps([1, 2, 3])
'L i1 i2 i3 ]'
>>> ul4on.loads('L i1 i2 i3 ]')
[1, 2, 3]
>>> ul4on.dumps(dict(one=1, two=2))
"D S'two' i2 S'one' i1 }"
>>> ul4on.loads("D S'two' i2 S'one' i1 }")
{'one': 1, 'two': 2}
>>> ul4on.dumps({1, 2, 3})
'Y i1 i2 i3 }'
>>> ul4on.loads('Y i1 i2 i3 }')
{1, 2, 3}

ll.ul4on can also handle recursive data structures:

>>> r = []
>>> r.append(r)
>>> ul4on.dumps(r)
'L ^0 ]'
>>> r2 = ul4on.loads('L ^0 ]')
>>> r2
[[...]]
>>> r2 is r2[0]
True
>>> r = {}
>>> r['recursive'] = r
>>> ul4on.dumps(r)
"D S'recursive' ^0 }"
>>> r2 = ul4on.loads("D S'recursive' ^0 }")
>>> r2
{'recursive': {...}}
>>> r2['recursive'] is r2
True

UL4ON is extensible. It supports serializing arbitrary instances by registering the class with the UL4ON serialization machinery:

from ll import ul4on

@ul4on.register("com.example.person")
class Person:
        def __init__(self, firstname=None, lastname=None):
                self.firstname = firstname
                self.lastname = lastname

        def __repr__(self):
                return f"<Person firstname={self.firstname!r} lastname={self.lastname!r}>"

        def ul4ondump(self, encoder):
                encoder.dump(self.firstname)
                encoder.dump(self.lastname)

        def ul4onload(self, decoder):
                self.firstname = decoder.load()
                self.lastname = decoder.load()

jd = Person("John", "Doe")
output = ul4on.dumps(jd)
print("Dump:", output)
jd2 = ul4on.loads(output)
print("Loaded:", jd2)

This script outputs:

Dump: O S'com.example.person' S'John' S'Doe' )
Loaded: <Person firstname='John' lastname='Doe'>

It is also possible to pass a custom registry to load() and loads():

from ll import ul4on

class Person:
        ul4onname = "com.example.person"

        def __init__(self, firstname=None, lastname=None):
                self.firstname = firstname
                self.lastname = lastname

        def __repr__(self):
                return f"<Person firstname={self.firstname!r} lastname={self.lastname!r}>"

        def ul4ondump(self, encoder):
                encoder.dump(self.firstname)
                encoder.dump(self.lastname)

        def ul4onload(self, decoder):
                self.firstname = decoder.load()
                self.lastname = decoder.load()

jd = Person("John", "Doe")
output = ul4on.dumps(jd)
print("Dump:", output)
jd2 = ul4on.loads(output, {"com.example.person": Person})
print("Loaded:", jd2)

Any type name not found in the registry dict passed in will be looked up in the global registry.

Note

If a class isn’t registered with the UL4ON serialization machinery, you have to set the class attribute ul4onname yourself for serialization to work.

In situations where an UL4ON API is updated frequently it makes sense to be able to update the writing side and the reading side independently. To support this Decoder has a method loadcontent() that is an generator that reads the content of an object from the input and yields those items. For our example class it could be used like this:

from ll import ul4on

class Person:
        ul4onname = "com.example.person"

        def __init__(self, firstname=None, lastname=None):
                self.firstname = firstname
                self.lastname = lastname

        def __repr__(self):
                return f"<Person firstname={self.firstname!r} lastname={self.lastname!r}>"

        def ul4ondump(self, encoder):
                encoder.dump(self.firstname)
                encoder.dump(self.lastname)

        def ul4onload(self, decoder):
                index = -1
                for (index, item) in enumerate(decoder.loadcontent()):
                        if index == 0:
                                self.firstname = item
                        elif index == 1:
                                self.lastname = item
                # Initialize attributes that were not loaded by ``loadcontent``
                if index < 1:
                        self.lastname = None
                        if index < 0:
                                self.firstname = None

output = """o s'com.example.person' s'John' )"""
j = ul4on.loads(output, {"com.example.person": Person})
print("Loaded:", j)
ll.ul4on.register(name)[source]

This decorator can be used to register the decorated class with the ll.ul4on serialization machinery.

name must be a globally unique name for the class. To avoid name collisions Java’s class naming system should be used (i.e. an inverted domain name like com.example.foo.bar).

name will be stored in the class attribute ul4onname.

class ll.ul4on.Encoder(stream, indent=None)[source]

Bases: object

A Encoder is used for serializing an object into an UL4ON dump.

It manages the internal state required for handling backreferences and other stuff.

__init__(stream, indent=None)[source]

Create an encoder for serializing objects to self.stream.

stream must provide a write() method.

dump(obj)[source]

Serialize obj into the stream as an UL4ON formatted dump.

class ll.ul4on.Decoder(stream, registry=None)[source]

Bases: object

A Decoder is used for deserializing an UL4ON dump.

It manages the internal state required for handling backreferences and other stuff.

__init__(stream, registry=None)[source]

Create a decoder for deserializing objects from self.stream.

stream must provide a read() method.

registry is used as a “custom type registry”. It must map UL4ON type names to callables that create new empty instances of those types. Any type not found in registry will be looked up in the global registry (see register()).

load()[source]

Deserialize the next object in the stream and return it.

loadcontent()[source]

Load the content of an object until the “object terminator” is encountered.

This is a generator and might produce fewer or more items than expected. The caller must be able to handle both cases (e.g. by ignoring additional items or initializing missing items with a default value).

The iterator should always be exhausted when it is read, otherwise the stream will be in an undefined state.

ll.ul4on.dumps(obj, indent=None)[source]

Serialize obj as an UL4ON formatted string.

ll.ul4on.dump(obj, stream, indent=None)[source]

Serialize obj as an UL4ON formatted stream to stream.

stream must provide a write() method.

ll.ul4on.loadclob(clob, bufsize=1048576, registry=None)[source]

Deserialize clob (which must be an cx_Oracle CLOB variable containing an UL4ON formatted object) to a Python object.

bufsize specifies the chunk size for reading the underlying CLOB object.

For the meaning of registry see Decoder.__init__().

ll.ul4on.loads(string, registry=None)[source]

Deserialize string (which must be a string containing an UL4ON formatted object) to a Python object.

For the meaning of registry see Decoder.__init__().

ll.ul4on.load(stream, registry=None)[source]

Deserialize stream (which must be file-like object with a read() method containing an UL4ON formatted object) to a Python object.

For the meaning of registry see Decoder.__init__().