Exobiont Systems

⌂ Home

Ladder Types

In order to implement complex datastructures and algorithms, usually many layers of abstraction are built ontop of each other. Consequently higher-level data types are encoded into lower-level data types, forming a chain of embeddings from concept to `rock bottom' of byte streams. While a high-level type makes claims about the semantics of objects of that type, high-level types are ambiguous in regard to their concrete syntactical representation or memory layout. However for compositions to be type-safe, compatibility of concrete represenations must be ensured.

For example in the unix shell, many different tools & utilities coexist. Depending on the application domain, each of them will potentially make use of different representational forms for the same abstract concepts. E.g. for the concept 'natural number', many representations do exist, e.g. with variation over radices, endianness, digit encoding etc.

Intuitively, ladder types provide a way to distinguish between multiple concrete representations of the same abstract / conceptual type, by capturing the represented-as of layered data formats in the structure of type-terms. Formally, we introduce a new type constructor, called the ladder type, written T1 ~ T2, where T1 and T2 are types. The type-term T1 ~ T2 then expresses the abstract type of T1 being represented in terms of the concrete type T2, which can be read by "T1 represented as T2".

Example

The following type describes a colon-separated sequence of timepoints, each represented as unix-timestamp written as decimal number in big-endian, encoded as UTF-8 string.

                                 #  We start with the conceptual type,
                                 #  which is a sequence (indicated by `[...]`)
[   TimePoint                    #  of timepoints, each given as UNIX timestamp,
  ~ <TimeSince UnixEpoch>        #  which counts time from the `UnixEpoch`
  ~ Duration                     #  as a duration,
  ~ SI.Seconds                   #  given in seconds.
  ~ ℝ                            #  Durations of seconds are real numbers
  ~ <QuantizedLinear 0 1 1>      #  that we quantize with one integer
  ~ ℕ                            #  step for the range [0.0 .. 1.0), i.e. `floor`.
  ~ <PosInt 10 BigEndian>        #  Those integer numbers are represented in positional big-endian,
  ~ [   <Digit 10>               #  given as a string of decimal digits,
      ~ Char       ]             #  where each decimal digit is represented by a character.
]                                #  The sequence of timepoints is also a sequence of strings
~ <SepSeq Char ':'>              #  that we embed into a single string by the ':' separator.
~ [Char]                         #  Finally, this string of unicode characters
~ UTF-8                          #  is encoded by the UTF-8 scheme
~ [Byte]                         #  to reach rock-bottom of a byte sequences.

An object that fits the format described by this type could look like this:

1696093021:1696093039:1528324679:1539892301:1638141920:1688010253