The tree in-depth
The ASDF tree, being encoded in YAML, is built out of the basic structures common to most dynamic languages: mappings (dictionaries), sequences (lists), and scalars (strings, integers, floating-point numbers, booleans, etc.). All of this comes “for free” by using YAML.
Since these core data structures on their own are so flexible, the
ASDF standard includes a number of schema that define the structure of
higher-level content. For instance, there is a schema that defines
how n-dimensional array data should be
described. These schema are written in a language called
YAML Schema which is just a thin extension of JSON Schema,
Draft 4. (Such
extensions are allowed and even encouraged by the JSON Schema
standard, which defines the $schema
attribute as a place to
specify which extension is being used.) ASDF Schemas contains an overview of
how schemas are defined and used by ASDF. ASDF Standard Schema Definitions describes in detail
all of the schemas provided by the ASDF Standard. reference to all of schemas
in detail.
YAML subset
For reasons of portability, some features of YAML 1.1 are not permitted in an ASDF tree.
Restricted mapping keys
YAML itself places no restrictions on the object type used as a mapping key; floats, sequences, even mappings themselves can serve as a key. For example, the following is a perfectly valid YAML document:
%YAML 1.1
---
{foo: bar}:
3.14159: baz
[1, 2, 3]: qux
...
However, such a file may not be easily parsed in all languages. Python, for example, does not include a hashable mapping type, so the two major Python YAML libraries both fail to construct the object described by this document. Floating-point keys are described as “not recommended” in the YAML 1.1 spec because YAML does not specify an accuracy for floats.
For these reasons, mapping keys in ASDF trees are restricted to the following scalar types:
bool
int
str
References
It is possible to directly reference other items within the same tree or within the tree of another ASDF file. This functionality is based on two IETF standards: JSON Pointer (IETF RFC 6901) and JSON Reference (Draft 3).
A reference is represented as a mapping (dictionary) with a single
key/value pair. The key is always the special keyword $ref
and the
value is a URI. The URI may contain a fragment (the part following
the #
character) in JSON Pointer syntax that references a specific
element within the external file. This is a /
-delimited path
where each element is a mapping key or an array index. If no fragment
is present, the reference refers to the top of the tree.
Note
JSON Pointer is a very simple convention. The only wrinkle is that
because the characters '~'
(0x7E) and '/'
(0x2F) have
special meanings, '~'
needs to be encoded as '~0'
and
'/'
needs to be encoded as '~1'
when these characters
appear in a reference token.
When these references are resolved, this mapping should be treated as having the same logical content as the target of the URI, though the exact details of how this is performed is dependent on the implementation, i.e., a library may copy the target data into the source tree, or it may insert a proxy object that is lazily loaded at a later time.
For example, suppose we had a given ASDF file containing some shared
reference data, available on a public webserver at the URI
http://www.nowhere.com/reference.asdf
:
wavelengths:
- !core/ndarray
source: 0
shape: [256, 256]
datatype: float
byteorder: little
Another file may reference this data directly:
reference_data:
$ref: "http://www.nowhere.com/reference.asdf#/wavelengths/0"
It is also possible to use references within the same file:
data: !core/ndarray
source: 0
shape: [256, 256]
datatype: float
byteorder: little
mask:
$ref: "#/my_mask"
my_mask: !core/ndarray
source: 0
shape: [256, 256]
datatype: uint8
byteorder: little
Reference resolution should be performed after the entire tree is read, therefore forward references within the same file are explicitly allowed.
Note
The YAML 1.1 standard itself also provides a method for internal references called “anchors” and “aliases”. It does not, however, support external references. While ASDF does not explicitly disallow YAML anchors and aliases, since it explicitly supports all of YAML 1.1, their use is discouraged in favor of the more flexible JSON Pointer/JSON Reference standard described above.
Numeric literals
Integers represented as string literals in the ASDF tree must be no more than
64-bits. Due to ndarray
types in
Numpy, this is further restricted to
ranges defined for signed 64-bit integers (int64), not unsigned 64-bit integers
(uint64).
Null values
YAML permits serialization of null values using the null
literal:
some_key: null
Previous versions of the ASDF Standard were vague as to how nulls should
be handled, and the Python reference implementation did not distinguish
between keys with null values and keys that were missing altogether (and
in fact, removed any keys assigned None
from the tree on read or
write). Beginning with ASDF Standard 1.6.0, ASDF implementatations
are required to preserve keys even if assigned null values. This
requirement does not extend back into previous versions, and users
of the Python implementation should be advised that the YAML portion
of a < 1.6.0 ASDF file containing null values may be modified in unexpected
ways when read or written.
Comments
It is quite common in FITS files to see comments that describe the purpose of the key/value pair. For example:
Bringing this convention over to ASDF, one could imagine:
It should be obvious from the examples that these kinds of comments, describing the global meaning of a key, are much less necessary in ASDF. Since ASDF is not limited to 8-character keywords, the keywords themselves can be much more descriptive. But more importantly, the schema for a given key/value pair describes its purpose in detail. (It would be quite straightforward to build a tool that, given an entry in a YAML tree, looks up the schema’s description associated with that entry.) Therefore, the use of comments to describe the global meaning of a value are strongly discouraged.
However, there still may be cases where a comment may be desired in ASDF, such as when a particular value is unusual or unexpected. The YAML standard includes a convention for comments, providing a handy way to include annotations in the ASDF file:
Unfortunately, most YAML parsers will simply throw these comments out and do not provide any mechanism to retain them, so reading in an ASDF file, making some changes, and writing it out will remove all comments. Even if the YAML parser could be improved or extended to retain comments, the YAML standard does not define which values the comments are associated with. In the above example, it is only by standard reading conventions that we assume the comment is associated with the content following it. If we were to move the content, where should the comment go?
To provide a mechanism to add user comments without swimming upstream against the YAML standard, we recommend a convention for associating comments with objects (mappings) by using the reserved key name
//
. In this case, the above example would be rewritten as:ASDF parsers must not interpret or react programmatically to these comment values: they are for human reference only. No schema may use
//
as a meaningful key.