YAML ordered data

Hello devs,

I need to record a scene hierarchy to a YAML file and then replicate it in the exact order in Maya. The real data is more complex tree and iteration is recursive function, but this is simplified version:

import yaml
dictionary = {
    "PARENT": {"B": {}, "A": {}, "C": {}, "D": {}}
    }

data = yaml.safe_load(yaml.dump(dictionary))

for key_parent, value_parent in data.iteritems():
    for key_child, value_child in value_parent.iteritems():
        print key_child

The result is A C B D. Why I am not getting B A C D? Is there a way to control this?

Also, if I print yaml.dump(dictionary) I will get:

PARENT:
  A: {}
  B: {}
  C: {}
  D: {}

which looks like an alphabetically sorted source dictionary. Why during iteration I don’t get sorted output?

In Python, dictionaries are arbitrarily ordered, so when you define dictionary, the B, A, C, D ordering gets lost.
If you want to look up how dictionaries work, the generic term for them is a “Hash Map” or “Hash table”: https://en.wikipedia.org/wiki/Hash_table

Luckily, python includes an OrderedDict class that you can use. There are some good examples how to use them in the documentation here:

1 Like

So parsing YAML in Pyhton is the same as a parsing dictionary?
OrderedDict should be used for YAML generation as well, or just reading would be enough?

P.S. I know what is a Hash Table, and how to implement it, for example in VEX :slight_smile:

I don’t know my way around YAML very well, but from what I can find, yes.
I’ve found in a StackOverflow question, and a pyyaml PR that the yaml spec does not guarantee an order of keys in a mapping.

I found in the PyYAML PR that it automatically sorts the keys when you run .dump(), so the YAML file itself is already in a different order.
Then I read the SO accepted answer, and it explains how to create an extension that preserves order.

All that said fact that you are creating dictionary as a dict in your example means that the keys aren’t in the order you’re defining before you even dump them to YAML.

FWIW, Python 3.6+ (I think it’s .6, somewhere in there) does now order dictionaries by default. But no idea how YAML handles that.

Yeah, 3.6 is when they started tracking insertion order by default. There are still some behavioral differences between dict and OrderedDict but for simple things they are pretty interchangeable.

Thank you, guys, it was a bit more sophisticated than I imagine:

# Dump/load ordered dictionary

import yaml
from collections import OrderedDict


def iterate(data):
    for key_parent, value_parent in data.iteritems():
        for key_child, value_child in value_parent.iteritems():
            print key_child


def dump_ordered(dictionary):
    """
    Dump ordered dictionary
    """
    yaml.add_representer(OrderedDict,
                         lambda dumper,
                         data: dumper.represent_mapping('tag:yaml.org,2002:map',
                         data.items()))

    return yaml.dump(dictionary)


def load_ordered(yaml_data):
    """
    Load yaml data to ordered dictionary
    """
    class OrderedLoader(yaml.SafeLoader):
        pass

    def construct_mapping(loader, node):
        loader.flatten_mapping(node)
        return OrderedDict(loader.construct_pairs(node))

    OrderedLoader.add_constructor(
        yaml.resolver.BaseResolver.DEFAULT_MAPPING_TAG,
        construct_mapping)

    return yaml.load(yaml_data, OrderedLoader)


# Build an ordered dictionary
dictionary = OrderedDict()
value = OrderedDict()
value['B'] = {}
value['A'] = {}
value['C'] = {}
value['D'] = {}
dictionary['PARENT'] = value

# Dump, load and print dictionary
yaml_data = dump_ordered(dictionary)
yaml_data = load_ordered(yaml_data)
iterate(yaml_data)
1 Like