Split Dictionary Of Lists Into List Of Dictionaries

June 28, 2023 Post a Comment

What i need to do is to convert something like this {'key1': [1, 2, 3], 'key2': [4, 5, 6]} into [{'key1': 1, 'key2': 4}, {'key1': 2, 'key2': 5}, {'key1': 3, 'key2': 6}] The lengt

Solution 1:

Works for any number of keys

>>> map(dict, zip(*[[(k, v) for v in value] for k, value in d.items()]))
[{'key2': 4, 'key1': 1}, {'key2': 5, 'key1': 2}, {'key2': 6, 'key1': 3}]

For example:

d = {'key3': [7, 8, 9], 'key2': [4, 5, 6], 'key1': [1, 2, 3]}

>>> map(dict, zip(*[[(k, v) for v in value] for k, value in d.items()]))
[{'key3': 7, 'key2': 4, 'key1': 1}, {'key3': 8, 'key2': 5, 'key1': 2}, {'key3': 9, 'key2': 6, 'key1': 3}]

A general solution that works on any number of values or keys: (python2.6)

>>> from itertools import izip_longest
>>> d = {'key2': [3, 4, 5, 6], 'key1': [1, 2]}
>>> map(lambda a: dict(filter(None, a)), izip_longest(*[[(k, v) for v in value] for k, value in d.items()]))
[{'key2': 3, 'key1': 1}, {'key2': 4, 'key1': 2}, {'key2': 5}, {'key2': 6}]

And if you don't have python2.6:

>>> d = {'key2': [3, 4, 5, 6], 'key1': [1, 2]}
>>> map(lambda a: dict(filter(None, a)), map(None, *[[(k, v) for v in value] for k, value in d.items()]))
[{'key2': 3, 'key1': 1}, {'key2': 4, 'key1': 2}, {'key2': 5}, {'key2': 6}]

Solution 2:

Assuming the number of keys, and values per key, are both arbitrary and a priori unknown, it's simplest to get the result with for loops, of course:

  itit = thedict.iteritems()
  k, vs = next(itit)
  result = [{k: v} forvin vs]
  fork, vs in itit:
    ford, v in itertools.izip(result, vs):
      d[k] = v

It can be collapsed, but I'm dubious about the performance implications of doing so (if the data structures involved are so huge as to warrant performance optimization, building any extra auxiliary structure in memory beyond what's strictly required may turn out costly -- this simple approach of mine is being especially careful to avoid any such intermediate structures).

Edit: another alternative, particularly interesting if the overall data structures are huge but in some use cases you may only need "bits and pieces" of the "transformed" structure, is to build a class that provides the interface you require, but does so "on the fly", rather than in a "big bang", "once and for all" transformation (this might be especially helpful if the original structure can change and the transformed one needs to reflect the present state of the original, etc, etc).

Of course, for such a purpose it's very helpful to identify exactly what features of a "list of dictionaries" your downstream code would use. Suppose for example that all you need is actually "read-only" indexing (not changing, iterating, slicing, sorting, ...): X[x] must return a dictionary in which each key k maps to a value such that (caling O the original dictionary of lists) X[x][k] is O[k][x]. Then:

classWrap1(object):def__init__(self, O):
    self.O = O
  def__getitem__(self, x):
    return dict((k, vs[x]) for k, vs inself.O.iteritems())

If you don't in fact need the wrapped structure to track modifications to the original one, then __getitem__ might well also "cache" the dict it's returning:

classWrap2(object):
  def__init__(self, O):
    self.O = O
    self.cache = {}
  def__getitem__(self, x):
    r = self.cache.get(x)
    if r isNone:
      r = self.cache[x] = dict((k, vs[x]) for k, vs in self.O.iteritems())
    return r

Note that this approach may end up with some duplication in the cache, e.g., if O's lists have 7 items each, the cache at x==6 and x==-1 may end up with two equal dicts; if that's a problem you can, for example, normalize negative xs in __getitem__ by adding len(self.O) to them before proceeding.

If you also need iteration, as well as this simple indexing, that's not too hard: just add an __iter__ method, easily implemented e.g. as a simple generator...:

def__iter__(self, x):
    for i in xrange(len(self.O)):
      yield self[i]

And so forth, incrementally, if and as you need more and more of a list's functionality (at worst, once you have implemented this __iter__, you can build self.L = list(self) -- reverting to the "big bang" approach -- and, for any further request, punt to self.L... but you'll have to make a special metaclass if you want to take that approach for special methods as well, or use some subtler trick such as self.__class__ = list; self[:] = self.L followed by appropriate dels;-).

Solution 3:

If there are always two keys you can use:

[{'key1':a, 'key2':b} for (a,b) in zip(d['key1'], d['key2'])]

Solution 4:

>>> a = {'key1': [1, 2, 3], 'key2': [4, 5, 6]}
>>> [dict((key, a[key][i]) for key in a.keys()) for i inrange(len(a.values()[0]))]
[{'key2': 4, 'key1': 1}, {'key2': 5, 'key1': 2}, {'key2': 6, 'key1': 3}]

Solution 5:

d = {'key1': [1, 2, 3], 'key2': [4, 5, 6]}

keys = d.keys()
vals = zip(*[d[k] for k in keys])
l = [dict(zip(keys, v)) for v in vals]
print l

produces

[{'key2': 4, 'key1': 1}, {'key2': 5, 'key1': 2}, {'key2': 6, 'key1': 3}]

Python Development