相关文章推荐
近视的橙子  ·  python ...·  2 周前    · 
威武的汤圆  ·  Guardian: Rumour that ...·  1 年前    · 
冲动的显示器  ·  java error class is ...·  1 年前    · 

合并YAML文件,在列表元素中使用凌驾值

2 人关注

我想把两个含有列表元素的YAML文件合并起来。(A)和(B)合并成一个新文件(C)。

我想覆盖(A)中的列表条目的现有属性值,如果它们也在(B)中定义。

我想为列表条目添加新的属性,如果它们没有在(A)中定义,但在(B)中定义。

如果(A)中没有,我也想增加(B)的新列表条目。

YAML文件A。

list:
  - id: 1
    name: "name-from-A"
  - id: 2
    name: "name-from-A"

YAML文件 B:

list:
  - id: 1
    name: "name-from-B"
  - id: 2
    title: "title-from-B"
  - id: 3
    name: "name-from-B"
    title: "title-from-B"

我想制作的合并YAML文件(C)。

list:
  - id: 1
    name: "name-from-B"
  - id: 2
    name: "name-from-A"
    title: "title-from-B"
  - id: 3
    name: "name-from-B"
    title: "title-from-B"

我需要在Bash脚本中实现这一功能,但我可以在环境中要求Python。

是否有任何独立的YAML处理器(如yq)可以做到这一点?

我如何在Python脚本中实现这样的东西呢?

1 个评论
到目前为止,你都试过什么?给我们看一些代码!
python
bash
merge
yaml
sola
sola
发布于 2019-10-22
3 个回答
pymym213
pymym213
发布于 2019-10-23
已采纳
0 人赞同

你可以使用 ruamel.yaml python包来做。

如果你已经安装了python,在终端运行以下命令。

pip install ruamel.yaml

python code 适应的 from here. (经测试,工作正常) :

import ruamel.yaml
yaml = ruamel.yaml.YAML()
#Load the yaml files
with open('/test1.yaml') as fp:
    data = yaml.load(fp)
with open('/test2.yaml') as fp:
    data1 = yaml.load(fp)
# dict to contain merged ids
merged = dict()
#Add the 'list' from test1.yaml to test2.yaml 'list'
for i in data1['list']:
    for j in data['list']:
        # if same 'id'
        if i['id'] == j['id']:
            i.update(j)
            merged[i['id']] = True
# add new ids if there is some
for j in data['list']:
    if not merged.get(j['id'], False):
        data1['list'].append(j)
#create a new file with merged yaml
with open('/merged.yaml', 'w') as yaml_file:
    yaml.dump(data1, yaml_file)
    
Cole Tierney
Cole Tierney
发布于 2019-10-23
0 人赞同

你可以合并命令行上传递的yaml文件。

import sys
import yaml
def merge_dict(m_list, s):
    for m in m_list:
        if m['id'] == s['id']:
            m.update(**s)
            return
    m_list.append(s)
merged_list = []
for f in sys.argv[1:]:
    with open(f) as s:
        for source in yaml.safe_load(s)['list']:
            merge_dict(merged_list, source)
print(yaml.dump({'list': merged_list}), end='')

Results:

list:
- id: 1
  name: name-from-B
- id: 2
  name: name-from-A
  title: title-from-B
- id: 3
  name: name-from-B
  title: title-from-B
    
sola
sola
发布于 2019-10-23
0 人赞同

基于这些答案(谢谢大家),我创建了一个解决方案,以一种相当通用的方式处理我需要的ATM的所有合并功能(我需要在很多不同类型的Kubernetes描述符上使用它)。

它是基于Ruamel的。

它处理多级列表,不仅按索引管理合并列表元素,而且还按适当的项目标识管理。

它比我希望的要复杂(它遍历YAML树)。

脚本和核心方法。

import ruamel.yaml
from ruamel.yaml.comments import CommentedMap, CommentedSeq
# Merges a node from B with its pair in A
# If the node exists in both A and B, it will merge
# all children in sync
# If the node only exists in A, it will do nothing.
# If the node only exists in B, it will add it to A and stops
# attrPath DOES NOT include attrName
def mergeAttribute(parentNodeA, nodeA, nodeB, attrName, attrPath):
    # If both is None, there is nothing to merge
    if (nodeA is None) and (nodeB is None):
        return
    # If NodeA is None but NodeB has value, we simply set it in A
    if (nodeA is None) and (parentNodeA is not None):
        parentNodeA[attrName] = nodeB
        return
    if attrPath == '':
        attrPath = attrName
    else:
        attrPath = attrPath + '.' + attrName
    if isinstance(nodeB, CommentedSeq):
        # The attribute is a list, we need to merge specially
        mergeList(nodeA, nodeB, attrPath)
    elif isinstance(nodeB, CommentedMap):
        # A simple object to be merged
        mergeObject(nodeA, nodeB, attrPath)
    else:
        # Primitive type, simply overwrites
        parentNodeA[attrName] = nodeB
# Lists object attributes and merges the attribute values if possible
def mergeObject(nodeA, nodeB, attrPath):
    for attrName in nodeB:
        subNodeA = None
        if attrName in nodeA:
            subNodeA = nodeA[attrName]
        subNodeB = None
        if attrName in nodeB:
            subNodeB = nodeB[attrName]
        mergeAttribute(nodeA, subNodeA, subNodeB, attrName, attrPath)
# Merges two lists by properly identifying each item in both lists
# (using the merge-directives).
# If an item of listB is identified in listA, it will be merged onto the item
# of listA
def mergeList(listA, listB, attrPath):
    # Iterating the list from B
    for itemInB in listB:
        itemInA = findItemInList(listA, itemInB, attrPath)
        if itemInA is None:
            listA.append(itemInB)
            continue
        # Present in both, we need to merge them
        mergeObject(itemInA, itemInB, attrPath)
# Finds an item in the list by using the appropriate ID field defined for that
# attribute-path.
# If there is no id attribute defined for the list, it returns None
def findItemInList(listA, itemB, attrPath):
    if attrPath not in listsWithId:
        # No id field defined for the list, only "dumb" merging is possible
        return None
    # Finding out the name of the id attribute in the list items
    idAttrName = listsWithId[attrPath]
    idB = None
    if idAttrName is not None:
        idB = itemB[idAttrName]
    # Looking for the item by its ID
    for itemA in listA:
        idA = None
        if idAttrName is not None:
            idA = itemA[idAttrName]
        if idA == idB:
            return itemA
    return None
# ------------------------------------------------------------------------------
yaml = ruamel.yaml.YAML()
# Load the merge directives
with open('merge-directives.yaml') as fp:
    mergeDirectives = yaml.load(fp)
listsWithId = mergeDirectives['lists-with-id']
# Load the yaml files
with open('a.yaml') as fp:
    dataA = yaml.load(fp)
with open('b.yaml') as fp:
    dataB = yaml.load(fp)
mergeObject(dataA, dataB, '')
# create a new file with the merged yaml
yaml.dump(dataA, file('c.yaml', 'w'))

辅助配置文件(merge-directives.yaml),指示在(甚至多级)列表中识别元素。

对于原问题中的数据结构,只有'list:"id" '配置项是需要的,但我包括了一些其他的键来演示用法。

# Lists that contain identifiable elements. # Each sub-key is a property path denoting the list element in the YAML # data structure. # The value is the name of the attribute in the list element that # identifies the list element so that pairing can be made. lists-with-id: list: "id" list.sub-list: "id" a.listAttrShared: "name"

还没有进行大量的测试,但这里有两个测试文件,测试的内容比原问题中的更完整。

a.yaml。

attrShared: value-from-a listAttrShared: - name: a1 - name: a2 attrOfAOnly: value-from-a list: - id: 1 name: "name-from-A" sub-list: - id: s1 name: "name-from-A" comments: "doesn't exist in B, so left untouched" - id: s2 name: "name-from-A" sub-list-with-no-identification: - "comment 1" - "comment 2" - id: 2 name: "name-from-A"

b.yaml。

attrShared: value-from-b listAttrShared: - name: b1 - name: b2 attrOfBOnly: value-from-b list: - id: 1 name: "name-from-B" sub-list: - id: s2 name: "name-from-B" title: "title-from-B" comments: "overwrites name in A with name in B + adds title from B" - id: s3 name: "name-from-B" comments: "only exists in B so added to A's list" sub-list-with-no-identification: - "comment 3" - "comment 4" - id: 2