Skip to content Skip to sidebar Skip to footer

Search And Remove Element With Elementtree In Python

I have an XML document in which I want to search for some elements and if they match some criteria I would like to delete them However, I cannot seem to be able to access the paren

Solution 1:

You can remove child elements with the according remove method. To remove an element you have to call its parents remove method. Unfortunately Element does not provide a reference to its parents, so it is up to you to keep track of parent/child relations (which speaks against your use of elem.findall())

A proposed solution could look like this:

root=elem.getroot()for child in root:ifchild.name!="prop":continueifTrue:#TODO:doyourcheckhere!root.remove(child)

PS: don't use prop.attrib.get(), use prop.get(), as explained here.

Solution 2:

You could use xpath to select an Element's parent.

file = open('test.xml', "r")
elem = ElementTree.parse(file)

namespace = "{http://somens}"

props = elem.findall('.//{0}prop'.format(namespace))
for prop in props:
    type = prop.get('type', None)
    iftype == 'json':
        value = json.loads(prop.attrib['value'])
        if value['name'] == 'Page1.Button1':
            # Get parent andremove this prop
            parent = prop.find("..")
            parent.remove(prop)

http://docs.python.org/2/library/xml.etree.elementtree.html#supported-xpath-syntax

Except if you try that it doesn't work: http://elmpowered.skawaii.net/?p=74

So instead you have to:

file = open('test.xml', "r")
elem = ElementTree.parse(file)

namespace = "{http://somens}"
search = './/{0}prop'.format(namespace)

# Use xpath to get all parents of props    
prop_parents = elem.findall(search + '/..')
for parent in prop_parents:
    # Still have to find and iterate through child propsfor prop in parent.findall(search):
        type = prop.get('type', None)
        iftype == 'json':
            value = json.loads(prop.attrib['value'])
            if value['name'] == 'Page1.Button1':
                parent.remove(prop)

It is two searches and a nested loop. The inner search is only on Elements known to contain props as first children, but that may not mean much depending on your schema.

Solution 3:

I know this is an old thread but this kept popping up while I was trying to figure out a similar task. I did not like the accepted answer for two reasons:

1) It doesn't handle multiple nested levels of tags.

2) It will break if multiple xml tags are deleted in the same level one-after-another. Since each element is an index of Element._children you shouldn't delete while forward iterating.

I think a better more versatile solution is this:

import xml.etree.ElementTree as et
file = 'test.xml'
tree = et.parse(file)
root = tree.getroot()

defiterator(parents, nested=False):
    for child inreversed(parents):
        if nested:
            iflen(child) >= 1:
                iterator(child)
        ifTrue:  # Add your entire condition here
            parents.remove(child)

iterator(root, nested=True)

For the OP, this should work - but I don't have the data you're working with to test if it's perfect.

import xml.etree.ElementTree as et
file = 'test.xml'
tree = et.parse(file)

namespace = "{http://somens}"
props = tree.findall('.//{0}prop'.format(namespace))

defiterator(parents, nested=False):
    for child inreversed(parents):
        if nested:
            iflen(child) >= 1:
                iterator(child)
        if prop.attrib.get('type') == 'json':
            value = json.loads(prop.attrib['value'])
            if value['name'] == 'Page1.Button1':
                parents.remove(child)

iterator(props, nested=True)

Solution 4:

Using the fact that every child must have a parent, I'm going to simplify @kitsu.eb's example. f using the findall command to get the children and parents, their indices will be equivalent.

    file = open('test.xml', "r")
    elem = ElementTree.parse(file)

    namespace = "{http://somens}"
    search = './/{0}prop'.format(namespace)

    # Use xpath to get all parents of props    
    prop_parents = elem.findall(search + '/..')

    props = elem.findall('.//{0}prop'.format(namespace))
    for prop in props:
            type = prop.attrib.get('type', None)
            iftype == 'json':
                value = json.loads(prop.attrib['value'])
                if value['name'] == 'Page1.Button1':
                    #use the index of the current child to find#its parent and remove the child
                    prop_parents[props.index[prop]].remove(prop)

Solution 5:

A solution using lxml module

from lxml import etree

root = ET.fromstring(xml_str)
for e in root.findall('.//{http://some.name.space}node'):
parent = e.getparent()
for child in parent.find('./{http://some.name.space}node'):
    try:
        parent.remove(child)
    except ValueError:
        pass

Post a Comment for "Search And Remove Element With Elementtree In Python"