文章詳情頁(yè)

python解析xml文件方式(解析、更新、寫入)

瀏覽：86日期：2022-08-03 16:58:22

Overview

這篇博客內(nèi)容將包括對(duì)XML文件的解析、追加新元素后寫入到XML，以及更新原XML文件中某結(jié)點(diǎn)的值。使用的是python的xml.dom.minidom包，詳情可見其官方文檔：xml.dom.minidom官方文檔。全文都將圍繞以下的customer.xml進(jìn)行操作：

<?xml version='1.0' encoding='utf-8' ?><customers> <customer ID='C001'> <name>Acme Inc.</name> <phone>12345</phone> <comments> <![CDATA[Regular customer since 1995]]> </comments> </customer> <customer ID='C002'> <name>Star Wars Inc.</name> <phone>23456</phone> <comments> <![CDATA[A small but healthy company.]]> </comments> </customer></customers>

CDATA：在XML中，不會(huì)被解析器解析的部分?jǐn)?shù)據(jù)。

聲明：在本文中，結(jié)點(diǎn)和節(jié)點(diǎn)被視為了同一個(gè)概念，你可以在全文的任何地方替換它，我個(gè)人感覺區(qū)別不是很大，當(dāng)然，你也可以看做是我的打字輸入錯(cuò)誤。

1. 解析XML文件

在解析XML時(shí)，所有的文本都是儲(chǔ)存在文本節(jié)點(diǎn)中的，且該文本節(jié)點(diǎn)被視為元素結(jié)點(diǎn)的子結(jié)點(diǎn)，例如：2005，元素節(jié)點(diǎn) ，擁有一個(gè)值為 “2005” 的文本節(jié)點(diǎn)，“2005” 不是元素的值，最常用的方法就是getElementsByTagName()方法了，獲取到結(jié)點(diǎn)后再進(jìn)一步根據(jù)文檔結(jié)構(gòu)解析即可。

具體的理論就不過多描述，配合上述XML文件和下面的代碼，你將清楚的看到操作方法，下面的代碼執(zhí)行的工作是將所有的結(jié)點(diǎn)名稱以及結(jié)點(diǎn)信息輸出一下：

# -*- coding: utf-8 -*-''' @Author : LiuZhian @Time : 2019/4/24 0024 上午 9:19 @Comment : '''from xml.dom.minidom import parsedef readXML(): domTree = parse('./customer.xml') # 文檔根元素 rootNode = domTree.documentElement print(rootNode.nodeName) # 所有顧客 customers = rootNode.getElementsByTagName('customer') print('****所有顧客信息****') for customer in customers: if customer.hasAttribute('ID'): print('ID:', customer.getAttribute('ID')) # name 元素 name = customer.getElementsByTagName('name')[0] print(name.nodeName, ':', name.childNodes[0].data) # phone 元素 phone = customer.getElementsByTagName('phone')[0] print(phone.nodeName, ':', phone.childNodes[0].data) # comments 元素 comments = customer.getElementsByTagName('comments')[0] print(comments.nodeName, ':', comments.childNodes[0].data)if __name__ == ’__main__’: readXML()

python解析xml文件方式(解析、更新、寫入)

2. 寫入XML文件

在寫入時(shí)，我覺得可分為兩種方式：

新建一個(gè)全新的XML文件

在已有XML文件基礎(chǔ)上追加一些元素信息

至于以上兩種情況，其實(shí)創(chuàng)建元素結(jié)點(diǎn)的方法類似，你必須要做的都是先創(chuàng)建/得到一個(gè)DOM對(duì)象，再在DOM基礎(chǔ)上創(chuàng)建new一個(gè)新的結(jié)點(diǎn)。

如果是第一種情況，你可以通過dom=minidom.Document()來創(chuàng)建；如果是第二種情況，直接可以通過解析已有XML文件來得到dom對(duì)象，例如dom = parse('./customer.xml')

在具體創(chuàng)建元素/文本結(jié)點(diǎn)時(shí)，你大致會(huì)寫出像以下這樣的“四部曲”代碼：

①創(chuàng)建一個(gè)新元素結(jié)點(diǎn)createElement()

②創(chuàng)建一個(gè)文本節(jié)點(diǎn)createTextNode()

③將文本節(jié)點(diǎn)掛載元素結(jié)點(diǎn)上

④將元素結(jié)點(diǎn)掛載到其父元素上。

現(xiàn)在，我需要新建一個(gè)customer節(jié)點(diǎn)，信息如下:

<customer ID='C003'> <name>kavin</name> <phone>32467</phone> <comments> <![CDATA[A small but healthy company.]]> </comments> </customer>

代碼如下：

def writeXML(): domTree = parse('./customer.xml') # 文檔根元素 rootNode = domTree.documentElement # 新建一個(gè)customer節(jié)點(diǎn) customer_node = domTree.createElement('customer') customer_node.setAttribute('ID', 'C003') # 創(chuàng)建name節(jié)點(diǎn),并設(shè)置textValue name_node = domTree.createElement('name') name_text_value = domTree.createTextNode('kavin') name_node.appendChild(name_text_value) # 把文本節(jié)點(diǎn)掛到name_node節(jié)點(diǎn) customer_node.appendChild(name_node) # 創(chuàng)建phone節(jié)點(diǎn),并設(shè)置textValue phone_node = domTree.createElement('phone') phone_text_value = domTree.createTextNode('32467') phone_node.appendChild(phone_text_value) # 把文本節(jié)點(diǎn)掛到name_node節(jié)點(diǎn) customer_node.appendChild(phone_node) # 創(chuàng)建comments節(jié)點(diǎn),這里是CDATA comments_node = domTree.createElement('comments') cdata_text_value = domTree.createCDATASection('A small but healthy company.') comments_node.appendChild(cdata_text_value) customer_node.appendChild(comments_node) rootNode.appendChild(customer_node) with open(’added_customer.xml’, ’w’) as f: # 縮進(jìn) - 換行 - 編碼 domTree.writexml(f, addindent=’ ’, encoding=’utf-8’)if __name__ == ’__main__’: writeXML()

python解析xml文件方式(解析、更新、寫入)

3. 更新XML文件

在更新XML時(shí)，只需先找到對(duì)應(yīng)的元素結(jié)點(diǎn)，然后將其下的文本結(jié)點(diǎn)或?qū)傩匀≈蹈录纯桑缓蟊４娴轿募唧w我就不多說了，代碼中我將思路都注釋清楚了，如下：

def updateXML(): domTree = parse('./customer.xml') # 文檔根元素 rootNode = domTree.documentElement names = rootNode.getElementsByTagName('name') for name in names: if name.childNodes[0].data == 'Acme Inc.': # 獲取到name節(jié)點(diǎn)的父節(jié)點(diǎn) pn = name.parentNode # 父節(jié)點(diǎn)的phone節(jié)點(diǎn)，其實(shí)也就是name的兄弟節(jié)點(diǎn) # 可能有sibNode方法，我沒試過，大家可以google一下 phone = pn.getElementsByTagName('phone')[0] # 更新phone的取值 phone.childNodes[0].data = 99999 with open(’updated_customer.xml’, ’w’) as f: # 縮進(jìn) - 換行 - 編碼 domTree.writexml(f, addindent=’ ’, encoding=’utf-8’)if __name__ == ’__main__’: updateXML()

python解析xml文件方式(解析、更新、寫入)

如有不對(duì)之處，還煩請(qǐng)指教~

補(bǔ)充知識(shí)：python 讀取xml文件內(nèi)容并完成修改

我就廢話不多說了，還是直接看代碼吧！

import osimport xml.etree.ElementTree as ETdef changesku(inputpath): listdir = os.listdir(inputpath) for file in listdir: if file.endswith(’xml’): file = os.path.join(inputpath,file) tree = ET.parse(file) root = tree.getroot() for object1 in root.findall(’object’): #我要修改的元素在object里面，所以需要先找到objectfor sku in object1.findall(’name’): #查找想要修改的所有同種元素 if (sku.text == ’005’): #‘005’為原始的text sku.text = ’008’ #修改‘name’的標(biāo)簽值 tree.write(file,encoding=’utf-8’) #寫進(jìn)原始的xml文件，不然修改就無效，‘encoding = “utf - 8”’避免原始xml #中文字符亂碼 else: pass else: passif __name__ == ’__main__’: inputpath = ’D:easyhebing_xml’ #這是xml文件的文件夾的絕對(duì)地址 changesku(inputpath)

以上這篇python解析xml文件方式(解析、更新、寫入)就是小編分享給大家的全部?jī)?nèi)容了，希望能給大家一個(gè)參考，也希望大家多多支持好吧啦網(wǎng)。

Python 編程

上一條：使用python執(zhí)行shell腳本并動(dòng)態(tài)傳參及subprocess的使用詳解下一條：python批量替換文件名中的共同字符實(shí)例

相關(guān)文章：

1. 解決vue頁(yè)面刷新，數(shù)據(jù)丟失的問題2. python 讀txt文件,按‘,’分割每行數(shù)據(jù)操作3. python logging.info在終端沒輸出的解決4. vue路由分文件拆分管理詳解5. vue+vuex+axios從后臺(tái)獲取數(shù)據(jù)存入vuex,組件之間共享數(shù)據(jù)操作6. 詳解android adb常見用法7. SpringBoot使用Captcha生成驗(yàn)證碼8. android studio實(shí)現(xiàn)簡(jiǎn)單的計(jì)算器（無bug）9. android 控件同時(shí)監(jiān)聽單擊和雙擊實(shí)例10. Python 忽略文件名編碼的方法

排行榜

					
					vue+vuex+axios從后臺(tái)獲取數(shù)據(jù)存入vuex,組件之間共享數(shù)據(jù)操作
Java Media Framework 基礎(chǔ)教程
詳解android adb常見用法
vue路由分文件拆分管理詳解
JavaEE SpringMyBatis是什么? 它和Hibernate的區(qū)別及如何配置MyBatis
Python 忽略文件名編碼的方法
python logging.info在終端沒輸出的解決
SpringBoot使用Captcha生成驗(yàn)證碼
python 讀txt文件,按‘,’分割每行數(shù)據(jù)操作
springboot項(xiàng)目整合druid數(shù)據(jù)庫(kù)連接池的實(shí)現(xiàn)
js屬性對(duì)象的hasOwnProperty方法的使用