Iterate XPath elements to get individual elements instead of list
I'm parsing a XML document and reading values of different elements using XPath. Currently this works well to get all elements in lists.
However, children elements are not always present for all parents (but are present in some!) and I need to know which as I'm parsing the xml to create a dataframe to insert in a database.
So I want to iterate over elements and grab the values I need one at a time. I'm not sure how to do this as currently I'm getting the full list on each iteration.
I'm extracting elements that are nested at different levels.
The xml I'm parsing is a TCX file by Garmin. Short example:
<?xml version="1.0" encoding="UTF-8"?>
<TrainingCenterDatabase
xsi:schemaLocation="http://www.garmin.com/xmlschemas/TrainingCenterDatabase/v2 http://www.garmin.com/xmlschemas/TrainingCenterDatabasev2.xsd"
xmlns:ns5="http://www.garmin.com/xmlschemas/ActivityGoals/v1"
xmlns:ns3="http://www.garmin.com/xmlschemas/ActivityExtension/v2"
xmlns:ns2="http://www.garmin.com/xmlschemas/UserProfile/v2"
xmlns="http://www.garmin.com/xmlschemas/TrainingCenterDatabase/v2"
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:ns4="http://www.garmin.com/xmlschemas/ProfileExtension/v1">
<Activities>
<Activity Sport="Running">
<Id>2018-10-10T14:10:10.000Z</Id>
<Lap StartTime="2018-10-10T14:10:10.000Z">
<TotalTimeSeconds>343.0</TotalTimeSeconds>
<DistanceMeters>1000.0</DistanceMeters>
<MaximumSpeed>3.694999933242798</MaximumSpeed>
<Calories>51</Calories>
<AverageHeartRateBpm>
<Value>136</Value>
</AverageHeartRateBpm>
<MaximumHeartRateBpm>
<Value>162</Value>
</MaximumHeartRateBpm>
<Intensity>Active</Intensity>
<TriggerMethod>Manual</TriggerMethod>
<Track>
<Trackpoint>
<Time>2018-10-10T14:10:10.000Z</Time>
<Position>
<LatitudeDegrees>52.17917550355196</LatitudeDegrees>
<LongitudeDegrees>6.532441098242998</LongitudeDegrees>
</Position>
<AltitudeMeters>-0.20000000298023224</AltitudeMeters>
<DistanceMeters>0.0</DistanceMeters>
<HeartRateBpm>
<Value>94</Value>
</HeartRateBpm>
<Extensions>
<ns3:TPX>
<ns3:Speed>0.04699999839067459</ns3:Speed>
<ns3:RunCadence>7</ns3:RunCadence>
</ns3:TPX>
</Extensions>
</Trackpoint>
<Trackpoint>
<Time>2018-10-10T14:10:11.000Z</Time>
<Position>
<LatitudeDegrees>52.17917634174228</LatitudeDegrees>
<LongitudeDegrees>6.532444199547172</LongitudeDegrees>
</Position>
<AltitudeMeters>0.0</AltitudeMeters>
<DistanceMeters>0.23000000417232513</DistanceMeters>
<HeartRateBpm>
<Value>95</Value>
</HeartRateBpm>
<Extensions>
<ns3:TPX>
<ns3:Speed>0.0</ns3:Speed>
<ns3:RunCadence>7</ns3:RunCadence>
</ns3:TPX>
</Extensions>
</Trackpoint>
<Trackpoint>
<Time>2018-10-10T14:10:12.000Z</Time>
<Position>
<LatitudeDegrees>52.17917206697166</LatitudeDegrees>
<LongitudeDegrees>6.532468926161528</LongitudeDegrees>
</Position>
<AltitudeMeters>0.0</AltitudeMeters>
<DistanceMeters>1.9700000286102295</DistanceMeters>
<Extensions>
<ns3:TPX>
<ns3:Speed>0.0</ns3:Speed>
<ns3:RunCadence>7</ns3:RunCadence>
</ns3:TPX>
</Extensions>
</Trackpoint>
<Trackpoint>
<Time>2018-10-10T14:10:13.000Z</Time>
<Position>
<LatitudeDegrees>52.17916024848819</LatitudeDegrees>
<LongitudeDegrees>6.5325202234089375</LongitudeDegrees>
</Position>
<AltitudeMeters>0.0</AltitudeMeters>
<DistanceMeters>5.679999828338623</DistanceMeters>
<HeartRateBpm>
<Value>96</Value>
</HeartRateBpm>
<Extensions>
<ns3:TPX>
<ns3:Speed>0.08399999886751175</ns3:Speed>
<ns3:RunCadence>7</ns3:RunCadence>
</ns3:TPX>
</Extensions>
</Trackpoint>
<Trackpoint>
<Time>2018-10-10T14:10:14.000Z</Time>
<Position>
<LatitudeDegrees>52.17914817854762</LatitudeDegrees>
<LongitudeDegrees>6.532532041892409</LongitudeDegrees>
</Position>
<AltitudeMeters>0.0</AltitudeMeters>
<DistanceMeters>7.150000095367432</DistanceMeters>
<HeartRateBpm>
<Value>98</Value>
</HeartRateBpm>
<Extensions>
<ns3:TPX>
<ns3:Speed>0.10300000011920929</ns3:Speed>
<ns3:RunCadence>10</ns3:RunCadence>
</ns3:TPX>
</Extensions>
</Trackpoint>
Code that is working that gives me all values in the file as a list:
from lxml import etree, objectify
from os import listdir
from os.path import isfile, join
def tcxParse(tcxFile):
parser = etree.XMLParser(remove_blank_text=True)
tree = etree.parse(tcxFile, parser)
root = tree.getroot()
####
#strip namespaces
for elem in root.getiterator():
if not hasattr(elem.tag, 'find'): continue # (1)
i = elem.tag.find('}')
if i >= 0:
elem.tag = elem.tag[i + 1:]
objectify.deannotate(root, cleanup_namespaces=True)
####
#check if we are dealing with .tcx or other format
if tcxFile.lower().endswith('.tcx'):
tcxParse.activity = tree.xpath('//*[@Sport]/@Sport')
tcxParse.HR = list(map(int, tree.xpath('//Track/Trackpoint/HeartRateBpm/Value/text()')))
tcxParse.Time = tree.xpath('//Time/text()')
tcxParse.Speed = list(map(float, tree.xpath('//Track/Trackpoint/Extensions/TPX/Speed/text()')))
tcxParse.Cadence = list(map(int, tree.xpath('//Track/Trackpoint/Extensions/TPX/RunCadence/text()')))
tcxParse.Lat = list(map(float, tree.xpath('//Track/Trackpoint/Position/LatitudeDegrees/text()')))
tcxParse.Lon = list(map(float, tree.xpath('//Track/Trackpoint/Position/LongitudeDegrees/text()')))
tcxParse.Alt = list(map(float, tree.xpath('//Track/Trackpoint/AltitudeMeters/text()')))
tcxParse.Distance = list(map(float, tree.xpath('//Track/Trackpoint/DistanceMeters/text()')))
I know I can use tree.iter() to iterate over the elements, but not sure how to grab the values one at a time instead of the full list.
To be clear:
Current output for tcxParse.HR for instance would be:
94,95,96,98
But I need it to be
94,95,nan,96,98
as the HeartRateBpm is missing in the 3rd Trackpoint element
python xml xpath
add a comment |
I'm parsing a XML document and reading values of different elements using XPath. Currently this works well to get all elements in lists.
However, children elements are not always present for all parents (but are present in some!) and I need to know which as I'm parsing the xml to create a dataframe to insert in a database.
So I want to iterate over elements and grab the values I need one at a time. I'm not sure how to do this as currently I'm getting the full list on each iteration.
I'm extracting elements that are nested at different levels.
The xml I'm parsing is a TCX file by Garmin. Short example:
<?xml version="1.0" encoding="UTF-8"?>
<TrainingCenterDatabase
xsi:schemaLocation="http://www.garmin.com/xmlschemas/TrainingCenterDatabase/v2 http://www.garmin.com/xmlschemas/TrainingCenterDatabasev2.xsd"
xmlns:ns5="http://www.garmin.com/xmlschemas/ActivityGoals/v1"
xmlns:ns3="http://www.garmin.com/xmlschemas/ActivityExtension/v2"
xmlns:ns2="http://www.garmin.com/xmlschemas/UserProfile/v2"
xmlns="http://www.garmin.com/xmlschemas/TrainingCenterDatabase/v2"
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:ns4="http://www.garmin.com/xmlschemas/ProfileExtension/v1">
<Activities>
<Activity Sport="Running">
<Id>2018-10-10T14:10:10.000Z</Id>
<Lap StartTime="2018-10-10T14:10:10.000Z">
<TotalTimeSeconds>343.0</TotalTimeSeconds>
<DistanceMeters>1000.0</DistanceMeters>
<MaximumSpeed>3.694999933242798</MaximumSpeed>
<Calories>51</Calories>
<AverageHeartRateBpm>
<Value>136</Value>
</AverageHeartRateBpm>
<MaximumHeartRateBpm>
<Value>162</Value>
</MaximumHeartRateBpm>
<Intensity>Active</Intensity>
<TriggerMethod>Manual</TriggerMethod>
<Track>
<Trackpoint>
<Time>2018-10-10T14:10:10.000Z</Time>
<Position>
<LatitudeDegrees>52.17917550355196</LatitudeDegrees>
<LongitudeDegrees>6.532441098242998</LongitudeDegrees>
</Position>
<AltitudeMeters>-0.20000000298023224</AltitudeMeters>
<DistanceMeters>0.0</DistanceMeters>
<HeartRateBpm>
<Value>94</Value>
</HeartRateBpm>
<Extensions>
<ns3:TPX>
<ns3:Speed>0.04699999839067459</ns3:Speed>
<ns3:RunCadence>7</ns3:RunCadence>
</ns3:TPX>
</Extensions>
</Trackpoint>
<Trackpoint>
<Time>2018-10-10T14:10:11.000Z</Time>
<Position>
<LatitudeDegrees>52.17917634174228</LatitudeDegrees>
<LongitudeDegrees>6.532444199547172</LongitudeDegrees>
</Position>
<AltitudeMeters>0.0</AltitudeMeters>
<DistanceMeters>0.23000000417232513</DistanceMeters>
<HeartRateBpm>
<Value>95</Value>
</HeartRateBpm>
<Extensions>
<ns3:TPX>
<ns3:Speed>0.0</ns3:Speed>
<ns3:RunCadence>7</ns3:RunCadence>
</ns3:TPX>
</Extensions>
</Trackpoint>
<Trackpoint>
<Time>2018-10-10T14:10:12.000Z</Time>
<Position>
<LatitudeDegrees>52.17917206697166</LatitudeDegrees>
<LongitudeDegrees>6.532468926161528</LongitudeDegrees>
</Position>
<AltitudeMeters>0.0</AltitudeMeters>
<DistanceMeters>1.9700000286102295</DistanceMeters>
<Extensions>
<ns3:TPX>
<ns3:Speed>0.0</ns3:Speed>
<ns3:RunCadence>7</ns3:RunCadence>
</ns3:TPX>
</Extensions>
</Trackpoint>
<Trackpoint>
<Time>2018-10-10T14:10:13.000Z</Time>
<Position>
<LatitudeDegrees>52.17916024848819</LatitudeDegrees>
<LongitudeDegrees>6.5325202234089375</LongitudeDegrees>
</Position>
<AltitudeMeters>0.0</AltitudeMeters>
<DistanceMeters>5.679999828338623</DistanceMeters>
<HeartRateBpm>
<Value>96</Value>
</HeartRateBpm>
<Extensions>
<ns3:TPX>
<ns3:Speed>0.08399999886751175</ns3:Speed>
<ns3:RunCadence>7</ns3:RunCadence>
</ns3:TPX>
</Extensions>
</Trackpoint>
<Trackpoint>
<Time>2018-10-10T14:10:14.000Z</Time>
<Position>
<LatitudeDegrees>52.17914817854762</LatitudeDegrees>
<LongitudeDegrees>6.532532041892409</LongitudeDegrees>
</Position>
<AltitudeMeters>0.0</AltitudeMeters>
<DistanceMeters>7.150000095367432</DistanceMeters>
<HeartRateBpm>
<Value>98</Value>
</HeartRateBpm>
<Extensions>
<ns3:TPX>
<ns3:Speed>0.10300000011920929</ns3:Speed>
<ns3:RunCadence>10</ns3:RunCadence>
</ns3:TPX>
</Extensions>
</Trackpoint>
Code that is working that gives me all values in the file as a list:
from lxml import etree, objectify
from os import listdir
from os.path import isfile, join
def tcxParse(tcxFile):
parser = etree.XMLParser(remove_blank_text=True)
tree = etree.parse(tcxFile, parser)
root = tree.getroot()
####
#strip namespaces
for elem in root.getiterator():
if not hasattr(elem.tag, 'find'): continue # (1)
i = elem.tag.find('}')
if i >= 0:
elem.tag = elem.tag[i + 1:]
objectify.deannotate(root, cleanup_namespaces=True)
####
#check if we are dealing with .tcx or other format
if tcxFile.lower().endswith('.tcx'):
tcxParse.activity = tree.xpath('//*[@Sport]/@Sport')
tcxParse.HR = list(map(int, tree.xpath('//Track/Trackpoint/HeartRateBpm/Value/text()')))
tcxParse.Time = tree.xpath('//Time/text()')
tcxParse.Speed = list(map(float, tree.xpath('//Track/Trackpoint/Extensions/TPX/Speed/text()')))
tcxParse.Cadence = list(map(int, tree.xpath('//Track/Trackpoint/Extensions/TPX/RunCadence/text()')))
tcxParse.Lat = list(map(float, tree.xpath('//Track/Trackpoint/Position/LatitudeDegrees/text()')))
tcxParse.Lon = list(map(float, tree.xpath('//Track/Trackpoint/Position/LongitudeDegrees/text()')))
tcxParse.Alt = list(map(float, tree.xpath('//Track/Trackpoint/AltitudeMeters/text()')))
tcxParse.Distance = list(map(float, tree.xpath('//Track/Trackpoint/DistanceMeters/text()')))
I know I can use tree.iter() to iterate over the elements, but not sure how to grab the values one at a time instead of the full list.
To be clear:
Current output for tcxParse.HR for instance would be:
94,95,96,98
But I need it to be
94,95,nan,96,98
as the HeartRateBpm is missing in the 3rd Trackpoint element
python xml xpath
What is your current and what is desired output?
– Andersson
Nov 15 '18 at 12:21
added it to the question based on the snippet I included here. Hope it's clear this way
– Chrisvdberge
Nov 15 '18 at 12:56
I think you need to do this with python. I don't know for sure, but seems with XPath 1.0 you won't be able to get nan for missing elements. You can check this answer[, but as I said it probably won't work with array result]: stackoverflow.com/a/4490667/7128891
– Mike Kaskun
Nov 15 '18 at 13:26
add a comment |
I'm parsing a XML document and reading values of different elements using XPath. Currently this works well to get all elements in lists.
However, children elements are not always present for all parents (but are present in some!) and I need to know which as I'm parsing the xml to create a dataframe to insert in a database.
So I want to iterate over elements and grab the values I need one at a time. I'm not sure how to do this as currently I'm getting the full list on each iteration.
I'm extracting elements that are nested at different levels.
The xml I'm parsing is a TCX file by Garmin. Short example:
<?xml version="1.0" encoding="UTF-8"?>
<TrainingCenterDatabase
xsi:schemaLocation="http://www.garmin.com/xmlschemas/TrainingCenterDatabase/v2 http://www.garmin.com/xmlschemas/TrainingCenterDatabasev2.xsd"
xmlns:ns5="http://www.garmin.com/xmlschemas/ActivityGoals/v1"
xmlns:ns3="http://www.garmin.com/xmlschemas/ActivityExtension/v2"
xmlns:ns2="http://www.garmin.com/xmlschemas/UserProfile/v2"
xmlns="http://www.garmin.com/xmlschemas/TrainingCenterDatabase/v2"
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:ns4="http://www.garmin.com/xmlschemas/ProfileExtension/v1">
<Activities>
<Activity Sport="Running">
<Id>2018-10-10T14:10:10.000Z</Id>
<Lap StartTime="2018-10-10T14:10:10.000Z">
<TotalTimeSeconds>343.0</TotalTimeSeconds>
<DistanceMeters>1000.0</DistanceMeters>
<MaximumSpeed>3.694999933242798</MaximumSpeed>
<Calories>51</Calories>
<AverageHeartRateBpm>
<Value>136</Value>
</AverageHeartRateBpm>
<MaximumHeartRateBpm>
<Value>162</Value>
</MaximumHeartRateBpm>
<Intensity>Active</Intensity>
<TriggerMethod>Manual</TriggerMethod>
<Track>
<Trackpoint>
<Time>2018-10-10T14:10:10.000Z</Time>
<Position>
<LatitudeDegrees>52.17917550355196</LatitudeDegrees>
<LongitudeDegrees>6.532441098242998</LongitudeDegrees>
</Position>
<AltitudeMeters>-0.20000000298023224</AltitudeMeters>
<DistanceMeters>0.0</DistanceMeters>
<HeartRateBpm>
<Value>94</Value>
</HeartRateBpm>
<Extensions>
<ns3:TPX>
<ns3:Speed>0.04699999839067459</ns3:Speed>
<ns3:RunCadence>7</ns3:RunCadence>
</ns3:TPX>
</Extensions>
</Trackpoint>
<Trackpoint>
<Time>2018-10-10T14:10:11.000Z</Time>
<Position>
<LatitudeDegrees>52.17917634174228</LatitudeDegrees>
<LongitudeDegrees>6.532444199547172</LongitudeDegrees>
</Position>
<AltitudeMeters>0.0</AltitudeMeters>
<DistanceMeters>0.23000000417232513</DistanceMeters>
<HeartRateBpm>
<Value>95</Value>
</HeartRateBpm>
<Extensions>
<ns3:TPX>
<ns3:Speed>0.0</ns3:Speed>
<ns3:RunCadence>7</ns3:RunCadence>
</ns3:TPX>
</Extensions>
</Trackpoint>
<Trackpoint>
<Time>2018-10-10T14:10:12.000Z</Time>
<Position>
<LatitudeDegrees>52.17917206697166</LatitudeDegrees>
<LongitudeDegrees>6.532468926161528</LongitudeDegrees>
</Position>
<AltitudeMeters>0.0</AltitudeMeters>
<DistanceMeters>1.9700000286102295</DistanceMeters>
<Extensions>
<ns3:TPX>
<ns3:Speed>0.0</ns3:Speed>
<ns3:RunCadence>7</ns3:RunCadence>
</ns3:TPX>
</Extensions>
</Trackpoint>
<Trackpoint>
<Time>2018-10-10T14:10:13.000Z</Time>
<Position>
<LatitudeDegrees>52.17916024848819</LatitudeDegrees>
<LongitudeDegrees>6.5325202234089375</LongitudeDegrees>
</Position>
<AltitudeMeters>0.0</AltitudeMeters>
<DistanceMeters>5.679999828338623</DistanceMeters>
<HeartRateBpm>
<Value>96</Value>
</HeartRateBpm>
<Extensions>
<ns3:TPX>
<ns3:Speed>0.08399999886751175</ns3:Speed>
<ns3:RunCadence>7</ns3:RunCadence>
</ns3:TPX>
</Extensions>
</Trackpoint>
<Trackpoint>
<Time>2018-10-10T14:10:14.000Z</Time>
<Position>
<LatitudeDegrees>52.17914817854762</LatitudeDegrees>
<LongitudeDegrees>6.532532041892409</LongitudeDegrees>
</Position>
<AltitudeMeters>0.0</AltitudeMeters>
<DistanceMeters>7.150000095367432</DistanceMeters>
<HeartRateBpm>
<Value>98</Value>
</HeartRateBpm>
<Extensions>
<ns3:TPX>
<ns3:Speed>0.10300000011920929</ns3:Speed>
<ns3:RunCadence>10</ns3:RunCadence>
</ns3:TPX>
</Extensions>
</Trackpoint>
Code that is working that gives me all values in the file as a list:
from lxml import etree, objectify
from os import listdir
from os.path import isfile, join
def tcxParse(tcxFile):
parser = etree.XMLParser(remove_blank_text=True)
tree = etree.parse(tcxFile, parser)
root = tree.getroot()
####
#strip namespaces
for elem in root.getiterator():
if not hasattr(elem.tag, 'find'): continue # (1)
i = elem.tag.find('}')
if i >= 0:
elem.tag = elem.tag[i + 1:]
objectify.deannotate(root, cleanup_namespaces=True)
####
#check if we are dealing with .tcx or other format
if tcxFile.lower().endswith('.tcx'):
tcxParse.activity = tree.xpath('//*[@Sport]/@Sport')
tcxParse.HR = list(map(int, tree.xpath('//Track/Trackpoint/HeartRateBpm/Value/text()')))
tcxParse.Time = tree.xpath('//Time/text()')
tcxParse.Speed = list(map(float, tree.xpath('//Track/Trackpoint/Extensions/TPX/Speed/text()')))
tcxParse.Cadence = list(map(int, tree.xpath('//Track/Trackpoint/Extensions/TPX/RunCadence/text()')))
tcxParse.Lat = list(map(float, tree.xpath('//Track/Trackpoint/Position/LatitudeDegrees/text()')))
tcxParse.Lon = list(map(float, tree.xpath('//Track/Trackpoint/Position/LongitudeDegrees/text()')))
tcxParse.Alt = list(map(float, tree.xpath('//Track/Trackpoint/AltitudeMeters/text()')))
tcxParse.Distance = list(map(float, tree.xpath('//Track/Trackpoint/DistanceMeters/text()')))
I know I can use tree.iter() to iterate over the elements, but not sure how to grab the values one at a time instead of the full list.
To be clear:
Current output for tcxParse.HR for instance would be:
94,95,96,98
But I need it to be
94,95,nan,96,98
as the HeartRateBpm is missing in the 3rd Trackpoint element
python xml xpath
I'm parsing a XML document and reading values of different elements using XPath. Currently this works well to get all elements in lists.
However, children elements are not always present for all parents (but are present in some!) and I need to know which as I'm parsing the xml to create a dataframe to insert in a database.
So I want to iterate over elements and grab the values I need one at a time. I'm not sure how to do this as currently I'm getting the full list on each iteration.
I'm extracting elements that are nested at different levels.
The xml I'm parsing is a TCX file by Garmin. Short example:
<?xml version="1.0" encoding="UTF-8"?>
<TrainingCenterDatabase
xsi:schemaLocation="http://www.garmin.com/xmlschemas/TrainingCenterDatabase/v2 http://www.garmin.com/xmlschemas/TrainingCenterDatabasev2.xsd"
xmlns:ns5="http://www.garmin.com/xmlschemas/ActivityGoals/v1"
xmlns:ns3="http://www.garmin.com/xmlschemas/ActivityExtension/v2"
xmlns:ns2="http://www.garmin.com/xmlschemas/UserProfile/v2"
xmlns="http://www.garmin.com/xmlschemas/TrainingCenterDatabase/v2"
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:ns4="http://www.garmin.com/xmlschemas/ProfileExtension/v1">
<Activities>
<Activity Sport="Running">
<Id>2018-10-10T14:10:10.000Z</Id>
<Lap StartTime="2018-10-10T14:10:10.000Z">
<TotalTimeSeconds>343.0</TotalTimeSeconds>
<DistanceMeters>1000.0</DistanceMeters>
<MaximumSpeed>3.694999933242798</MaximumSpeed>
<Calories>51</Calories>
<AverageHeartRateBpm>
<Value>136</Value>
</AverageHeartRateBpm>
<MaximumHeartRateBpm>
<Value>162</Value>
</MaximumHeartRateBpm>
<Intensity>Active</Intensity>
<TriggerMethod>Manual</TriggerMethod>
<Track>
<Trackpoint>
<Time>2018-10-10T14:10:10.000Z</Time>
<Position>
<LatitudeDegrees>52.17917550355196</LatitudeDegrees>
<LongitudeDegrees>6.532441098242998</LongitudeDegrees>
</Position>
<AltitudeMeters>-0.20000000298023224</AltitudeMeters>
<DistanceMeters>0.0</DistanceMeters>
<HeartRateBpm>
<Value>94</Value>
</HeartRateBpm>
<Extensions>
<ns3:TPX>
<ns3:Speed>0.04699999839067459</ns3:Speed>
<ns3:RunCadence>7</ns3:RunCadence>
</ns3:TPX>
</Extensions>
</Trackpoint>
<Trackpoint>
<Time>2018-10-10T14:10:11.000Z</Time>
<Position>
<LatitudeDegrees>52.17917634174228</LatitudeDegrees>
<LongitudeDegrees>6.532444199547172</LongitudeDegrees>
</Position>
<AltitudeMeters>0.0</AltitudeMeters>
<DistanceMeters>0.23000000417232513</DistanceMeters>
<HeartRateBpm>
<Value>95</Value>
</HeartRateBpm>
<Extensions>
<ns3:TPX>
<ns3:Speed>0.0</ns3:Speed>
<ns3:RunCadence>7</ns3:RunCadence>
</ns3:TPX>
</Extensions>
</Trackpoint>
<Trackpoint>
<Time>2018-10-10T14:10:12.000Z</Time>
<Position>
<LatitudeDegrees>52.17917206697166</LatitudeDegrees>
<LongitudeDegrees>6.532468926161528</LongitudeDegrees>
</Position>
<AltitudeMeters>0.0</AltitudeMeters>
<DistanceMeters>1.9700000286102295</DistanceMeters>
<Extensions>
<ns3:TPX>
<ns3:Speed>0.0</ns3:Speed>
<ns3:RunCadence>7</ns3:RunCadence>
</ns3:TPX>
</Extensions>
</Trackpoint>
<Trackpoint>
<Time>2018-10-10T14:10:13.000Z</Time>
<Position>
<LatitudeDegrees>52.17916024848819</LatitudeDegrees>
<LongitudeDegrees>6.5325202234089375</LongitudeDegrees>
</Position>
<AltitudeMeters>0.0</AltitudeMeters>
<DistanceMeters>5.679999828338623</DistanceMeters>
<HeartRateBpm>
<Value>96</Value>
</HeartRateBpm>
<Extensions>
<ns3:TPX>
<ns3:Speed>0.08399999886751175</ns3:Speed>
<ns3:RunCadence>7</ns3:RunCadence>
</ns3:TPX>
</Extensions>
</Trackpoint>
<Trackpoint>
<Time>2018-10-10T14:10:14.000Z</Time>
<Position>
<LatitudeDegrees>52.17914817854762</LatitudeDegrees>
<LongitudeDegrees>6.532532041892409</LongitudeDegrees>
</Position>
<AltitudeMeters>0.0</AltitudeMeters>
<DistanceMeters>7.150000095367432</DistanceMeters>
<HeartRateBpm>
<Value>98</Value>
</HeartRateBpm>
<Extensions>
<ns3:TPX>
<ns3:Speed>0.10300000011920929</ns3:Speed>
<ns3:RunCadence>10</ns3:RunCadence>
</ns3:TPX>
</Extensions>
</Trackpoint>
Code that is working that gives me all values in the file as a list:
from lxml import etree, objectify
from os import listdir
from os.path import isfile, join
def tcxParse(tcxFile):
parser = etree.XMLParser(remove_blank_text=True)
tree = etree.parse(tcxFile, parser)
root = tree.getroot()
####
#strip namespaces
for elem in root.getiterator():
if not hasattr(elem.tag, 'find'): continue # (1)
i = elem.tag.find('}')
if i >= 0:
elem.tag = elem.tag[i + 1:]
objectify.deannotate(root, cleanup_namespaces=True)
####
#check if we are dealing with .tcx or other format
if tcxFile.lower().endswith('.tcx'):
tcxParse.activity = tree.xpath('//*[@Sport]/@Sport')
tcxParse.HR = list(map(int, tree.xpath('//Track/Trackpoint/HeartRateBpm/Value/text()')))
tcxParse.Time = tree.xpath('//Time/text()')
tcxParse.Speed = list(map(float, tree.xpath('//Track/Trackpoint/Extensions/TPX/Speed/text()')))
tcxParse.Cadence = list(map(int, tree.xpath('//Track/Trackpoint/Extensions/TPX/RunCadence/text()')))
tcxParse.Lat = list(map(float, tree.xpath('//Track/Trackpoint/Position/LatitudeDegrees/text()')))
tcxParse.Lon = list(map(float, tree.xpath('//Track/Trackpoint/Position/LongitudeDegrees/text()')))
tcxParse.Alt = list(map(float, tree.xpath('//Track/Trackpoint/AltitudeMeters/text()')))
tcxParse.Distance = list(map(float, tree.xpath('//Track/Trackpoint/DistanceMeters/text()')))
I know I can use tree.iter() to iterate over the elements, but not sure how to grab the values one at a time instead of the full list.
To be clear:
Current output for tcxParse.HR for instance would be:
94,95,96,98
But I need it to be
94,95,nan,96,98
as the HeartRateBpm is missing in the 3rd Trackpoint element
python xml xpath
python xml xpath
edited Nov 15 '18 at 12:50
Chrisvdberge
asked Nov 15 '18 at 12:15
ChrisvdbergeChrisvdberge
5602822
5602822
What is your current and what is desired output?
– Andersson
Nov 15 '18 at 12:21
added it to the question based on the snippet I included here. Hope it's clear this way
– Chrisvdberge
Nov 15 '18 at 12:56
I think you need to do this with python. I don't know for sure, but seems with XPath 1.0 you won't be able to get nan for missing elements. You can check this answer[, but as I said it probably won't work with array result]: stackoverflow.com/a/4490667/7128891
– Mike Kaskun
Nov 15 '18 at 13:26
add a comment |
What is your current and what is desired output?
– Andersson
Nov 15 '18 at 12:21
added it to the question based on the snippet I included here. Hope it's clear this way
– Chrisvdberge
Nov 15 '18 at 12:56
I think you need to do this with python. I don't know for sure, but seems with XPath 1.0 you won't be able to get nan for missing elements. You can check this answer[, but as I said it probably won't work with array result]: stackoverflow.com/a/4490667/7128891
– Mike Kaskun
Nov 15 '18 at 13:26
What is your current and what is desired output?
– Andersson
Nov 15 '18 at 12:21
What is your current and what is desired output?
– Andersson
Nov 15 '18 at 12:21
added it to the question based on the snippet I included here. Hope it's clear this way
– Chrisvdberge
Nov 15 '18 at 12:56
added it to the question based on the snippet I included here. Hope it's clear this way
– Chrisvdberge
Nov 15 '18 at 12:56
I think you need to do this with python. I don't know for sure, but seems with XPath 1.0 you won't be able to get nan for missing elements. You can check this answer[, but as I said it probably won't work with array result]: stackoverflow.com/a/4490667/7128891
– Mike Kaskun
Nov 15 '18 at 13:26
I think you need to do this with python. I don't know for sure, but seems with XPath 1.0 you won't be able to get nan for missing elements. You can check this answer[, but as I said it probably won't work with array result]: stackoverflow.com/a/4490667/7128891
– Mike Kaskun
Nov 15 '18 at 13:26
add a comment |
1 Answer
1
active
oldest
votes
As I understand you need to iterate <Trackpoint>'s in <Track>.
I propose you to do it like this:
trackpoints = [
'HR': tp.findtext('HeartRateBpm/Value'),
'Time': tp.findtext('Time'),
'Speed': tp.findtext('Extensions/TPX/Speed'),
'Cadence': tp.findtext('Extensions/TPX/RunCadence'),
'Lat': tp.findtext('Position/LatitudeDegrees'),
'Lon': tp.findtext('Position/LongitudeDegrees'),
'Alt': tp.findtext('AltitudeMeters'),
'Distance': tp.findtext('DistanceMeters')
for tp in tree.xpath('//Track/Trackpoint')]
For xml chunk in question (with deleted <HeartRateBpm> in second <Trackpoint>) - trackpoints will contain such list:
['HR': '94', 'Time': '2018-10-10T14:10:10.000Z', 'Speed': '0.04699999839067459', 'Cadence': '7', 'Lat': '52.17917550355196', 'Lon': '6.532441098242998', 'Alt': '-0.20000000298023224', 'Distance': '0.0',
'HR': None, 'Time': '2018-10-10T14:10:11.000Z', 'Speed': '0.0', 'Cadence': '7', 'Lat': '52.17917634174228', 'Lon': '6.532444199547172', 'Alt': '0.0', 'Distance': '0.23000000417232513']
Thx, that's what I needed indeed!
– Chrisvdberge
Nov 15 '18 at 18:58
add a comment |
Your Answer
StackExchange.ifUsing("editor", function ()
StackExchange.using("externalEditor", function ()
StackExchange.using("snippets", function ()
StackExchange.snippets.init();
);
);
, "code-snippets");
StackExchange.ready(function()
var channelOptions =
tags: "".split(" "),
id: "1"
;
initTagRenderer("".split(" "), "".split(" "), channelOptions);
StackExchange.using("externalEditor", function()
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled)
StackExchange.using("snippets", function()
createEditor();
);
else
createEditor();
);
function createEditor()
StackExchange.prepareEditor(
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: true,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: 10,
bindNavPrevention: true,
postfix: "",
imageUploader:
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
,
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
);
);
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53319313%2fiterate-xpath-elements-to-get-individual-elements-instead-of-list%23new-answer', 'question_page');
);
Post as a guest
Required, but never shown
1 Answer
1
active
oldest
votes
1 Answer
1
active
oldest
votes
active
oldest
votes
active
oldest
votes
As I understand you need to iterate <Trackpoint>'s in <Track>.
I propose you to do it like this:
trackpoints = [
'HR': tp.findtext('HeartRateBpm/Value'),
'Time': tp.findtext('Time'),
'Speed': tp.findtext('Extensions/TPX/Speed'),
'Cadence': tp.findtext('Extensions/TPX/RunCadence'),
'Lat': tp.findtext('Position/LatitudeDegrees'),
'Lon': tp.findtext('Position/LongitudeDegrees'),
'Alt': tp.findtext('AltitudeMeters'),
'Distance': tp.findtext('DistanceMeters')
for tp in tree.xpath('//Track/Trackpoint')]
For xml chunk in question (with deleted <HeartRateBpm> in second <Trackpoint>) - trackpoints will contain such list:
['HR': '94', 'Time': '2018-10-10T14:10:10.000Z', 'Speed': '0.04699999839067459', 'Cadence': '7', 'Lat': '52.17917550355196', 'Lon': '6.532441098242998', 'Alt': '-0.20000000298023224', 'Distance': '0.0',
'HR': None, 'Time': '2018-10-10T14:10:11.000Z', 'Speed': '0.0', 'Cadence': '7', 'Lat': '52.17917634174228', 'Lon': '6.532444199547172', 'Alt': '0.0', 'Distance': '0.23000000417232513']
Thx, that's what I needed indeed!
– Chrisvdberge
Nov 15 '18 at 18:58
add a comment |
As I understand you need to iterate <Trackpoint>'s in <Track>.
I propose you to do it like this:
trackpoints = [
'HR': tp.findtext('HeartRateBpm/Value'),
'Time': tp.findtext('Time'),
'Speed': tp.findtext('Extensions/TPX/Speed'),
'Cadence': tp.findtext('Extensions/TPX/RunCadence'),
'Lat': tp.findtext('Position/LatitudeDegrees'),
'Lon': tp.findtext('Position/LongitudeDegrees'),
'Alt': tp.findtext('AltitudeMeters'),
'Distance': tp.findtext('DistanceMeters')
for tp in tree.xpath('//Track/Trackpoint')]
For xml chunk in question (with deleted <HeartRateBpm> in second <Trackpoint>) - trackpoints will contain such list:
['HR': '94', 'Time': '2018-10-10T14:10:10.000Z', 'Speed': '0.04699999839067459', 'Cadence': '7', 'Lat': '52.17917550355196', 'Lon': '6.532441098242998', 'Alt': '-0.20000000298023224', 'Distance': '0.0',
'HR': None, 'Time': '2018-10-10T14:10:11.000Z', 'Speed': '0.0', 'Cadence': '7', 'Lat': '52.17917634174228', 'Lon': '6.532444199547172', 'Alt': '0.0', 'Distance': '0.23000000417232513']
Thx, that's what I needed indeed!
– Chrisvdberge
Nov 15 '18 at 18:58
add a comment |
As I understand you need to iterate <Trackpoint>'s in <Track>.
I propose you to do it like this:
trackpoints = [
'HR': tp.findtext('HeartRateBpm/Value'),
'Time': tp.findtext('Time'),
'Speed': tp.findtext('Extensions/TPX/Speed'),
'Cadence': tp.findtext('Extensions/TPX/RunCadence'),
'Lat': tp.findtext('Position/LatitudeDegrees'),
'Lon': tp.findtext('Position/LongitudeDegrees'),
'Alt': tp.findtext('AltitudeMeters'),
'Distance': tp.findtext('DistanceMeters')
for tp in tree.xpath('//Track/Trackpoint')]
For xml chunk in question (with deleted <HeartRateBpm> in second <Trackpoint>) - trackpoints will contain such list:
['HR': '94', 'Time': '2018-10-10T14:10:10.000Z', 'Speed': '0.04699999839067459', 'Cadence': '7', 'Lat': '52.17917550355196', 'Lon': '6.532441098242998', 'Alt': '-0.20000000298023224', 'Distance': '0.0',
'HR': None, 'Time': '2018-10-10T14:10:11.000Z', 'Speed': '0.0', 'Cadence': '7', 'Lat': '52.17917634174228', 'Lon': '6.532444199547172', 'Alt': '0.0', 'Distance': '0.23000000417232513']
As I understand you need to iterate <Trackpoint>'s in <Track>.
I propose you to do it like this:
trackpoints = [
'HR': tp.findtext('HeartRateBpm/Value'),
'Time': tp.findtext('Time'),
'Speed': tp.findtext('Extensions/TPX/Speed'),
'Cadence': tp.findtext('Extensions/TPX/RunCadence'),
'Lat': tp.findtext('Position/LatitudeDegrees'),
'Lon': tp.findtext('Position/LongitudeDegrees'),
'Alt': tp.findtext('AltitudeMeters'),
'Distance': tp.findtext('DistanceMeters')
for tp in tree.xpath('//Track/Trackpoint')]
For xml chunk in question (with deleted <HeartRateBpm> in second <Trackpoint>) - trackpoints will contain such list:
['HR': '94', 'Time': '2018-10-10T14:10:10.000Z', 'Speed': '0.04699999839067459', 'Cadence': '7', 'Lat': '52.17917550355196', 'Lon': '6.532441098242998', 'Alt': '-0.20000000298023224', 'Distance': '0.0',
'HR': None, 'Time': '2018-10-10T14:10:11.000Z', 'Speed': '0.0', 'Cadence': '7', 'Lat': '52.17917634174228', 'Lon': '6.532444199547172', 'Alt': '0.0', 'Distance': '0.23000000417232513']
answered Nov 15 '18 at 14:13
Mike KaskunMike Kaskun
8142518
8142518
Thx, that's what I needed indeed!
– Chrisvdberge
Nov 15 '18 at 18:58
add a comment |
Thx, that's what I needed indeed!
– Chrisvdberge
Nov 15 '18 at 18:58
Thx, that's what I needed indeed!
– Chrisvdberge
Nov 15 '18 at 18:58
Thx, that's what I needed indeed!
– Chrisvdberge
Nov 15 '18 at 18:58
add a comment |
Thanks for contributing an answer to Stack Overflow!
- Please be sure to answer the question. Provide details and share your research!
But avoid …
- Asking for help, clarification, or responding to other answers.
- Making statements based on opinion; back them up with references or personal experience.
To learn more, see our tips on writing great answers.
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53319313%2fiterate-xpath-elements-to-get-individual-elements-instead-of-list%23new-answer', 'question_page');
);
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
What is your current and what is desired output?
– Andersson
Nov 15 '18 at 12:21
added it to the question based on the snippet I included here. Hope it's clear this way
– Chrisvdberge
Nov 15 '18 at 12:56
I think you need to do this with python. I don't know for sure, but seems with XPath 1.0 you won't be able to get nan for missing elements. You can check this answer[, but as I said it probably won't work with array result]: stackoverflow.com/a/4490667/7128891
– Mike Kaskun
Nov 15 '18 at 13:26