Iterate XPath elements to get individual elements instead of list










1















I'm parsing a XML document and reading values of different elements using XPath. Currently this works well to get all elements in lists.
However, children elements are not always present for all parents (but are present in some!) and I need to know which as I'm parsing the xml to create a dataframe to insert in a database.
So I want to iterate over elements and grab the values I need one at a time. I'm not sure how to do this as currently I'm getting the full list on each iteration.
I'm extracting elements that are nested at different levels.



The xml I'm parsing is a TCX file by Garmin. Short example:



 <?xml version="1.0" encoding="UTF-8"?>
<TrainingCenterDatabase
xsi:schemaLocation="http://www.garmin.com/xmlschemas/TrainingCenterDatabase/v2 http://www.garmin.com/xmlschemas/TrainingCenterDatabasev2.xsd"
xmlns:ns5="http://www.garmin.com/xmlschemas/ActivityGoals/v1"
xmlns:ns3="http://www.garmin.com/xmlschemas/ActivityExtension/v2"
xmlns:ns2="http://www.garmin.com/xmlschemas/UserProfile/v2"
xmlns="http://www.garmin.com/xmlschemas/TrainingCenterDatabase/v2"
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:ns4="http://www.garmin.com/xmlschemas/ProfileExtension/v1">
<Activities>
<Activity Sport="Running">
<Id>2018-10-10T14:10:10.000Z</Id>
<Lap StartTime="2018-10-10T14:10:10.000Z">
<TotalTimeSeconds>343.0</TotalTimeSeconds>
<DistanceMeters>1000.0</DistanceMeters>
<MaximumSpeed>3.694999933242798</MaximumSpeed>
<Calories>51</Calories>
<AverageHeartRateBpm>
<Value>136</Value>
</AverageHeartRateBpm>
<MaximumHeartRateBpm>
<Value>162</Value>
</MaximumHeartRateBpm>
<Intensity>Active</Intensity>
<TriggerMethod>Manual</TriggerMethod>
<Track>
<Trackpoint>
<Time>2018-10-10T14:10:10.000Z</Time>
<Position>
<LatitudeDegrees>52.17917550355196</LatitudeDegrees>
<LongitudeDegrees>6.532441098242998</LongitudeDegrees>
</Position>
<AltitudeMeters>-0.20000000298023224</AltitudeMeters>
<DistanceMeters>0.0</DistanceMeters>
<HeartRateBpm>
<Value>94</Value>
</HeartRateBpm>
<Extensions>
<ns3:TPX>
<ns3:Speed>0.04699999839067459</ns3:Speed>
<ns3:RunCadence>7</ns3:RunCadence>
</ns3:TPX>
</Extensions>
</Trackpoint>
<Trackpoint>
<Time>2018-10-10T14:10:11.000Z</Time>
<Position>
<LatitudeDegrees>52.17917634174228</LatitudeDegrees>
<LongitudeDegrees>6.532444199547172</LongitudeDegrees>
</Position>
<AltitudeMeters>0.0</AltitudeMeters>
<DistanceMeters>0.23000000417232513</DistanceMeters>
<HeartRateBpm>
<Value>95</Value>
</HeartRateBpm>
<Extensions>
<ns3:TPX>
<ns3:Speed>0.0</ns3:Speed>
<ns3:RunCadence>7</ns3:RunCadence>
</ns3:TPX>
</Extensions>
</Trackpoint>
<Trackpoint>
<Time>2018-10-10T14:10:12.000Z</Time>
<Position>
<LatitudeDegrees>52.17917206697166</LatitudeDegrees>
<LongitudeDegrees>6.532468926161528</LongitudeDegrees>
</Position>
<AltitudeMeters>0.0</AltitudeMeters>
<DistanceMeters>1.9700000286102295</DistanceMeters>
<Extensions>
<ns3:TPX>
<ns3:Speed>0.0</ns3:Speed>
<ns3:RunCadence>7</ns3:RunCadence>
</ns3:TPX>
</Extensions>
</Trackpoint>
<Trackpoint>
<Time>2018-10-10T14:10:13.000Z</Time>
<Position>
<LatitudeDegrees>52.17916024848819</LatitudeDegrees>
<LongitudeDegrees>6.5325202234089375</LongitudeDegrees>
</Position>
<AltitudeMeters>0.0</AltitudeMeters>
<DistanceMeters>5.679999828338623</DistanceMeters>
<HeartRateBpm>
<Value>96</Value>
</HeartRateBpm>
<Extensions>
<ns3:TPX>
<ns3:Speed>0.08399999886751175</ns3:Speed>
<ns3:RunCadence>7</ns3:RunCadence>
</ns3:TPX>
</Extensions>
</Trackpoint>
<Trackpoint>
<Time>2018-10-10T14:10:14.000Z</Time>
<Position>
<LatitudeDegrees>52.17914817854762</LatitudeDegrees>
<LongitudeDegrees>6.532532041892409</LongitudeDegrees>
</Position>
<AltitudeMeters>0.0</AltitudeMeters>
<DistanceMeters>7.150000095367432</DistanceMeters>
<HeartRateBpm>
<Value>98</Value>
</HeartRateBpm>
<Extensions>
<ns3:TPX>
<ns3:Speed>0.10300000011920929</ns3:Speed>
<ns3:RunCadence>10</ns3:RunCadence>
</ns3:TPX>
</Extensions>
</Trackpoint>


Code that is working that gives me all values in the file as a list:



from lxml import etree, objectify
from os import listdir
from os.path import isfile, join

def tcxParse(tcxFile):
parser = etree.XMLParser(remove_blank_text=True)
tree = etree.parse(tcxFile, parser)
root = tree.getroot()

####
#strip namespaces
for elem in root.getiterator():
if not hasattr(elem.tag, 'find'): continue # (1)
i = elem.tag.find('}')
if i >= 0:
elem.tag = elem.tag[i + 1:]
objectify.deannotate(root, cleanup_namespaces=True)
####
#check if we are dealing with .tcx or other format
if tcxFile.lower().endswith('.tcx'):
tcxParse.activity = tree.xpath('//*[@Sport]/@Sport')
tcxParse.HR = list(map(int, tree.xpath('//Track/Trackpoint/HeartRateBpm/Value/text()')))
tcxParse.Time = tree.xpath('//Time/text()')
tcxParse.Speed = list(map(float, tree.xpath('//Track/Trackpoint/Extensions/TPX/Speed/text()')))
tcxParse.Cadence = list(map(int, tree.xpath('//Track/Trackpoint/Extensions/TPX/RunCadence/text()')))
tcxParse.Lat = list(map(float, tree.xpath('//Track/Trackpoint/Position/LatitudeDegrees/text()')))
tcxParse.Lon = list(map(float, tree.xpath('//Track/Trackpoint/Position/LongitudeDegrees/text()')))
tcxParse.Alt = list(map(float, tree.xpath('//Track/Trackpoint/AltitudeMeters/text()')))
tcxParse.Distance = list(map(float, tree.xpath('//Track/Trackpoint/DistanceMeters/text()')))


I know I can use tree.iter() to iterate over the elements, but not sure how to grab the values one at a time instead of the full list.



To be clear:
Current output for tcxParse.HR for instance would be:



94,95,96,98


But I need it to be



94,95,nan,96,98 


as the HeartRateBpm is missing in the 3rd Trackpoint element










share|improve this question
























  • What is your current and what is desired output?

    – Andersson
    Nov 15 '18 at 12:21











  • added it to the question based on the snippet I included here. Hope it's clear this way

    – Chrisvdberge
    Nov 15 '18 at 12:56











  • I think you need to do this with python. I don't know for sure, but seems with XPath 1.0 you won't be able to get nan for missing elements. You can check this answer[, but as I said it probably won't work with array result]: stackoverflow.com/a/4490667/7128891

    – Mike Kaskun
    Nov 15 '18 at 13:26















1















I'm parsing a XML document and reading values of different elements using XPath. Currently this works well to get all elements in lists.
However, children elements are not always present for all parents (but are present in some!) and I need to know which as I'm parsing the xml to create a dataframe to insert in a database.
So I want to iterate over elements and grab the values I need one at a time. I'm not sure how to do this as currently I'm getting the full list on each iteration.
I'm extracting elements that are nested at different levels.



The xml I'm parsing is a TCX file by Garmin. Short example:



 <?xml version="1.0" encoding="UTF-8"?>
<TrainingCenterDatabase
xsi:schemaLocation="http://www.garmin.com/xmlschemas/TrainingCenterDatabase/v2 http://www.garmin.com/xmlschemas/TrainingCenterDatabasev2.xsd"
xmlns:ns5="http://www.garmin.com/xmlschemas/ActivityGoals/v1"
xmlns:ns3="http://www.garmin.com/xmlschemas/ActivityExtension/v2"
xmlns:ns2="http://www.garmin.com/xmlschemas/UserProfile/v2"
xmlns="http://www.garmin.com/xmlschemas/TrainingCenterDatabase/v2"
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:ns4="http://www.garmin.com/xmlschemas/ProfileExtension/v1">
<Activities>
<Activity Sport="Running">
<Id>2018-10-10T14:10:10.000Z</Id>
<Lap StartTime="2018-10-10T14:10:10.000Z">
<TotalTimeSeconds>343.0</TotalTimeSeconds>
<DistanceMeters>1000.0</DistanceMeters>
<MaximumSpeed>3.694999933242798</MaximumSpeed>
<Calories>51</Calories>
<AverageHeartRateBpm>
<Value>136</Value>
</AverageHeartRateBpm>
<MaximumHeartRateBpm>
<Value>162</Value>
</MaximumHeartRateBpm>
<Intensity>Active</Intensity>
<TriggerMethod>Manual</TriggerMethod>
<Track>
<Trackpoint>
<Time>2018-10-10T14:10:10.000Z</Time>
<Position>
<LatitudeDegrees>52.17917550355196</LatitudeDegrees>
<LongitudeDegrees>6.532441098242998</LongitudeDegrees>
</Position>
<AltitudeMeters>-0.20000000298023224</AltitudeMeters>
<DistanceMeters>0.0</DistanceMeters>
<HeartRateBpm>
<Value>94</Value>
</HeartRateBpm>
<Extensions>
<ns3:TPX>
<ns3:Speed>0.04699999839067459</ns3:Speed>
<ns3:RunCadence>7</ns3:RunCadence>
</ns3:TPX>
</Extensions>
</Trackpoint>
<Trackpoint>
<Time>2018-10-10T14:10:11.000Z</Time>
<Position>
<LatitudeDegrees>52.17917634174228</LatitudeDegrees>
<LongitudeDegrees>6.532444199547172</LongitudeDegrees>
</Position>
<AltitudeMeters>0.0</AltitudeMeters>
<DistanceMeters>0.23000000417232513</DistanceMeters>
<HeartRateBpm>
<Value>95</Value>
</HeartRateBpm>
<Extensions>
<ns3:TPX>
<ns3:Speed>0.0</ns3:Speed>
<ns3:RunCadence>7</ns3:RunCadence>
</ns3:TPX>
</Extensions>
</Trackpoint>
<Trackpoint>
<Time>2018-10-10T14:10:12.000Z</Time>
<Position>
<LatitudeDegrees>52.17917206697166</LatitudeDegrees>
<LongitudeDegrees>6.532468926161528</LongitudeDegrees>
</Position>
<AltitudeMeters>0.0</AltitudeMeters>
<DistanceMeters>1.9700000286102295</DistanceMeters>
<Extensions>
<ns3:TPX>
<ns3:Speed>0.0</ns3:Speed>
<ns3:RunCadence>7</ns3:RunCadence>
</ns3:TPX>
</Extensions>
</Trackpoint>
<Trackpoint>
<Time>2018-10-10T14:10:13.000Z</Time>
<Position>
<LatitudeDegrees>52.17916024848819</LatitudeDegrees>
<LongitudeDegrees>6.5325202234089375</LongitudeDegrees>
</Position>
<AltitudeMeters>0.0</AltitudeMeters>
<DistanceMeters>5.679999828338623</DistanceMeters>
<HeartRateBpm>
<Value>96</Value>
</HeartRateBpm>
<Extensions>
<ns3:TPX>
<ns3:Speed>0.08399999886751175</ns3:Speed>
<ns3:RunCadence>7</ns3:RunCadence>
</ns3:TPX>
</Extensions>
</Trackpoint>
<Trackpoint>
<Time>2018-10-10T14:10:14.000Z</Time>
<Position>
<LatitudeDegrees>52.17914817854762</LatitudeDegrees>
<LongitudeDegrees>6.532532041892409</LongitudeDegrees>
</Position>
<AltitudeMeters>0.0</AltitudeMeters>
<DistanceMeters>7.150000095367432</DistanceMeters>
<HeartRateBpm>
<Value>98</Value>
</HeartRateBpm>
<Extensions>
<ns3:TPX>
<ns3:Speed>0.10300000011920929</ns3:Speed>
<ns3:RunCadence>10</ns3:RunCadence>
</ns3:TPX>
</Extensions>
</Trackpoint>


Code that is working that gives me all values in the file as a list:



from lxml import etree, objectify
from os import listdir
from os.path import isfile, join

def tcxParse(tcxFile):
parser = etree.XMLParser(remove_blank_text=True)
tree = etree.parse(tcxFile, parser)
root = tree.getroot()

####
#strip namespaces
for elem in root.getiterator():
if not hasattr(elem.tag, 'find'): continue # (1)
i = elem.tag.find('}')
if i >= 0:
elem.tag = elem.tag[i + 1:]
objectify.deannotate(root, cleanup_namespaces=True)
####
#check if we are dealing with .tcx or other format
if tcxFile.lower().endswith('.tcx'):
tcxParse.activity = tree.xpath('//*[@Sport]/@Sport')
tcxParse.HR = list(map(int, tree.xpath('//Track/Trackpoint/HeartRateBpm/Value/text()')))
tcxParse.Time = tree.xpath('//Time/text()')
tcxParse.Speed = list(map(float, tree.xpath('//Track/Trackpoint/Extensions/TPX/Speed/text()')))
tcxParse.Cadence = list(map(int, tree.xpath('//Track/Trackpoint/Extensions/TPX/RunCadence/text()')))
tcxParse.Lat = list(map(float, tree.xpath('//Track/Trackpoint/Position/LatitudeDegrees/text()')))
tcxParse.Lon = list(map(float, tree.xpath('//Track/Trackpoint/Position/LongitudeDegrees/text()')))
tcxParse.Alt = list(map(float, tree.xpath('//Track/Trackpoint/AltitudeMeters/text()')))
tcxParse.Distance = list(map(float, tree.xpath('//Track/Trackpoint/DistanceMeters/text()')))


I know I can use tree.iter() to iterate over the elements, but not sure how to grab the values one at a time instead of the full list.



To be clear:
Current output for tcxParse.HR for instance would be:



94,95,96,98


But I need it to be



94,95,nan,96,98 


as the HeartRateBpm is missing in the 3rd Trackpoint element










share|improve this question
























  • What is your current and what is desired output?

    – Andersson
    Nov 15 '18 at 12:21











  • added it to the question based on the snippet I included here. Hope it's clear this way

    – Chrisvdberge
    Nov 15 '18 at 12:56











  • I think you need to do this with python. I don't know for sure, but seems with XPath 1.0 you won't be able to get nan for missing elements. You can check this answer[, but as I said it probably won't work with array result]: stackoverflow.com/a/4490667/7128891

    – Mike Kaskun
    Nov 15 '18 at 13:26













1












1








1








I'm parsing a XML document and reading values of different elements using XPath. Currently this works well to get all elements in lists.
However, children elements are not always present for all parents (but are present in some!) and I need to know which as I'm parsing the xml to create a dataframe to insert in a database.
So I want to iterate over elements and grab the values I need one at a time. I'm not sure how to do this as currently I'm getting the full list on each iteration.
I'm extracting elements that are nested at different levels.



The xml I'm parsing is a TCX file by Garmin. Short example:



 <?xml version="1.0" encoding="UTF-8"?>
<TrainingCenterDatabase
xsi:schemaLocation="http://www.garmin.com/xmlschemas/TrainingCenterDatabase/v2 http://www.garmin.com/xmlschemas/TrainingCenterDatabasev2.xsd"
xmlns:ns5="http://www.garmin.com/xmlschemas/ActivityGoals/v1"
xmlns:ns3="http://www.garmin.com/xmlschemas/ActivityExtension/v2"
xmlns:ns2="http://www.garmin.com/xmlschemas/UserProfile/v2"
xmlns="http://www.garmin.com/xmlschemas/TrainingCenterDatabase/v2"
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:ns4="http://www.garmin.com/xmlschemas/ProfileExtension/v1">
<Activities>
<Activity Sport="Running">
<Id>2018-10-10T14:10:10.000Z</Id>
<Lap StartTime="2018-10-10T14:10:10.000Z">
<TotalTimeSeconds>343.0</TotalTimeSeconds>
<DistanceMeters>1000.0</DistanceMeters>
<MaximumSpeed>3.694999933242798</MaximumSpeed>
<Calories>51</Calories>
<AverageHeartRateBpm>
<Value>136</Value>
</AverageHeartRateBpm>
<MaximumHeartRateBpm>
<Value>162</Value>
</MaximumHeartRateBpm>
<Intensity>Active</Intensity>
<TriggerMethod>Manual</TriggerMethod>
<Track>
<Trackpoint>
<Time>2018-10-10T14:10:10.000Z</Time>
<Position>
<LatitudeDegrees>52.17917550355196</LatitudeDegrees>
<LongitudeDegrees>6.532441098242998</LongitudeDegrees>
</Position>
<AltitudeMeters>-0.20000000298023224</AltitudeMeters>
<DistanceMeters>0.0</DistanceMeters>
<HeartRateBpm>
<Value>94</Value>
</HeartRateBpm>
<Extensions>
<ns3:TPX>
<ns3:Speed>0.04699999839067459</ns3:Speed>
<ns3:RunCadence>7</ns3:RunCadence>
</ns3:TPX>
</Extensions>
</Trackpoint>
<Trackpoint>
<Time>2018-10-10T14:10:11.000Z</Time>
<Position>
<LatitudeDegrees>52.17917634174228</LatitudeDegrees>
<LongitudeDegrees>6.532444199547172</LongitudeDegrees>
</Position>
<AltitudeMeters>0.0</AltitudeMeters>
<DistanceMeters>0.23000000417232513</DistanceMeters>
<HeartRateBpm>
<Value>95</Value>
</HeartRateBpm>
<Extensions>
<ns3:TPX>
<ns3:Speed>0.0</ns3:Speed>
<ns3:RunCadence>7</ns3:RunCadence>
</ns3:TPX>
</Extensions>
</Trackpoint>
<Trackpoint>
<Time>2018-10-10T14:10:12.000Z</Time>
<Position>
<LatitudeDegrees>52.17917206697166</LatitudeDegrees>
<LongitudeDegrees>6.532468926161528</LongitudeDegrees>
</Position>
<AltitudeMeters>0.0</AltitudeMeters>
<DistanceMeters>1.9700000286102295</DistanceMeters>
<Extensions>
<ns3:TPX>
<ns3:Speed>0.0</ns3:Speed>
<ns3:RunCadence>7</ns3:RunCadence>
</ns3:TPX>
</Extensions>
</Trackpoint>
<Trackpoint>
<Time>2018-10-10T14:10:13.000Z</Time>
<Position>
<LatitudeDegrees>52.17916024848819</LatitudeDegrees>
<LongitudeDegrees>6.5325202234089375</LongitudeDegrees>
</Position>
<AltitudeMeters>0.0</AltitudeMeters>
<DistanceMeters>5.679999828338623</DistanceMeters>
<HeartRateBpm>
<Value>96</Value>
</HeartRateBpm>
<Extensions>
<ns3:TPX>
<ns3:Speed>0.08399999886751175</ns3:Speed>
<ns3:RunCadence>7</ns3:RunCadence>
</ns3:TPX>
</Extensions>
</Trackpoint>
<Trackpoint>
<Time>2018-10-10T14:10:14.000Z</Time>
<Position>
<LatitudeDegrees>52.17914817854762</LatitudeDegrees>
<LongitudeDegrees>6.532532041892409</LongitudeDegrees>
</Position>
<AltitudeMeters>0.0</AltitudeMeters>
<DistanceMeters>7.150000095367432</DistanceMeters>
<HeartRateBpm>
<Value>98</Value>
</HeartRateBpm>
<Extensions>
<ns3:TPX>
<ns3:Speed>0.10300000011920929</ns3:Speed>
<ns3:RunCadence>10</ns3:RunCadence>
</ns3:TPX>
</Extensions>
</Trackpoint>


Code that is working that gives me all values in the file as a list:



from lxml import etree, objectify
from os import listdir
from os.path import isfile, join

def tcxParse(tcxFile):
parser = etree.XMLParser(remove_blank_text=True)
tree = etree.parse(tcxFile, parser)
root = tree.getroot()

####
#strip namespaces
for elem in root.getiterator():
if not hasattr(elem.tag, 'find'): continue # (1)
i = elem.tag.find('}')
if i >= 0:
elem.tag = elem.tag[i + 1:]
objectify.deannotate(root, cleanup_namespaces=True)
####
#check if we are dealing with .tcx or other format
if tcxFile.lower().endswith('.tcx'):
tcxParse.activity = tree.xpath('//*[@Sport]/@Sport')
tcxParse.HR = list(map(int, tree.xpath('//Track/Trackpoint/HeartRateBpm/Value/text()')))
tcxParse.Time = tree.xpath('//Time/text()')
tcxParse.Speed = list(map(float, tree.xpath('//Track/Trackpoint/Extensions/TPX/Speed/text()')))
tcxParse.Cadence = list(map(int, tree.xpath('//Track/Trackpoint/Extensions/TPX/RunCadence/text()')))
tcxParse.Lat = list(map(float, tree.xpath('//Track/Trackpoint/Position/LatitudeDegrees/text()')))
tcxParse.Lon = list(map(float, tree.xpath('//Track/Trackpoint/Position/LongitudeDegrees/text()')))
tcxParse.Alt = list(map(float, tree.xpath('//Track/Trackpoint/AltitudeMeters/text()')))
tcxParse.Distance = list(map(float, tree.xpath('//Track/Trackpoint/DistanceMeters/text()')))


I know I can use tree.iter() to iterate over the elements, but not sure how to grab the values one at a time instead of the full list.



To be clear:
Current output for tcxParse.HR for instance would be:



94,95,96,98


But I need it to be



94,95,nan,96,98 


as the HeartRateBpm is missing in the 3rd Trackpoint element










share|improve this question
















I'm parsing a XML document and reading values of different elements using XPath. Currently this works well to get all elements in lists.
However, children elements are not always present for all parents (but are present in some!) and I need to know which as I'm parsing the xml to create a dataframe to insert in a database.
So I want to iterate over elements and grab the values I need one at a time. I'm not sure how to do this as currently I'm getting the full list on each iteration.
I'm extracting elements that are nested at different levels.



The xml I'm parsing is a TCX file by Garmin. Short example:



 <?xml version="1.0" encoding="UTF-8"?>
<TrainingCenterDatabase
xsi:schemaLocation="http://www.garmin.com/xmlschemas/TrainingCenterDatabase/v2 http://www.garmin.com/xmlschemas/TrainingCenterDatabasev2.xsd"
xmlns:ns5="http://www.garmin.com/xmlschemas/ActivityGoals/v1"
xmlns:ns3="http://www.garmin.com/xmlschemas/ActivityExtension/v2"
xmlns:ns2="http://www.garmin.com/xmlschemas/UserProfile/v2"
xmlns="http://www.garmin.com/xmlschemas/TrainingCenterDatabase/v2"
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:ns4="http://www.garmin.com/xmlschemas/ProfileExtension/v1">
<Activities>
<Activity Sport="Running">
<Id>2018-10-10T14:10:10.000Z</Id>
<Lap StartTime="2018-10-10T14:10:10.000Z">
<TotalTimeSeconds>343.0</TotalTimeSeconds>
<DistanceMeters>1000.0</DistanceMeters>
<MaximumSpeed>3.694999933242798</MaximumSpeed>
<Calories>51</Calories>
<AverageHeartRateBpm>
<Value>136</Value>
</AverageHeartRateBpm>
<MaximumHeartRateBpm>
<Value>162</Value>
</MaximumHeartRateBpm>
<Intensity>Active</Intensity>
<TriggerMethod>Manual</TriggerMethod>
<Track>
<Trackpoint>
<Time>2018-10-10T14:10:10.000Z</Time>
<Position>
<LatitudeDegrees>52.17917550355196</LatitudeDegrees>
<LongitudeDegrees>6.532441098242998</LongitudeDegrees>
</Position>
<AltitudeMeters>-0.20000000298023224</AltitudeMeters>
<DistanceMeters>0.0</DistanceMeters>
<HeartRateBpm>
<Value>94</Value>
</HeartRateBpm>
<Extensions>
<ns3:TPX>
<ns3:Speed>0.04699999839067459</ns3:Speed>
<ns3:RunCadence>7</ns3:RunCadence>
</ns3:TPX>
</Extensions>
</Trackpoint>
<Trackpoint>
<Time>2018-10-10T14:10:11.000Z</Time>
<Position>
<LatitudeDegrees>52.17917634174228</LatitudeDegrees>
<LongitudeDegrees>6.532444199547172</LongitudeDegrees>
</Position>
<AltitudeMeters>0.0</AltitudeMeters>
<DistanceMeters>0.23000000417232513</DistanceMeters>
<HeartRateBpm>
<Value>95</Value>
</HeartRateBpm>
<Extensions>
<ns3:TPX>
<ns3:Speed>0.0</ns3:Speed>
<ns3:RunCadence>7</ns3:RunCadence>
</ns3:TPX>
</Extensions>
</Trackpoint>
<Trackpoint>
<Time>2018-10-10T14:10:12.000Z</Time>
<Position>
<LatitudeDegrees>52.17917206697166</LatitudeDegrees>
<LongitudeDegrees>6.532468926161528</LongitudeDegrees>
</Position>
<AltitudeMeters>0.0</AltitudeMeters>
<DistanceMeters>1.9700000286102295</DistanceMeters>
<Extensions>
<ns3:TPX>
<ns3:Speed>0.0</ns3:Speed>
<ns3:RunCadence>7</ns3:RunCadence>
</ns3:TPX>
</Extensions>
</Trackpoint>
<Trackpoint>
<Time>2018-10-10T14:10:13.000Z</Time>
<Position>
<LatitudeDegrees>52.17916024848819</LatitudeDegrees>
<LongitudeDegrees>6.5325202234089375</LongitudeDegrees>
</Position>
<AltitudeMeters>0.0</AltitudeMeters>
<DistanceMeters>5.679999828338623</DistanceMeters>
<HeartRateBpm>
<Value>96</Value>
</HeartRateBpm>
<Extensions>
<ns3:TPX>
<ns3:Speed>0.08399999886751175</ns3:Speed>
<ns3:RunCadence>7</ns3:RunCadence>
</ns3:TPX>
</Extensions>
</Trackpoint>
<Trackpoint>
<Time>2018-10-10T14:10:14.000Z</Time>
<Position>
<LatitudeDegrees>52.17914817854762</LatitudeDegrees>
<LongitudeDegrees>6.532532041892409</LongitudeDegrees>
</Position>
<AltitudeMeters>0.0</AltitudeMeters>
<DistanceMeters>7.150000095367432</DistanceMeters>
<HeartRateBpm>
<Value>98</Value>
</HeartRateBpm>
<Extensions>
<ns3:TPX>
<ns3:Speed>0.10300000011920929</ns3:Speed>
<ns3:RunCadence>10</ns3:RunCadence>
</ns3:TPX>
</Extensions>
</Trackpoint>


Code that is working that gives me all values in the file as a list:



from lxml import etree, objectify
from os import listdir
from os.path import isfile, join

def tcxParse(tcxFile):
parser = etree.XMLParser(remove_blank_text=True)
tree = etree.parse(tcxFile, parser)
root = tree.getroot()

####
#strip namespaces
for elem in root.getiterator():
if not hasattr(elem.tag, 'find'): continue # (1)
i = elem.tag.find('}')
if i >= 0:
elem.tag = elem.tag[i + 1:]
objectify.deannotate(root, cleanup_namespaces=True)
####
#check if we are dealing with .tcx or other format
if tcxFile.lower().endswith('.tcx'):
tcxParse.activity = tree.xpath('//*[@Sport]/@Sport')
tcxParse.HR = list(map(int, tree.xpath('//Track/Trackpoint/HeartRateBpm/Value/text()')))
tcxParse.Time = tree.xpath('//Time/text()')
tcxParse.Speed = list(map(float, tree.xpath('//Track/Trackpoint/Extensions/TPX/Speed/text()')))
tcxParse.Cadence = list(map(int, tree.xpath('//Track/Trackpoint/Extensions/TPX/RunCadence/text()')))
tcxParse.Lat = list(map(float, tree.xpath('//Track/Trackpoint/Position/LatitudeDegrees/text()')))
tcxParse.Lon = list(map(float, tree.xpath('//Track/Trackpoint/Position/LongitudeDegrees/text()')))
tcxParse.Alt = list(map(float, tree.xpath('//Track/Trackpoint/AltitudeMeters/text()')))
tcxParse.Distance = list(map(float, tree.xpath('//Track/Trackpoint/DistanceMeters/text()')))


I know I can use tree.iter() to iterate over the elements, but not sure how to grab the values one at a time instead of the full list.



To be clear:
Current output for tcxParse.HR for instance would be:



94,95,96,98


But I need it to be



94,95,nan,96,98 


as the HeartRateBpm is missing in the 3rd Trackpoint element







python xml xpath






share|improve this question















share|improve this question













share|improve this question




share|improve this question








edited Nov 15 '18 at 12:50







Chrisvdberge

















asked Nov 15 '18 at 12:15









ChrisvdbergeChrisvdberge

5602822




5602822












  • What is your current and what is desired output?

    – Andersson
    Nov 15 '18 at 12:21











  • added it to the question based on the snippet I included here. Hope it's clear this way

    – Chrisvdberge
    Nov 15 '18 at 12:56











  • I think you need to do this with python. I don't know for sure, but seems with XPath 1.0 you won't be able to get nan for missing elements. You can check this answer[, but as I said it probably won't work with array result]: stackoverflow.com/a/4490667/7128891

    – Mike Kaskun
    Nov 15 '18 at 13:26

















  • What is your current and what is desired output?

    – Andersson
    Nov 15 '18 at 12:21











  • added it to the question based on the snippet I included here. Hope it's clear this way

    – Chrisvdberge
    Nov 15 '18 at 12:56











  • I think you need to do this with python. I don't know for sure, but seems with XPath 1.0 you won't be able to get nan for missing elements. You can check this answer[, but as I said it probably won't work with array result]: stackoverflow.com/a/4490667/7128891

    – Mike Kaskun
    Nov 15 '18 at 13:26
















What is your current and what is desired output?

– Andersson
Nov 15 '18 at 12:21





What is your current and what is desired output?

– Andersson
Nov 15 '18 at 12:21













added it to the question based on the snippet I included here. Hope it's clear this way

– Chrisvdberge
Nov 15 '18 at 12:56





added it to the question based on the snippet I included here. Hope it's clear this way

– Chrisvdberge
Nov 15 '18 at 12:56













I think you need to do this with python. I don't know for sure, but seems with XPath 1.0 you won't be able to get nan for missing elements. You can check this answer[, but as I said it probably won't work with array result]: stackoverflow.com/a/4490667/7128891

– Mike Kaskun
Nov 15 '18 at 13:26





I think you need to do this with python. I don't know for sure, but seems with XPath 1.0 you won't be able to get nan for missing elements. You can check this answer[, but as I said it probably won't work with array result]: stackoverflow.com/a/4490667/7128891

– Mike Kaskun
Nov 15 '18 at 13:26












1 Answer
1






active

oldest

votes


















1














As I understand you need to iterate <Trackpoint>'s in <Track>.

I propose you to do it like this:



trackpoints = [
'HR': tp.findtext('HeartRateBpm/Value'),
'Time': tp.findtext('Time'),
'Speed': tp.findtext('Extensions/TPX/Speed'),
'Cadence': tp.findtext('Extensions/TPX/RunCadence'),
'Lat': tp.findtext('Position/LatitudeDegrees'),
'Lon': tp.findtext('Position/LongitudeDegrees'),
'Alt': tp.findtext('AltitudeMeters'),
'Distance': tp.findtext('DistanceMeters')

for tp in tree.xpath('//Track/Trackpoint')]


For xml chunk in question (with deleted <HeartRateBpm> in second <Trackpoint>) - trackpoints will contain such list:



['HR': '94', 'Time': '2018-10-10T14:10:10.000Z', 'Speed': '0.04699999839067459', 'Cadence': '7', 'Lat': '52.17917550355196', 'Lon': '6.532441098242998', 'Alt': '-0.20000000298023224', 'Distance': '0.0', 
'HR': None, 'Time': '2018-10-10T14:10:11.000Z', 'Speed': '0.0', 'Cadence': '7', 'Lat': '52.17917634174228', 'Lon': '6.532444199547172', 'Alt': '0.0', 'Distance': '0.23000000417232513']





share|improve this answer























  • Thx, that's what I needed indeed!

    – Chrisvdberge
    Nov 15 '18 at 18:58










Your Answer






StackExchange.ifUsing("editor", function ()
StackExchange.using("externalEditor", function ()
StackExchange.using("snippets", function ()
StackExchange.snippets.init();
);
);
, "code-snippets");

StackExchange.ready(function()
var channelOptions =
tags: "".split(" "),
id: "1"
;
initTagRenderer("".split(" "), "".split(" "), channelOptions);

StackExchange.using("externalEditor", function()
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled)
StackExchange.using("snippets", function()
createEditor();
);

else
createEditor();

);

function createEditor()
StackExchange.prepareEditor(
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: true,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: 10,
bindNavPrevention: true,
postfix: "",
imageUploader:
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
,
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
);



);













draft saved

draft discarded


















StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53319313%2fiterate-xpath-elements-to-get-individual-elements-instead-of-list%23new-answer', 'question_page');

);

Post as a guest















Required, but never shown

























1 Answer
1






active

oldest

votes








1 Answer
1






active

oldest

votes









active

oldest

votes






active

oldest

votes









1














As I understand you need to iterate <Trackpoint>'s in <Track>.

I propose you to do it like this:



trackpoints = [
'HR': tp.findtext('HeartRateBpm/Value'),
'Time': tp.findtext('Time'),
'Speed': tp.findtext('Extensions/TPX/Speed'),
'Cadence': tp.findtext('Extensions/TPX/RunCadence'),
'Lat': tp.findtext('Position/LatitudeDegrees'),
'Lon': tp.findtext('Position/LongitudeDegrees'),
'Alt': tp.findtext('AltitudeMeters'),
'Distance': tp.findtext('DistanceMeters')

for tp in tree.xpath('//Track/Trackpoint')]


For xml chunk in question (with deleted <HeartRateBpm> in second <Trackpoint>) - trackpoints will contain such list:



['HR': '94', 'Time': '2018-10-10T14:10:10.000Z', 'Speed': '0.04699999839067459', 'Cadence': '7', 'Lat': '52.17917550355196', 'Lon': '6.532441098242998', 'Alt': '-0.20000000298023224', 'Distance': '0.0', 
'HR': None, 'Time': '2018-10-10T14:10:11.000Z', 'Speed': '0.0', 'Cadence': '7', 'Lat': '52.17917634174228', 'Lon': '6.532444199547172', 'Alt': '0.0', 'Distance': '0.23000000417232513']





share|improve this answer























  • Thx, that's what I needed indeed!

    – Chrisvdberge
    Nov 15 '18 at 18:58















1














As I understand you need to iterate <Trackpoint>'s in <Track>.

I propose you to do it like this:



trackpoints = [
'HR': tp.findtext('HeartRateBpm/Value'),
'Time': tp.findtext('Time'),
'Speed': tp.findtext('Extensions/TPX/Speed'),
'Cadence': tp.findtext('Extensions/TPX/RunCadence'),
'Lat': tp.findtext('Position/LatitudeDegrees'),
'Lon': tp.findtext('Position/LongitudeDegrees'),
'Alt': tp.findtext('AltitudeMeters'),
'Distance': tp.findtext('DistanceMeters')

for tp in tree.xpath('//Track/Trackpoint')]


For xml chunk in question (with deleted <HeartRateBpm> in second <Trackpoint>) - trackpoints will contain such list:



['HR': '94', 'Time': '2018-10-10T14:10:10.000Z', 'Speed': '0.04699999839067459', 'Cadence': '7', 'Lat': '52.17917550355196', 'Lon': '6.532441098242998', 'Alt': '-0.20000000298023224', 'Distance': '0.0', 
'HR': None, 'Time': '2018-10-10T14:10:11.000Z', 'Speed': '0.0', 'Cadence': '7', 'Lat': '52.17917634174228', 'Lon': '6.532444199547172', 'Alt': '0.0', 'Distance': '0.23000000417232513']





share|improve this answer























  • Thx, that's what I needed indeed!

    – Chrisvdberge
    Nov 15 '18 at 18:58













1












1








1







As I understand you need to iterate <Trackpoint>'s in <Track>.

I propose you to do it like this:



trackpoints = [
'HR': tp.findtext('HeartRateBpm/Value'),
'Time': tp.findtext('Time'),
'Speed': tp.findtext('Extensions/TPX/Speed'),
'Cadence': tp.findtext('Extensions/TPX/RunCadence'),
'Lat': tp.findtext('Position/LatitudeDegrees'),
'Lon': tp.findtext('Position/LongitudeDegrees'),
'Alt': tp.findtext('AltitudeMeters'),
'Distance': tp.findtext('DistanceMeters')

for tp in tree.xpath('//Track/Trackpoint')]


For xml chunk in question (with deleted <HeartRateBpm> in second <Trackpoint>) - trackpoints will contain such list:



['HR': '94', 'Time': '2018-10-10T14:10:10.000Z', 'Speed': '0.04699999839067459', 'Cadence': '7', 'Lat': '52.17917550355196', 'Lon': '6.532441098242998', 'Alt': '-0.20000000298023224', 'Distance': '0.0', 
'HR': None, 'Time': '2018-10-10T14:10:11.000Z', 'Speed': '0.0', 'Cadence': '7', 'Lat': '52.17917634174228', 'Lon': '6.532444199547172', 'Alt': '0.0', 'Distance': '0.23000000417232513']





share|improve this answer













As I understand you need to iterate <Trackpoint>'s in <Track>.

I propose you to do it like this:



trackpoints = [
'HR': tp.findtext('HeartRateBpm/Value'),
'Time': tp.findtext('Time'),
'Speed': tp.findtext('Extensions/TPX/Speed'),
'Cadence': tp.findtext('Extensions/TPX/RunCadence'),
'Lat': tp.findtext('Position/LatitudeDegrees'),
'Lon': tp.findtext('Position/LongitudeDegrees'),
'Alt': tp.findtext('AltitudeMeters'),
'Distance': tp.findtext('DistanceMeters')

for tp in tree.xpath('//Track/Trackpoint')]


For xml chunk in question (with deleted <HeartRateBpm> in second <Trackpoint>) - trackpoints will contain such list:



['HR': '94', 'Time': '2018-10-10T14:10:10.000Z', 'Speed': '0.04699999839067459', 'Cadence': '7', 'Lat': '52.17917550355196', 'Lon': '6.532441098242998', 'Alt': '-0.20000000298023224', 'Distance': '0.0', 
'HR': None, 'Time': '2018-10-10T14:10:11.000Z', 'Speed': '0.0', 'Cadence': '7', 'Lat': '52.17917634174228', 'Lon': '6.532444199547172', 'Alt': '0.0', 'Distance': '0.23000000417232513']






share|improve this answer












share|improve this answer



share|improve this answer










answered Nov 15 '18 at 14:13









Mike KaskunMike Kaskun

8142518




8142518












  • Thx, that's what I needed indeed!

    – Chrisvdberge
    Nov 15 '18 at 18:58

















  • Thx, that's what I needed indeed!

    – Chrisvdberge
    Nov 15 '18 at 18:58
















Thx, that's what I needed indeed!

– Chrisvdberge
Nov 15 '18 at 18:58





Thx, that's what I needed indeed!

– Chrisvdberge
Nov 15 '18 at 18:58



















draft saved

draft discarded
















































Thanks for contributing an answer to Stack Overflow!


  • Please be sure to answer the question. Provide details and share your research!

But avoid


  • Asking for help, clarification, or responding to other answers.

  • Making statements based on opinion; back them up with references or personal experience.

To learn more, see our tips on writing great answers.




draft saved


draft discarded














StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53319313%2fiterate-xpath-elements-to-get-individual-elements-instead-of-list%23new-answer', 'question_page');

);

Post as a guest















Required, but never shown





















































Required, but never shown














Required, but never shown












Required, but never shown







Required, but never shown

































Required, but never shown














Required, but never shown












Required, but never shown







Required, but never shown







這個網誌中的熱門文章

What does pagestruct do in Eviews?

Dutch intervention in Lombok and Karangasem

Channel Islands