Turn text list into json formatted list
I have a text file that is formatted like the following, with each hyphen representing a hierarchy for the list item.
category1 : 0120391123123
- subcategory : 0120391123123
-- subsubcategory : 019301948109
--- subsubsubcategory : 013904123908
---- subsubsubsubcategory : 019341823908
- subcategory2 : 0934810923801
-- subsubcategory2 : 09341829308123
category2: 1309183912309
- subcategory : 10293182094
...
How can I programmatically get a list like this into a json format like the following?
[
"category1":"0120391123123"
,
[
"subcategory":"0120391123123"
,
[
"subsubcategory":"019301948109"
,
[
"subsubsubcategory":"013904123908"
,
[
"subsubsubsubcategory":"019341823908"
]
]
]
],
[
"subcategory2":"0934810923801"
,
[
"subsubcategory2":"09341829308123"
]
],
[
"category2":"1309183912309"
,
[
"subcategory":"10293182094"
]
]
]
python json list
add a comment |
I have a text file that is formatted like the following, with each hyphen representing a hierarchy for the list item.
category1 : 0120391123123
- subcategory : 0120391123123
-- subsubcategory : 019301948109
--- subsubsubcategory : 013904123908
---- subsubsubsubcategory : 019341823908
- subcategory2 : 0934810923801
-- subsubcategory2 : 09341829308123
category2: 1309183912309
- subcategory : 10293182094
...
How can I programmatically get a list like this into a json format like the following?
[
"category1":"0120391123123"
,
[
"subcategory":"0120391123123"
,
[
"subsubcategory":"019301948109"
,
[
"subsubsubcategory":"013904123908"
,
[
"subsubsubsubcategory":"019341823908"
]
]
]
],
[
"subcategory2":"0934810923801"
,
[
"subsubcategory2":"09341829308123"
]
],
[
"category2":"1309183912309"
,
[
"subcategory":"10293182094"
]
]
]
python json list
Have you got any Python code yet that just reads the file and attempts to parse it?
– cricket_007
Nov 14 '18 at 4:11
Don't be afraid of JSON. It just a list and dictionary. Start from small example first and then cover your result with test one by one into real problem. It takes time. Be patience.
– Sarit
Nov 14 '18 at 4:15
It kinda looks like YAML, so you can try it look closer to yaml and parse with relevant library. And yes, without trying or showing your attempt it's a small chance that you'll get an answer.
– vishes_shell
Nov 14 '18 at 4:26
add a comment |
I have a text file that is formatted like the following, with each hyphen representing a hierarchy for the list item.
category1 : 0120391123123
- subcategory : 0120391123123
-- subsubcategory : 019301948109
--- subsubsubcategory : 013904123908
---- subsubsubsubcategory : 019341823908
- subcategory2 : 0934810923801
-- subsubcategory2 : 09341829308123
category2: 1309183912309
- subcategory : 10293182094
...
How can I programmatically get a list like this into a json format like the following?
[
"category1":"0120391123123"
,
[
"subcategory":"0120391123123"
,
[
"subsubcategory":"019301948109"
,
[
"subsubsubcategory":"013904123908"
,
[
"subsubsubsubcategory":"019341823908"
]
]
]
],
[
"subcategory2":"0934810923801"
,
[
"subsubcategory2":"09341829308123"
]
],
[
"category2":"1309183912309"
,
[
"subcategory":"10293182094"
]
]
]
python json list
I have a text file that is formatted like the following, with each hyphen representing a hierarchy for the list item.
category1 : 0120391123123
- subcategory : 0120391123123
-- subsubcategory : 019301948109
--- subsubsubcategory : 013904123908
---- subsubsubsubcategory : 019341823908
- subcategory2 : 0934810923801
-- subsubcategory2 : 09341829308123
category2: 1309183912309
- subcategory : 10293182094
...
How can I programmatically get a list like this into a json format like the following?
[
"category1":"0120391123123"
,
[
"subcategory":"0120391123123"
,
[
"subsubcategory":"019301948109"
,
[
"subsubsubcategory":"013904123908"
,
[
"subsubsubsubcategory":"019341823908"
]
]
]
],
[
"subcategory2":"0934810923801"
,
[
"subsubcategory2":"09341829308123"
]
],
[
"category2":"1309183912309"
,
[
"subcategory":"10293182094"
]
]
]
python json list
python json list
edited Nov 14 '18 at 4:12
cricket_007
81.5k1142111
81.5k1142111
asked Nov 14 '18 at 4:08
Evan HesslerEvan Hessler
899
899
Have you got any Python code yet that just reads the file and attempts to parse it?
– cricket_007
Nov 14 '18 at 4:11
Don't be afraid of JSON. It just a list and dictionary. Start from small example first and then cover your result with test one by one into real problem. It takes time. Be patience.
– Sarit
Nov 14 '18 at 4:15
It kinda looks like YAML, so you can try it look closer to yaml and parse with relevant library. And yes, without trying or showing your attempt it's a small chance that you'll get an answer.
– vishes_shell
Nov 14 '18 at 4:26
add a comment |
Have you got any Python code yet that just reads the file and attempts to parse it?
– cricket_007
Nov 14 '18 at 4:11
Don't be afraid of JSON. It just a list and dictionary. Start from small example first and then cover your result with test one by one into real problem. It takes time. Be patience.
– Sarit
Nov 14 '18 at 4:15
It kinda looks like YAML, so you can try it look closer to yaml and parse with relevant library. And yes, without trying or showing your attempt it's a small chance that you'll get an answer.
– vishes_shell
Nov 14 '18 at 4:26
Have you got any Python code yet that just reads the file and attempts to parse it?
– cricket_007
Nov 14 '18 at 4:11
Have you got any Python code yet that just reads the file and attempts to parse it?
– cricket_007
Nov 14 '18 at 4:11
Don't be afraid of JSON. It just a list and dictionary. Start from small example first and then cover your result with test one by one into real problem. It takes time. Be patience.
– Sarit
Nov 14 '18 at 4:15
Don't be afraid of JSON. It just a list and dictionary. Start from small example first and then cover your result with test one by one into real problem. It takes time. Be patience.
– Sarit
Nov 14 '18 at 4:15
It kinda looks like YAML, so you can try it look closer to yaml and parse with relevant library. And yes, without trying or showing your attempt it's a small chance that you'll get an answer.
– vishes_shell
Nov 14 '18 at 4:26
It kinda looks like YAML, so you can try it look closer to yaml and parse with relevant library. And yes, without trying or showing your attempt it's a small chance that you'll get an answer.
– vishes_shell
Nov 14 '18 at 4:26
add a comment |
2 Answers
2
active
oldest
votes
You can use recursion with itertools.groupby
:
s = """
category1 : 0120391123123
- subcategory : 0120391123123
-- subsubcategory : 019301948109
--- subsubsubcategory : 013904123908
---- subsubsubsubcategory : 019341823908
- subcategory2 : 0934810923801
-- subsubcategory2 : 09341829308123
category2: 1309183912309
- subcategory : 10293182094
"""
import re, itertools
data = list(filter(None, s.split('n')))
def group_data(d):
if len(d) == 1:
return [dict([re.split('s*:s*', d[0])])]
grouped = [[a, list(b)] for a, b in itertools.groupby(d, key=lambda x:not x.startswith('-'))]
_group = [[grouped[i][-1], grouped[i+1][-1]] for i in range(0, len(grouped), 2)]
_c = [[dict([re.split('s*:s*', i) for i in a]), group_data([c[1:] for c in b])] for a, b in _group]
return [i for b in _c for i in b]
print(json.dumps(group_data(data), indent=4))
Output:
[
"category1": "0120391123123"
,
[
" subcategory": "0120391123123"
,
[
" subsubcategory": "019301948109"
,
[
" subsubsubcategory": "013904123908"
,
[
" subsubsubsubcategory": "019341823908"
]
]
],
" subcategory2": "0934810923801"
,
[
" subsubcategory2": "09341829308123"
]
],
"category2": "1309183912309"
,
[
" subcategory": "10293182094"
]
]
Note: this answer assumes that your final output should have "category2"
be at the same level as "category1"
, since both do not contain a "-"
in the front.
add a comment |
use a recursive function to split the content of the file to chunks and use divide and conquer
from pprint import pprint
req=
startingindex=-1
with open('temp.txt' ,'r') as f:
content=f.read().split('n')
def foo(splitcontent):
index=0
reqlist=
while(index<len(splitcontent)):
if (splitcontent[index][0]!='-'):
key,value=splitcontent[index].split(':')
reqlist.append(key.strip():value.strip())
index+=1
templist=
while(index<len(splitcontent) and splitcontent[index][0]=='-'):
templist.append(splitcontent[index][1:])
index+=1
intermediatelist=foo(templist)
if(intermediatelist):
reqlist.append(intermediatelist)
return reqlist
pprint(foo(content))
OUTPUT
['category1': '0120391123123',
['subcategory': '0120391123123',
['subsubcategory': '019301948109',
['subsubsubcategory': '013904123908',
['subsubsubsubcategory': '019341823908']]],
'subcategory2': '0934810923801',
['subsubcategory2': '09341829308123']],
'category2': '1309183912309',
['subcategory': '10293182094']]
add a comment |
Your Answer
StackExchange.ifUsing("editor", function ()
StackExchange.using("externalEditor", function ()
StackExchange.using("snippets", function ()
StackExchange.snippets.init();
);
);
, "code-snippets");
StackExchange.ready(function()
var channelOptions =
tags: "".split(" "),
id: "1"
;
initTagRenderer("".split(" "), "".split(" "), channelOptions);
StackExchange.using("externalEditor", function()
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled)
StackExchange.using("snippets", function()
createEditor();
);
else
createEditor();
);
function createEditor()
StackExchange.prepareEditor(
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: true,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: 10,
bindNavPrevention: true,
postfix: "",
imageUploader:
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
,
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
);
);
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53293071%2fturn-text-list-into-json-formatted-list%23new-answer', 'question_page');
);
Post as a guest
Required, but never shown
2 Answers
2
active
oldest
votes
2 Answers
2
active
oldest
votes
active
oldest
votes
active
oldest
votes
You can use recursion with itertools.groupby
:
s = """
category1 : 0120391123123
- subcategory : 0120391123123
-- subsubcategory : 019301948109
--- subsubsubcategory : 013904123908
---- subsubsubsubcategory : 019341823908
- subcategory2 : 0934810923801
-- subsubcategory2 : 09341829308123
category2: 1309183912309
- subcategory : 10293182094
"""
import re, itertools
data = list(filter(None, s.split('n')))
def group_data(d):
if len(d) == 1:
return [dict([re.split('s*:s*', d[0])])]
grouped = [[a, list(b)] for a, b in itertools.groupby(d, key=lambda x:not x.startswith('-'))]
_group = [[grouped[i][-1], grouped[i+1][-1]] for i in range(0, len(grouped), 2)]
_c = [[dict([re.split('s*:s*', i) for i in a]), group_data([c[1:] for c in b])] for a, b in _group]
return [i for b in _c for i in b]
print(json.dumps(group_data(data), indent=4))
Output:
[
"category1": "0120391123123"
,
[
" subcategory": "0120391123123"
,
[
" subsubcategory": "019301948109"
,
[
" subsubsubcategory": "013904123908"
,
[
" subsubsubsubcategory": "019341823908"
]
]
],
" subcategory2": "0934810923801"
,
[
" subsubcategory2": "09341829308123"
]
],
"category2": "1309183912309"
,
[
" subcategory": "10293182094"
]
]
Note: this answer assumes that your final output should have "category2"
be at the same level as "category1"
, since both do not contain a "-"
in the front.
add a comment |
You can use recursion with itertools.groupby
:
s = """
category1 : 0120391123123
- subcategory : 0120391123123
-- subsubcategory : 019301948109
--- subsubsubcategory : 013904123908
---- subsubsubsubcategory : 019341823908
- subcategory2 : 0934810923801
-- subsubcategory2 : 09341829308123
category2: 1309183912309
- subcategory : 10293182094
"""
import re, itertools
data = list(filter(None, s.split('n')))
def group_data(d):
if len(d) == 1:
return [dict([re.split('s*:s*', d[0])])]
grouped = [[a, list(b)] for a, b in itertools.groupby(d, key=lambda x:not x.startswith('-'))]
_group = [[grouped[i][-1], grouped[i+1][-1]] for i in range(0, len(grouped), 2)]
_c = [[dict([re.split('s*:s*', i) for i in a]), group_data([c[1:] for c in b])] for a, b in _group]
return [i for b in _c for i in b]
print(json.dumps(group_data(data), indent=4))
Output:
[
"category1": "0120391123123"
,
[
" subcategory": "0120391123123"
,
[
" subsubcategory": "019301948109"
,
[
" subsubsubcategory": "013904123908"
,
[
" subsubsubsubcategory": "019341823908"
]
]
],
" subcategory2": "0934810923801"
,
[
" subsubcategory2": "09341829308123"
]
],
"category2": "1309183912309"
,
[
" subcategory": "10293182094"
]
]
Note: this answer assumes that your final output should have "category2"
be at the same level as "category1"
, since both do not contain a "-"
in the front.
add a comment |
You can use recursion with itertools.groupby
:
s = """
category1 : 0120391123123
- subcategory : 0120391123123
-- subsubcategory : 019301948109
--- subsubsubcategory : 013904123908
---- subsubsubsubcategory : 019341823908
- subcategory2 : 0934810923801
-- subsubcategory2 : 09341829308123
category2: 1309183912309
- subcategory : 10293182094
"""
import re, itertools
data = list(filter(None, s.split('n')))
def group_data(d):
if len(d) == 1:
return [dict([re.split('s*:s*', d[0])])]
grouped = [[a, list(b)] for a, b in itertools.groupby(d, key=lambda x:not x.startswith('-'))]
_group = [[grouped[i][-1], grouped[i+1][-1]] for i in range(0, len(grouped), 2)]
_c = [[dict([re.split('s*:s*', i) for i in a]), group_data([c[1:] for c in b])] for a, b in _group]
return [i for b in _c for i in b]
print(json.dumps(group_data(data), indent=4))
Output:
[
"category1": "0120391123123"
,
[
" subcategory": "0120391123123"
,
[
" subsubcategory": "019301948109"
,
[
" subsubsubcategory": "013904123908"
,
[
" subsubsubsubcategory": "019341823908"
]
]
],
" subcategory2": "0934810923801"
,
[
" subsubcategory2": "09341829308123"
]
],
"category2": "1309183912309"
,
[
" subcategory": "10293182094"
]
]
Note: this answer assumes that your final output should have "category2"
be at the same level as "category1"
, since both do not contain a "-"
in the front.
You can use recursion with itertools.groupby
:
s = """
category1 : 0120391123123
- subcategory : 0120391123123
-- subsubcategory : 019301948109
--- subsubsubcategory : 013904123908
---- subsubsubsubcategory : 019341823908
- subcategory2 : 0934810923801
-- subsubcategory2 : 09341829308123
category2: 1309183912309
- subcategory : 10293182094
"""
import re, itertools
data = list(filter(None, s.split('n')))
def group_data(d):
if len(d) == 1:
return [dict([re.split('s*:s*', d[0])])]
grouped = [[a, list(b)] for a, b in itertools.groupby(d, key=lambda x:not x.startswith('-'))]
_group = [[grouped[i][-1], grouped[i+1][-1]] for i in range(0, len(grouped), 2)]
_c = [[dict([re.split('s*:s*', i) for i in a]), group_data([c[1:] for c in b])] for a, b in _group]
return [i for b in _c for i in b]
print(json.dumps(group_data(data), indent=4))
Output:
[
"category1": "0120391123123"
,
[
" subcategory": "0120391123123"
,
[
" subsubcategory": "019301948109"
,
[
" subsubsubcategory": "013904123908"
,
[
" subsubsubsubcategory": "019341823908"
]
]
],
" subcategory2": "0934810923801"
,
[
" subsubcategory2": "09341829308123"
]
],
"category2": "1309183912309"
,
[
" subcategory": "10293182094"
]
]
Note: this answer assumes that your final output should have "category2"
be at the same level as "category1"
, since both do not contain a "-"
in the front.
edited Nov 14 '18 at 4:40
answered Nov 14 '18 at 4:30
Ajax1234Ajax1234
41.3k42853
41.3k42853
add a comment |
add a comment |
use a recursive function to split the content of the file to chunks and use divide and conquer
from pprint import pprint
req=
startingindex=-1
with open('temp.txt' ,'r') as f:
content=f.read().split('n')
def foo(splitcontent):
index=0
reqlist=
while(index<len(splitcontent)):
if (splitcontent[index][0]!='-'):
key,value=splitcontent[index].split(':')
reqlist.append(key.strip():value.strip())
index+=1
templist=
while(index<len(splitcontent) and splitcontent[index][0]=='-'):
templist.append(splitcontent[index][1:])
index+=1
intermediatelist=foo(templist)
if(intermediatelist):
reqlist.append(intermediatelist)
return reqlist
pprint(foo(content))
OUTPUT
['category1': '0120391123123',
['subcategory': '0120391123123',
['subsubcategory': '019301948109',
['subsubsubcategory': '013904123908',
['subsubsubsubcategory': '019341823908']]],
'subcategory2': '0934810923801',
['subsubcategory2': '09341829308123']],
'category2': '1309183912309',
['subcategory': '10293182094']]
add a comment |
use a recursive function to split the content of the file to chunks and use divide and conquer
from pprint import pprint
req=
startingindex=-1
with open('temp.txt' ,'r') as f:
content=f.read().split('n')
def foo(splitcontent):
index=0
reqlist=
while(index<len(splitcontent)):
if (splitcontent[index][0]!='-'):
key,value=splitcontent[index].split(':')
reqlist.append(key.strip():value.strip())
index+=1
templist=
while(index<len(splitcontent) and splitcontent[index][0]=='-'):
templist.append(splitcontent[index][1:])
index+=1
intermediatelist=foo(templist)
if(intermediatelist):
reqlist.append(intermediatelist)
return reqlist
pprint(foo(content))
OUTPUT
['category1': '0120391123123',
['subcategory': '0120391123123',
['subsubcategory': '019301948109',
['subsubsubcategory': '013904123908',
['subsubsubsubcategory': '019341823908']]],
'subcategory2': '0934810923801',
['subsubcategory2': '09341829308123']],
'category2': '1309183912309',
['subcategory': '10293182094']]
add a comment |
use a recursive function to split the content of the file to chunks and use divide and conquer
from pprint import pprint
req=
startingindex=-1
with open('temp.txt' ,'r') as f:
content=f.read().split('n')
def foo(splitcontent):
index=0
reqlist=
while(index<len(splitcontent)):
if (splitcontent[index][0]!='-'):
key,value=splitcontent[index].split(':')
reqlist.append(key.strip():value.strip())
index+=1
templist=
while(index<len(splitcontent) and splitcontent[index][0]=='-'):
templist.append(splitcontent[index][1:])
index+=1
intermediatelist=foo(templist)
if(intermediatelist):
reqlist.append(intermediatelist)
return reqlist
pprint(foo(content))
OUTPUT
['category1': '0120391123123',
['subcategory': '0120391123123',
['subsubcategory': '019301948109',
['subsubsubcategory': '013904123908',
['subsubsubsubcategory': '019341823908']]],
'subcategory2': '0934810923801',
['subsubcategory2': '09341829308123']],
'category2': '1309183912309',
['subcategory': '10293182094']]
use a recursive function to split the content of the file to chunks and use divide and conquer
from pprint import pprint
req=
startingindex=-1
with open('temp.txt' ,'r') as f:
content=f.read().split('n')
def foo(splitcontent):
index=0
reqlist=
while(index<len(splitcontent)):
if (splitcontent[index][0]!='-'):
key,value=splitcontent[index].split(':')
reqlist.append(key.strip():value.strip())
index+=1
templist=
while(index<len(splitcontent) and splitcontent[index][0]=='-'):
templist.append(splitcontent[index][1:])
index+=1
intermediatelist=foo(templist)
if(intermediatelist):
reqlist.append(intermediatelist)
return reqlist
pprint(foo(content))
OUTPUT
['category1': '0120391123123',
['subcategory': '0120391123123',
['subsubcategory': '019301948109',
['subsubsubcategory': '013904123908',
['subsubsubsubcategory': '019341823908']]],
'subcategory2': '0934810923801',
['subsubcategory2': '09341829308123']],
'category2': '1309183912309',
['subcategory': '10293182094']]
edited Nov 14 '18 at 5:11
answered Nov 14 '18 at 5:01
Albin PaulAlbin Paul
1,471717
1,471717
add a comment |
add a comment |
Thanks for contributing an answer to Stack Overflow!
- Please be sure to answer the question. Provide details and share your research!
But avoid …
- Asking for help, clarification, or responding to other answers.
- Making statements based on opinion; back them up with references or personal experience.
To learn more, see our tips on writing great answers.
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53293071%2fturn-text-list-into-json-formatted-list%23new-answer', 'question_page');
);
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Have you got any Python code yet that just reads the file and attempts to parse it?
– cricket_007
Nov 14 '18 at 4:11
Don't be afraid of JSON. It just a list and dictionary. Start from small example first and then cover your result with test one by one into real problem. It takes time. Be patience.
– Sarit
Nov 14 '18 at 4:15
It kinda looks like YAML, so you can try it look closer to yaml and parse with relevant library. And yes, without trying or showing your attempt it's a small chance that you'll get an answer.
– vishes_shell
Nov 14 '18 at 4:26