Is there a way I can speed up this code or is this the fastest it'll go?









up vote
-2
down vote

favorite












I have a function that's running in Django. It's supposed to calculate distance of a location based on user location. It works, the only problem is I feel my current implementation may be lacking in performance. It tends to take quite a bit of time. Here's the code:



def resolve_near_by_branches(self, info, **kwargs):
ul_raw = kwargs.get('user_location')
ul_l = ul_raw.split(',')
user_location = (float(ul_l[0]), float(ul_l[1]))

final_b =

if kwargs.get('category') is None:
es = Establishment.objects.distinct().all()
else:
es = Establishment.objects.distinct().filter(
category__code_name__exact=kwargs.get('category'),
)

for e in es:
for branch in e.branches.all():
b_l = (float(branch.location.latitude.replace(' ', "")), float(branch.location.longitude.replace(' ', "")))
# if geodesic(user_location, b_l).km < 9000000:
final_b.append((geodesic(user_location, b_l).m, branch))

final_data = sorted(final_b, key=lambda x: x[0])
print(final_data)
# print([i[1] for i in final_b])

return [i[1] for i in final_data]


If you have any suggestions on how I can speed this up, please contribute.










share|improve this question



















  • 2




    If you have working code, you should consider to post your question on codereview.stackexchange.com instead. You might need to provide a Minimal, Complete, and Verifiable example though.
    – Mike Scotty
    Nov 11 at 17:06











  • How should we know if there is a faster way to implement your function when we don't know what it is doing. We can't get what it is doing, because the indents are wrong (first line) and they are important in python. Please fix.
    – quant
    Nov 11 at 17:12










  • There are several things there that look like they will cause unnecessary queries, but without seeing your models it's hard to help you properly.
    – Daniel Roseman
    Nov 11 at 17:13










  • You probably don't need to sort the full list. If you only need the top N users nearby. Use the heapq module to track the top N instead.
    – Martijn Pieters
    Nov 11 at 17:24














up vote
-2
down vote

favorite












I have a function that's running in Django. It's supposed to calculate distance of a location based on user location. It works, the only problem is I feel my current implementation may be lacking in performance. It tends to take quite a bit of time. Here's the code:



def resolve_near_by_branches(self, info, **kwargs):
ul_raw = kwargs.get('user_location')
ul_l = ul_raw.split(',')
user_location = (float(ul_l[0]), float(ul_l[1]))

final_b =

if kwargs.get('category') is None:
es = Establishment.objects.distinct().all()
else:
es = Establishment.objects.distinct().filter(
category__code_name__exact=kwargs.get('category'),
)

for e in es:
for branch in e.branches.all():
b_l = (float(branch.location.latitude.replace(' ', "")), float(branch.location.longitude.replace(' ', "")))
# if geodesic(user_location, b_l).km < 9000000:
final_b.append((geodesic(user_location, b_l).m, branch))

final_data = sorted(final_b, key=lambda x: x[0])
print(final_data)
# print([i[1] for i in final_b])

return [i[1] for i in final_data]


If you have any suggestions on how I can speed this up, please contribute.










share|improve this question



















  • 2




    If you have working code, you should consider to post your question on codereview.stackexchange.com instead. You might need to provide a Minimal, Complete, and Verifiable example though.
    – Mike Scotty
    Nov 11 at 17:06











  • How should we know if there is a faster way to implement your function when we don't know what it is doing. We can't get what it is doing, because the indents are wrong (first line) and they are important in python. Please fix.
    – quant
    Nov 11 at 17:12










  • There are several things there that look like they will cause unnecessary queries, but without seeing your models it's hard to help you properly.
    – Daniel Roseman
    Nov 11 at 17:13










  • You probably don't need to sort the full list. If you only need the top N users nearby. Use the heapq module to track the top N instead.
    – Martijn Pieters
    Nov 11 at 17:24












up vote
-2
down vote

favorite









up vote
-2
down vote

favorite











I have a function that's running in Django. It's supposed to calculate distance of a location based on user location. It works, the only problem is I feel my current implementation may be lacking in performance. It tends to take quite a bit of time. Here's the code:



def resolve_near_by_branches(self, info, **kwargs):
ul_raw = kwargs.get('user_location')
ul_l = ul_raw.split(',')
user_location = (float(ul_l[0]), float(ul_l[1]))

final_b =

if kwargs.get('category') is None:
es = Establishment.objects.distinct().all()
else:
es = Establishment.objects.distinct().filter(
category__code_name__exact=kwargs.get('category'),
)

for e in es:
for branch in e.branches.all():
b_l = (float(branch.location.latitude.replace(' ', "")), float(branch.location.longitude.replace(' ', "")))
# if geodesic(user_location, b_l).km < 9000000:
final_b.append((geodesic(user_location, b_l).m, branch))

final_data = sorted(final_b, key=lambda x: x[0])
print(final_data)
# print([i[1] for i in final_b])

return [i[1] for i in final_data]


If you have any suggestions on how I can speed this up, please contribute.










share|improve this question















I have a function that's running in Django. It's supposed to calculate distance of a location based on user location. It works, the only problem is I feel my current implementation may be lacking in performance. It tends to take quite a bit of time. Here's the code:



def resolve_near_by_branches(self, info, **kwargs):
ul_raw = kwargs.get('user_location')
ul_l = ul_raw.split(',')
user_location = (float(ul_l[0]), float(ul_l[1]))

final_b =

if kwargs.get('category') is None:
es = Establishment.objects.distinct().all()
else:
es = Establishment.objects.distinct().filter(
category__code_name__exact=kwargs.get('category'),
)

for e in es:
for branch in e.branches.all():
b_l = (float(branch.location.latitude.replace(' ', "")), float(branch.location.longitude.replace(' ', "")))
# if geodesic(user_location, b_l).km < 9000000:
final_b.append((geodesic(user_location, b_l).m, branch))

final_data = sorted(final_b, key=lambda x: x[0])
print(final_data)
# print([i[1] for i in final_b])

return [i[1] for i in final_data]


If you have any suggestions on how I can speed this up, please contribute.







python django






share|improve this question















share|improve this question













share|improve this question




share|improve this question








edited Nov 11 at 17:26









rassar

2,20811029




2,20811029










asked Nov 11 at 17:05









Richard Nsama

45




45







  • 2




    If you have working code, you should consider to post your question on codereview.stackexchange.com instead. You might need to provide a Minimal, Complete, and Verifiable example though.
    – Mike Scotty
    Nov 11 at 17:06











  • How should we know if there is a faster way to implement your function when we don't know what it is doing. We can't get what it is doing, because the indents are wrong (first line) and they are important in python. Please fix.
    – quant
    Nov 11 at 17:12










  • There are several things there that look like they will cause unnecessary queries, but without seeing your models it's hard to help you properly.
    – Daniel Roseman
    Nov 11 at 17:13










  • You probably don't need to sort the full list. If you only need the top N users nearby. Use the heapq module to track the top N instead.
    – Martijn Pieters
    Nov 11 at 17:24












  • 2




    If you have working code, you should consider to post your question on codereview.stackexchange.com instead. You might need to provide a Minimal, Complete, and Verifiable example though.
    – Mike Scotty
    Nov 11 at 17:06











  • How should we know if there is a faster way to implement your function when we don't know what it is doing. We can't get what it is doing, because the indents are wrong (first line) and they are important in python. Please fix.
    – quant
    Nov 11 at 17:12










  • There are several things there that look like they will cause unnecessary queries, but without seeing your models it's hard to help you properly.
    – Daniel Roseman
    Nov 11 at 17:13










  • You probably don't need to sort the full list. If you only need the top N users nearby. Use the heapq module to track the top N instead.
    – Martijn Pieters
    Nov 11 at 17:24







2




2




If you have working code, you should consider to post your question on codereview.stackexchange.com instead. You might need to provide a Minimal, Complete, and Verifiable example though.
– Mike Scotty
Nov 11 at 17:06





If you have working code, you should consider to post your question on codereview.stackexchange.com instead. You might need to provide a Minimal, Complete, and Verifiable example though.
– Mike Scotty
Nov 11 at 17:06













How should we know if there is a faster way to implement your function when we don't know what it is doing. We can't get what it is doing, because the indents are wrong (first line) and they are important in python. Please fix.
– quant
Nov 11 at 17:12




How should we know if there is a faster way to implement your function when we don't know what it is doing. We can't get what it is doing, because the indents are wrong (first line) and they are important in python. Please fix.
– quant
Nov 11 at 17:12












There are several things there that look like they will cause unnecessary queries, but without seeing your models it's hard to help you properly.
– Daniel Roseman
Nov 11 at 17:13




There are several things there that look like they will cause unnecessary queries, but without seeing your models it's hard to help you properly.
– Daniel Roseman
Nov 11 at 17:13












You probably don't need to sort the full list. If you only need the top N users nearby. Use the heapq module to track the top N instead.
– Martijn Pieters
Nov 11 at 17:24




You probably don't need to sort the full list. If you only need the top N users nearby. Use the heapq module to track the top N instead.
– Martijn Pieters
Nov 11 at 17:24












1 Answer
1






active

oldest

votes

















up vote
2
down vote













One obvious improvement, since you are accessing what appears to be a reverse relationship on each iteration, is to use prefetch_related(). This tells Django to do one extra database query at evaluation to retrieve the reverse relationship rather than doing one each time it is accessed, which results in much fewer queries.



def resolve_near_by_branches(self, info, **kwargs):
ul_raw = kwargs.get('user_location')
ul_l = ul_raw.split(',')
user_location = (float(ul_l[0]), float(ul_l[1]))

final_b =

if kwargs.get('category') is None:
es = Establishment.objects.distinct().all()
else:
es = Establishment.objects.distinct().filter(
category__code_name__exact=kwargs.get('category'),
)

for e in es.prefetch_related('branches'):
for branch in e.branches.all():
b_l = (float(branch.location.latitude.replace(' ', "")), float(branch.location.longitude.replace(' ', "")))
# if geodesic(user_location, b_l).km < 9000000:
final_b.append((geodesic(user_location, b_l).m, branch))

final_data = sorted(final_b, key=lambda x: x[0])
print(final_data)
# print([i[1] for i in final_b])

return [i[1] for i in final_data]


I made a Django ORM optimization cheat sheet recently that you may find helpful when looking for quick optimizations.






share|improve this answer






















  • Thanks for the feedback. But won't asking Django to make one extra query slow the function down?
    – Richard Nsama
    Nov 11 at 18:10






  • 1




    Sorry, I was unclear. When you iterate through a queryset (for e in es), Django makes one database query to evaluate it. When you use prefetch_related(), it does another database query at this point and joins the reverse relationship in Python. If you don't use prefetch_related(), it doesn't do this extra query and instead makes a query each time you access the reverse relationship (e.branches.all()), which results in a lot more queries.
    – Levi Payne
    Nov 11 at 18:15






  • 1




    There are tools that you can use to profile your django application during development. This helps you identify and analyze performance bottlenecks. Check out django-debug-toolbar or django-silk. djangopackages.org/grids/g/developer-tools
    – Håken Lid
    Nov 11 at 18:22










Your Answer






StackExchange.ifUsing("editor", function ()
StackExchange.using("externalEditor", function ()
StackExchange.using("snippets", function ()
StackExchange.snippets.init();
);
);
, "code-snippets");

StackExchange.ready(function()
var channelOptions =
tags: "".split(" "),
id: "1"
;
initTagRenderer("".split(" "), "".split(" "), channelOptions);

StackExchange.using("externalEditor", function()
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled)
StackExchange.using("snippets", function()
createEditor();
);

else
createEditor();

);

function createEditor()
StackExchange.prepareEditor(
heartbeatType: 'answer',
convertImagesToLinks: true,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: 10,
bindNavPrevention: true,
postfix: "",
imageUploader:
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
,
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
);



);













draft saved

draft discarded


















StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53251103%2fis-there-a-way-i-can-speed-up-this-code-or-is-this-the-fastest-itll-go%23new-answer', 'question_page');

);

Post as a guest















Required, but never shown

























1 Answer
1






active

oldest

votes








1 Answer
1






active

oldest

votes









active

oldest

votes






active

oldest

votes








up vote
2
down vote













One obvious improvement, since you are accessing what appears to be a reverse relationship on each iteration, is to use prefetch_related(). This tells Django to do one extra database query at evaluation to retrieve the reverse relationship rather than doing one each time it is accessed, which results in much fewer queries.



def resolve_near_by_branches(self, info, **kwargs):
ul_raw = kwargs.get('user_location')
ul_l = ul_raw.split(',')
user_location = (float(ul_l[0]), float(ul_l[1]))

final_b =

if kwargs.get('category') is None:
es = Establishment.objects.distinct().all()
else:
es = Establishment.objects.distinct().filter(
category__code_name__exact=kwargs.get('category'),
)

for e in es.prefetch_related('branches'):
for branch in e.branches.all():
b_l = (float(branch.location.latitude.replace(' ', "")), float(branch.location.longitude.replace(' ', "")))
# if geodesic(user_location, b_l).km < 9000000:
final_b.append((geodesic(user_location, b_l).m, branch))

final_data = sorted(final_b, key=lambda x: x[0])
print(final_data)
# print([i[1] for i in final_b])

return [i[1] for i in final_data]


I made a Django ORM optimization cheat sheet recently that you may find helpful when looking for quick optimizations.






share|improve this answer






















  • Thanks for the feedback. But won't asking Django to make one extra query slow the function down?
    – Richard Nsama
    Nov 11 at 18:10






  • 1




    Sorry, I was unclear. When you iterate through a queryset (for e in es), Django makes one database query to evaluate it. When you use prefetch_related(), it does another database query at this point and joins the reverse relationship in Python. If you don't use prefetch_related(), it doesn't do this extra query and instead makes a query each time you access the reverse relationship (e.branches.all()), which results in a lot more queries.
    – Levi Payne
    Nov 11 at 18:15






  • 1




    There are tools that you can use to profile your django application during development. This helps you identify and analyze performance bottlenecks. Check out django-debug-toolbar or django-silk. djangopackages.org/grids/g/developer-tools
    – Håken Lid
    Nov 11 at 18:22














up vote
2
down vote













One obvious improvement, since you are accessing what appears to be a reverse relationship on each iteration, is to use prefetch_related(). This tells Django to do one extra database query at evaluation to retrieve the reverse relationship rather than doing one each time it is accessed, which results in much fewer queries.



def resolve_near_by_branches(self, info, **kwargs):
ul_raw = kwargs.get('user_location')
ul_l = ul_raw.split(',')
user_location = (float(ul_l[0]), float(ul_l[1]))

final_b =

if kwargs.get('category') is None:
es = Establishment.objects.distinct().all()
else:
es = Establishment.objects.distinct().filter(
category__code_name__exact=kwargs.get('category'),
)

for e in es.prefetch_related('branches'):
for branch in e.branches.all():
b_l = (float(branch.location.latitude.replace(' ', "")), float(branch.location.longitude.replace(' ', "")))
# if geodesic(user_location, b_l).km < 9000000:
final_b.append((geodesic(user_location, b_l).m, branch))

final_data = sorted(final_b, key=lambda x: x[0])
print(final_data)
# print([i[1] for i in final_b])

return [i[1] for i in final_data]


I made a Django ORM optimization cheat sheet recently that you may find helpful when looking for quick optimizations.






share|improve this answer






















  • Thanks for the feedback. But won't asking Django to make one extra query slow the function down?
    – Richard Nsama
    Nov 11 at 18:10






  • 1




    Sorry, I was unclear. When you iterate through a queryset (for e in es), Django makes one database query to evaluate it. When you use prefetch_related(), it does another database query at this point and joins the reverse relationship in Python. If you don't use prefetch_related(), it doesn't do this extra query and instead makes a query each time you access the reverse relationship (e.branches.all()), which results in a lot more queries.
    – Levi Payne
    Nov 11 at 18:15






  • 1




    There are tools that you can use to profile your django application during development. This helps you identify and analyze performance bottlenecks. Check out django-debug-toolbar or django-silk. djangopackages.org/grids/g/developer-tools
    – Håken Lid
    Nov 11 at 18:22












up vote
2
down vote










up vote
2
down vote









One obvious improvement, since you are accessing what appears to be a reverse relationship on each iteration, is to use prefetch_related(). This tells Django to do one extra database query at evaluation to retrieve the reverse relationship rather than doing one each time it is accessed, which results in much fewer queries.



def resolve_near_by_branches(self, info, **kwargs):
ul_raw = kwargs.get('user_location')
ul_l = ul_raw.split(',')
user_location = (float(ul_l[0]), float(ul_l[1]))

final_b =

if kwargs.get('category') is None:
es = Establishment.objects.distinct().all()
else:
es = Establishment.objects.distinct().filter(
category__code_name__exact=kwargs.get('category'),
)

for e in es.prefetch_related('branches'):
for branch in e.branches.all():
b_l = (float(branch.location.latitude.replace(' ', "")), float(branch.location.longitude.replace(' ', "")))
# if geodesic(user_location, b_l).km < 9000000:
final_b.append((geodesic(user_location, b_l).m, branch))

final_data = sorted(final_b, key=lambda x: x[0])
print(final_data)
# print([i[1] for i in final_b])

return [i[1] for i in final_data]


I made a Django ORM optimization cheat sheet recently that you may find helpful when looking for quick optimizations.






share|improve this answer














One obvious improvement, since you are accessing what appears to be a reverse relationship on each iteration, is to use prefetch_related(). This tells Django to do one extra database query at evaluation to retrieve the reverse relationship rather than doing one each time it is accessed, which results in much fewer queries.



def resolve_near_by_branches(self, info, **kwargs):
ul_raw = kwargs.get('user_location')
ul_l = ul_raw.split(',')
user_location = (float(ul_l[0]), float(ul_l[1]))

final_b =

if kwargs.get('category') is None:
es = Establishment.objects.distinct().all()
else:
es = Establishment.objects.distinct().filter(
category__code_name__exact=kwargs.get('category'),
)

for e in es.prefetch_related('branches'):
for branch in e.branches.all():
b_l = (float(branch.location.latitude.replace(' ', "")), float(branch.location.longitude.replace(' ', "")))
# if geodesic(user_location, b_l).km < 9000000:
final_b.append((geodesic(user_location, b_l).m, branch))

final_data = sorted(final_b, key=lambda x: x[0])
print(final_data)
# print([i[1] for i in final_b])

return [i[1] for i in final_data]


I made a Django ORM optimization cheat sheet recently that you may find helpful when looking for quick optimizations.







share|improve this answer














share|improve this answer



share|improve this answer








edited Nov 11 at 18:16

























answered Nov 11 at 17:53









Levi Payne

142110




142110











  • Thanks for the feedback. But won't asking Django to make one extra query slow the function down?
    – Richard Nsama
    Nov 11 at 18:10






  • 1




    Sorry, I was unclear. When you iterate through a queryset (for e in es), Django makes one database query to evaluate it. When you use prefetch_related(), it does another database query at this point and joins the reverse relationship in Python. If you don't use prefetch_related(), it doesn't do this extra query and instead makes a query each time you access the reverse relationship (e.branches.all()), which results in a lot more queries.
    – Levi Payne
    Nov 11 at 18:15






  • 1




    There are tools that you can use to profile your django application during development. This helps you identify and analyze performance bottlenecks. Check out django-debug-toolbar or django-silk. djangopackages.org/grids/g/developer-tools
    – Håken Lid
    Nov 11 at 18:22
















  • Thanks for the feedback. But won't asking Django to make one extra query slow the function down?
    – Richard Nsama
    Nov 11 at 18:10






  • 1




    Sorry, I was unclear. When you iterate through a queryset (for e in es), Django makes one database query to evaluate it. When you use prefetch_related(), it does another database query at this point and joins the reverse relationship in Python. If you don't use prefetch_related(), it doesn't do this extra query and instead makes a query each time you access the reverse relationship (e.branches.all()), which results in a lot more queries.
    – Levi Payne
    Nov 11 at 18:15






  • 1




    There are tools that you can use to profile your django application during development. This helps you identify and analyze performance bottlenecks. Check out django-debug-toolbar or django-silk. djangopackages.org/grids/g/developer-tools
    – Håken Lid
    Nov 11 at 18:22















Thanks for the feedback. But won't asking Django to make one extra query slow the function down?
– Richard Nsama
Nov 11 at 18:10




Thanks for the feedback. But won't asking Django to make one extra query slow the function down?
– Richard Nsama
Nov 11 at 18:10




1




1




Sorry, I was unclear. When you iterate through a queryset (for e in es), Django makes one database query to evaluate it. When you use prefetch_related(), it does another database query at this point and joins the reverse relationship in Python. If you don't use prefetch_related(), it doesn't do this extra query and instead makes a query each time you access the reverse relationship (e.branches.all()), which results in a lot more queries.
– Levi Payne
Nov 11 at 18:15




Sorry, I was unclear. When you iterate through a queryset (for e in es), Django makes one database query to evaluate it. When you use prefetch_related(), it does another database query at this point and joins the reverse relationship in Python. If you don't use prefetch_related(), it doesn't do this extra query and instead makes a query each time you access the reverse relationship (e.branches.all()), which results in a lot more queries.
– Levi Payne
Nov 11 at 18:15




1




1




There are tools that you can use to profile your django application during development. This helps you identify and analyze performance bottlenecks. Check out django-debug-toolbar or django-silk. djangopackages.org/grids/g/developer-tools
– Håken Lid
Nov 11 at 18:22




There are tools that you can use to profile your django application during development. This helps you identify and analyze performance bottlenecks. Check out django-debug-toolbar or django-silk. djangopackages.org/grids/g/developer-tools
– Håken Lid
Nov 11 at 18:22

















draft saved

draft discarded
















































Thanks for contributing an answer to Stack Overflow!


  • Please be sure to answer the question. Provide details and share your research!

But avoid


  • Asking for help, clarification, or responding to other answers.

  • Making statements based on opinion; back them up with references or personal experience.

To learn more, see our tips on writing great answers.





Some of your past answers have not been well-received, and you're in danger of being blocked from answering.


Please pay close attention to the following guidance:


  • Please be sure to answer the question. Provide details and share your research!

But avoid


  • Asking for help, clarification, or responding to other answers.

  • Making statements based on opinion; back them up with references or personal experience.

To learn more, see our tips on writing great answers.




draft saved


draft discarded














StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53251103%2fis-there-a-way-i-can-speed-up-this-code-or-is-this-the-fastest-itll-go%23new-answer', 'question_page');

);

Post as a guest















Required, but never shown





















































Required, but never shown














Required, but never shown












Required, but never shown







Required, but never shown

































Required, but never shown














Required, but never shown












Required, but never shown







Required, but never shown







這個網誌中的熱門文章

How to read a connectionString WITH PROVIDER in .NET Core?

In R, how to develop a multiplot heatmap.2 figure showing key labels successfully

Museum of Modern and Contemporary Art of Trento and Rovereto