Aggregation Of An Annotation In GROUP BY In Django


Answer :

Update: Since Django 2.1, everything works out of the box. No workarounds needed and the produced query is correct.

This is maybe a bit too late, but I have found the solution (tested with Django 1.11.1).

The problem is, call to .values('publisher'), which is required to provide grouping, removes all annotations, that are not included in .values() fields param.

And we can't include dbl_price to fields param, because it will add another GROUP BY statement.

The solution in to make all aggregation, which requires annotated fields firstly, then call .values() and include that aggregations to fields param(this won't add GROUP BY, because they are aggregations). Then we should call .annotate() with ANY expression - this will make django add GROUP BY statement to SQL query using the only non-aggregation field in query - publisher.

Title.objects     .annotate(dbl_price=2*F('price'))     .annotate(sum_of_prices=Sum('dbl_price'))     .values('publisher', 'sum_of_prices')     .annotate(titles_count=Count('id')) 

The only minus with this approach - if you don't need any other aggregations except that one with annotated field - you would have to include some anyway. Without last call to .annotate() (and it should include at least one expression!), Django will not add GROUP BY to SQL query. One approach to deal with this is just to create a copy of your field:

Title.objects     .annotate(dbl_price=2*F('price'))     .annotate(_sum_of_prices=Sum('dbl_price')) # note the underscore!     .values('publisher', '_sum_of_prices')     .annotate(sum_of_prices=F('_sum_of_prices') 

Also, mention, that you should be careful with QuerySet ordering. You'd better call .order_by() either without parameters to clear ordering or with you GROUP BY field. If the resulting query will contain ordering by any other field, the grouping will be wrong. https://docs.djangoproject.com/en/1.11/topics/db/aggregation/#interaction-with-default-ordering-or-order-by

Also, you might want to remove that fake annotation from your output, so call .values() again. So, final code looks like:

Title.objects     .annotate(dbl_price=2*F('price'))     .annotate(_sum_of_prices=Sum('dbl_price'))     .values('publisher', '_sum_of_prices')     .annotate(sum_of_prices=F('_sum_of_prices'))     .values('publisher', 'sum_of_prices')     .order_by('publisher') 

This is expected from the way group_by works in Django. All annotated fields are added in GROUP BY clause. However, I am unable to comment on why it was written this way.

You can get your query to work like this:

Title.objects   .values('publisher')   .annotate(total_dbl_price=Sum(2*F('price')) 

which produces following SQL:

SELECT publisher, SUM((2 * price)) AS total_dbl_price FROM title GROUP BY publisher 

which just happens to work in your case.

I understand this might not be the complete solution you were looking for, but some even complex annotations can also be accommodated in this solution by using CombinedExpressions(I hope!).


Your problem comes from values() follow by annotate(). Order are important. This is explain in documentation about [order of annotate and values clauses]( https://docs.djangoproject.com/en/1.10/topics/db/aggregation/#order-of-annotate-and-values-clauses)

.values('pub_id') limit the queryset field with pub_id. So you can't annotate on income

The values() method takes optional positional arguments, *fields, which specify field names to which the SELECT should be limited.


Comments

Popular posts from this blog

Are Regular VACUUM ANALYZE Still Recommended Under 9.1?

Can Feynman Diagrams Be Used To Represent Any Perturbation Theory?