Django ORM 效能優化

--- tags: Python, Django --- # Django ORM 效能優化 ## 1. queryset 資訊 + 如果不需要 queryset 的資料只需要知道queryset 有沒有資料，可以用exists() + 不要直接把queryset 放在if的條件裡 ```python= # Don't waste a query if you are using the queryset books = Book.objects.filter(..) if books: do_stuff_with_books(books) # If you aren't using the queryset use exist books = Book.objects.filter(..) if books.exists(): do_some_stuff() # But never if Book.objects.filter(..): do_some_stuff()` ``` + 要知道長度可以用 quryset.count()，要用到queryset資料可以用len()，不要把queryset查詢式放進 len()裡面 ```python= # Don't waste a query if you are using the queryset books = Book.objects.filter(..) if len(books) > 5: do_stuff_with_books(books) # If you aren't using the queryset use count books = Book.objects.filter(..) if books.count() > 5: do_some_stuff() # But never if len(Book.objects.filter(..)) > 5: do_some_stuff() ``` ## 2. 只取需要的部分 + queryset.values() 將指定欄位取出為字典 + queryset.values_list() 將指定欄位取出為元組 ```python # Retrieve values as a dictionary >>> Book.objects.values('title', 'author__name') <QuerySet [{'author__name': u'Nikolai Gogol', 'title': u'The Overcoat'}, {'author__name': u'Leo Tolstoy', 'title': u'War and Peace'}]> # Retrieve values as a tuple >>> Book.objects.values_list('title', 'author__name') <QuerySet [(u'The Overcoat', u'Nikolai Gogol'), (u'War and Peace', u'Leo Tolstoy')]> >>> Book.objects.values_list('title') <QuerySet [(u'The Overcoat',), (u'War and Peace',)]> # With one value, it is easier to flatten the list >>> Book.objects.values_list('title', flat=True) <QuerySet [u'The Overcoat', u'War and Peace']> ``` ## 3. 迭代處理 ```python # 這個寫法比較慢，因為一次把資料全載入記憶體 for book in Books.objects.all(): do_stuff(book) # 這個比較好，每次要用才載入下一筆資料 for book in Books.objects.all().iterator(): do_stuff(book) ``` ## 4. 關係問題 select_related() => 根據foreign key 預先快取相關資料，在資料庫層用了JOIN語法把資料組合 (1 to 1, foreignkey) select_related() 可以跟 filter()一起搭配使用, 所有資料必須在同一資料庫 prefetch_related為每一個關係使用了單獨的查詢，然後在 python層把資料 JOIN，所以不會被限制（可以查詢多對多關係）針對小資料集，一對一關係可以用 select_related 多對多或是大資料及可以用 prefetch_related 參考資料： https://kknews.cc/zh-tw/code/egge6vn.html ```python= >>> Author.objects.count() 20 >>> Book.objects.count() 100 # 下面程式總共 101 次查詢 # 一次找book還有每個作者一次 books = Book.objects.all() for book in books: do_stuff(book.title, book.author.name) # 這樣寫有21次查詢 # 一次作者還有20本書各自查一次作者 authors = Author.objects.all() for author in authors: do_stuff_with_books(author.name, author.books.all()) # 這只用到一次查詢 # 所有書的作者已經被prefetch了 book = Book.objects.selected_related('author').all() for book in books: do_stuff(book.title, book.author) # 這只用到一次查詢 # 所有作者的書已經被prefetch了 authors = Author.objects.prefetch_related('books').all() for author in authors: do_stuff_with_books(author.name, author.books.all()) ``` 如果要用queryset.filter()的話prefetch data會失效這時候要改用Prefetch 物件 ## 5. 加入 index 可參考官方文件 ```python from django.db import models class Customer(models.Model): first_name = models.CharField(max_length=100) last_name = models.CharField(max_length=100) class Meta: indexes = [ models.Index(fields=['last_name', 'first_name']), models.Index(fields=['first_name'], name='first_name_idx'), ] ``` 經過 index 標注的欄位找起來會比較快(15倍速度) 缺點：要多花很多空間存資料，因為 index就是新建一張表存指定的欄位每多一個欄位的index就多一張表，很耗空間所以只要隊常讀取的欄位設定 index就好