Python中的iterable、iterator和generator(二)

我們在上文〝Python中的iterable、iterator和generator(一)〞提到iterable的__iter__方法可返回iterator,而用iterator的__next__方法又可以逐一得出序列的內容,直至出現StopIteration異常為止。文章最後說iterator也有__iter__方法,其返回自身,所以iterator也是iterable的。那為何iterator自己也有__iter__方法?為何調用它時又會返回自身呢?

原因很簡單,因為可以方便我們直接用iterator做迭代,即:

>>> names = ['Mary', 'Jack', 'Emily', 'Lauren', 'Olivia']    # iterable
>>> names_iter = iter(names)    # iterator
>>> for name in names_iter:  
...     print(name)
... 
Mary  
Jack  
Emily 
Lauren
Olivia

我們看到,就算直接將iterator放到for語句中,也一樣可以進行迭代。可以這樣做的根本原因是iterator和iterable一樣有__iter__方法,當用for語句進行迭代某個iterable object時,Python會先用它的__iter__方法,得到一個iterator,然後再調用其__next__方法直至出現StopIteration異常,Python會適當地處理這個異常及釋放已經迭代完的iterator資源。由於剛才我們將從names中得到的iterator賦值給變量names_iter,所以用for迭代完後,names_iter還存在,但已不能從中迭代出任何東西了:

>>> for name in names_iter:
...     print(name)
... 
>>> next(names_iter)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
StopIteration

實際上,這個例子就正正說明了為何Python中要分iterable和iterator,為何iterator可以用來迭代時卻不將iterable和iterator二合為一:一個iterable可產生許多個iterator objects,它們毫無瓜葛,各有各地迭代。例如,上述代碼已經迭代完names_iter這個iterator了,但沒關係,馬上用iter(names)再製造一個,又可以從頭開始做迭代了:

>>> names_iter2 = iter(names)
>>> next(names_iter2)
'Mary'
>>> next(names_iter2)
'Jack'

想在names_iter2還沒迭代完時又再對names重新做迭代?沒問題,再用iter(names)製造一個新的iterator就好:

>>> names_iter3 = iter(names)
>>> next(names_iter3)
'Mary'

於是在多重for語句中對同一個iterable做迭代是沒問題的,因為在每個for語句中,Python都會調用iterable的__iter__方法產生一個新的iterator,不同for語句中產生的iterator之間是沒有關係的:

>>> for name1 in names:
...     for name2 in names:
...         print(f'Outer loop: {name1.lower()}, inner loop: {name2.upper()}')
... 
Outer loop: mary, inner loop: MARY
Outer loop: mary, inner loop: JACK
Outer loop: mary, inner loop: EMILY
Outer loop: mary, inner loop: LAUREN
Outer loop: mary, inner loop: OLIVIA
Outer loop: jack, inner loop: MARY
Outer loop: jack, inner loop: JACK
Outer loop: jack, inner loop: EMILY
Outer loop: jack, inner loop: LAUREN
Outer loop: jack, inner loop: OLIVIA
Outer loop: emily, inner loop: MARY
Outer loop: emily, inner loop: JACK
Outer loop: emily, inner loop: EMILY
Outer loop: emily, inner loop: LAUREN
Outer loop: emily, inner loop: OLIVIA
Outer loop: lauren, inner loop: MARY
Outer loop: lauren, inner loop: JACK
Outer loop: lauren, inner loop: EMILY
Outer loop: lauren, inner loop: LAUREN
Outer loop: lauren, inner loop: OLIVIA
Outer loop: olivia, inner loop: MARY
Outer loop: olivia, inner loop: JACK
Outer loop: olivia, inner loop: EMILY
Outer loop: olivia, inner loop: LAUREN
Outer loop: olivia, inner loop: OLIVIA

如果iterator和iterable二合為一就問題大了,例如:

>>> names_iter = iter(names)
>>> for name1 in names_iter:
...     for name2 in names_iter:
...         print(f'Outer loop: {name1.lower()}, inner loop: {name2.upper()}')
... 
Outer loop: mary, inner loop: JACK
Outer loop: mary, inner loop: EMILY
Outer loop: mary, inner loop: LAUREN
Outer loop: mary, inner loop: OLIVIA

可見,將iterator當成是iterable做多重for迭代時,由於兩層迭代都用了同一個iterator,所以外層迭代進行了一次後,內層的迭代就讓names_iter這個iterator徹底迭代完並出現StopIteration異常,所以外層迭代都無法繼續進行下去。

小結:

  • iterator有__iter__方法的原因是方便我們直接用iterator做迭代。
  • 每個iterator只能迭代一次,要重新迭代,就要從iterable重新製造一個新的iterator。
  • 不同iterator做迭代時各自獨立工作,所以可對同一個iterable做多重迭代,因為Python會在同不同層次的迭代中用iterable製作一個新的iterator。

寫了這麼多iterable和iterator之間的關係,但這兩者又和generator有何聯繫呢?我們留在下一篇文章再討論。

4.3 7 votes
Article Rating
Subscribe
Notify of
guest
0 Comments
Inline Feedbacks
View all comments