Python’s Yield

This week, I’ve been playing with implementing an HTTP client in Python. Why Python? It seemed like a straightforward language for this sort of thing, and the fact that I don’t know it at all is a bonus learning opportunity! In any case, I make no claim to be any kind of authority at all on the language.  I am certain there are better ways to do all of these things, but I don’t know them-yet!

After a bit of googling and reading and copying and pasting, I ended up with the following methods.

def getLine(s):
    line = ''
    for l in iter(lambda: s.recv(1), '\r'):
        if (l != '\n'):
            line += l
    return line

def getHeader(s)
    for line in iter(lambda:getLine(s), ''):
        yield line

def getContentLength(header):    
    for line in header:
        if (re.match("Content-Length: \d+$", line)):
            return int(line[line.find(": ")+2 : len(line)])
    return -1

def fetchlines(s):
    header = getHeader(s)
    contentLength = getContentLength(header)

# other methods to get message

Assume fetchlines is the entry to the above methods. First, we are getting header lines up to the first empty line, then, we are pulling the content length from those lines.  In later code (not shown), I get the body of the message based on the content length. But I had a strange problem. Here’s the header I was receiving:

HTTP/1.1 200 OK
Date: Wed, 26 Jan 2011 06:36:39 GMT
Server: Apache/2.2.4 (Unix) mod_ssl/2.2.4 OpenSSL/0.9.8a DAV/2 PHP/5.2.1
Last-Modified: Tue, 14 Dec 2010 17:23:19 GMT
ETag: "1104fb7-1ea9-4976213a4cfc0"
Accept-Ranges: bytes
Content-Length: 7849
Vary: Accept-Encoding
Content-Type: text/html

<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN"

I was expecting to pop everything down through the Content-Type line into header, then get the content length value in contentLength, then, socket reading being a one-way operation, the message reading code would start right on in at the DOCTYPE line.  Instead, I kept having a strange problem:  every time I started in on reading the message body, I ended up at the Vary line instead!  I wasn’t sure why: I was specifically looking to read down to first blank line, and there was no blank line before Vary.  I looked for hidden \r characters; nothing.

After some reading, I discovered the problem: yield.  Though looking at a few examples, I’d been under the impression that yield was essentially a nice and terse  loop-returning-collection construct; i.e. it would run through, collect up all the results, return as a collection all by itself.  I did remember hearing something about yield being *weird* in C#, but I hadn’t actually ever used it, and couldn’t remember was the problem was.   Besides, this was Python, and it seemed to be working, except for this little issue. How magical.

Uh... no.

Turns out this a case where it may have sorta looked like a duck, but it was something else entirely...  Though I was treating the function’s return value like the list I believed it to be, getHeaders wasn’t returning a list.  It was returning an iterator to a generator function.  A generator function, when called through an iterator, will run through its body, return a value back  to the accessor of the iterator at the yield, save state, and just hang out until you call for the next iterator item, at which point, it will resume on the line after the yield.  So the loop in getHeaders wasn’t actually run when the method was initially called-instead, each run through that loop is done once for each run through getContentLength’s loop... and since getContentLenth exits once it gets the content length, it never pulls the rest of the items from the iterator, and so, the Vary line never gets pulled in getHeaders... the blank line condition is never even hit, and the Vary line is still waiting to get pulled when it was time to read the message body-not what we want!

A quick fix? take out the yield and use a regular list. It adds an extra line or two over the yield, but this way, I’m sure the full header gets read before pulling the message body.

def getHeader(s):
    header =[]
    for line in iter(lambda:getLine(s), ''):
        print line
    return header

Lessons to take from this: 1) yield is NOT generating a list, it’s an iterator pointing to a generator, which is entirely  different, and on a broader note, 2) copying and pasting code that you don’t understand can result in behavior you don’t understand-or even worse, behavior that you only think you understand!

Posted 01-27-2011 12:40 AM by Anne Epstein



Dan S wrote re: Python’s Yield
on 01-27-2011 1:29 PM

for line in iter(lambda:getLine(s), '')

crorkz wrote re: Python’s Yield
on 08-06-2014 5:36 AM

CCsFMM Really enjoyed this blog article. Cool.

good backlinks wrote re: Python’s Yield
on 02-04-2015 7:57 AM

79P1ea Spot on with this write-up, I truly suppose this web site needs rather more consideration. I'll in all probability be once more to learn much more, thanks for that info.

craig david wrote re: Python’s Yield
on 03-03-2015 12:08 AM

CHcUlb It is in reality a great and useful piece of info. I am satisfied that you shared this helpful info with us. Please stay us informed like this. Thanks for sharing.

matt daemon wrote re: Python’s Yield
on 03-09-2015 3:07 AM

7eK3Xo You made a few nice points there. I did a search on the subject and found most people will have the same opinion with your blog.

best young pron wrote re: Python’s Yield
on 10-14-2016 9:59 AM

dr6j6B You made some first rate points there. I looked on the internet for the problem and found most individuals will associate with along with your website.

Add a Comment

Remember Me?

About The CodeBetter.Com Blog Network
CodeBetter.Com FAQ

Our Mission

Advertisers should contact Brendan

Google Reader or Homepage Latest Items
Add to My Yahoo!
Subscribe with Bloglines
Subscribe in NewsGator Online
Subscribe with myFeedster
Add to My AOL
Furl Latest Items
Subscribe in Rojo

Member Projects
DimeCasts.Net - Derik Whittaker

Friends of
Red-Gate Tools For SQL and .NET


SmartInspect .NET Logging
NGEDIT: ViEmu and Codekana
NHibernate Profiler
Balsamiq Mockups
JetBrains - ReSharper
Web Sequence Diagrams
Ducksboard<-- NEW Friend!


Site Copyright © 2007 CodeBetter.Com
Content Copyright Individual Bloggers


Community Server (Commercial Edition)