SEGFAULT for parse-xml(http:read-text())

Bug #867154 reported by Sorin Marian Nasoi
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
Zorba
Fix Released
Critical
Sorin Marian Nasoi

Bug Description

import module namespace http = \"http://www.zorba-xquery.com/modules/http-client\";
fn:parse-xml(http:get-text(\"http://api.whitepages.com/find_person/1.0/?name=Maria%20Lurdes;api_key=06ea2f21cc15602b6a3e242e3225a81a\")[2])

From the stack trace attached it seems that the problem is in parse-xml: if this is not the case please reassign accordingly.

Revision history for this message
Sorin Marian Nasoi (sorin.marian.nasoi) wrote :

The file LastDynamicAnalysis_20110728-0844.log.tar.gz was added: valgrind for the test

Revision history for this message
Chris Hillery (ceejatec) wrote :

The memory management of objects in the http-client modules was a mess. It's still a mess, but at least now it is a mess that runs valgrind-clean.

The HttpResponseParser class is at the center of all the confusion. It has two very distinct lifecycles depend on what happens when its parse() method is called:

1. If it creates a non-StreamableString Item, then it can free everything immediately (the istream it parses, and the curl readbuffer that the istream wraps). Also, the client which created the HttpResponseParser and called parse() may then immediately free the HttpResponseParser.

2. On the other hand, if it creates a StreamableString, then the istream and the curl readbuffer must last as long as the StreamableString itself. Not only that, but the curl readbuffer has a pointer to the HttpResponseParser itself, which means that the HttpResponseParser object must also not be freed until the StreamableString is destroyed. And the StreamableString has a longer lifecycle than anything in the http-client module: longer than the HttpResponseIterator, in particular.

So here's what we did:

A. HttpResponseParser has a new flag, "self-contained". When called after parse(), it specifies whether the caller may now free the HttpResponseParser or not. This flag defaults to true and is set false when a StreamableString is created.

B. Also when a StreamableString is created (this is all in HttpResponseParser::createTextItem()), the StreamableString is given ownership of the istream and its readbuffer. The readbuffer, in turn, is given memory ownership of the HttpResponseParser. Thus, all three objects will be freed when the StreamableString is destroyed, and not before.

All of this actually allowed us to simplify some of the other object relationships:

C. For instance, HttpResponseIterator no longer has a pointer to a HttpResponseParser; the HttpResponseParser's lifecycle is managed entirely within HttpClientModule::general_evaluate().

D. Likewise, HttpResponseIterator no longer has a pointer to an istream; the istream and readbuffer lifecycles are managed entirely within HttpResponseParser.

E. HttpResponseIterator also no longer has a map from istream* to HttpResponseIterator*, which was in fact completely unused.

F. HttpResponseParser no longer has a reference to a HttpResponseIterator, which was in fact also completely unused.

I tested with the above query, as well as this one which does not cause a StreamableString to be created:

  import module namespace http = "http://www.zorba-xquery.com/modules/http-client";
  http:get-node("http://api.whitepages.com/find_person/1.0/?name=Maria%20Lurdes;api_key=06ea2f21cc15602b6a3e242e3225a81a")

Both now run valgrind-clean with no leaks and no using free'd memory.

Revision history for this message
Sorin Marian Nasoi (sorin.marian.nasoi) wrote :

Checked in r11485.

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.