Access to freed memory in timezone handling causes crash

Bug #956843 reported by Robie Basak
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
Libical
Unknown
Unknown
libical (Ubuntu)
Fix Released
Undecided
Unassigned

Bug Description

When I start evolution and then click the button at the bottom of the left pane to switch to the calendar, evolution crashes. If I right-click on the evolution icon in Unity and click on "Calendar" to go straight to the calendar, evolution also crashes. This reproduced on my machine 31 out of 32 attempts, and produces a variety of backtraces (attached, summary below). For privacy reasons, I regret that I am not willing to post the core dumps.

I have searched through previous bugs and found a number of bugs that I believe are the same problem. For example: bug 900534, bug 951201, bug 952368, bug 954220, bug 900534.

Although I am still on Oneiric, looking at the existing bugs the same crash appears to also be present in Precise.

The problem seems to be that calendar items have a builtin_timezone field set that is not initialised. I have not yet managed to figure out where it is supposed to be initialised. For example:

#5 0x00007f2f479925a6 in e_calendar_item_draw_day_numbers (cells_y=45,
    cells_x=7, start_weekday=3, month=2, year=2012, col=0, row=0,
    cr=0x7f2f456ec9e0, calitem=0x7f2f4b154cd0, width=<optimized out>,
    height=<optimized out>) at e-calendar-item.c:1485
1485 today_tm = (*calitem->time_callback) (calitem, calitem->time_callback_data);
(gdb) p ((GnomeCalendar *)(((ECalShellView *)calitem->time_callback_data)->priv->cal_shell_content->priv->calendar))->priv->model->priv->zone->builtin_timezone
$49 = (icaltimezone *) 0x2000000020

I've found this in modules/calendar/e-cal-shell-backend.c which I think may be related:

        /* XXX Pre-load all built-in timezones in libical.
         *
         * Built-in time zones in libical 0.43 are loaded on demand,
         * but not in a thread-safe manner, resulting in a race when
         * multiple threads call icaltimezone_load_builtin_timezone()
         * on the same time zone. Until built-in time zone loading
         * in libical is made thread-safe, work around the issue by
         * loading all built-in time zones now, so libical's internal
         * time zone array will be fully populated before any threads
         * are spawned.
         */

As this bug is so difficult to reproduce and I can reproduce it reliably at the moment, I will try and get to the bottom of this. Any help would be appreciated.

Here are my 31 crash stack frames:

#0 0x00007f1c0ca611ad in icaltimezone_load_builtin_timezone (
#0 0x00007f4a000007e1 in ?? ()
#0 0x00007fb420a511ad in icaltimezone_load_builtin_timezone (
#0 0x00007fbc08d94ac7 in icaltimezone_get_utc_offset_of_utc_time (
#0 0x00007fc23a37eac7 in icaltimezone_get_utc_offset_of_utc_time (
#0 0x00007feb4a4401ad in icaltimezone_load_builtin_timezone (
#0 __strcmp_sse42 () at ../sysdeps/x86_64/multiarch/strcmp.S:259
#0 icalarray_free (array=0x7f2000000001)
#0 icalcomponent_get_first_component (c=0xc8000006f3000000,
#0 icalcomponent_get_first_component (c=0xd00000009, kind=ICAL_ANY_COMPONENT)
#0 icaltimezone_compare_change_fn (elem1=0x7fff75af4f60, elem2=0x2)
#0 icaltimezone_ensure_coverage (zone=0x1, end_year=2012)
#0 icaltimezone_ensure_coverage (zone=0x20, end_year=2012)
#0 icaltimezone_ensure_coverage (zone=0x21, end_year=2012)
#0 icaltimezone_ensure_coverage (zone=0x36, end_year=2012)
#0 icaltimezone_ensure_coverage (zone=0x4008000000000000, end_year=2012)
#0 icaltimezone_ensure_coverage (zone=0x6, end_year=2012)
#0 icaltimezone_ensure_coverage (zone=0x7f1500000004, end_year=2012)
#0 icaltimezone_ensure_coverage (zone=0x7f3000000001, end_year=2012)
#0 icaltimezone_ensure_coverage (zone=0x7f9400000003, end_year=2012)
#0 icaltimezone_ensure_coverage (zone=0x7fda00000004, end_year=2012)
#0 icaltimezone_ensure_coverage (zone=0x8ffecfbcaff6a5e, end_year=2012)
#0 icaltimezone_ensure_coverage (zone=0x900001100000000, end_year=2012)
#0 pvl_head (L=0x42555347e0300100)

Backtraces of all of these are attached.

ProblemType: Bug
DistroRelease: Ubuntu 11.10
Package: evolution 3.2.2-0ubuntu0.1
ProcVersionSignature: Ubuntu 3.0.0-16.28-generic 3.0.17
Uname: Linux 3.0.0-16-generic x86_64
ApportVersion: 1.23-0ubuntu4
Architecture: amd64
Date: Fri Mar 16 10:25:02 2012
ProcEnviron:
 LC_COLLATE=C
 PATH=(custom, user)
 LANG=en_GB.UTF-8
 SHELL=/bin/bash
SourcePackage: evolution
UpgradeStatus: Upgraded to oneiric on 2011-09-03 (194 days ago)

Related branches

Revision history for this message
Robie Basak (racb) wrote :
Revision history for this message
Robie Basak (racb) wrote :

This is not a trivial bug. It looks like a memory corruption issue.

What I have found so far:

An important data structure is an ECalShellView *cal_shell_view. The specific instance I care about (I suspect there may be only one) is as it appears in the first argument when e-cal-shell-view-private.c:e_cal_shell_view_private_constructed is called.

The bug is being triggered some time after ((GnomeCalendar *)cal_shell_view->priv->cal_shell_content->priv->calendar)->priv->model->priv->zone->builtin_timezone is corrupted.

Setting a watch to detect the point when it is corrupted gives me the example backtrace attached. I can't think why malloc would overwrite this memory area unless it is treating that memory as freed. I have tried setting breakpoints to catch something freeing
((GnomeCalendar *)cal_shell_view->priv->cal_shell_content->priv->calendar)->priv->model->priv->zone but haven't had any success there.

Any ideas?

Revision history for this message
Robie Basak (racb) wrote :

The root of the problem seems to be that builtin_timezone entries kept as pointers into an "icalarray". But the icalarray is "expanded" by being moved to a new location and the old location freed, making the previous builtin_timezone pointers invalid.

==4519== Invalid read of size 8
==4519== at 0xE6FEB46: icaltimezone_get_utc_offset_of_utc_time (icaltimezone.c:981)
==4519== by 0xE6FE652: icaltimezone_convert_time (icaltimezone.c:794)
==4519== by 0xE6F9EE0: icaltime_from_timet_with_zone (icaltime.c:224)
==4519== by 0x18810169: tag_calendar_cb (tag-calendar.c:120)
==4519== by 0x932B1E7: process_instances (e-cal-client.c:1961)
==4519== by 0x932B314: generate_instances_for_object_got_objects_cb (e-cal-client.c:1992)
==4519== by 0x932A799: got_objects_for_uid_cb (e-cal-client.c:1711)
==4519== by 0x626CC16: g_simple_async_result_complete (in /usr/lib/x86_64-linux-gnu/libgio-2.0.so.0.3000.0)
==4519== by 0x5536C5B: finish_async_op (e-client.c:2281)
==4519== by 0x5536F55: async_result_ready_cb (e-client.c:2318)
==4519== by 0x626CC16: g_simple_async_result_complete (in /usr/lib/x86_64-linux-gnu/libgio-2.0.so.0.3000.0)
==4519== by 0x626CD28: ??? (in /usr/lib/x86_64-linux-gnu/libgio-2.0.so.0.3000.0)
==4519== Address 0x1c11c8d8 is 29,928 bytes inside a block of size 29,952 free'd
==4519== at 0x4C282E0: free (vg_replace_malloc.c:366)
==4519== by 0xE6E8E5E: icalarray_expand (icalarray.c:159)
==4519== by 0xE6E8BE8: icalarray_append (icalarray.c:89)
==4519== by 0xE6FF54A: icaltimezone_get_builtin_timezone (icaltimezone.c:1414)
==4519== by 0xE6FF8A6: icaltimezone_get_builtin_timezone_from_tzid (icaltimezone.c:1525)
==4519== by 0xE6EC18F: icalcomponent_get_datetime (icalcomponent.c:1566)
==4519== by 0xE6EC28A: icalcomponent_get_dtstart (icalcomponent.c:1594)
==4519== by 0x187FB7EA: ensure_dates_are_in_default_zone (gnome-cal.c:744)
==4519== by 0x187FBA21: dn_client_view_objects_added_cb (gnome-cal.c:773)
==4519== by 0x65560A3: g_closure_invoke (in /usr/lib/x86_64-linux-gnu/libgobject-2.0.so.0.3000.0)
==4519== by 0x6568029: ??? (in /usr/lib/x86_64-linux-gnu/libgobject-2.0.so.0.3000.0)
==4519== by 0x65716B0: g_signal_emit_valist (in /usr/lib/x86_64-linux-gnu/libgobject-2.0.so.0.3000.0)

summary: - Race condition in timezone handling causes crash
+ Access to freed memory in timezone handling causes crash
Revision history for this message
Robie Basak (racb) wrote :
Revision history for this message
Robie Basak (racb) wrote :

I now think that this bug is in libical. I've emailed upstream here: http://sourceforge.net/mailarchive/message.php?msg_id=29069293

The following workaround works for me:

--- libical-0.48.orig/src/libical/icaltimezone.c 2011-12-13 17:08:18.000000000 +0000
+++ libical-0.48/src/libical/icaltimezone.c 2012-04-01 12:15:00.836064296 +0000
@@ -1656,7 +1656,7 @@
     icalerror_assert (builtin_timezones == NULL,
         "Parsing zones.tab file multiple times");

- builtin_timezones = icalarray_new (sizeof (icaltimezone), 32);
+ builtin_timezones = icalarray_new (sizeof (icaltimezone), 1024);

 #ifndef USE_BUILTIN_TZDATA
     filename_len = strlen ((char *) icaltzutil_get_zone_directory()) + strlen (ZONES_TAB_SYSTEM_FILENAME)

affects: evolution (Ubuntu) → libical (Ubuntu)
Revision history for this message
Robie Basak (racb) wrote :

Summary:

This is a heap corruption bug in libical. This has been acknowledged in the libical upstream development mailing list. Fixing this is not trivial, as the problem is architectural and crosses an API boundary.

The problem occurs when an array grows, causing it to be moved, which the API does not consider possible. This invalidates previous pointers issued by the API and causes later heap corruption.

A simple workaround is to make the array bigger to start with. It contains only timezone entries, of which there is expected to only be a limited number anyway. The attached patch increases the default size from 32 to 1024, which should be more than enough. The extra memory this would take is negligible.

Impact: this bug causes evolution calendar to crash on my machine, and I suspect that the instability I've seen in evolution's calendar over the past year or so stems from this root cause. I think that the bug 900534, bug 951201, bug 952368 and bug 954220 are also caused by this same issue. Applying this workaround will provide a significant improvement to evolution's stability.

Revision history for this message
Launchpad Janitor (janitor) wrote :

This bug was fixed in the package libical - 0.48-1ubuntu1

---------------
libical (0.48-1ubuntu1) precise; urgency=low

  * debian/patches/fix_timezone_crash.patch: workaround to avoid heap
    corruption until upstream have a complete fix (LP: #956843).
 -- Robie Basak <email address hidden> Wed, 04 Apr 2012 12:32:45 +0000

Changed in libical (Ubuntu):
status: New → Fix Released
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.