DOLFIN

Code review comment for lp:~jobh/dolfin/fast-array

fast-array
Merge into trunk

Revision history for this message

Anders Logg (logg) wrote on 2012-02-27:

On Mon, Feb 27, 2012 at 07:52:27PM -0000, Garth Wells wrote:
> On 27 February 2012 17:01, Anders Logg <email address hidden> wrote:
> > On Mon, Feb 27, 2012 at 03:32:04PM -0000, Johan Hake wrote:
> >> On 02/27/2012 03:01 PM, Garth Wells wrote:
> >> > On 27 February 2012 13:40, Anders Logg<email address hidden> wrote:
> >> >> On Fri, Feb 24, 2012 at 07:01:57PM -0000, Joachim Haga wrote:
> >> >>> I'd opine that the easiest would be to make array never own the data,
> >> >>> rather than always own the data. In that way, it can wrap anything.
> >> >>>
> >> >>> * TimeSeries returns (owned) Arrays in two places. These could
> >> >>> return references to its own data (const vector<double>&), that
> >> >>> would save a copy (they are currently copied twice, once into an
> >> >>> owned Array, then again in swig into a numpy array). But maybe the
> >> >>> amount of data returned is small, in which case a vector<double>
> >> >>> will be fine.
> >> >>
> >> >> The return of arrays from TimeSeries is typically not performance
> >> >> critical so a copy is fine (which it already is).
> >> >>
> >> >>> * Can't support resize() without ownership, so everywhere resize is
> >> >>> used on a reference argument should pass a vector<double>&
> >> >>> instead. I think this is a few places only.
> >> >>>
> >> >>> At that point, Array will be a plain wrapper of others' data.
> >> >>>
> >> >>> As for your (1), it might well be possible (keep a scratch array in
> >> >>> the GenericFunction class perhaps). But the thing is, the shared
> >> >>> version was no more safe. You can't keep a reference to an Array
> >> >>> because you don't know if it's shared or just wrapping data that
> >> >>> will be deleted from under you. It was "shared or borrowed" rather
> >> >>> than "owned or borrowed", only without the flag to tell the cases
> >> >>> apart.
> >> >>
> >> >> I'm not very happy with our current use of Array. We use std::vector
> >> >> in some places and Array in other places. I don't remember what the
> >> >> original intention was with Array. Is the purpose just to be able to
> >> >> pass shared data as arrays back and forth to Python?
> >> >
> >> > Yes.
> >> >
> >> >> Or should it be a
> >> >> general array class used throughout DOLFIN C++?
> >> >>
> >> >
> >> > No - there is no point in re-inventing std::vector. We should use
> >> > std::vector where possible. It's more flexible and can be used with
> >> > STL algorithms.
> >>
> >> Agree.
> >>
> >> >> If it is just for Python-wrapping, I think we should make it never own
> >> >> data as Joachim suggests.
> >>
> >> Agree. But we need to go over the interface and check that it makes
> >> sense everywhere. TimeSeries could probably just return a
> >> std::vector which is copied in the Python interface.
> >
> > Yes, definitely for TimeSeries. Perhaps we could settle for that
> > everytime we want to pass output data to Python?
> >
> > Then Array input would essentially be limited to a few particular
> > callbacks that need to be made efficient (GenericFunction::eval).
> >
>
> In the case of GenericFunction::eval we can use std::vector if, in
> the wrapping process, we can prevent resizing on the Python side.

Don't know if that's possible. Johan will know but he's out traveling.

> Maybe we can remove Array and support (a) std::vectors via copy and
> (b) references to a std::vector with resizing prevented.

If that is possible, we should keep the Array class around, perhaps in
a new subdirectory 'deprecated' to provide backwards compatibility
with the book. We've changed the eval interface maybe 100 times
already so we should try not to break user code.

--
Anders

> Garth
>
>
> >
> >
> >> > Memory has to be created somewhere in order to be used, and it has
> > to
> >> > be cleaned up.
> >> >
> >> > There are two issues here:
> >> >
> >> > 1) Dynamic allocation in order to interface with UFC. Using plain
> >> > pointers rather than smart pointers is likely faster, but I think that
> >> > it's still not a solution. We should try to avoid these allocations in
> >> > loops.
> >> >
> >> > 2) We need to re-visit the NumPy interface. If we can create NumPy
> >> > arrays of fixed length, then we can just use std::vector, and let the
> >> > NumPy array wrap the pointer&x[0]. Maybe Johan Hake can comment on
> >> > this?
> >>
> >> The problem is to pass a NumPy array to the C++ interface when it wants
> >> a std::vector&. Then we need to copy the values. A light weight Array
> >> class fixes that issue.
> >>
> >> Wrapping a std::vector& into a Numpy array is possible without copying
> >> by just passing the pointer. This would be similar to what we do today
> >> with the Array.
> >>
> >> Johan
> >>
> >> > Garth
> >> >
> >> >>
> >> >>
> >> >>> -j.
> >> >>>
> >> >>> On 24 February 2012 19:37, Garth Wells<email address hidden> wrote:
> >> >>>
> >> >>>> I'm not keen on ownership flags - they were a real mess before and it's
> >> >>>> been a lot better since we go rid of them.
> >> >>>>
> >> >>>> Would be it be possible to (1) re-use the Array so we avoid dynamic
> >> >>>> allocation or (2) have an Array-like structure that always owns the data?
> >> >>>> The class Array was introduced to better interface with Python, but I
> >> >>>> recall that we later became aware that the Numpy interface has has a flag
> >> >>>> to prevent resizing, which would make typemaps more robust.
> >> >>>>
> >> >>>
> >> >>
> >> >
> >>
> >>
> >
>

On Mon, Feb 27, 2012 at 07:52:27PM -0000, Garth Wells wrote:
> On 27 February 2012 17:01, Anders Logg <logg@simula.no> wrote:
> > On Mon, Feb 27, 2012 at 03:32:04PM -0000, Johan Hake wrote:
> >> On 02/27/2012 03:01 PM, Garth Wells wrote:
> >> > On 27 February 2012 13:40, Anders Logg<logg@simula.no>  wrote:
> >> >> On Fri, Feb 24, 2012 at 07:01:57PM -0000, Joachim Haga wrote:
> >> >>> I'd opine that the easiest would be to make array never own the data,
> >> >>> rather than always own the data. In that way, it can wrap anything.
> >> >>>
> >> >>> * TimeSeries returns (owned) Arrays in two places. These could
> >> >>> return references to its own data (const vector<double>&), that
> >> >>> would save a copy (they are currently copied twice, once into an
> >> >>> owned Array, then again in swig into a numpy array). But maybe the
> >> >>> amount of data returned is small, in which case a vector<double>
> >> >>> will be fine.
> >> >>
> >> >> The return of arrays from TimeSeries is typically not performance
> >> >> critical so a copy is fine (which it already is).
> >> >>
> >> >>> * Can't support resize() without ownership, so everywhere resize is
> >> >>> used on a reference argument should pass a vector<double>&
> >> >>> instead. I think this is a few places only.
> >> >>>
> >> >>> At that point, Array will be a plain wrapper of others' data.
> >> >>>
> >> >>> As for your (1), it might well be possible (keep a scratch array in
> >> >>> the GenericFunction class perhaps). But the thing is, the shared
> >> >>> version was no more safe. You can't keep a reference to an Array
> >> >>> because you don't know if it's shared or just wrapping data that
> >> >>> will be deleted from under you. It was "shared or borrowed" rather
> >> >>> than "owned or borrowed", only without the flag to tell the cases
> >> >>> apart.
> >> >>
> >> >> I'm not very happy with our current use of Array. We use std::vector
> >> >> in some places and Array in other places. I don't remember what the
> >> >> original intention was with Array. Is the purpose just to be able to
> >> >> pass shared data as arrays back and forth to Python?
> >> >
> >> > Yes.
> >> >
> >> >> Or should it be a
> >> >> general array class used throughout DOLFIN C++?
> >> >>
> >> >
> >> > No - there is no point in re-inventing std::vector. We should use
> >> > std::vector where possible. It's more flexible and can be used with
> >> > STL algorithms.
> >>
> >> Agree.
> >>
> >> >> If it is just for Python-wrapping, I think we should make it never own
> >> >> data as Joachim suggests.
> >>
> >> Agree. But we need to go over the interface and check that it makes
> >> sense everywhere. TimeSeries could probably just return a
> >> std::vector which is copied in the Python interface.
> >
> > Yes, definitely for TimeSeries. Perhaps we could settle for that
> > everytime we want to pass output data to Python?
> >
> > Then Array input would essentially be limited to a few particular
> > callbacks that need to be made efficient (GenericFunction::eval).
> >
>
> In the case of GenericFunction::eval we can use std::vector if, in
> the wrapping process, we can prevent resizing on the Python side.

Don't know if that's possible. Johan will know but he's out traveling.

> Maybe we can remove Array and support (a) std::vectors via copy and
> (b) references to a std::vector with resizing prevented.

--
Anders

> Garth
>
>
> >
> >
> >> > Memory has to be created somewhere in order to be used, and it has
> > to
> >> > be cleaned up.
> >> >
> >> > There are two issues here:
> >> >
> >> > 1) Dynamic allocation in order to interface with UFC. Using plain
> >> > pointers rather than smart pointers is likely faster, but I think that
> >> > it's still not a solution. We should try to avoid these allocations in
> >> > loops.
> >> >
> >> > 2) We need to re-visit the NumPy interface. If we can create NumPy
> >> > arrays of fixed length, then we can just use std::vector, and let the
> >> > NumPy array wrap the pointer&x[0]. Maybe Johan Hake can comment on
> >> > this?
> >>
> >> The problem is to pass a NumPy array to the C++ interface when it wants
> >> a std::vector&. Then we need to copy the values. A light weight Array
> >> class fixes that issue.
> >>
> >> Wrapping a std::vector& into a Numpy array is possible without copying
> >> by just passing the pointer. This would be similar to what we do today
> >> with the Array.
> >>
> >> Johan
> >>
> >> > Garth
> >> >
> >> >>
> >> >>
> >> >>> -j.
> >> >>>
> >> >>> On 24 February 2012 19:37, Garth Wells<gnw20@cam.ac.uk>  wrote:
> >> >>>
> >> >>>> I'm not keen on ownership flags - they were a real mess before and it's
> >> >>>> been a lot better since we go rid of them.
> >> >>>>
> >> >>>> Would be it be possible to (1) re-use the Array so we avoid dynamic
> >> >>>> allocation or (2) have an Array-like structure that always owns the data?
> >> >>>> The class Array was introduced to better interface with Python, but I
> >> >>>> recall that we later became aware that the Numpy interface has has a flag
> >> >>>> to prevent resizing, which would make typemaps more robust.
> >> >>>>
> >> >>>
> >> >>
> >> >
> >>
> >>
> >
>

« Back to merge proposal