Merge lp:~jobh/dolfin/symmetric-assemble into lp:~fenics-core/dolfin/trunk

Proposed by Joachim Haga
Status: Merged
Merged at revision: 6625
Proposed branch: lp:~jobh/dolfin/symmetric-assemble
Merge into: lp:~fenics-core/dolfin/trunk
Diff against target: 1311 lines (+1080/-16)
12 files modified
dolfin/common/utils.h (+20/-1)
dolfin/fem/DirichletBC.cpp (+10/-1)
dolfin/fem/DirichletBC.h (+5/-1)
dolfin/fem/SymmetricAssembler.cpp (+588/-0)
dolfin/fem/SymmetricAssembler.h (+82/-0)
dolfin/fem/SystemAssembler.cpp (+1/-1)
dolfin/fem/assemble.cpp (+36/-1)
dolfin/fem/assemble.h (+33/-1)
dolfin/fem/dolfin_fem.h (+1/-0)
site-packages/dolfin/fem/assembling.py (+122/-9)
test/unit/fem/python/SymmetricAssembler.py (+181/-0)
test/unit/test.py (+1/-1)
To merge this branch: bzr merge lp:~jobh/dolfin/symmetric-assemble
Reviewer Review Type Date Requested Status
Garth Wells Approve
Review via email: mp+91107@code.launchpad.net

Description of the change

Implement new symmetric assembler, for use with multiple RHS vectors.

This implementation returns two matrices: The symmetric and the non-symmetric
(due to application of Dirichlet BCs) parts of A:
Then, A=S+N, and since only non-zero columns of N are those of BC dofs,
Ax=b implies Sx=b-Nb.

Simple example (python):
    A, An = symmetric_assemble(a, bcs=bc)
    b = assemble(L, bcs=bc, symmetric_mod=An)
    solve(A, x, b)

I have created several prototypes for this assembler until finally deciding on on this interface. Since the user may (in an optimised solver) wish to create several partial coefficient matrices and combine them into the final LHS, it is advantageous to let N be a full-featured GenericMatrix rather than a specialised data structure. The simple identity A=S+N is also helpful in testing.

At the moment, the sparsity pattern of N is the same as for S. If this becomes a problem, it's possible to make N much sparser without changing the interface.

I'll propose this change for backport into the 1.0 branch later, since it should not impact current users. In trunk, it should be possible to remove some interfaces which this assembler can replace. At least DirichletBC::zero_column (I think cbc.block is the only user).

To post a comment you must log in.
Revision history for this message
Garth Wells (garth-wells) wrote :

I'm not convinced that this is the best approach. I'm inclined towards supporting unassembled matrices, and using this to store the cell tensors.

With the code, I'd like to see more verbose variable naming. It's too terse for me.

I wouldn't back port this. It's and feature and not a bug. I would prefer to make new releases to get new functionality out.

Revision history for this message
Johan Hake (johan-hake) wrote :

On Wednesday February 1 2012 16:48:24 Garth Wells wrote:
> I'm not convinced that this is the best approach. I'm inclined towards
> supporting unassembled matrices, and using this to store the cell tensors.

What exactly do you mean with unassembled matrices, and how have you
envisioned the interface?

I think this is a usefull approach. As Joachim says it gives the user some
nice options. How is the performance compared to present system assembler?

> With the code, I'd like to see more verbose variable naming. It's too terse
> for me.

To me it doesn't look much more terse than the excisting code.

> I wouldn't back port this. It's and feature and not a bug. I would prefer
> to make new releases to get new functionality out.

I agree.

Johan

Revision history for this message
Kent-Andre Mardal (kent-and) wrote :

On 1 February 2012 16:48, Garth Wells <email address hidden> wrote:

> I'm not convinced that this is the best approach. I'm inclined towards
> supporting unassembled matrices, and using this to store the cell tensors.
>

I think the current design/code is nice in the sense that it is
user-friendly with std matrices
for representing the boundary condition. It can be used when combining
different matrices
for efficiency (like we do in our flow computations). It is exactly what we
need.

>
> With the code, I'd like to see more verbose variable naming. It's too
> terse for me.
>
> I wouldn't back port this. It's and feature and not a bug. I would prefer
> to make new releases to get new functionality out.
>

ok

> --
> https://code.launchpad.net/~jobh/dolfin/symmetric-assemble/+merge/91107<https://code.launchpad.net/%7Ejobh/dolfin/symmetric-assemble/+merge/91107>
> Your team DOLFIN Core Team is requested to review the proposed merge of
> lp:~jobh/dolfin/symmetric-assemble into lp:dolfin.
>

Revision history for this message
Anders Logg (logg) wrote :

On a side note: unassembled matrices (for use with PETSc CUSP) are
implemented in Fredrik's GPU branch.

--
Anders

On Wed, Feb 01, 2012 at 03:48:24PM -0000, Garth Wells wrote:
> I'm not convinced that this is the best approach. I'm inclined towards supporting unassembled matrices, and using this to store the cell tensors.
>
> With the code, I'd like to see more verbose variable naming. It's too terse for me.
>
> I wouldn't back port this. It's and feature and not a bug. I would prefer to make new releases to get new functionality out.

Revision history for this message
Anders Logg (logg) wrote :

On Wed, Feb 01, 2012 at 03:48:24PM -0000, Garth Wells wrote:
> I'm not convinced that this is the best approach. I'm inclined towards supporting unassembled matrices, and using this to store the cell tensors.
>
> With the code, I'd like to see more verbose variable naming. It's too terse for me.
>
> I wouldn't back port this. It's and feature and not a bug. I would prefer to make new releases to get new functionality out.

I haven't looked at the patch yet so I can't comment on it.

--
Anders

Revision history for this message
Joachim Haga (jobh) wrote :

I don't know what advantages unassembled matrices will bring, so I can't
comment on that directly. The new interface is quite consistent with the
current ones though, and brings needed functionality.

With regards to variable naming, it can be changed once the interface is
decided on. However, it is now mostly duplicated from Assemble.cpp. It's
probably good to not diverge these two too much so that they're easy to
keep in sync.

-j.
Den 1. feb. 2012 16:48 skrev "Garth Wells" <email address hidden> følgende:

> I'm not convinced that this is the best approach. I'm inclined towards
> supporting unassembled matrices, and using this to store the cell tensors.
>
> With the code, I'd like to see more verbose variable naming. It's too
> terse for me.
>
> I wouldn't back port this. It's and feature and not a bug. I would prefer
> to make new releases to get new functionality out.
> --
> https://code.launchpad.net/~jobh/dolfin/symmetric-assemble/+merge/91107
> You are the owner of lp:~jobh/dolfin/symmetric-assemble.
>

Revision history for this message
Joachim Haga (jobh) wrote :

> How is the performance compared to present system assemble

Similar. The cost compared to regular assemble is slightly higher, owing to
the extra global tensor and the extra matvec product. At a guess, up to 15%
for simple forms without tensor reuse. I have earlier measured
assemble_system to be a little bit slower than regular assemble, so also
comparable.

-j.

Revision history for this message
Garth Wells (garth-wells) wrote :

On 1 February 2012 18:19, Joachim Haga <email address hidden> wrote:
> I don't know what advantages unassembled matrices will bring, so I can't
> comment on that directly. The new interface is quite consistent with the
> current ones though, and brings needed functionality.
>

Simplicity and efficiency. An assembled matrix is not required to
apply bcs. Only the matrix for a given cell is required to apply bcs
to the RHS. With an unassembled matrix, the code would be very simple
and would not require any extra parallel communication.

> With regards to variable naming, it can be changed once the interface is
> decided on. However, it is now mostly duplicated from Assemble.cpp. It's
> probably good to not diverge these two too much so that they're easy to
> keep in sync.
>

I don't like terse names like

  Impl, lrow_is_bc, t idx, lcol, n_entries (we generally use num_foo
in DOLFIN), etc

If the names are cleaned up the code formatting is made consistent
with the DOLFIN style, I don't have a any strong objection to merging,
but I would prefer to use unassembled matrices. A concern is that the
symmetric assembler code is in need of simplification, and we've made
a few limited attempts in this direction, but the patch is making it
more complex before it's been simplified as much as is reasonably
possible.

Garth

> -j.
> Den 1. feb. 2012 16:48 skrev "Garth Wells" <email address hidden> følgende:
>
>> I'm not convinced that this is the best approach. I'm inclined towards
>> supporting unassembled matrices, and using this to store the cell tensors.
>>
>> With the code, I'd like to see more verbose variable naming. It's too
>> terse for me.
>>
>> I wouldn't back port this. It's and feature and not a bug. I would prefer
>> to make new releases to get new functionality out.
>> --
>> https://code.launchpad.net/~jobh/dolfin/symmetric-assemble/+merge/91107
>> You are the owner of lp:~jobh/dolfin/symmetric-assemble.
>>
>
> --
> https://code.launchpad.net/~jobh/dolfin/symmetric-assemble/+merge/91107
> Your team DOLFIN Core Team is requested to review the proposed merge of lp:~jobh/dolfin/symmetric-assemble into lp:dolfin.

Revision history for this message
Joachim Haga (jobh) wrote :

Den 1. feb. 2012 19:50 skrev "Garth Wells" <email address hidden> følgende:
>
> I don't like terse names like
>
> Impl, lrow_is_bc, t idx, lcol, n_entries (we generally use num_foo
> in DOLFIN), etc
>
> If the names are cleaned up the code formatting is made consistent
> with the DOLFIN style, I don't have a any strong objection to merging,
> but I would prefer to use unassembled matrices. A concern is that the
> symmetric assembler code is in need of simplification, and we've made
> a few limited attempts in this direction, but the patch is making it
> more complex before it's been simplified as much as is reasonably
> possible.

Sounds good! I'll go through the parts that are new to improve naming. To
be clear, the old symmetric assemble code is not much used by the new code,
it's the standard assemble that's reused with the addition of a method to
set conditions on a cell matrix. So the old symmetric assembler can be
simplified or even removed if so desired.

It's possible to merge regular and new assembler too, with a flag for
symmetric assembly, to avoid code duplication, but it's not a completely
natural fit I think.

-j.
 Den 1. feb. 2012 19:50 skrev "Garth Wells" <email address hidden> følgende:

Revision history for this message
Anders Logg (logg) wrote :

On Wed, Feb 01, 2012 at 08:19:26PM -0000, Joachim Haga wrote:
> Den 1. feb. 2012 19:50 skrev "Garth Wells" <email address hidden> følgende:
> >
> > I don't like terse names like
> >
> > Impl, lrow_is_bc, t idx, lcol, n_entries (we generally use num_foo
> > in DOLFIN), etc
> >
> > If the names are cleaned up the code formatting is made consistent
> > with the DOLFIN style, I don't have a any strong objection to merging,
> > but I would prefer to use unassembled matrices. A concern is that the
> > symmetric assembler code is in need of simplification, and we've made
> > a few limited attempts in this direction, but the patch is making it
> > more complex before it's been simplified as much as is reasonably
> > possible.
>
> Sounds good! I'll go through the parts that are new to improve naming. To
> be clear, the old symmetric assemble code is not much used by the new code,
> it's the standard assemble that's reused with the addition of a method to
> set conditions on a cell matrix. So the old symmetric assembler can be
> simplified or even removed if so desired.
>
> It's possible to merge regular and new assembler too, with a flag for
> symmetric assembly, to avoid code duplication, but it's not a completely
> natural fit I think.

I think it would be very desirable to get rid of as much code
duplication as possible. Now we have 3 different assemblers: regular,
system and multicore. I'd like to have one single assembler. If for
some reason this becomes inefficient (bloated), we can look at ways to
handle that (templates, code generation).

--
Anders

Revision history for this message
Joachim Haga (jobh) wrote :

> > It's possible to merge regular and new assembler too, with a flag for
> > symmetric assembly, to avoid code duplication, but it's not a completely
> > natural fit I think.
>
> I think it would be very desirable to get rid of as much code
> duplication as possible. Now we have 3 different assemblers: regular,
> system and multicore. I'd like to have one single assembler. If for
> some reason this becomes inefficient (bloated), we can look at ways to
> handle that (templates, code generation).

Regular, system and new can become one without too much trouble, I think.
Not sure about multicore, but can have a look.

Mind if I merge the current one (after fixing style issues) first?

J.

Revision history for this message
Johan Hake (johan-hake) wrote :

On Wednesday February 1 2012 21:40:29 Joachim Haga wrote:
> > > It's possible to merge regular and new assembler too, with a flag for
> > > symmetric assembly, to avoid code duplication, but it's not a
> > > completely natural fit I think.
> >
> > I think it would be very desirable to get rid of as much code
> > duplication as possible. Now we have 3 different assemblers: regular,
> > system and multicore. I'd like to have one single assembler. If for
> > some reason this becomes inefficient (bloated), we can look at ways to
> > handle that (templates, code generation).

Agree!

> Regular, system and new can become one without too much trouble,

Good!

> I think.
> Not sure about multicore, but can have a look.

This is similar to the present AssembleSystem, in that it essentially iterates
over cells and then facets in each cell.

AFAIK, assemble over interior faces have not got as much love as the other
integrals. There are also some code dublications which can be removed (I
think). We have assemble_cells and assemble_cells_and_exterior_facets. One can
probably just have the latter.

> Mind if I merge the current one (after fixing style issues) first?

Not for me.

Johan

> J.

Revision history for this message
Joachim Haga (jobh) wrote :

>> Not sure about multicore, but can have a look.
>
> This is similar to the present AssembleSystem, in that it essentially iterates
> over cells and then facets in each cell.
>
> AFAIK, assemble over interior faces have not got as much love as the other
> integrals. There are also some code dublications which can be removed (I
> think). We have assemble_cells and assemble_cells_and_exterior_facets. One can
> probably just have the latter.

Ok. I assume there's a reason that multicore does it in this way,
meaning that if they are combined then it's the multicore version that
"wins". Other than that, I guess it's just a matter of specifying a
single thread and (if necessary) shorting out the mesh coloring in the
single-thread case.

But OpenMPAssembler hasn't replaced Assembler, so I guess there are
problems with this approach. Performance?

-j.

Revision history for this message
Garth Wells (garth-wells) wrote :

On 2 February 2012 10:01, Joachim Haga <email address hidden> wrote:
>>> Not sure about multicore, but can have a look.
>>
>> This is similar to the present AssembleSystem, in that it essentially iterates
>> over cells and then facets in each cell.
>>
>> AFAIK, assemble over interior faces have not got as much love as the other
>> integrals. There are also some code dublications which can be removed (I
>> think). We have assemble_cells and assemble_cells_and_exterior_facets. One can
>> probably just have the latter.
>
> Ok. I assume there's a reason that multicore does it in this way,
> meaning that if they are combined then it's the multicore version that
> "wins". Other than that, I guess it's just a matter of specifying a
> single thread and (if necessary) shorting out the mesh coloring in the
> single-thread case.
>
> But OpenMPAssembler hasn't replaced Assembler, so I guess there are
> problems with this approach. Performance?
>

I don't believe that it's possible to have one Assembler without
compromising on performance.

OpenMPAssembler is slower than Assembler for one thread because it
requires a somewhat different loop over cells. Also, at least when I
last worked on OpenMPAssembler, it didn't support as many cases as
Assembler. Johan H has probably bridged most/all of the gap in the
mean time.

OpenMPAssembler needs more testing before removing the 'experimental' tag.

I think that the performance focus should be on SystemAssembler (with
the possibility of just assembling the LHS or RHS). Assembler and
OpenMPAssembler could be merged for now. The assembler code would be
simpler if a number of the 'if' statements could be removed. Perhaps
the sub-domains code should be moved to the domain (i.e., the Mesh),
and the assemblers can just loop over sub-domains.

Garth

> -j.
>
> --
> https://code.launchpad.net/~jobh/dolfin/symmetric-assemble/+merge/91107
> Your team DOLFIN Core Team is requested to review the proposed merge of lp:~jobh/dolfin/symmetric-assemble into lp:dolfin.

lp:~jobh/dolfin/symmetric-assemble updated
6538. By Joachim Haga

Style issues following merge request

Revision history for this message
Joachim Haga (jobh) wrote :

I've fixed (or at least improved) the style issues, as requested.

I realise that the private-implementation (pImpl) style is not used much in dolfin, but I think it saves a lot of clutter in this case. Besides, the semi-public nature of methods like f.x. Assembler::assemble_exterior_facets forces a particular implementation and loop ordering, which complicates matters if the different implementations are to be merged.

Revision history for this message
Joachim Haga (jobh) wrote :

> OpenMPAssembler is slower than Assembler for one thread because it
> requires a somewhat different loop over cells. Also, at least when I
> last worked on OpenMPAssembler, it didn't support as many cases as
> Assembler. Johan H has probably bridged most/all of the gap in the
> mean time.
>
> OpenMPAssembler needs more testing before removing the 'experimental' tag.
>
> I think that the performance focus should be on SystemAssembler (with
> the possibility of just assembling the LHS or RHS). Assembler and
> OpenMPAssembler could be merged for now. The assembler code  would be
> simpler if a number of the 'if' statements could be removed. Perhaps
> the sub-domains code should be moved to the domain (i.e., the Mesh),
> and the assemblers can just loop over sub-domains.

I'm confused. I thought you said earlier that SystemAssembler was
overly complex (and hence not desireable to extend), and Johan
mentioned that SystemAssembler/OpenMPAssembler uses similar
loop-orderings (hence, potentially, with similar single-thread
performance).

Anyway, I'll have a look at it later. As I mentioned, SystemAssembler
is a bit slower now in my tests, but I guess that would change if
there are many facet integrals since then the per-cell assemble starts
to pay off.

-j.

Revision history for this message
Garth Wells (garth-wells) wrote :

On 2 February 2012 13:07, Joachim Haga <email address hidden> wrote:
>> OpenMPAssembler is slower than Assembler for one thread because it
>> requires a somewhat different loop over cells. Also, at least when I
>> last worked on OpenMPAssembler, it didn't support as many cases as
>> Assembler. Johan H has probably bridged most/all of the gap in the
>> mean time.
>>
>> OpenMPAssembler needs more testing before removing the 'experimental' tag.
>>
>> I think that the performance focus should be on SystemAssembler (with
>> the possibility of just assembling the LHS or RHS). Assembler and
>> OpenMPAssembler could be merged for now. The assembler code  would be
>> simpler if a number of the 'if' statements could be removed. Perhaps
>> the sub-domains code should be moved to the domain (i.e., the Mesh),
>> and the assemblers can just loop over sub-domains.
>
> I'm confused. I thought you said earlier that SystemAssembler was
> overly complex (and hence not desireable to extend),

I would like to see it simplified before being extended.

> and Johan
> mentioned that SystemAssembler/OpenMPAssembler uses similar
> loop-orderings (hence, potentially, with similar single-thread
> performance).
>

OpenMPAssembler has an outer loop over colours, and an inner loop over
cells. The other assemblers have one loop over cells.

> Anyway, I'll have a look at it later. As I mentioned, SystemAssembler
> is a bit slower now in my tests,

The difference was always very small in my tests. The original
SystemAssembler was faster, but making it more general has lead to a
performance drop (not a huge drop). It may be possible to improve the
performance of the bc searches in the assembler.

Another issue with SystemAssembler is that it is not robust in
parallel with the faster bc methods. The problem is that a partition
can have a vertices on a Dirichlet boundary but no facets. The results
is that bcs are not applied when they should be. I would like to see
this worked out before extending SystemAssembler.

Garth

> but I guess that would change if
> there are many facet integrals since then the per-cell assemble starts
> to pay off.
>
> -j.
>
> --
> https://code.launchpad.net/~jobh/dolfin/symmetric-assemble/+merge/91107
> Your team DOLFIN Core Team is requested to review the proposed merge of lp:~jobh/dolfin/symmetric-assemble into lp:dolfin.

Revision history for this message
Joachim Haga (jobh) wrote :

>
> > I'm confused. I thought you said earlier that SystemAssembler was
> > overly complex (and hence not desireable to extend),
>
> I would like to see it simplified before being extended.

I see, ok!

> OpenMPAssembler has an outer loop over colours, and an inner loop over
> cells. The other assemblers have one loop over cells.

Oh, the outer loop won't be a problem, all cells can be set to the same
color in sequential runs.

> Another issue with SystemAssembler is that it is not robust in
> parallel with the faster bc methods.
>

Noted! The only thing I'm missing now is... why, with all these problems,
do you still recommend SystemAssembler as the path forward? ;)

-j.

Revision history for this message
Garth Wells (garth-wells) wrote :

On 2 February 2012 13:46, Joachim Haga <email address hidden> wrote:
>>
>> > I'm confused. I thought you said earlier that SystemAssembler was
>> > overly complex (and hence not desireable to extend),
>>
>> I would like to see it simplified before being extended.
>
>
> I see, ok!
>
>
>> OpenMPAssembler has an outer loop over colours, and an inner loop over
>> cells. The other assemblers have one loop over cells.
>
>
> Oh, the outer loop won't be a problem, all cells can be set to the same
> color in sequential runs.
>

Yes, but there are some subtle issues that need to be taken care off.
Assembler uses a Mesh iterator, but we can't use the iterators in
OpenMPAssembler, so we loop with an integer, get the cell index and
then create a cell. We might want to integrate colouring more deeply
in Mesh, which would make things easier.

>
>> Another issue with SystemAssembler is that it is not robust in
>> parallel with the faster bc methods.
>>
>
> Noted! The only thing I'm missing now is... why, with all these problems,
> do you still recommend SystemAssembler as the path forward? ;)
>

Symmetry!

Garth

> -j.
>
> --
> https://code.launchpad.net/~jobh/dolfin/symmetric-assemble/+merge/91107
> Your team DOLFIN Core Team is requested to review the proposed merge of lp:~jobh/dolfin/symmetric-assemble into lp:dolfin.

Revision history for this message
Joachim Haga (jobh) wrote :

>
> > Noted! The only thing I'm missing now is... why, with all these problems,
> > do you still recommend SystemAssembler as the path forward? ;)
>
> Symmetry!

Aha! I think you need to accept my merge request, so that we get symmetry
without SystemAssembler!

:-)

-j.

Revision history for this message
Anders Logg (logg) wrote :

On Thu, Feb 02, 2012 at 10:20:13AM -0000, Garth Wells wrote:
> On 2 February 2012 10:01, Joachim Haga <email address hidden> wrote:
> >>> Not sure about multicore, but can have a look.
> >>
> >> This is similar to the present AssembleSystem, in that it essentially iterates
> >> over cells and then facets in each cell.
> >>
> >> AFAIK, assemble over interior faces have not got as much love as the other
> >> integrals. There are also some code dublications which can be removed (I
> >> think). We have assemble_cells and assemble_cells_and_exterior_facets. One can
> >> probably just have the latter.
> >
> > Ok. I assume there's a reason that multicore does it in this way,
> > meaning that if they are combined then it's the multicore version that
> > "wins". Other than that, I guess it's just a matter of specifying a
> > single thread and (if necessary) shorting out the mesh coloring in the
> > single-thread case.
> >
> > But OpenMPAssembler hasn't replaced Assembler, so I guess there are
> > problems with this approach. Performance?
> >
>
> I don't believe that it's possible to have one Assembler without
> compromising on performance.

Perhaps not, but it should be possible to share much more of the code.

> OpenMPAssembler is slower than Assembler for one thread because it
> requires a somewhat different loop over cells. Also, at least when I
> last worked on OpenMPAssembler, it didn't support as many cases as
> Assembler. Johan H has probably bridged most/all of the gap in the
> mean time.
>
> OpenMPAssembler needs more testing before removing the 'experimental' tag.

Agree.

> I think that the performance focus should be on SystemAssembler (with
> the possibility of just assembling the LHS or RHS). Assembler and
> OpenMPAssembler could be merged for now. The assembler code would be
> simpler if a number of the 'if' statements could be removed. Perhaps
> the sub-domains code should be moved to the domain (i.e., the Mesh),
> and the assemblers can just loop over sub-domains.

Yes, it should be possible to move that logic elsewhere. It probably
should not go into the mesh, since it is possible to assemble with the
subdomain specification separate from the mesh (which is the reason
for the somewhat complex logic). It might be possible to just move it
to a separate utility function, or a new data structure may be needed.

--
Anders

Revision history for this message
Anders Logg (logg) wrote :

On Thu, Feb 02, 2012 at 02:01:22PM -0000, Garth Wells wrote:
> On 2 February 2012 13:46, Joachim Haga <email address hidden> wrote:
> >>
> >> > I'm confused. I thought you said earlier that SystemAssembler was
> >> > overly complex (and hence not desireable to extend),
> >>
> >> I would like to see it simplified before being extended.
> >
> >
> > I see, ok!
> >
> >
> >> OpenMPAssembler has an outer loop over colours, and an inner loop over
> >> cells. The other assemblers have one loop over cells.
> >
> >
> > Oh, the outer loop won't be a problem, all cells can be set to the same
> > color in sequential runs.
> >
>
> Yes, but there are some subtle issues that need to be taken care off.
> Assembler uses a Mesh iterator, but we can't use the iterators in
> OpenMPAssembler, so we loop with an integer, get the cell index and
> then create a cell. We might want to integrate colouring more deeply
> in Mesh, which would make things easier.

We could either make CellIterator that takes an optional color
argument, or we could loop over integers also in the regular
assembler.

--
Anders

> >
> >> Another issue with SystemAssembler is that it is not robust in
> >> parallel with the faster bc methods.
> >>
> >
> > Noted! The only thing I'm missing now is... why, with all these problems,
> > do you still recommend SystemAssembler as the path forward? ;)
> >
>
> Symmetry!
>
> Garth
>
> > -j.
> >
>

lp:~jobh/dolfin/symmetric-assemble updated
6539. By Joachim Haga

Accept arbitrary iterables in addition to lists for bcs argument to assemble

Revision history for this message
Joachim Haga (jobh) wrote :

On 2 February 2012 15:01, Garth Wells <email address hidden> wrote:

> On 2 February 2012 13:46, Joachim Haga <email address hidden> wrote:>> Another
> issue with SystemAssembler is that it is not robust in
> >> parallel with the faster bc methods.
> >
> > Noted!
>

 I've been thinking a bit about this, and as far as I can understand this
can only happen if DirichletBC::get_boundary_values() does not return all
(local) Dirichlet vertices for a given partition, is that right? If so,
maybe the warning that is printed by SystemAssemble should be moved there?

Anyway, it was a useful excercise, because I also figured out two problems
with the new assembler under similar circumstances. The first is serious
(wrong diagonal value), but also easy to fix -- I'll do it first thing on
Monday, and push to this branch.

The second, if I understand correctly the SystemAssemble problem you
mention, is that the new assembler will fail to fully symmetricise the
matrix when the local boundary-vertex records are incomplete. This will
require fixing DirichletBC::get_boundary_values, which is harder. However,
the consequences are not as bad as they are for SystemAssembler, since the
matrix is complete (just not totally symmetric).

-j.

lp:~jobh/dolfin/symmetric-assemble updated
6540. By Joachim Haga

More style changes for SymmetricAssembler

6541. By Joachim Haga

Fix for parallel SymmetricAssembler.

It now uses GenericMatrix::ident() to set BC rows, to ensure that the diagonal is 1.0.
If non-robust methods (topological etc) are used to compute BCs, the assembled matrix
may not be completely symmetric.

Revision history for this message
Joachim Haga (jobh) wrote :

I've pushed a change which should fix SymmetricAssembler in parallel. It uses GenericMatrix::ident to set the diagonal values, which should always work (it does, however, require a finalized matrix -- so it's not possible to return an unfinalized one now).

But maybe it would be better to just enforce a more robust boundary search. Even pointwise search should be significantly cheaper than assemble. (But creating a DirichletBC from a subdomain is expensive -- something like 10x the time of assemble.)

Revision history for this message
Garth Wells (garth-wells) wrote :

On 6 February 2012 10:24, Joachim Haga <email address hidden> wrote:
> I've pushed a change which should fix SymmetricAssembler in parallel. It uses GenericMatrix::ident to set the diagonal values, which should always work (it does, however, require a finalized matrix -- so it's not possible to return an unfinalized one now).
>

How can this always work? It can destroy symmetry, right??

Garth

> But maybe it would be better to just enforce a more robust boundary search. Even pointwise search should be significantly cheaper than assemble. (But creating a DirichletBC from a subdomain is expensive -- something like 10x the time of assemble.)
> --
> https://code.launchpad.net/~jobh/dolfin/symmetric-assemble/+merge/91107
> Your team DOLFIN Core Team is requested to review the proposed merge of lp:~jobh/dolfin/symmetric-assemble into lp:dolfin.

Revision history for this message
Joachim Haga (jobh) wrote :

On 6 February 2012 13:06, Garth Wells <email address hidden> wrote:

> On 6 February 2012 10:24, Joachim Haga <email address hidden> wrote:
> > I've pushed a change which should fix SymmetricAssembler in parallel. It
> uses GenericMatrix::ident to set the diagonal values, which should always
> work (it does, however, require a finalized matrix -- so it's not possible
> to return an unfinalized one now).
> >
>
> How can this always work? It can destroy symmetry, right??
>

Yes. Depends on what you mean by "works", I guess. The remaining problem,
as I said, is that the boundary column is NOT zeroed if the boundary dof is
sometimes missing from DirichletBC::get_boundary_values(), as I understand
can happen with topology-search in parallel.

In that case, ident() will still set the boundary row, but the boundary
column will not be completely symmetricised. Hence, the matrix is correct
in the sense that Ax=b gives the right answer for x, but A has a small
unsymmetric component. This is less severe than for SystemAssemble, which
will be symmetric but wrong.

Now, topological search should be easy enough to fix, but it will make it
heavier (parallel comms + a search through the dofs). Maybe it's not worth
it, because the cost of pointwise search is (usually?) low compared to
other overhead.

 -j.

Revision history for this message
Garth Wells (garth-wells) wrote :

On 6 February 2012 12:45, Joachim Haga <email address hidden> wrote:
> On 6 February 2012 13:06, Garth Wells <email address hidden> wrote:
>
>> On 6 February 2012 10:24, Joachim Haga <email address hidden> wrote:
>> > I've pushed a change which should fix SymmetricAssembler in parallel. It
>> uses GenericMatrix::ident to set the diagonal values, which should always
>> work (it does, however, require a finalized matrix -- so it's not possible
>> to return an unfinalized one now).
>> >
>>
>> How can this always work? It can destroy symmetry, right??
>>
>
> Yes. Depends on what you mean by "works", I guess. The remaining problem,
> as I said, is that the boundary column is NOT zeroed if the boundary dof is
> sometimes missing from DirichletBC::get_boundary_values(), as I understand
> can happen with topology-search in parallel.
>
> In that case, ident() will still set the boundary row, but the boundary
> column will not be completely symmetricised. Hence, the matrix is correct
> in the sense that Ax=b gives the right answer for x, but A has a small
> unsymmetric component. This is less severe than for SystemAssemble, which
> will be symmetric but wrong.
>

It's just as bad. It will produce the wrong answer in common cases,
e.g. Cholesky factorisation.

> Now, topological search should be easy enough to fix, but it will make it
> heavier (parallel comms + a search through the dofs). Maybe it's not worth
> it, because the cost of pointwise search is (usually?) low compared to
> other overhead.
>

Yes, the solution is to look at the DirichletBC implementation.

Garth

>  -j.
>
> --
> https://code.launchpad.net/~jobh/dolfin/symmetric-assemble/+merge/91107
> Your team DOLFIN Core Team is requested to review the proposed merge of lp:~jobh/dolfin/symmetric-assemble into lp:dolfin.

Revision history for this message
Joachim Haga (jobh) wrote :

>
> > unsymmetric component. This is less severe than for SystemAssemble, which
> > will be symmetric but wrong.
> >
>
> It's just as bad. It will produce the wrong answer in common cases,
> e.g. Cholesky factorisation.

Right you are, thanks! I'll put a warning in (like in SystemAssembler).

> > Now, topological search should be easy enough to fix, but it will make it
> > heavier (parallel comms + a search through the dofs). Maybe it's not
> worth
> > it, because the cost of pointwise search is (usually?) low compared to
> > other overhead.
> >
>
> Yes, the solution is to look at the DirichletBC implementation.
>

Not sure what you mean, "look at" as in "study" or as in "fix"? I'm leaning
towards thinking that the actual boundary search is not so important
compared to other overhead, so it's completely acceptable to just say "use
pointwise" in parallel.

-j.

Revision history for this message
Garth Wells (garth-wells) wrote :

On 6 February 2012 13:28, Joachim Haga <email address hidden> wrote:
>>
>> > unsymmetric component. This is less severe than for SystemAssemble, which
>> > will be symmetric but wrong.
>> >
>>
>> It's just as bad. It will produce the wrong answer in common cases,
>> e.g. Cholesky factorisation.
>
>
> Right you are, thanks! I'll put a warning in (like in SystemAssembler).
>
>
>> > Now, topological search should be easy enough to fix, but it will make it
>> > heavier (parallel comms + a search through the dofs). Maybe it's not
>> worth
>> > it, because the cost of pointwise search is (usually?) low compared to
>> > other overhead.
>> >
>>
>> Yes, the solution is to look at the DirichletBC implementation.
>>
>
> Not sure what you mean, "look at" as in "study" or as in "fix"?

Make faster (and improve re-use/caching of data, if possible) the
"pointwise" case.

We could also communicate boundary condition facets that have vertices
that are not wholly owned by a given process. There should be
relatively few.

Garth

> I'm leaning
> towards thinking that the actual boundary search is not so important
> compared to other overhead, so it's completely acceptable to just say "use
> pointwise" in parallel.
>
> -j.
>
> --
> https://code.launchpad.net/~jobh/dolfin/symmetric-assemble/+merge/91107
> Your team DOLFIN Core Team is requested to review the proposed merge of lp:~jobh/dolfin/symmetric-assemble into lp:dolfin.

lp:~jobh/dolfin/symmetric-assemble updated
6542. By Joachim Haga

Add timers to DirichletBC

6543. By Joachim Haga

SymmetricAssembler: Require pointwise DirichletBC in parallel, add back finalize_tensor to interface.

Revision history for this message
Joachim Haga (jobh) wrote :

OK, ready for review (again) now! It's back to not using ident() but rather setting the diagonal element-wise, as it did before. Pointwise boundary search is required in parallel.

Tested on a 256x256 unitsquare Poisson problem (sequential), symmetric_assemble() is 1% slower than regular assemble(). assemble_system() is 30% slower.

I've added some timers to DirichletBC as well (and removed one in AssemblerTools, which didn't measure anything).

Revision history for this message
Joachim Haga (jobh) wrote :

Are there any objections to merging this symmetric assembler? In particular, objections to the interface?

If not, I'll prepare a branch for 1.1 as well.

Anything else (merging the different assemblers, fixing DirichletBC pointwise performance / parallel correctness, etc) can be dealt with separately later.

lp:~jobh/dolfin/symmetric-assemble updated
6544. By Joachim Haga

Add copyright and modification information.

Revision history for this message
Anders Logg (logg) wrote :

On Tue, Feb 14, 2012 at 09:05:23AM -0000, Joachim Haga wrote:
> Are there any objections to merging this symmetric assembler? In particular, objections to the interface?

Not from me.

--
Anders

> If not, I'll prepare a branch for 1.1 as well.
>
> Anything else (merging the different assemblers, fixing DirichletBC pointwise performance / parallel correctness, etc) can be dealt with separately later.

Revision history for this message
Johan Hake (johan-hake) wrote :

On 02/16/2012 11:37 PM, Anders Logg wrote:
> On Tue, Feb 14, 2012 at 09:05:23AM -0000, Joachim Haga wrote:
>> Are there any objections to merging this symmetric assembler? In particular, objections to the interface?
> Not from me.

Neither from me.

>> If not, I'll prepare a branch for 1.1 as well.

Not sure what you mean with this. Are you going to include it in 1.0.x
branch? Even if it is a a feature which will, not for now break any
code, it will most probably trigger some iterations on the interface,
read merging of Assemblers, which a stable interface should be spared
from. But that is my opinion.

Johan

>> Anything else (merging the different assemblers, fixing DirichletBC pointwise performance / parallel correctness, etc) can be dealt with separately later.
>

Revision history for this message
Joachim Haga (jobh) wrote :

Den 17. feb. 2012 08.10 skrev "Johan Hake" <email address hidden>:
> >> If not, I'll prepare a branch for 1.1 as well.
>
> Not sure what you mean with this. Are you going to include it in 1.0.x
> branch?

No, I mixed up with the 1.1 series, I thought there was a branch as well.
Now I see there isn't.

-j.

Revision history for this message
Johan Hake (johan-hake) wrote :

On 02/17/2012 08:31 AM, Joachim Haga wrote:
> Den 17. feb. 2012 08.10 skrev "Johan Hake"<email address hidden>:
>>>> If not, I'll prepare a branch for 1.1 as well.
>> Not sure what you mean with this. Are you going to include it in 1.0.x
>> branch?
> No, I mixed up with the 1.1 series, I thought there was a branch as well.
> Now I see there isn't.

Ok!

Johan

>
> -j.
>

Revision history for this message
Anders Logg (logg) wrote :

On Fri, Feb 17, 2012 at 07:10:21AM -0000, Johan Hake wrote:
> On 02/16/2012 11:37 PM, Anders Logg wrote:
> > On Tue, Feb 14, 2012 at 09:05:23AM -0000, Joachim Haga wrote:
> >> Are there any objections to merging this symmetric assembler? In particular, objections to the interface?
> > Not from me.
>
> Neither from me.
>
> >> If not, I'll prepare a branch for 1.1 as well.
>
> Not sure what you mean with this. Are you going to include it in 1.0.x
> branch? Even if it is a a feature which will, not for now break any
> code, it will most probably trigger some iterations on the interface,
> read merging of Assemblers, which a stable interface should be spared
> from. But that is my opinion.

This should *not* be merged into 1.0. We should only do bug fixes to
the stable branch.

--
Anders

> Johan
>
> >> Anything else (merging the different assemblers, fixing DirichletBC pointwise performance / parallel correctness, etc) can be dealt with separately later.
> >
>
>

Revision history for this message
Joachim Haga (jobh) wrote :

On 17 February 2012 08:43, Anders Logg <email address hidden> wrote:
> This should *not* be merged into 1.0. We should only do bug fixes to
> the stable branch.

Don't worry, I just thought (erroneously) that there was a 1.1 branch
for the 1.1 series.

-j.

Revision history for this message
Anders Logg (logg) wrote :

On Fri, Feb 17, 2012 at 07:55:21AM -0000, Joachim Haga wrote:
> On 17 February 2012 08:43, Anders Logg <email address hidden> wrote:
> > This should *not* be merged into 1.0. We should only do bug fixes to
> > the stable branch.
>
> Don't worry, I just thought (erroneously) that there was a 1.1 branch
> for the 1.1 series.

ok. :-)

--
Anders

Revision history for this message
Joachim Haga (jobh) wrote :

I'm sorry if I nag, but since there are no objections, could this be merged? I'd like to finish this and move on.

Revision history for this message
Johan Hake (johan-hake) wrote :

On 02/24/2012 12:10 AM, Joachim Haga wrote:
> I'm sorry if I nag, but since there are no objections, could this be
> merged? I'd like to finish this and move on.

I am on a travel from now on so I wont be able to apply it right now.

Johan

Revision history for this message
Anders Logg (logg) wrote :

On Thu, Feb 23, 2012 at 11:31:17PM -0000, Johan Hake wrote:
> On 02/24/2012 12:10 AM, Joachim Haga wrote:
> > I'm sorry if I nag, but since there are no objections, could this be
> > merged? I'd like to finish this and move on.
>
> I am on a travel from now on so I wont be able to apply it right now.

I can merge it but Garth needs to comment first. He had objections
before.

--
Anders

Revision history for this message
Garth Wells (garth-wells) wrote :

If it passes all the tests and symmetry preservation is *guaranteed*, go ahead.

Bear in mind that it may be changed in the future once we support unassembled matrices.

review: Approve
Revision history for this message
Joachim Haga (jobh) wrote :

Thanks! Just hold on a few minutes while I check -Wall -Werror
-pedantic compilation.

-j.

On 27 February 2012 15:09, Garth Wells <email address hidden> wrote:
> Review: Approve
>
> If it passes all the tests and symmetry preservation is *guaranteed*, go ahead.
>
> Bear in mind that it may be changed in the future once we support unassembled matrices.
> --
> https://code.launchpad.net/~jobh/dolfin/symmetric-assemble/+merge/91107
> You are the owner of lp:~jobh/dolfin/symmetric-assemble.

Revision history for this message
Joachim Haga (jobh) wrote :

Ok, compiles fine now.

lp:~jobh/dolfin/symmetric-assemble updated
6545. By Joachim Haga

Fix errors from -Wall -Werror -pedantic

Revision history for this message
Anders Logg (logg) wrote :

On Mon, Feb 27, 2012 at 03:16:23PM -0000, Joachim Haga wrote:
> Ok, compiles fine now.

Great. I'll merge it into my branch then push when my buildbot is
green. I have a couple of changesets lined up for merge.

--
Anders

lp:~jobh/dolfin/symmetric-assemble updated
6546. By Joachim Haga

Revert accidental change to GenericMatrix

6547. By Joachim Haga

Merge with trunk

6548. By Joachim Haga

Merge with trunk

6549. By Joachim Haga

Make symmetric assembler safe with topological dirichlet bcs in parallel

6550. By Joachim Haga

Enable SymmetricAssembler unit test

Preview Diff

[H/L] Next/Prev Comment, [J/K] Next/Prev File, [N/P] Next/Prev Hunk
1=== modified file 'dolfin/common/utils.h'
2--- dolfin/common/utils.h 2011-06-02 19:26:59 +0000
3+++ dolfin/common/utils.h 2012-02-29 22:48:20 +0000
4@@ -15,13 +15,18 @@
5 // You should have received a copy of the GNU Lesser General Public License
6 // along with DOLFIN. If not, see <http://www.gnu.org/licenses/>.
7 //
8+// Modified by Joachim B. Haga, 2012.
9+//
10 // First added: 2009-08-09
11-// Last changed: 2010-11-18
12+// Last changed: 2012-02-01
13
14 #ifndef __UTILS_H
15 #define __UTILS_H
16
17 #include <string>
18+#include <cstring>
19+#include <limits>
20+#include <vector>
21 #include "types.h"
22
23 namespace dolfin
24@@ -39,6 +44,20 @@
25 /// Return simple hash for given signature string
26 dolfin::uint hash(std::string signature);
27
28+ /// Fast zero-fill of numeric vectors / blocks.
29+ template <class T> inline void zerofill(T* arr, uint n)
30+ {
31+ if (std::numeric_limits<T>::is_integer || std::numeric_limits<T>::is_iec559)
32+ std::memset(arr, 0, n*sizeof(T));
33+ else
34+ // should never happen in practice
35+ std::fill_n(arr, n, T(0));
36+ }
37+
38+ template <class T> inline void zerofill(std::vector<T> &vec)
39+ {
40+ zerofill(&vec[0], vec.size());
41+ }
42 }
43
44 #endif
45
46=== modified file 'dolfin/fem/DirichletBC.cpp'
47--- dolfin/fem/DirichletBC.cpp 2012-02-29 13:22:33 +0000
48+++ dolfin/fem/DirichletBC.cpp 2012-02-29 22:48:20 +0000
49@@ -18,7 +18,7 @@
50 // Modified by Kristian Oelgaard, 2008
51 // Modified by Martin Sandve Alnes, 2008
52 // Modified by Johan Hake, 2009
53-// Modified by Joachim B Haga, 2009
54+// Modified by Joachim B. Haga, 2012
55 //
56 // First added: 2007-04-10
57 // Last changed: 2012-02-29
58@@ -28,6 +28,7 @@
59 #include <boost/assign/list_of.hpp>
60 #include <boost/serialization/utility.hpp>
61
62+#include <dolfin/common/Timer.h>
63 #include <dolfin/common/constants.h>
64 #include <dolfin/common/Array.h>
65 #include <dolfin/common/NoDeleter.h>
66@@ -205,6 +206,8 @@
67 //-----------------------------------------------------------------------------
68 void DirichletBC::gather(Map& boundary_values) const
69 {
70+ Timer timer("DirichletBC gather");
71+
72 typedef std::vector<std::pair<uint, double> > bv_vec_type;
73 typedef std::map<uint, bv_vec_type> map_type;
74
75@@ -480,6 +483,8 @@
76 GenericVector* b,
77 const GenericVector* x) const
78 {
79+ Timer timer("DirichletBC apply");
80+
81 // Check arguments
82 check_arguments(A, b, x);
83
84@@ -605,6 +610,8 @@
85 //-----------------------------------------------------------------------------
86 void DirichletBC::init_facets() const
87 {
88+ Timer timer("DirichletBC init facets");
89+
90 if (facets.size() > 0)
91 return;
92
93@@ -705,6 +712,8 @@
94 BoundaryCondition::LocalData& data,
95 std::string method) const
96 {
97+ Timer timer("DirichletBC compute bc");
98+
99 // Set method if dafault
100 if (method == "default")
101 method = _method;
102
103=== modified file 'dolfin/fem/DirichletBC.h'
104--- dolfin/fem/DirichletBC.h 2012-02-29 13:22:33 +0000
105+++ dolfin/fem/DirichletBC.h 2012-02-29 22:48:20 +0000
106@@ -289,7 +289,11 @@
107 void apply(GenericMatrix& A, GenericVector& b,
108 const GenericVector& x) const;
109
110- /// Get Dirichlet dofs and values
111+ /// Get Dirichlet dofs and values. If a method other than 'pointwise' is
112+ /// used in parallel, the map may not be complete for local vertices since
113+ /// a vertex can have a bc applied, but the partition might not have a
114+ /// facet on the boundary. To ensure all local boundary dofs are marked,
115+ /// it is necessary to call gather() on the returned boundary values.
116 ///
117 /// *Arguments*
118 /// boundary_values (boost::unordered_map<uint, double>)
119
120=== added file 'dolfin/fem/SymmetricAssembler.cpp'
121--- dolfin/fem/SymmetricAssembler.cpp 1970-01-01 00:00:00 +0000
122+++ dolfin/fem/SymmetricAssembler.cpp 2012-02-29 22:48:20 +0000
123@@ -0,0 +1,588 @@
124+// Copyright (C) 2007-2011 Anders Logg
125+// Copyright (C) 2012 Joachim B. Haga
126+//
127+// This file is part of DOLFIN.
128+//
129+// DOLFIN is free software: you can redistribute it and/or modify
130+// it under the terms of the GNU Lesser General Public License as published by
131+// the Free Software Foundation, either version 3 of the License, or
132+// (at your option) any later version.
133+//
134+// DOLFIN is distributed in the hope that it will be useful,
135+// but WITHOUT ANY WARRANTY; without even the implied warranty of
136+// MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
137+// GNU Lesser General Public License for more details.
138+//
139+// You should have received a copy of the GNU Lesser General Public License
140+// along with DOLFIN. If not, see <http://www.gnu.org/licenses/>.
141+//
142+// First added: 2012-02-01 (modified from Assembler.cpp by jobh@simula.no)
143+
144+#include <boost/scoped_ptr.hpp>
145+
146+#include <dolfin/log/dolfin_log.h>
147+#include <dolfin/common/Timer.h>
148+#include <dolfin/common/utils.h>
149+#include <dolfin/parameter/GlobalParameters.h>
150+#include <dolfin/la/GenericMatrix.h>
151+#include <dolfin/mesh/Mesh.h>
152+#include <dolfin/mesh/Cell.h>
153+#include <dolfin/mesh/Facet.h>
154+#include <dolfin/mesh/MeshData.h>
155+#include <dolfin/mesh/MeshFunction.h>
156+#include <dolfin/mesh/SubDomain.h>
157+#include <dolfin/function/GenericFunction.h>
158+#include <dolfin/function/FunctionSpace.h>
159+#include "GenericDofMap.h"
160+#include "Form.h"
161+#include "UFC.h"
162+#include "FiniteElement.h"
163+#include "AssemblerTools.h"
164+#include "SymmetricAssembler.h"
165+
166+using namespace dolfin;
167+
168+/// The private implementation class. It holds all relevant parameters for a
169+/// single assemble, the implementation, and some scratch variables. Its
170+/// lifetime is never longer than the assemble itself, so it's safe to keep
171+/// references to parameters.
172+class SymmetricAssembler::PImpl
173+{
174+public:
175+ // User-provided parameters
176+ GenericMatrix& A;
177+ GenericMatrix& A_asymm;
178+ const Form& a;
179+ const std::vector<const DirichletBC*>& row_bcs;
180+ const std::vector<const DirichletBC*>& col_bcs;
181+ const MeshFunction<uint>* cell_domains;
182+ const MeshFunction<uint>* exterior_facet_domains;
183+ const MeshFunction<uint>* interior_facet_domains;
184+ bool reset_sparsity, add_values, finalize_tensor;
185+
186+ PImpl(GenericMatrix& _A, GenericMatrix& _A_asymm,
187+ const Form& _a,
188+ const std::vector<const DirichletBC*>& _row_bcs,
189+ const std::vector<const DirichletBC*>& _col_bcs,
190+ const MeshFunction<uint>* _cell_domains,
191+ const MeshFunction<uint>* _exterior_facet_domains,
192+ const MeshFunction<uint>* _interior_facet_domains,
193+ bool _reset_sparsity, bool _add_values, bool _finalize_tensor)
194+ : A(_A), A_asymm(_A_asymm), a(_a),
195+ row_bcs(_row_bcs), col_bcs(_col_bcs),
196+ cell_domains(_cell_domains),
197+ exterior_facet_domains(_exterior_facet_domains),
198+ interior_facet_domains(_interior_facet_domains),
199+ reset_sparsity(_reset_sparsity),
200+ add_values(_add_values),
201+ finalize_tensor(_finalize_tensor),
202+ mesh(_a.mesh()), ufc(_a), ufc_asymm(_a)
203+ {
204+ }
205+
206+ void assemble();
207+
208+private:
209+ void assemble_cells();
210+ void assemble_exterior_facets();
211+ void assemble_interior_facets();
212+
213+ // Adjust the columns of the local element tensor so that it becomes
214+ // symmetric once BCs have been applied to the rows. Returns true if any
215+ // columns have been moved to _asymm.
216+ bool make_bc_symmetric(std::vector<double>& elm_A, std::vector<double>& elm_A_asymm,
217+ const std::vector<const std::vector<uint>*>& dofs);
218+
219+ // These are derived from the variables above:
220+ const Mesh& mesh; // = Mesh(a)
221+ UFC ufc; // = UFC(a)
222+ UFC ufc_asymm; // = UFC(a), used for scratch local tensors
223+ bool matching_bcs; // true if row_bcs==col_bcs
224+ DirichletBC::Map row_bc_values; // derived from row_bcs
225+ DirichletBC::Map col_bc_values; // derived from col_bcs, but empty if matching_bcs
226+
227+ // These are used to keep track of which diagonals have been set:
228+ std::pair<uint,uint> processor_dof_range;
229+ std::set<uint> inserted_diagonals;
230+
231+ // Scratch variables
232+ std::vector<bool> local_row_is_bc;
233+};
234+//-----------------------------------------------------------------------------
235+void SymmetricAssembler::assemble(GenericMatrix& A,
236+ GenericMatrix& A_asymm,
237+ const Form& a,
238+ const std::vector<const DirichletBC*>& row_bcs,
239+ const std::vector<const DirichletBC*>& col_bcs,
240+ const MeshFunction<uint>* cell_domains,
241+ const MeshFunction<uint>* exterior_facet_domains,
242+ const MeshFunction<uint>* interior_facet_domains,
243+ bool reset_sparsity,
244+ bool add_values,
245+ bool finalize_tensor)
246+{
247+ PImpl pImpl(A, A_asymm, a, row_bcs, col_bcs,
248+ cell_domains, exterior_facet_domains, interior_facet_domains,
249+ reset_sparsity, add_values, finalize_tensor);
250+ pImpl.assemble();
251+}
252+//-----------------------------------------------------------------------------
253+void SymmetricAssembler::PImpl::assemble()
254+{
255+ // All assembler functions above end up calling this function, which
256+ // in turn calls the assembler functions below to assemble over
257+ // cells, exterior and interior facets.
258+ //
259+ // Important notes:
260+ //
261+ // 1. Note the importance of treating empty mesh functions as null
262+ // pointers for the PyDOLFIN interface.
263+ //
264+ // 2. Note that subdomains given as input to this function override
265+ // subdomains attached to forms, which in turn override subdomains
266+ // stored as part of the mesh.
267+
268+ // If the bcs match (which is the usual case), we are assembling a normal
269+ // square matrix which contains the diagonal (and the dofmaps should match,
270+ // too).
271+ matching_bcs = (row_bcs == col_bcs);
272+
273+ // Get Dirichlet dofs rows and values for local mesh
274+ for (uint i = 0; i < row_bcs.size(); ++i)
275+ {
276+ row_bcs[i]->get_boundary_values(row_bc_values);
277+ if (MPI::num_processes() > 1 && row_bcs[i]->method() != "pointwise")
278+ row_bcs[i]->gather(row_bc_values);
279+ }
280+ if (!matching_bcs)
281+ {
282+ // Get Dirichlet dofs columns and values for local mesh
283+ for (uint i = 0; i < col_bcs.size(); ++i)
284+ {
285+ col_bcs[i]->get_boundary_values(col_bc_values);
286+ if (MPI::num_processes() > 1 && col_bcs[i]->method() != "pointwise")
287+ col_bcs[i]->gather(col_bc_values);
288+ }
289+ }
290+
291+ dolfin_assert(a.rank() == 2);
292+
293+ // Get cell domains
294+ if (!cell_domains || cell_domains->size() == 0)
295+ {
296+ cell_domains = a.cell_domains_shared_ptr().get();
297+ if (!cell_domains)
298+ cell_domains = a.mesh().domains().cell_domains(a.mesh()).get();
299+ }
300+
301+ // Get exterior facet domains
302+ if (!exterior_facet_domains || exterior_facet_domains->size() == 0)
303+ {
304+ exterior_facet_domains = a.exterior_facet_domains_shared_ptr().get();
305+ if (!exterior_facet_domains)
306+ exterior_facet_domains = a.mesh().domains().facet_domains(a.mesh()).get();
307+ }
308+
309+ // Get interior facet domains
310+ if (!interior_facet_domains || interior_facet_domains->size() == 0)
311+ {
312+ interior_facet_domains = a.interior_facet_domains_shared_ptr().get();
313+ if (!interior_facet_domains)
314+ interior_facet_domains = a.mesh().domains().facet_domains(a.mesh()).get();
315+ }
316+
317+ // Check form
318+ AssemblerTools::check(a);
319+
320+ // Gather off-process coefficients
321+ const std::vector<boost::shared_ptr<const GenericFunction> >
322+ coefficients = a.coefficients();
323+ for (uint i = 0; i < coefficients.size(); ++i)
324+ coefficients[i]->gather();
325+
326+ // Initialize global tensors
327+ AssemblerTools::init_global_tensor(A, a, reset_sparsity, add_values);
328+ AssemblerTools::init_global_tensor(A_asymm, a, reset_sparsity, add_values);
329+
330+ // Get dofs that are local to this processor
331+ processor_dof_range = A.local_range(0);
332+
333+ // Assemble over cells
334+ assemble_cells();
335+
336+ // Assemble over exterior facets
337+ assemble_exterior_facets();
338+
339+ // Assemble over interior facets
340+ assemble_interior_facets();
341+
342+ // Finalize assembly of global tensor
343+ if (finalize_tensor)
344+ {
345+ A.apply("add");
346+ A_asymm.apply("add");
347+ }
348+}
349+//-----------------------------------------------------------------------------
350+void SymmetricAssembler::PImpl::assemble_cells()
351+{
352+ // Skip assembly if there are no cell integrals
353+ if (ufc.form.num_cell_domains() == 0)
354+ return;
355+
356+ // Set timer
357+ Timer timer("Assemble cells");
358+
359+ // Form rank
360+ const uint form_rank = ufc.form.rank();
361+
362+ // Collect pointers to dof maps
363+ std::vector<const GenericDofMap*> dofmaps;
364+ for (uint i = 0; i < form_rank; ++i)
365+ dofmaps.push_back(a.function_space(i)->dofmap().get());
366+
367+ // Vector to hold dof map for a cell
368+ std::vector<const std::vector<uint>* > dofs(form_rank);
369+
370+ // Cell integral
371+ dolfin_assert(ufc.cell_integrals.size() > 0);
372+ ufc::cell_integral* integral = ufc.cell_integrals[0].get();
373+
374+ // Assemble over cells
375+ Progress p(AssemblerTools::progress_message(A.rank(), "cells"), mesh.num_cells());
376+ for (CellIterator cell(mesh); !cell.end(); ++cell)
377+ {
378+ // Get integral for sub domain (if any)
379+ if (cell_domains && cell_domains->size() > 0)
380+ {
381+ const uint domain = (*cell_domains)[*cell];
382+ if (domain < ufc.form.num_cell_domains())
383+ integral = ufc.cell_integrals[domain].get();
384+ else
385+ continue;
386+ }
387+
388+ // Skip integral if zero
389+ if (!integral)
390+ continue;
391+
392+ // Update to current cell
393+ ufc.update(*cell);
394+
395+ // Get local-to-global dof maps for cell
396+ for (uint i = 0; i < form_rank; ++i)
397+ dofs[i] = &(dofmaps[i]->cell_dofs(cell->index()));
398+
399+ // Tabulate cell tensor
400+ integral->tabulate_tensor(&ufc.A[0], ufc.w(), ufc.cell);
401+
402+ // Apply boundary conditions
403+ const bool asymm_changed = make_bc_symmetric(ufc.A, ufc_asymm.A, dofs);
404+
405+ // Add entries to global tensor.
406+ A.add(&ufc.A[0], dofs);
407+ if (asymm_changed)
408+ A_asymm.add(&ufc_asymm.A[0], dofs);
409+
410+ p++;
411+ }
412+}
413+//-----------------------------------------------------------------------------
414+void SymmetricAssembler::PImpl::assemble_exterior_facets()
415+{
416+ // Skip assembly if there are no exterior facet integrals
417+ if (ufc.form.num_exterior_facet_domains() == 0)
418+ return;
419+ Timer timer("Assemble exterior facets");
420+
421+ // Extract mesh
422+ const Mesh& mesh = a.mesh();
423+
424+ // Form rank
425+ const uint form_rank = ufc.form.rank();
426+
427+ // Collect pointers to dof maps
428+ std::vector<const GenericDofMap*> dofmaps;
429+ for (uint i = 0; i < form_rank; ++i)
430+ dofmaps.push_back(a.function_space(i)->dofmap().get());
431+
432+ // Vector to hold dof map for a cell
433+ std::vector<const std::vector<uint>* > dofs(form_rank);
434+
435+ // Exterior facet integral
436+ dolfin_assert(ufc.exterior_facet_integrals.size() > 0);
437+ const ufc::exterior_facet_integral*
438+ integral = ufc.exterior_facet_integrals[0].get();
439+
440+ // Compute facets and facet - cell connectivity if not already computed
441+ const uint D = mesh.topology().dim();
442+ mesh.init(D - 1);
443+ mesh.init(D - 1, D);
444+ dolfin_assert(mesh.ordered());
445+
446+ // Assemble over exterior facets (the cells of the boundary)
447+ Progress p(AssemblerTools::progress_message(A.rank(), "exterior facets"),
448+ mesh.num_facets());
449+ for (FacetIterator facet(mesh); !facet.end(); ++facet)
450+ {
451+ // Only consider exterior facets
452+ if (!facet->exterior())
453+ {
454+ p++;
455+ continue;
456+ }
457+
458+ // Get integral for sub domain (if any)
459+ if (exterior_facet_domains && exterior_facet_domains->size() > 0)
460+ {
461+ const uint domain = (*exterior_facet_domains)[*facet];
462+ if (domain < ufc.form.num_exterior_facet_domains())
463+ integral = ufc.exterior_facet_integrals[domain].get();
464+ else
465+ continue;
466+ }
467+
468+ // Skip integral if zero
469+ if (!integral)
470+ continue;
471+
472+ // Get mesh cell to which mesh facet belongs (pick first, there is only one)
473+ dolfin_assert(facet->num_entities(D) == 1);
474+ Cell mesh_cell(mesh, facet->entities(D)[0]);
475+
476+ // Get local index of facet with respect to the cell
477+ const uint local_facet = mesh_cell.index(*facet);
478+
479+ // Update to current cell
480+ ufc.update(mesh_cell, local_facet);
481+
482+ // Get local-to-global dof maps for cell
483+ for (uint i = 0; i < form_rank; ++i)
484+ dofs[i] = &(dofmaps[i]->cell_dofs(mesh_cell.index()));
485+
486+ // Tabulate exterior facet tensor
487+ integral->tabulate_tensor(&ufc.A[0], ufc.w(), ufc.cell, local_facet);
488+
489+ // Apply boundary conditions
490+ const bool asymm_changed = make_bc_symmetric(ufc.A, ufc_asymm.A, dofs);
491+
492+ // Add entries to global tensor
493+ A.add(&ufc.A[0], dofs);
494+ if (asymm_changed)
495+ A_asymm.add(&ufc_asymm.A[0], dofs);
496+
497+ p++;
498+ }
499+}
500+//-----------------------------------------------------------------------------
501+void SymmetricAssembler::PImpl::assemble_interior_facets()
502+{
503+ // Skip assembly if there are no interior facet integrals
504+ if (ufc.form.num_interior_facet_domains() == 0)
505+ return;
506+
507+ not_working_in_parallel("Assembly over interior facets");
508+
509+ Timer timer("Assemble interior facets");
510+
511+ // Extract mesh and coefficients
512+ const Mesh& mesh = a.mesh();
513+
514+ // Form rank
515+ const uint form_rank = ufc.form.rank();
516+
517+ // Collect pointers to dof maps
518+ std::vector<const GenericDofMap*> dofmaps;
519+ for (uint i = 0; i < form_rank; ++i)
520+ dofmaps.push_back(a.function_space(i)->dofmap().get());
521+
522+ // Vector to hold dofs for cells
523+ std::vector<std::vector<uint> > macro_dofs(form_rank);
524+ std::vector<const std::vector<uint>*> macro_dof_ptrs(form_rank);
525+ for (uint i = 0; i < form_rank; i++)
526+ macro_dof_ptrs[i] = &macro_dofs[i];
527+
528+ // Interior facet integral
529+ dolfin_assert(ufc.interior_facet_integrals.size() > 0);
530+ const ufc::interior_facet_integral*
531+ integral = ufc.interior_facet_integrals[0].get();
532+
533+ // Compute facets and facet - cell connectivity if not already computed
534+ const uint D = mesh.topology().dim();
535+ mesh.init(D - 1);
536+ mesh.init(D - 1, D);
537+ dolfin_assert(mesh.ordered());
538+
539+ // Get interior facet directions (if any)
540+ boost::shared_ptr<MeshFunction<unsigned int> >
541+ facet_orientation = mesh.data().mesh_function("facet_orientation");
542+ if (facet_orientation && facet_orientation->dim() != D - 1)
543+ {
544+ dolfin_error("Assembler.cpp",
545+ "assemble form over interior facets",
546+ "Expecting facet orientation to be defined on facets (not dimension %d)",
547+ facet_orientation->dim());
548+ }
549+
550+ // Assemble over interior facets (the facets of the mesh)
551+ Progress p(AssemblerTools::progress_message(A.rank(), "interior facets"),
552+ mesh.num_facets());
553+ for (FacetIterator facet(mesh); !facet.end(); ++facet)
554+ {
555+ // Only consider interior facets
556+ if (facet->exterior())
557+ {
558+ p++;
559+ continue;
560+ }
561+
562+ // Get integral for sub domain (if any)
563+ if (interior_facet_domains && interior_facet_domains->size() > 0)
564+ {
565+ const uint domain = (*interior_facet_domains)[*facet];
566+ if (domain < ufc.form.num_interior_facet_domains())
567+ integral = ufc.interior_facet_integrals[domain].get();
568+ else
569+ continue;
570+ }
571+
572+ // Skip integral if zero
573+ if (!integral)
574+ continue;
575+
576+ // Get cells incident with facet
577+ std::pair<const Cell, const Cell>
578+ cells = facet->adjacent_cells(facet_orientation.get());
579+ const Cell& cell0 = cells.first;
580+ const Cell& cell1 = cells.second;
581+
582+ // Get local index of facet with respect to each cell
583+ uint local_facet0 = cell0.index(*facet);
584+ uint local_facet1 = cell1.index(*facet);
585+
586+ // Update to current pair of cells
587+ ufc.update(cell0, local_facet0, cell1, local_facet1);
588+
589+ // Tabulate dofs for each dimension on macro element
590+ for (uint i = 0; i < form_rank; ++i)
591+ {
592+ // Get dofs for each cell
593+ const std::vector<uint>& cell_dofs0 = dofmaps[i]->cell_dofs(cell0.index());
594+ const std::vector<uint>& cell_dofs1 = dofmaps[i]->cell_dofs(cell1.index());
595+
596+ // Create space in macro dof vector
597+ macro_dofs[i].resize(cell_dofs0.size() + cell_dofs1.size());
598+
599+ // Copy cell dofs into macro dof vector
600+ std::copy(cell_dofs0.begin(), cell_dofs0.end(),
601+ macro_dofs[i].begin());
602+ std::copy(cell_dofs1.begin(), cell_dofs1.end(),
603+ macro_dofs[i].begin() + cell_dofs0.size());
604+ }
605+
606+ // Tabulate exterior interior facet tensor on macro element
607+ integral->tabulate_tensor(&ufc.macro_A[0], ufc.macro_w(), ufc.cell0, ufc.cell1,
608+ local_facet0, local_facet1);
609+
610+ // Apply boundary conditions
611+ const bool asymm_changed = make_bc_symmetric(ufc.macro_A, ufc_asymm.macro_A, macro_dof_ptrs);
612+
613+ // Add entries to global tensor
614+ A.add(&ufc.macro_A[0], macro_dofs);
615+ if (asymm_changed)
616+ A_asymm.add(&ufc_asymm.macro_A[0], macro_dofs);
617+
618+ p++;
619+ }
620+}
621+//-----------------------------------------------------------------------------
622+bool SymmetricAssembler::PImpl::make_bc_symmetric(std::vector<double>& local_A,
623+ std::vector<double>& local_A_asymm,
624+ const std::vector<const std::vector<uint>*>& dofs)
625+{
626+ // Get local dimensions
627+ const uint num_local_rows = dofs[0]->size();
628+ const uint num_local_cols = dofs[1]->size();
629+
630+ // Return value, true if columns have been moved to _asymm
631+ bool columns_moved = false;
632+
633+ // Convenience aliases
634+ const std::vector<uint>& row_dofs = *dofs[0];
635+ const std::vector<uint>& col_dofs = *dofs[1];
636+
637+ if (matching_bcs && row_dofs!=col_dofs)
638+ dolfin_error("SymmetricAssembler.cpp",
639+ "make_bc_symmetric",
640+ "Same BCs are used for rows and columns, but dofmaps don't match");
641+
642+ // Store the local boundary conditions, to avoid multiple searches in the
643+ // (common) case of matching_bcs
644+ local_row_is_bc.resize(num_local_rows);
645+ for (uint row = 0; row < num_local_rows; ++row)
646+ {
647+ DirichletBC::Map::const_iterator bc_item = row_bc_values.find(row_dofs[row]);
648+ local_row_is_bc[row] = (bc_item != row_bc_values.end());
649+ }
650+
651+ // Clear matrix rows belonging to BCs. These modifications destroy symmetry.
652+ for (uint row = 0; row < num_local_rows; ++row)
653+ {
654+ // Do nothing if row is not affected by BCs
655+ if (!local_row_is_bc[row])
656+ continue;
657+
658+ // Zero out the row
659+ zerofill(&local_A[row*num_local_cols], num_local_cols);
660+
661+ // Set the diagonal if we're in a diagonal block
662+ if (matching_bcs)
663+ {
664+ // ...but only set it on the owning processor
665+ const uint dof = row_dofs[row];
666+ if (dof >= processor_dof_range.first && dof < processor_dof_range.second)
667+ {
668+ // ...and only once.
669+ const bool already_inserted = !inserted_diagonals.insert(dof).second;
670+ if (!already_inserted)
671+ local_A[row + row*num_local_cols] = 1.0;
672+ }
673+ }
674+ }
675+
676+ // Modify matrix columns belonging to BCs. These modifications restore
677+ // symmetry, but the entries must be moved to the asymm matrix instead of
678+ // just cleared.
679+ for (uint col = 0; col < num_local_cols; ++col)
680+ {
681+ // Do nothing if column is not affected by BCs
682+ if (matching_bcs) {
683+ if (!local_row_is_bc[col])
684+ continue;
685+ }
686+ else
687+ {
688+ DirichletBC::Map::const_iterator bc_item = col_bc_values.find(col_dofs[col]);
689+ if (bc_item == col_bc_values.end())
690+ continue;
691+ }
692+
693+ // Zero the asymmetric part before use
694+ if (!columns_moved)
695+ {
696+ zerofill(local_A_asymm);
697+ columns_moved = true;
698+ }
699+
700+ // Move the column to A_asymm, zero it in A
701+ for (uint row = 0; row < num_local_rows; ++row)
702+ if (!local_row_is_bc[row])
703+ {
704+ const uint entry = col + row*num_local_cols;
705+ local_A_asymm[entry] = local_A[entry];
706+ local_A[entry] = 0.0;
707+ }
708+ }
709+
710+ return columns_moved;
711+}
712
713=== added file 'dolfin/fem/SymmetricAssembler.h'
714--- dolfin/fem/SymmetricAssembler.h 1970-01-01 00:00:00 +0000
715+++ dolfin/fem/SymmetricAssembler.h 2012-02-29 22:48:20 +0000
716@@ -0,0 +1,82 @@
717+// Copyright (C) 2012 Joachim B. Haga
718+//
719+// This file is part of DOLFIN.
720+//
721+// DOLFIN is free software: you can redistribute it and/or modify
722+// it under the terms of the GNU Lesser General Public License as published by
723+// the Free Software Foundation, either version 3 of the License, or
724+// (at your option) any later version.
725+//
726+// DOLFIN is distributed in the hope that it will be useful,
727+// but WITHOUT ANY WARRANTY; without even the implied warranty of
728+// MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
729+// GNU Lesser General Public License for more details.
730+//
731+// You should have received a copy of the GNU Lesser General Public License
732+// along with DOLFIN. If not, see <http://www.gnu.org/licenses/>.
733+//
734+// First added: 2012-01-26 (jobh@simula.no)
735+
736+#ifndef __SYMMETRIC_ASSEMBLER_H
737+#define __SYMMETRIC_ASSEMBLER_H
738+
739+#include <map>
740+#include <vector>
741+#include <boost/scoped_ptr.hpp>
742+#include <dolfin/common/types.h>
743+#include "Form.h"
744+#include "DirichletBC.h"
745+
746+namespace dolfin
747+{
748+ /// This class provides implements an assembler for systems
749+ /// of the form Ax = b. Its assembly algorithms are similar to SystemAssember's,
750+ /// but it saves the matrix modifications into a separate tensor so that it
751+ /// can later apply the symmetric modifications to a RHS vector.
752+
753+ /// The non-symmetric part is only nonzero in BC columns, and is zero in all BC
754+ /// rows, so that [(A_s+A_n) x = b] implies [A_s x = b - A_n b], IF b has
755+ /// boundary conditions applied. (If the final A is composed from a sum of
756+ /// A_s matrices, BCs must be reapplied to make the diagonal value for BC
757+ /// dofs 1.0. The matrix will remain symmetric since only the diagonal is
758+ /// changed.)
759+ ///
760+ /// *Example*
761+ ///
762+ /// .. code-block:: c++
763+ ///
764+ /// std::vector<const DirichletBC*> bcs = {bc};
765+ /// SymmetricAssembler::assemble(A, A_n, a, bcs, bcs);
766+ /// Assembler::assemble(b, L);
767+ /// bc.apply(b)
768+ /// A_n.mult(b, b_mod);
769+ /// b -= b_mod;
770+
771+ class SymmetricAssembler
772+ {
773+ public:
774+
775+ /// Assemble A and apply Dirichlet boundary conditions. Returns two
776+ /// matrices, where the second contains the symmetric modifications
777+ /// suitable for modifying RHS vectors.
778+ ///
779+ /// Note: row_bcs and col_bcs will normally be the same, but are different
780+ /// for e.g. off-diagonal block matrices in a mixed PDE.
781+ static void assemble(GenericMatrix &A,
782+ GenericMatrix &A_nonsymm,
783+ const Form &a,
784+ const std::vector<const DirichletBC*> &row_bcs,
785+ const std::vector<const DirichletBC*> &col_bcs,
786+ const MeshFunction<uint> *cell_domains=NULL,
787+ const MeshFunction<uint> *exterior_facet_domains=NULL,
788+ const MeshFunction<uint> *interior_facet_domains=NULL,
789+ bool reset_sparsity=true,
790+ bool add_values=false,
791+ bool finalize_tensor=true);
792+
793+ private:
794+ class PImpl;
795+ };
796+}
797+
798+#endif
799
800=== modified file 'dolfin/fem/SystemAssembler.cpp'
801--- dolfin/fem/SystemAssembler.cpp 2012-02-29 13:22:33 +0000
802+++ dolfin/fem/SystemAssembler.cpp 2012-02-29 22:48:20 +0000
803@@ -673,7 +673,7 @@
804
805 // Resize dof vector
806 a_macro_dofs[0].resize(a0_dofs0.size() + a0_dofs1.size());
807- a_macro_dofs[1].resize(a0_dofs1.size() + a1_dofs1.size());
808+ a_macro_dofs[1].resize(a1_dofs0.size() + a1_dofs1.size());
809 L_macro_dofs[0].resize(L_dofs0.size() + L_dofs1.size());
810
811 // Tabulate dofs for each dimension on macro element
812
813=== modified file 'dolfin/fem/assemble.cpp'
814--- dolfin/fem/assemble.cpp 2011-11-14 18:20:22 +0000
815+++ dolfin/fem/assemble.cpp 2012-02-29 22:48:20 +0000
816@@ -17,14 +17,16 @@
817 //
818 // Modified by Garth N. Wells, 2008.
819 // Modified by Johan Hake, 2009.
820+// Modified by Joachim B. Haga, 2012.
821 //
822 // First added: 2007-01-17
823-// Last changed: 2011-11-13
824+// Last changed: 2012-02-01
825
826 #include <dolfin/la/Scalar.h>
827 #include "Form.h"
828 #include "Assembler.h"
829 #include "SystemAssembler.h"
830+#include "SymmetricAssembler.h"
831 #include "assemble.h"
832
833 using namespace dolfin;
834@@ -122,6 +124,39 @@
835 reset_sparsity, add_values, finalize_tensor);
836 }
837 //-----------------------------------------------------------------------------
838+void dolfin::symmetric_assemble(GenericMatrix& As,
839+ GenericMatrix& An,
840+ const Form& a,
841+ const std::vector<const DirichletBC*>& bcs,
842+ const MeshFunction<unsigned int>* cell_domains,
843+ const MeshFunction<unsigned int>* exterior_facet_domains,
844+ const MeshFunction<unsigned int>* interior_facet_domains,
845+ bool reset_sparsity,
846+ bool add_values,
847+ bool finalize_tensor)
848+{
849+ SymmetricAssembler::assemble(As, An, a, bcs, bcs,
850+ cell_domains, exterior_facet_domains, interior_facet_domains,
851+ reset_sparsity, add_values, finalize_tensor);
852+}
853+//-----------------------------------------------------------------------------
854+void dolfin::symmetric_assemble(GenericMatrix& As,
855+ GenericMatrix& An,
856+ const Form& a,
857+ const std::vector<const DirichletBC*>& row_bcs,
858+ const std::vector<const DirichletBC*>& col_bcs,
859+ const MeshFunction<unsigned int>* cell_domains,
860+ const MeshFunction<unsigned int>* exterior_facet_domains,
861+ const MeshFunction<unsigned int>* interior_facet_domains,
862+ bool reset_sparsity,
863+ bool add_values,
864+ bool finalize_tensor)
865+{
866+ SymmetricAssembler::assemble(As, An, a, row_bcs, col_bcs,
867+ cell_domains, exterior_facet_domains, interior_facet_domains,
868+ reset_sparsity, add_values, finalize_tensor);
869+}
870+//-----------------------------------------------------------------------------
871 double dolfin::assemble(const Form& a,
872 bool reset_sparsity,
873 bool add_values,
874
875=== modified file 'dolfin/fem/assemble.h'
876--- dolfin/fem/assemble.h 2011-10-03 13:19:12 +0000
877+++ dolfin/fem/assemble.h 2012-02-29 22:48:20 +0000
878@@ -17,9 +17,10 @@
879 //
880 // Modified by Garth N. Wells, 2008, 2009.
881 // Modified by Johan Hake, 2009.
882+// Modified by Joachim B. Haga, 2012.
883 //
884 // First added: 2007-01-17
885-// Last changed: 2011-09-29
886+// Last changed: 2012-02-01
887 //
888 // This file duplicates the Assembler::assemble* and SystemAssembler::assemble*
889 // functions in namespace dolfin, and adds special versions returning the value
890@@ -113,6 +114,37 @@
891 bool add_values=false,
892 bool finalize_tensor=true);
893
894+ /// Symmetric assembly of As, storing the modifications in An. To create
895+ /// matching RHS, assemble and apply bcs normally, then subtract An*b.
896+ /// In this variant of symmetric_assemble, rows and columns use the same BCs.
897+ void symmetric_assemble(GenericMatrix& As,
898+ GenericMatrix& An,
899+ const Form& a,
900+ const std::vector<const DirichletBC*>& bcs,
901+ const MeshFunction<unsigned int>* cell_domains=NULL,
902+ const MeshFunction<unsigned int>* exterior_facet_domains=NULL,
903+ const MeshFunction<unsigned int>* interior_facet_domains=NULL,
904+ bool reset_sparsity=true,
905+ bool add_values=false,
906+ bool finalize_tensor=true);
907+
908+ /// Symmetric assembly of As, storing the modifications in An. To create
909+ /// matching RHS, assemble and apply bcs normally, then subtract An*b.
910+ /// In this variant of symmetric_assemble, rows and columns use (potentially)
911+ /// different BCs. The BCs will be different for example in coupling
912+ /// (i.e., off-diagonal) blocks of a block matrix.
913+ void symmetric_assemble(GenericMatrix& As,
914+ GenericMatrix& An,
915+ const Form& a,
916+ const std::vector<const DirichletBC*>& row_bcs,
917+ const std::vector<const DirichletBC*>& col_bcs,
918+ const MeshFunction<unsigned int>* cell_domains=NULL,
919+ const MeshFunction<unsigned int>* exterior_facet_domains=NULL,
920+ const MeshFunction<unsigned int>* interior_facet_domains=NULL,
921+ bool reset_sparsity=true,
922+ bool add_values=false,
923+ bool finalize_tensor=true);
924+
925 //--- Specialized versions for scalars ---
926
927 /// Assemble scalar
928
929=== modified file 'dolfin/fem/dolfin_fem.h'
930--- dolfin/fem/dolfin_fem.h 2011-06-30 22:15:54 +0000
931+++ dolfin/fem/dolfin_fem.h 2012-02-29 22:48:20 +0000
932@@ -17,6 +17,7 @@
933 #include <dolfin/fem/Form.h>
934 #include <dolfin/fem/Assembler.h>
935 #include <dolfin/fem/SparsityPatternBuilder.h>
936+#include <dolfin/fem/SymmetricAssembler.h>
937 #include <dolfin/fem/SystemAssembler.h>
938 #include <dolfin/fem/LinearVariationalProblem.h>
939 #include <dolfin/fem/LinearVariationalSolver.h>
940
941=== modified file 'site-packages/dolfin/fem/assembling.py'
942--- site-packages/dolfin/fem/assembling.py 2011-11-14 21:54:12 +0000
943+++ site-packages/dolfin/fem/assembling.py 2012-02-29 22:48:20 +0000
944@@ -29,11 +29,12 @@
945 # Modified by Martin Sandve Alnaes, 2008.
946 # Modified by Johan Hake, 2008-2009.
947 # Modified by Garth N. Wells, 2008-2009.
948+# Modified by Joachim B. Haga, 2012.
949 #
950 # First added: 2007-08-15
951-# Last changed: 2010-11-04
952+# Last changed: 2012-02-01
953
954-__all__ = ["assemble", "assemble_system"]
955+__all__ = ["assemble", "assemble_system", "symmetric_assemble"]
956
957 import types
958
959@@ -59,7 +60,9 @@
960 add_values=False,
961 finalize_tensor=True,
962 backend=None,
963- form_compiler_parameters=None):
964+ form_compiler_parameters=None,
965+ bcs=None,
966+ symmetric_mod=None):
967 """
968 Assemble the given form and return the corresponding tensor.
969
970@@ -182,6 +185,14 @@
971 if dolfin_form.rank() == 0:
972 tensor = tensor.getval()
973
974+ # Apply (possibly list of) boundary conditions
975+ for bc in _wrap_in_list(bcs, 'bcs', cpp.DirichletBC):
976+ bc.apply(tensor)
977+
978+ # Apply symmetric modification
979+ if symmetric_mod:
980+ tensor -= symmetric_mod*tensor
981+
982 # Return value
983 return tensor
984
985@@ -261,12 +272,7 @@
986 interior_facet_domains)
987
988 # Check bcs
989- if not isinstance(bcs,(types.NoneType,list,cpp.DirichletBC)):
990- raise TypeError, "expected a 'list', or a 'DirichletBC' as bcs argument"
991- if bcs is None:
992- bcs = []
993- elif isinstance(bcs,cpp.DirichletBC):
994- bcs = [bcs]
995+ bcs = _wrap_in_list(bcs, 'bcs', cpp.DirichletBC)
996
997 # Call C++ assemble function
998 cpp.assemble_system(A_tensor,
999@@ -284,6 +290,113 @@
1000
1001 return A_tensor, b_tensor
1002
1003+# JIT system assembler
1004+def symmetric_assemble(A_form,
1005+ bcs=None,
1006+ row_bcs=None,
1007+ col_bcs=None,
1008+ A_coefficients=None,
1009+ A_function_spaces=None,
1010+ cell_domains=None,
1011+ exterior_facet_domains=None,
1012+ interior_facet_domains=None,
1013+ reset_sparsity=True,
1014+ add_values=False,
1015+ finalize_tensor=True,
1016+ As_tensor=None,
1017+ An_tensor=None,
1018+ backend=None,
1019+ form_compiler_parameters=None,
1020+ bc_diagonal_value=1.0):
1021+ """
1022+ Assemble form(s) and apply any given boundary conditions in a
1023+ symmetric fashion and return tensor(s).
1024+
1025+ The standard application of boundary conditions does not
1026+ necessarily preserve the symmetry of the assembled matrix.
1027+
1028+ *Examples of usage*
1029+
1030+ For instance, the statements
1031+
1032+ .. code-block:: python
1033+
1034+ A = assemble(a)
1035+ b = assemble(L)
1036+ bc.apply(A, b)
1037+
1038+ can alternatively be carried out by
1039+
1040+ .. code-block:: python
1041+
1042+ A = symmetric_assemble(a, bc)
1043+ b = assemble(L)
1044+ b.apply(bc)
1045+ A.symmetric_modification.apply(b)
1046+
1047+ The statement above is valid even if ``bc`` is a list of
1048+ :py:class:`DirichletBC <dolfin.fem.bcs.DirichletBC>`
1049+ instances. For more info and options, see :py:func:`assemble
1050+ <dolfin.fem.assembling.assemble>`.
1051+
1052+ """
1053+
1054+ # Extract subdomains
1055+ subdomains = { "cell": cell_domains,
1056+ "exterior_facet": exterior_facet_domains,
1057+ "interior_facet": interior_facet_domains}
1058+
1059+ # Wrap forms
1060+ A_dolfin_form = Form(A_form, A_function_spaces, A_coefficients,
1061+ subdomains, form_compiler_parameters)
1062+
1063+ # Create tensors
1064+ As_tensor = _create_tensor(A_form, A_dolfin_form.rank(), backend, As_tensor)
1065+ An_tensor = _create_tensor(A_form, A_dolfin_form.rank(), backend, An_tensor)
1066+
1067+ # Extract domains
1068+ cell_domains, exterior_facet_domains, interior_facet_domains = \
1069+ _extract_domains(A_dolfin_form.mesh(),
1070+ cell_domains,
1071+ exterior_facet_domains,
1072+ interior_facet_domains)
1073+
1074+ # Check bcs
1075+ if bcs is None:
1076+ row_bcs = _wrap_in_list(row_bcs, 'row_bcs', cpp.DirichletBC)
1077+ col_bcs = _wrap_in_list(col_bcs, 'col_bcs', cpp.DirichletBC)
1078+ else:
1079+ if row_bcs is not None or col_bcs is not None:
1080+ raise TypeError("supply 'bcs' or 'row_bcs'/'col_bcs', not both")
1081+ row_bcs = col_bcs = _wrap_in_list(bcs, 'bcs', cpp.DirichletBC)
1082+
1083+ # Call C++ assemble function
1084+ cpp.symmetric_assemble(As_tensor,
1085+ An_tensor,
1086+ A_dolfin_form,
1087+ row_bcs,
1088+ col_bcs,
1089+ cell_domains,
1090+ exterior_facet_domains,
1091+ interior_facet_domains,
1092+ reset_sparsity,
1093+ add_values,
1094+ finalize_tensor)
1095+
1096+ return As_tensor, An_tensor
1097+
1098+def _wrap_in_list(obj, name, types=type):
1099+ if obj is None:
1100+ lst = []
1101+ elif hasattr(obj, '__iter__'):
1102+ lst = list(obj)
1103+ else:
1104+ lst = [obj]
1105+ for obj in lst:
1106+ if not isinstance(obj, types):
1107+ raise TypeError("expected a (list of) %s as '%s' argument" % (str(types),name))
1108+ return lst
1109+
1110 def _create_tensor(form, rank, backend, tensor):
1111 "Create tensor for form"
1112
1113
1114=== added file 'test/unit/fem/python/SymmetricAssembler.py'
1115--- test/unit/fem/python/SymmetricAssembler.py 1970-01-01 00:00:00 +0000
1116+++ test/unit/fem/python/SymmetricAssembler.py 2012-02-29 22:48:20 +0000
1117@@ -0,0 +1,181 @@
1118+"""Unit tests for class SymmetricAssembler"""
1119+
1120+# Copyright (C) 2011 Garth N. Wells
1121+# Copyright (C) 2012 Joachim B. Haga
1122+#
1123+# This file is part of DOLFIN.
1124+#
1125+# DOLFIN is free software: you can redistribute it and/or modify
1126+# it under the terms of the GNU Lesser General Public License as published by
1127+# the Free Software Foundation, either version 3 of the License, or
1128+# (at your option) any later version.
1129+#
1130+# DOLFIN is distributed in the hope that it will be useful,
1131+# but WITHOUT ANY WARRANTY; without even the implied warranty of
1132+# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
1133+# GNU Lesser General Public License for more details.
1134+#
1135+# You should have received a copy of the GNU Lesser General Public License
1136+# along with DOLFIN. If not, see <http://www.gnu.org/licenses/>.
1137+#
1138+# First added: 2012-01-01 (modified from SystemAssembler.py by jobh@simula.no)
1139+
1140+import unittest
1141+import numpy
1142+from dolfin import *
1143+
1144+class TestSymmetricAssembler(unittest.TestCase):
1145+
1146+ def _check_against_reference(self, a, L, bc):
1147+
1148+ # Assemble LHS using symmetric assembler
1149+ A, A_n = symmetric_assemble(a, bcs=bc)
1150+
1151+ # Assemble LHS using regular assembler
1152+ A_ref = assemble(a, bcs=bc)
1153+
1154+ # Check that the symmetric assemble matches the reference
1155+ N = A + A_n - A_ref
1156+ self.assertAlmostEqual(N.norm("frobenius"), 0.0, 10)
1157+
1158+ # Check that A is symmetric
1159+ X = assemble(L) # just to get the size
1160+ X.set_local(numpy.random.random(X.local_size()))
1161+ AT_X = Vector()
1162+ A.transpmult(X, AT_X)
1163+ N = A*X - AT_X
1164+ self.assertAlmostEqual(N.norm("l2"), 0.0, 10)
1165+
1166+ def test_cell_assembly(self):
1167+
1168+ mesh = UnitCube(4, 4, 4)
1169+ V = VectorFunctionSpace(mesh, "CG", 1)
1170+
1171+ v = TestFunction(V)
1172+ u = TrialFunction(V)
1173+ f = Constant((10, 20, 30))
1174+
1175+ def epsilon(v):
1176+ return 0.5*(grad(v) + grad(v).T)
1177+
1178+ a = inner(epsilon(v), epsilon(u))*dx
1179+ L = inner(v, f)*dx
1180+
1181+ # Define boundary condition
1182+ def boundary(x):
1183+ return near(x[0], 0.0) or near(x[0], 1.0)
1184+ u0 = Constant((1.0, 2.0, 3.0))
1185+ bc = DirichletBC(V, u0, boundary)
1186+
1187+ self._check_against_reference(a, L, bc)
1188+
1189+ def test_facet_assembly(self):
1190+
1191+ if MPI.num_processes() > 1:
1192+ print "FIXME: This unit test does not work in parallel, skipping"
1193+ return
1194+
1195+ mesh = UnitSquare(24, 24)
1196+ V = FunctionSpace(mesh, "CG", 1)
1197+
1198+ # Define test and trial functions
1199+ v = TestFunction(V)
1200+ u = TrialFunction(V)
1201+
1202+ # Define normal component, mesh size and right-hand side
1203+ n = V.cell().n
1204+ h = CellSize(mesh)
1205+ h_avg = (h('+')+h('-'))/2
1206+ f = Expression("500.0*exp(-(pow(x[0] - 0.5, 2) + pow(x[1] - 0.5, 2)) / 0.02)", degree=1)
1207+
1208+ # Define bilinear form
1209+ a = dot(grad(v), grad(u))*dx \
1210+ - dot(avg(grad(v)), jump(u, n))*dS \
1211+ - dot(jump(v, n), avg(grad(u)))*dS \
1212+ + 4.0/h_avg*dot(jump(v, n), jump(u, n))*dS \
1213+ - dot(grad(v), u*n)*ds \
1214+ - dot(v*n, grad(u))*ds \
1215+ + 8.0/h*v*u*ds
1216+
1217+ # Define linear form
1218+ L = v*f*dx
1219+
1220+ # Define boundary condition
1221+ def boundary(x):
1222+ return near(x[0], 0.0) or near(x[0], 1.0)
1223+ u0 = Constant(1.0)
1224+ bc = DirichletBC(V, u0, boundary, method="pointwise")
1225+
1226+ self._check_against_reference(a, L, bc)
1227+
1228+ def test_subdomain_assembly_meshdomains(self):
1229+ "Test assembly over subdomains with markers stored as part of mesh"
1230+
1231+ # Create a mesh of the unit cube
1232+ mesh = UnitCube(4, 4, 4)
1233+
1234+ # Define subdomains for 3 faces of the unit cube
1235+ class F0(SubDomain):
1236+ def inside(self, x, inside):
1237+ return near(x[0], 0.0)
1238+ class F1(SubDomain):
1239+ def inside(self, x, inside):
1240+ return near(x[1], 0.0)
1241+ class F2(SubDomain):
1242+ def inside(self, x, inside):
1243+ return near(x[2], 0.0)
1244+
1245+ # Define subdomain for left of x = 0.5
1246+ class S0(SubDomain):
1247+ def inside(self, x, inside):
1248+ return x[0] < 0.5 + DOLFIN_EPS
1249+
1250+ # Define subdomain for right of x = 0.5
1251+ class S1(SubDomain):
1252+ def inside(self, x, inside):
1253+ return x[0] >= 0.5 + DOLFIN_EPS
1254+
1255+ # Mark mesh
1256+ f0 = F0()
1257+ f1 = F1()
1258+ f2 = F2()
1259+ s0 = S0()
1260+ s1 = S1()
1261+ f0.mark_facets(mesh, 0)
1262+ f1.mark_facets(mesh, 1)
1263+ f2.mark_facets(mesh, 2)
1264+ s0.mark_cells(mesh, 0)
1265+ s1.mark_cells(mesh, 1)
1266+
1267+ # Define test and trial functions
1268+ V = FunctionSpace(mesh, "CG", 1)
1269+ u = TrialFunction(V)
1270+ v = TestFunction(V)
1271+
1272+ # FIXME: If the Z terms are not present, PETSc will claim:
1273+ # Object is in wrong state!
1274+ # Matrix is missing diagonal entry in row 124!
1275+ Z = Constant(0.0)
1276+
1277+ # Define forms on marked subdomains
1278+ a0 = 1*u*v*dx(0) + 2*u*v*ds(0) + 3*u*v*ds(1) + 4*u*v*ds(2) + Z*u*v*dx(1)
1279+ L0 = 1*v*dx(0) + 2*v*ds(0) + 3*v*ds(1) + 4*v*ds(2)
1280+
1281+ # Defined forms on unmarked subdomains (should be zero)
1282+ a1 = 1*u*v*dx(2) + 2*u*v*ds(3) + Z*u*v*dx(0) + Z*u*v*dx(1)
1283+ L1 = 1*v*dx(2) + 2*v*ds(3)
1284+
1285+ # Define boundary condition
1286+ def boundary(x):
1287+ return near(x[0], 0.0) or near(x[0], 1.0)
1288+ u0 = Constant(1.0)
1289+ bc = DirichletBC(V, u0, boundary, method="pointwise")
1290+
1291+ self._check_against_reference(a0, L0, bc)
1292+ self._check_against_reference(a1, L1, bc)
1293+
1294+if __name__ == "__main__":
1295+ print ""
1296+ print "Testing class SymmetricAssembler"
1297+ print "-----------------------------"
1298+ unittest.main()
1299
1300=== modified file 'test/unit/test.py'
1301--- test/unit/test.py 2012-02-22 12:31:32 +0000
1302+++ test/unit/test.py 2012-02-29 22:48:20 +0000
1303@@ -35,7 +35,7 @@
1304 "adaptivity": ["errorcontrol", "TimeSeries"],
1305 "book": ["chapter_1", "chapter_10"],
1306 "fem": ["solving", "Assembler", "DirichletBC", "DofMap",
1307- "FiniteElement", "SystemAssembler", "Form"],
1308+ "FiniteElement", "SystemAssembler", "Form", "SymmetricAssembler"],
1309 "function": ["Constant", "Expression", "Function", "FunctionSpace",
1310 "SpecialFunctions"],
1311 "io": ["vtk", "XMLMeshFunction", "XMLMesh",

Subscribers

People subscribed via source and target branches