Merge lp:~maddevelopers/mg5amcnlo/smart_zeros into lp:~maddevelopers/mg5amcnlo/2.3.4

Proposed by Valentin Hirschi
Status: Merged
Merged at revision: 393
Proposed branch: lp:~maddevelopers/mg5amcnlo/smart_zeros
Merge into: lp:~maddevelopers/mg5amcnlo/2.3.4
Diff against target: 782 lines (+306/-115) (has conflicts)
13 files modified
Template/NLO/SubProcesses/makefile_loop.inc (+13/-2)
Template/loop_material/StandAlone/SubProcesses/makefile (+13/-2)
madgraph/iolibs/export_v4.py (+4/-4)
madgraph/iolibs/file_writers.py (+1/-0)
madgraph/iolibs/template_files/loop_optimized/helas_calls_split.inc (+5/-3)
madgraph/iolibs/template_files/loop_optimized/loop_matrix_standalone.inc (+7/-5)
madgraph/iolibs/template_files/loop_optimized/mp_compute_loop_coefs.inc (+9/-3)
madgraph/iolibs/template_files/loop_optimized/mp_helas_calls_split.inc (+3/-4)
madgraph/iolibs/template_files/loop_optimized/polynomial.inc (+6/-8)
madgraph/loop/loop_exporters.py (+46/-21)
madgraph/various/process_checks.py (+4/-0)
madgraph/various/q_polynomial.py (+194/-63)
tests/time_db (+1/-0)
Text conflict in madgraph/various/process_checks.py
To merge this branch: bzr merge lp:~maddevelopers/mg5amcnlo/smart_zeros
Reviewer Review Type Date Requested Status
Hua-Sheng Shao Approve
Review via email: mp+287741@code.launchpad.net

Description of the change

This branch brings small but significant improvement to the computation of loop polynomial coefficients in MadLoop.

Profiling shows that the overall majority of the computation time of these coefficients comes from the 'update_wl' functions that implement the tensorial product of the 'loop wavefunctions polynomials' with the 'vertex polynomial updaters' provided by aloha.

Basically I identified that within the current framework there is three ways of implementing this tensorial product:

a) Loop over only the wavefunction/updater indices and write explicitly the expanded tensor product (so expand explicitly over the loop wf and updater coefs).
This was the only technique used so far before this branch.

b) Perform all 5 do-loops ( 3 over loop wf+updater indices and 2 over polynomial coefficients) but looping first over the updater indices and filtering over the updater coefficients which are zero

c) Same as b, so all 5 do-loops, but this time first over the loop wavefunction coefficients with a filter on the loop wf coefs which are zero.

So a gain can be obtained by wisely choosing which strategy to use for the different cases of loop_wavefunction rank and updated rank.
The following choice is made, based on several empirical profiling.

-------------------------------

if ( loop_wf rank == 0 ) or (updater rank == 0 ) or (loop_wf rank == updater rank == 1)
-> Keep the original strategy a), which seems to be faster

else ( loop_wf rank ) >= ( updater rank )
-> Use the strategy b) which exploits the fact that the loop polynomial is high rank in comparison to the vertex polynomial.

else
-> Use the strategy c) which exploits the fact that the vertex polynomial is high rank in comparison to the loop wf polynomial.
This is typically not a very relevant change as it is basically used only for the combination
(loop wf rank =1 , updater rank =2) in effective theories.

-------------------------------

So the introduction of strategy b) is really what brings the improvement. However for this improvement to be large, it is necessary that polynomial.f be compiled without '-fbounds-check' and preferably with '-O3'.
I have therefore slight altered the makefiles so that this source files in particular enforces the above, irrespectively of what is in make_opts.

The improvements obtained are not a game-changer, but still welcome (these are gains relative to the loop polynomial coefficient computation only [i.e. not relative to the timing incl. loop reduction]):
Notice that even though the implementation is such that the gain should be larger for more complicated processes, this is not guaranteed as it also depends on the sparsity of the updater polynomial coefficients.

u d~ > e+ ve -> No gain, code completely identical in this case
g g > t t~ -> -25%
g g > t t~ g -> -18%
g g > t t~ g g -> -9%
u u~ > d d~ s s~ -> -27%
u u~ > d d~ s s~ g -> -23%
g g > x0 g (HEFT process) -> -40%
g g > x0 g g (HEFT process) -> -19%
g g > y2 g (y2 = massive spin-2 boson) -> -43%
g g > y2 g g (y2 = massive spin-2 boson) -> -31%
g g > h h -> -58%
g g > h h h -> -60%
g g > h h h h -> -64%
g g > z z -> -41%
g g > z z z -> -43%

The improvement is better for loop-induced processes, but unfortunately this is also where we are anyways already dominated by the reduction time, so that it doesn't matter much :(.

Finally one crazy idea would be to dynamically chose optimally between the three methods above for each UPDATE_WL call, with a training session. but ok, let's not go there...
And of course another avenue of optimization is to properly chose the optimal l-cut location of each loop so as to exactly maximize over the number of loop wavefunctions recycled.
But this was my one improvement of the year on the loop polynomial computation, anything more will wait for 2017 at least.

(Originally the hope was to get even larger gains by having aloha setting what coefficients can be zero and keeping track overall of the list of non-zero coefficients.
But after 2 long days of testing and trying hard, the full-fledged tracking of zero coefs seems more expensive than what it saves, except when done with a partial filtering like above. Well... at least I tried.)

Anyway, as far as the review goes (I picked you Olivier, because we already discussed this a bit, but others who are reading this are welcome to give their opinion), there is not much to be done here:
Just make sure that things go smooth for a couple runs and also double-check a couple of the timing improvements above (with the check timing -reuse command) and give me a green light (pretty please).

To post a comment you must log in.
350. By Valentin Hirschi

1. Returned 'make_it_quick' internal option of check timing to the default false.

351. By Valentin Hirschi

1. Removed so useless comments in the code

Revision history for this message
Rikkert Frederix (frederix) wrote :

Hi Valentin,

If it doesn't compile with the '-fbounds-check' doesn't that signify that some arrays go out of bound, which might lead to compiler dependent problems?

Cheers,
Rik

Revision history for this message
Valentin Hirschi (valentin-hirschi) wrote :

Hi Rik,

> Hi Valentin,
>
> If it doesn't compile with the '-fbounds-check' doesn't that signify that some
> arrays go out of bound, which might lead to compiler dependent problems?

Well first of all, it is only the file 'polynomial.f' for which I vetoed the compiler flag '-fbounds-check' and forced the optimization '-O3'.

They can still be turned on changing the following variables in the MadLoop makefile:

POLYNOMIAL_OPTIMIZATION = -O3
POLYNOMIAL_BOUNDS_CHECK =

(defined as above by default)

Anyway, the fact that by default '-fbounds-check' is absent means, indeed, that if an array goes out of bound the behavior can be anything: ranging from incorrect values retrieved from memory to segmentation faults (system+compiler-dependent behavior indeed).
But that's OK because it should not happen in this very generic file 'polynomial.f' which I have now tested for many processes *with* the bound checks.
It is the usual concept: one should test with boundchecks but then remove it for production.
We don't necessarily do this for other files because the gain is marginal, but this is not the case in polynomial.f anymore.

Also, if some out of bounds happen, then there is no way MadLoop's answer will be stable (no matter what happens with the memory access). So I think it is safe to assume that if incorrect bounds are present in polynomial.f, we would immediately realize it in any use of MadLoop that monitors its stability (as it should).

Cheers,

> Cheers,
> Rik

352. By Valentin Hirschi

1. Removed all comments and unecessary now-dummy code.

Revision history for this message
Olivier Mattelaer (olivier-mattelaer) wrote :

Hi,

Is this merged in 2.3.4?
I plan to release 2.3.4 extremelly soon.
So what is the status of this branch?
Should it wait the next version? If not please finish this review asap.

Cheers,

Olivier

Revision history for this message
Valentin Hirschi (valentin-hirschi) wrote :

Not Yet. Huasheng, could you accept the merge? I'll carry on with it them.

Olivier, please wait until at least wednesday for the release, as we should have the arxiv number for Ninja by then.

Cheers

Revision history for this message
Hua-Sheng Shao (erdissshaw) :
review: Approve

Preview Diff

[H/L] Next/Prev Comment, [J/K] Next/Prev File, [N/P] Next/Prev Hunk
1=== modified file 'Template/NLO/SubProcesses/makefile_loop.inc'
2--- Template/NLO/SubProcesses/makefile_loop.inc 2014-09-30 06:56:29 +0000
3+++ Template/NLO/SubProcesses/makefile_loop.inc 2016-03-03 23:14:51 +0000
4@@ -3,11 +3,16 @@
5 LIBDIR = ../../../lib/
6 LOOPLIB= libMadLoop.a
7
8+# For the compilation of the MadLoop file polynomial.f it makes a big difference to use -O3 and
9+# to turn off the bounds check. These can however be modified here if really necessary.
10+POLYNOMIAL_OPTIMIZATION = -O3
11+POLYNOMIAL_BOUNDS_CHECK =
12+
13 LINKLIBS = -L$(LIBDIR) -lcts -ldhelas -lmodel %(link_tir_libs)s
14 LIBS = $(LIBDIR)libcts.$(libext) $(LIBDIR)libdhelas.$(libext) \
15 $(LIBDIR)libmodel.$(libext) %(tir_libs)s
16-PROCESS= loop_matrix.o improve_ps.o born_matrix.o loop_num.o CT_interface.o MadLoopCommons.o \
17- $(patsubst %(dotf)s,%(doto)s,$(wildcard polynomial.f)) \
18+PROCESS= $(patsubst %(dotf)s,%(doto)s,$(wildcard polynomial.f)) \
19+ loop_matrix.o improve_ps.o born_matrix.o loop_num.o CT_interface.o MadLoopCommons.o \
20 $(patsubst %(dotf)s,%(doto)s,$(wildcard MadLoopParamReader.f)) \
21 $(patsubst %(dotf)s,%(doto)s,$(wildcard helas_calls*.f)) \
22 $(patsubst %(dotf)s,%(doto)s,$(wildcard jamp?_calls_*.f)) \
23@@ -21,6 +26,12 @@
24 $(patsubst %(dotf)s,%(doto)s,$(wildcard GOLEM_interface.f)) \
25 $(patsubst %(dotf)s,%(doto)s,$(wildcard compute_color_flows.f))
26
27+# This is the core of madloop computationally wise, so make sure to turn optimizations on and bound checks off.
28+# We use %%olynomial.o and not directly polynomial.o because we want it to match when both doing make check here
29+# or make OLP one directory above
30+%%olynomial.o : %%olynomial.f
31+ $(FC) $(patsubst -O%%,, $(subst -fbounds-check,,$(FFLAGS))) $(POLYNOMIAL_OPTIMIZATION) $(POLYNOMIAL_BOUNDS_CHECK) -c $< -o $@ $(LOOP_INCLUDE)
32+
33 %(doto)s : %(dotf)s
34 $(FC) $(FFLAGS) -c $< %(tir_include)s
35
36
37=== modified file 'Template/loop_material/StandAlone/SubProcesses/makefile'
38--- Template/loop_material/StandAlone/SubProcesses/makefile 2016-02-12 01:48:16 +0000
39+++ Template/loop_material/StandAlone/SubProcesses/makefile 2016-03-03 23:14:51 +0000
40@@ -9,6 +9,11 @@
41 ROOT = ..
42 endif
43
44+# For the compilation of the MadLoop file polynomial.f it makes a big difference to use -O3 and
45+# to turn off the bounds check. These can however be modified here if really necessary.
46+POLYNOMIAL_OPTIMIZATION = -O3
47+POLYNOMIAL_BOUNDS_CHECK =
48+
49 include $(ROOT)/Source/make_opts
50 include $(ROOT)/SubProcesses/MadLoop_makefile_definitions
51 SHELL = /bin/bash
52@@ -23,12 +28,12 @@
53 LIBS = $(LIBDIR)libdhelas.$(libext) $(LIBDIR)libmodel.$(libext) $(LOOP_LIBS)
54
55 PROCESS= MadLoopParamReader.o MadLoopCommons.o \
56+ $(patsubst $(DOTF),$(DOTO),$(wildcard polynomial.f)) \
57 $(patsubst $(DOTF),$(DOTO),$(wildcard loop_matrix.f)) \
58 $(patsubst $(DOTF),$(DOTO),$(wildcard improve_ps.f)) \
59 $(patsubst $(DOTF),$(DOTO),$(wildcard born_matrix.f)) \
60 $(patsubst $(DOTF),$(DOTO),$(wildcard CT_interface.f)) \
61 $(patsubst $(DOTF),$(DOTO),$(wildcard loop_num.f)) \
62- $(patsubst $(DOTF),$(DOTO),$(wildcard polynomial.f)) \
63 $(patsubst $(DOTF),$(DOTO),$(wildcard helas_calls*.f)) \
64 $(patsubst $(DOTF),$(DOTO),$(wildcard jamp?_calls_*.f)) \
65 $(patsubst $(DOTF),$(DOTO),$(wildcard mp_born_amps_and_wfs.f)) \
66@@ -42,12 +47,12 @@
67 $(patsubst $(DOTF),$(DOTO),$(wildcard compute_color_flows.f))
68
69 OLP_PROCESS= MadLoopParamReader.o MadLoopCommons.o \
70+ $(patsubst $(DOTF),$(DOTO),$(wildcard $(LOOP_PREFIX)*/polynomial.f)) \
71 $(patsubst $(DOTF),$(DOTO),$(wildcard $(LOOP_PREFIX)*/loop_matrix.f)) \
72 $(patsubst $(DOTF),$(DOTO),$(wildcard $(LOOP_PREFIX)*/improve_ps.f)) \
73 $(patsubst $(DOTF),$(DOTO),$(wildcard $(LOOP_PREFIX)*/born_matrix.f)) \
74 $(patsubst $(DOTF),$(DOTO),$(wildcard $(LOOP_PREFIX)*/CT_interface.f)) \
75 $(patsubst $(DOTF),$(DOTO),$(wildcard $(LOOP_PREFIX)*/loop_num.f)) \
76- $(patsubst $(DOTF),$(DOTO),$(wildcard $(LOOP_PREFIX)*/polynomial.f)) \
77 $(patsubst $(DOTF),$(DOTO),$(wildcard $(LOOP_PREFIX)*/helas_calls*.f)) \
78 $(patsubst $(DOTF),$(DOTO),$(wildcard $(LOOP_PREFIX)*/jamp?_calls_*.f)) \
79 $(patsubst $(DOTF),$(DOTO),$(wildcard $(LOOP_PREFIX)*/mp_born_amps_and_wfs.f)) \
80@@ -70,6 +75,12 @@
81 $(CHECK_SA_BORN_SPLITORDERS): check_sa_born_splitOrders.o $(patsubst $(DOTF),$(DOTO),$(wildcard *born_matrix.f)) makefile $(LIBDIR)libdhelas.$(libext) $(LIBDIR)libmodel.$(libext)
82 $(FC) $(FFLAGS) -o $(CHECK_SA_BORN_SPLITORDERS) check_sa_born_splitOrders.o $(patsubst $(DOTF),$(DOTO),$(wildcard *born_matrix.f)) -L$(LIBDIR) -ldhelas -lmodel
83
84+# This is the core of madloop computationally wise, so make sure to turn optimizations on and bound checks off.
85+# We use %olynomial.o and not directly polynomial.o because we want it to match when both doing make check here
86+# or make OLP one directory above
87+%olynomial.o : %olynomial.f
88+ $(FC) $(patsubst -O%,, $(subst -fbounds-check,,$(FFLAGS))) $(POLYNOMIAL_OPTIMIZATION) $(POLYNOMIAL_BOUNDS_CHECK) -c $< -o $@ $(LOOP_INCLUDE)
89+
90 $(DOTO) : $(DOTF)
91 $(FC) $(FFLAGS) -c $< -o $@ $(LOOP_INCLUDE)
92
93
94=== modified file 'madgraph/iolibs/export_v4.py'
95--- madgraph/iolibs/export_v4.py 2016-03-02 04:03:52 +0000
96+++ madgraph/iolibs/export_v4.py 2016-03-03 23:14:51 +0000
97@@ -786,10 +786,12 @@
98 #copy Helas Template
99 cp(MG5DIR + '/aloha/template_files/Makefile_F', write_dir+'/makefile')
100 if any([any(['L' in tag for tag in d[1]]) for d in wanted_lorentz]):
101- cp(MG5DIR + '/aloha/template_files/aloha_functions_loop.f', write_dir+'/aloha_functions.f')
102+ cp(MG5DIR + '/aloha/template_files/aloha_functions_loop.f',
103+ write_dir+'/aloha_functions.f')
104 aloha_model.loop_mode = False
105 else:
106- cp(MG5DIR + '/aloha/template_files/aloha_functions.f', write_dir+'/aloha_functions.f')
107+ cp(MG5DIR + '/aloha/template_files/aloha_functions.f',
108+ write_dir+'/aloha_functions.f')
109 create_aloha.write_aloha_file_inc(write_dir, '.f', '.o')
110
111 # Make final link in the Process
112@@ -5217,7 +5219,6 @@
113 if self.opt['mp']:
114 self.create_intparam_def(dp=False,mp=True)
115
116-
117 # definition of the coupling.
118 self.create_actualize_mp_ext_param_inc()
119 self.create_coupl_inc()
120@@ -5245,7 +5246,6 @@
121
122 def copy_standard_file(self):
123 """Copy the standard files for the fortran model."""
124-
125
126 #copy the library files
127 file_to_link = ['formats.inc','printout.f', \
128
129=== modified file 'madgraph/iolibs/file_writers.py'
130--- madgraph/iolibs/file_writers.py 2015-10-01 16:00:08 +0000
131+++ madgraph/iolibs/file_writers.py 2016-03-03 23:14:51 +0000
132@@ -177,6 +177,7 @@
133 '^type(?!\s*\()\s*.+\s*$': ('^endtype', 2),
134 '^do(?!\s+\d+)\s+': ('^enddo\s*$', 2),
135 '^subroutine': ('^end\s*$', 0),
136+ '^module': ('^end\s*$', 0),
137 'function': ('^end\s*$', 0)}
138 single_indents = {'^else\s*$':-2,
139 '^else\s*if.+then\s*$':-2}
140
141=== modified file 'madgraph/iolibs/template_files/loop_optimized/helas_calls_split.inc'
142--- madgraph/iolibs/template_files/loop_optimized/helas_calls_split.inc 2016-02-23 19:44:10 +0000
143+++ madgraph/iolibs/template_files/loop_optimized/helas_calls_split.inc 2016-03-03 23:14:51 +0000
144@@ -1,5 +1,9 @@
145 SUBROUTINE %(proc_prefix)s%(bunch_name)s_%(bunch_number)d(P,NHEL,H,IC)
146-C
147+C
148+C Modules
149+C
150+ use %(proc_prefix)sPOLYNOMIAL_CONSTANTS
151+C
152 IMPLICIT NONE
153 C
154 C CONSTANTS
155@@ -18,8 +22,6 @@
156 PARAMETER (NLOOPAMPS=%(nloopamps)d)
157 INTEGER NWAVEFUNCS,NLOOPWAVEFUNCS
158 PARAMETER (NWAVEFUNCS=%(nwavefuncs)d,NLOOPWAVEFUNCS=%(nloopwavefuncs)d)
159- include 'loop_max_coefs.inc'
160- include 'coef_specs.inc'
161 %(real_dp_format)s ZERO
162 PARAMETER (ZERO=0D0)
163 %(real_mp_format)s MP__ZERO
164
165=== modified file 'madgraph/iolibs/template_files/loop_optimized/loop_matrix_standalone.inc'
166--- madgraph/iolibs/template_files/loop_optimized/loop_matrix_standalone.inc 2016-02-23 19:44:10 +0000
167+++ madgraph/iolibs/template_files/loop_optimized/loop_matrix_standalone.inc 2016-03-03 23:14:51 +0000
168@@ -11,7 +11,11 @@
169 c and external lines W(0:6,NEXTERNAL)
170 C
171 %(process_lines)s
172-C
173+C
174+C Modules
175+C
176+ use %(proc_prefix)sPOLYNOMIAL_CONSTANTS
177+C
178 IMPLICIT NONE
179 C
180 C USER CUSTOMIZABLE OPTIONS
181@@ -74,8 +78,6 @@
182 PARAMETER (NEXTERNAL=%(nexternal)d)
183 INTEGER NWAVEFUNCS,NLOOPWAVEFUNCS
184 PARAMETER (NWAVEFUNCS=%(nwavefuncs)d,NLOOPWAVEFUNCS=%(nloopwavefuncs)d)
185- include 'loop_max_coefs.inc'
186- include 'coef_specs.inc'
187 INTEGER NCOMB
188 PARAMETER (NCOMB=%(ncomb)d)
189 %(real_dp_format)s ZERO
190@@ -189,7 +191,7 @@
191 ## if(ComputeColorFlows) {
192 %(real_dp_format)s BUFFRES(0:3,0:NSQUAREDSO)
193 ## }
194- %(complex_dp_format)s COEFS(MAXLWFSIZE,0:VERTEXMAXCOEFS-1,MAXLWFSIZE)
195+ %(complex_dp_format)s COEFS(MAXLWFSIZE,0:VERTEXMAXCOEFS-1,MAXLWFSIZE)
196 %(complex_dp_format)s CFTOT
197 LOGICAL FOUNDHELFILTER,FOUNDLOOPFILTER
198 DATA FOUNDHELFILTER/.TRUE./
199@@ -561,7 +563,7 @@
200 C SETUP OF THE COMMON STARTING EXTERNAL LOOP WAVEFUNCTION
201 C IT IS ALSO PS POINT INDEPENDENT, SO IT CAN BE DONE HERE.
202 DO I=0,3
203- PL(I,0)=(0.0d0,0.0d0)
204+ PL(I,0)=DCMPLX(0.0d0,0.0d0)
205 ENDDO
206 DO I=1,MAXLWFSIZE
207 DO J=0,LOOPMAXCOEFS-1
208
209=== modified file 'madgraph/iolibs/template_files/loop_optimized/mp_compute_loop_coefs.inc'
210--- madgraph/iolibs/template_files/loop_optimized/mp_compute_loop_coefs.inc 2016-02-25 19:31:10 +0000
211+++ madgraph/iolibs/template_files/loop_optimized/mp_compute_loop_coefs.inc 2016-03-03 23:14:51 +0000
212@@ -8,6 +8,10 @@
213 C
214 %(process_lines)s
215 C
216+C Modules
217+C
218+ use %(proc_prefix)sPOLYNOMIAL_CONSTANTS
219+C
220 IMPLICIT NONE
221 C
222 C CONSTANTS
223@@ -28,8 +32,6 @@
224 PARAMETER (NEXTERNAL=%(nexternal)d)
225 INTEGER NWAVEFUNCS,NLOOPWAVEFUNCS
226 PARAMETER (NWAVEFUNCS=%(nwavefuncs)d,NLOOPWAVEFUNCS=%(nloopwavefuncs)d)
227- include 'loop_max_coefs.inc'
228- include 'coef_specs.inc'
229 INTEGER NCOMB
230 PARAMETER (NCOMB=%(ncomb)d)
231 %(real_mp_format)s ZERO
232@@ -206,8 +208,12 @@
233 ENDDO
234
235 DO I=0,3
236- PL(I,0)=(ZERO,ZERO)
237+ PL(I,0)=CMPLX(ZERO,ZERO,KIND=16)
238+ IF (.NOT.COMPUTE_INTEGRAND_IN_QP) THEN
239+ DP_PL(I,0)=DCMPLX(0.0d0,0.0d0)
240+ ENDIF
241 ENDDO
242+
243 ## if(AmplitudeReduction){
244 IF (.NOT.SKIP_LOOPNUM_COEFS_CONSTRUCTION) THEN
245 ## }
246
247=== modified file 'madgraph/iolibs/template_files/loop_optimized/mp_helas_calls_split.inc'
248--- madgraph/iolibs/template_files/loop_optimized/mp_helas_calls_split.inc 2016-02-23 19:44:10 +0000
249+++ madgraph/iolibs/template_files/loop_optimized/mp_helas_calls_split.inc 2016-03-03 23:14:51 +0000
250@@ -1,5 +1,6 @@
251 SUBROUTINE %(proc_prefix)s%(bunch_name)s_%(bunch_number)d(P,NHEL,H,IC)
252-C
253+C
254+ use %(proc_prefix)sPOLYNOMIAL_CONSTANTS
255 IMPLICIT NONE
256 C
257 C CONSTANTS
258@@ -19,8 +20,6 @@
259 PARAMETER (NLOOPAMPS=%(nloopamps)d)
260 INTEGER NWAVEFUNCS,NLOOPWAVEFUNCS
261 PARAMETER (NWAVEFUNCS=%(nwavefuncs)d,NLOOPWAVEFUNCS=%(nloopwavefuncs)d)
262- include 'loop_max_coefs.inc'
263- include 'coef_specs.inc'
264 %(real_mp_format)s ZERO
265 PARAMETER (ZERO=0.0e0_16)
266 %(complex_mp_format)s IZERO
267@@ -38,7 +37,7 @@
268 C LOCAL VARIABLES
269 C
270 INTEGER I,J,K
271- %(complex_mp_format)s COEFS(MAXLWFSIZE,0:VERTEXMAXCOEFS-1,MAXLWFSIZE)
272+ %(complex_mp_format)s COEFS(MAXLWFSIZE,0:VERTEXMAXCOEFS-1,MAXLWFSIZE)
273 C
274 C GLOBAL VARIABLES
275 C
276
277=== modified file 'madgraph/iolibs/template_files/loop_optimized/polynomial.inc'
278--- madgraph/iolibs/template_files/loop_optimized/polynomial.inc 2016-02-23 19:44:10 +0000
279+++ madgraph/iolibs/template_files/loop_optimized/polynomial.inc 2016-03-03 23:14:51 +0000
280@@ -3,6 +3,7 @@
281 C MULTIPLY BY THE BORN
282
283 SUBROUTINE %(mp_prefix)s%(proc_prefix)sCREATE_LOOP_COEFS(LOOP_WF,RANK,LCUT_SIZE,LOOP_GROUP_NUMBER,SYMFACT,MULTIPLIER,COLOR_ID,HELCONFIG)
284+ USE %(proc_prefix)sPOLYNOMIAL_CONSTANTS
285 implicit none
286 C
287 C CONSTANTS
288@@ -17,8 +18,6 @@
289 PARAMETER (IMAG1=(ZERO,ONE))
290 %(complex_format)s CMPLX_ZERO
291 PARAMETER (CMPLX_ZERO=(ZERO,ZERO))
292- include 'loop_max_coefs.inc'
293- include 'coef_specs.inc'
294 INTEGER NCOLORROWS
295 PARAMETER (NCOLORROWS=%(nloopamps)d)
296 INTEGER NLOOPGROUPS
297@@ -97,6 +96,7 @@
298 C amplitude level so that no multiplication is performed.
299
300 SUBROUTINE %(mp_prefix)s%(proc_prefix)sCREATE_LOOP_COEFS(LOOP_WF,RANK,LCUT_SIZE,LOOP_GROUP_NUMBER,SYMFACT,MULTIPLIER)
301+ USE %(proc_prefix)sPOLYNOMIAL_CONSTANTS
302 implicit none
303 C
304 C CONSTANTS
305@@ -107,8 +107,6 @@
306 PARAMETER (IMAG1=(ZERO,ONE))
307 %(complex_format)s CMPLX_ZERO
308 PARAMETER (CMPLX_ZERO=(ZERO,ZERO))
309- include 'loop_max_coefs.inc'
310- include 'coef_specs.inc'
311 INTEGER NLOOPGROUPS
312 PARAMETER (NLOOPGROUPS=%(nloop_groups)d)
313 INTEGER NCOMB
314@@ -143,16 +141,13 @@
315 C Just a handy subroutine to modify the coefficients for the
316 C tranformation q_loop -> -q_loop
317 C It is only used for the NINJA interface
318+ USE %(proc_prefix)sPOLYNOMIAL_CONSTANTS
319 IMPLICIT NONE
320
321- include 'loop_max_coefs.inc'
322 INTEGER I, NCOEFS
323
324 %(complex_format)s POLYNOMIAL(0:NCOEFS-1)
325
326- INTEGER COEFTORANK_MAP(0:LOOPMAXCOEFS-1)
327- %(coef_to_rank_map_definition)s
328-
329 DO I=0,NCOEFS-1
330 IF (MOD(COEFTORANK_MAP(I),2).eq.1) then
331 POLYNOMIAL(I)=-POLYNOMIAL(I)
332@@ -161,3 +156,6 @@
333
334 END
335 ## }
336+
337+C Now the routines to update the wavefunctions
338+
339
340=== modified file 'madgraph/loop/loop_exporters.py'
341--- madgraph/loop/loop_exporters.py 2016-02-24 13:54:17 +0000
342+++ madgraph/loop/loop_exporters.py 2016-03-03 23:14:51 +0000
343@@ -2070,18 +2070,7 @@
344
345 # Start from the routine in the template
346 replace_dict = copy.copy(matrix_element.rep_dict)
347-
348- # Write the definition of the coef_to_rank_map
349- coef_to_rank_map_definition = []
350- for rank in range(replace_dict['maxrank']+1):
351- start = q_polynomial.get_number_of_coefs_for_rank(rank-1)
352- end = q_polynomial.get_number_of_coefs_for_rank(rank)-1
353- coef_to_rank_map_definition.append(
354-'DATA (COEFTORANK_MAP(I),I=%(start)d,%(end)d)/%(n_entries)d*%(rank)d/'%
355-{'start': start,'end': end,'n_entries': end-start+1,'rank': rank})
356- replace_dict['coef_to_rank_map_definition']=\
357- '\n'.join(coef_to_rank_map_definition)
358-
359+
360 dp_routine = open(os.path.join(self.template_dir,'polynomial.inc')).read()
361 mp_routine = open(os.path.join(self.template_dir,'polynomial.inc')).read()
362 # The double precision version of the basic polynomial routines, such as
363@@ -2106,11 +2095,19 @@
364
365 # Initialize the polynomial routine writer
366 poly_writer=q_polynomial.FortranPolynomialRoutines(
367- matrix_element.get_max_loop_rank(),
368- sub_prefix=replace_dict['proc_prefix'])
369+ matrix_element.get_max_loop_rank(),
370+ updater_max_rank = matrix_element.get_max_loop_vertex_rank(),
371+ sub_prefix=replace_dict['proc_prefix'],
372+ proc_prefix=replace_dict['proc_prefix'],
373+ mp_prefix='')
374+ # Write the polynomial constant module common to all
375+ writer.writelines(poly_writer.write_polynomial_constant_module()+'\n')
376+
377 mp_poly_writer=q_polynomial.FortranPolynomialRoutines(
378- matrix_element.get_max_loop_rank(),coef_format='complex*32',
379- sub_prefix='MP_'+replace_dict['proc_prefix'])
380+ matrix_element.get_max_loop_rank(),
381+ updater_max_rank = matrix_element.get_max_loop_vertex_rank(),
382+ coef_format='complex*32', sub_prefix='MP_'+replace_dict['proc_prefix'],
383+ proc_prefix=replace_dict['proc_prefix'], mp_prefix='MP_')
384 # The eval subroutine
385 subroutines.append(poly_writer.write_polynomial_evaluator())
386 subroutines.append(mp_poly_writer.write_polynomial_evaluator())
387@@ -2120,12 +2117,40 @@
388 # The merging one for creating the loop coefficients
389 subroutines.append(poly_writer.write_wl_merger())
390 subroutines.append(mp_poly_writer.write_wl_merger())
391- # Now the udpate subroutines
392 for wl_update in matrix_element.get_used_wl_updates():
393- subroutines.append(poly_writer.write_wl_updater(\
394- wl_update[0],wl_update[1]))
395- subroutines.append(mp_poly_writer.write_wl_updater(\
396- wl_update[0],wl_update[1]))
397+ # We pick here the most appropriate way of computing the
398+ # tensor product depending on the rank of the two tensors.
399+ # The various choices below come out from a careful comparison of
400+ # the different methods using the valgrind profiler
401+ if wl_update[0]==wl_update[1]==1 or wl_update[0]==0 or wl_update[1]==0:
402+ # If any of the rank is 0, or if they are both equal to 1,
403+ # then we are better off using the full expanded polynomial,
404+ # and let the compiler optimize it.
405+ subroutines.append(poly_writer.write_expanded_wl_updater(\
406+ wl_update[0],wl_update[1]))
407+ subroutines.append(mp_poly_writer.write_expanded_wl_updater(\
408+ wl_update[0],wl_update[1]))
409+ elif wl_update[0] >= wl_update[1]:
410+ # If the loop polynomial is larger then we will filter and loop
411+ # over the vertex coefficients first. The smallest product for
412+ # which the routines below could be used is then
413+ # loop_rank_2 x vertex_rank_1
414+ subroutines.append(poly_writer.write_compact_wl_updater(\
415+ wl_update[0],wl_update[1],loop_over_vertex_coefs_first=True))
416+ subroutines.append(mp_poly_writer.write_compact_wl_updater(\
417+ wl_update[0],wl_update[1],loop_over_vertex_coefs_first=True))
418+ else:
419+ # This happens only when the rank of the updater (vertex coef)
420+ # is larger than the one of the loop coef and none of them is
421+ # zero. This never happens in renormalizable theories but it
422+ # can happen in the HEFT ones or other effective ones. In this
423+ # case the typicaly use of this routine if for the product
424+ # loop_rank_1 x vertex_rank_2
425+ subroutines.append(poly_writer.write_compact_wl_updater(\
426+ wl_update[0],wl_update[1],loop_over_vertex_coefs_first=False))
427+ subroutines.append(mp_poly_writer.write_compact_wl_updater(\
428+ wl_update[0],wl_update[1],loop_over_vertex_coefs_first=False))
429+
430 writer.writelines('\n\n'.join(subroutines),
431 context=self.get_context(matrix_element))
432
433
434=== modified file 'madgraph/various/process_checks.py'
435--- madgraph/various/process_checks.py 2016-03-02 05:56:15 +0000
436+++ madgraph/various/process_checks.py 2016-03-03 23:14:51 +0000
437@@ -1315,7 +1315,11 @@
438 if not make_it_quick:
439 target_pspoints_number = max(int(30.0/time_per_ps_estimate)+1,50)
440 else:
441+<<<<<<< TREE
442 target_pspoints_number = 10
443+=======
444+ target_pspoints_number = 1000
445+>>>>>>> MERGE-SOURCE
446
447 logger.info("Checking timing for process %s "%proc_name+\
448 "with %d PS points."%target_pspoints_number)
449
450=== modified file 'madgraph/various/q_polynomial.py'
451--- madgraph/various/q_polynomial.py 2016-02-23 19:44:10 +0000
452+++ madgraph/various/q_polynomial.py 2016-03-03 23:14:51 +0000
453@@ -118,11 +118,23 @@
454 class PolynomialRoutines(object):
455 """ The mother class to output the polynomial subroutines """
456
457- def __init__(self, max_rank, coef_format='complex*16', sub_prefix=''
458- ,line_split=30):
459+ def __init__(self, max_rank, updater_max_rank=None,
460+ coef_format='complex*16', sub_prefix='',
461+ proc_prefix='',mp_prefix='',
462+ line_split=30):
463
464 self.coef_format=coef_format
465 self.sub_prefix=sub_prefix
466+ self.proc_prefix=proc_prefix
467+ self.mp_prefix=mp_prefix
468+ if updater_max_rank is None:
469+ self.updater_max_rank = max_rank
470+ else:
471+ if updater_max_rank > max_rank:
472+ raise PolynomialError, "The updater max rank must be at most"+\
473+ " equal to the overall max rank"
474+ else:
475+ self.updater_max_rank = updater_max_rank
476 if coef_format=='complex*16':
477 self.rzero='0.0d0'
478 self.czero='(0.0d0,0.0d0)'
479@@ -138,10 +150,70 @@
480 "The rank of a q-polynomial should be 0 or positive"
481 self.max_rank=max_rank
482 self.pq=Polynomial(max_rank)
483+
484+ # A useful replacement dictionary
485+ self.rep_dict = {'sub_prefix':self.sub_prefix,
486+ 'proc_prefix':self.proc_prefix,
487+ 'mp_prefix':self.mp_prefix,
488+ 'coef_format':self.coef_format}
489
490 class FortranPolynomialRoutines(PolynomialRoutines):
491 """ A daughter class to output the subroutine in the fortran format"""
492
493+ def write_polynomial_constant_module(self):
494+ """ Writes a fortran90 module that defined polynomial constants objects."""
495+
496+ # Start with the polynomial constants module header
497+ polynomial_constant_lines = []
498+ polynomial_constant_lines.append(
499+"""MODULE %sPOLYNOMIAL_CONSTANTS
500+implicit none
501+include 'coef_specs.inc'
502+include 'loop_max_coefs.inc'
503+"""%self.sub_prefix)
504+ # Add the N coef for rank
505+ polynomial_constant_lines.append(
506+'C Map associating a rank to each coefficient position')
507+ polynomial_constant_lines.append(
508+ 'INTEGER COEFTORANK_MAP(0:LOOPMAXCOEFS-1)')
509+ for rank in range(self.max_rank+1):
510+ start = get_number_of_coefs_for_rank(rank-1)
511+ end = get_number_of_coefs_for_rank(rank)-1
512+ polynomial_constant_lines.append(
513+'DATA COEFTORANK_MAP(%(start)d:%(end)d)/%(n_entries)d*%(rank)d/'%
514+{'start': start,'end': end,'n_entries': end-start+1,'rank': rank})
515+
516+ polynomial_constant_lines.append(
517+'\nC Map defining the number of coefficients for a symmetric tensor of a given rank')
518+ polynomial_constant_lines.append(
519+"""INTEGER NCOEF_R(0:%(max_rank)d)
520+DATA NCOEF_R/%(ranks)s/"""%{'max_rank':self.max_rank,'ranks':','.join([
521+ str(get_number_of_coefs_for_rank(r)) for r in range(0,self.max_rank+1)])})
522+ polynomial_constant_lines.append(
523+'\nC Map defining the coef position resulting from the multiplication of two lower rank coefs.')
524+ mult_matrix = [[
525+ self.pq.get_coef_position(self.pq.get_coef_at_position(coef_a)+
526+ self.pq.get_coef_at_position(coef_b))
527+ for coef_b in range(0,get_number_of_coefs_for_rank(self.updater_max_rank))]
528+ for coef_a in range(0,get_number_of_coefs_for_rank(self.max_rank))]
529+
530+ polynomial_constant_lines.append(
531+'INTEGER COMB_COEF_POS(0:LOOPMAXCOEFS-1,0:%(max_updater_rank)d)'\
532+%{'max_updater_rank':(get_number_of_coefs_for_rank(self.updater_max_rank)-1)})
533+
534+ for j, line in enumerate(mult_matrix):
535+ chunk_size = 20
536+ for k in xrange(0, len(line), chunk_size):
537+ polynomial_constant_lines.append(
538+ "DATA COMB_COEF_POS(%3r,%3r:%3r) /%s/" % \
539+ (j, k, min(k + chunk_size, len(line))-1,
540+ ','.join(["%3r" % i for i in line[k:k + chunk_size]])))
541+
542+ polynomial_constant_lines.append(
543+ "\nEND MODULE %sPOLYNOMIAL_CONSTANTS\n"%self.sub_prefix)
544+
545+ return '\n'.join(polynomial_constant_lines)
546+
547
548 def write_pjfry_mapping(self):
549 """ Returns a fortran subroutine which fills in the array of integral reduction
550@@ -383,34 +455,112 @@
551 subroutines.append('\n'.join(lines+['end']))
552
553 return '\n\n'.join(subroutines)
554-
555- def write_wl_updater(self,r_1,r_2):
556- """ Give out the subroutine to update a polynomial of rank r_1 with
557- one of rank r_2 """
558-
559- # The update is basically given by
560- # OUT(j,coef,i) = A(k,*,i) x B(j,*,k)
561- # with k a summed index and the 'x' operation is equivalent to
562- # putting together two regular polynomial in q with scalar coefficients
563- # The complexity of this subroutine is therefore
564- # MAXLWFSIZE**3 * NCoef(r_1) * NCoef(r_2)
565- # Which is for example 22'400 when updating a rank 4 loop wavefunction
566- # with a rank 1 updater.
567-
568- lines=[]
569-
570- # Start by writing out the header:
571- lines.append(
572- """SUBROUTINE %(sub_prefix)sUPDATE_WL_%(r_1)d_%(r_2)d(A,LCUT_SIZE,B,IN_SIZE,OUT_SIZE,OUT)
573- include 'coef_specs.inc'
574- include 'loop_max_coefs.inc'
575- INTEGER I,J,K
576- %(coef_format)s A(MAXLWFSIZE,0:LOOPMAXCOEFS-1,MAXLWFSIZE)
577- %(coef_format)s B(MAXLWFSIZE,0:VERTEXMAXCOEFS-1,MAXLWFSIZE)
578- %(coef_format)s OUT(MAXLWFSIZE,0:LOOPMAXCOEFS-1,MAXLWFSIZE)
579- INTEGER LCUT_SIZE,IN_SIZE,OUT_SIZE
580- """%{'sub_prefix':self.sub_prefix,'r_1':r_1,'r_2':r_2,
581- 'coef_format':self.coef_format})
582+
583+ def write_compact_wl_updater(self,r_1,r_2,loop_over_vertex_coefs_first=True):
584+ """ Give out the subroutine to update a polynomial of rank r_1 with
585+ one of rank r_2 """
586+
587+ # The update is basically given by
588+ # OUT(j,coef,i) = A(k,*,i) x B(j,*,k)
589+ # with k a summed index and the 'x' operation is equivalent to
590+ # putting together two regular polynomial in q with scalar coefficients
591+ # The complexity of this subroutine is therefore
592+ # MAXLWFSIZE**3 * NCoef(r_1) * NCoef(r_2)
593+ # Which is for example 22'400 when updating a rank 4 loop wavefunction
594+ # with a rank 1 updater.
595+ # The situation is slightly improved by a smarter handling of the
596+ # coefficients equal to zero
597+
598+ lines=[]
599+
600+ # Start by writing out the header:
601+ lines.append(
602+ """SUBROUTINE %(sub_prefix)sUPDATE_WL_%(r_1)d_%(r_2)d(A,LCUT_SIZE,B,IN_SIZE,OUT_SIZE,OUT)
603+ USE %(proc_prefix)sPOLYNOMIAL_CONSTANTS
604+ implicit none
605+ INTEGER I,J,K,L,M
606+ %(coef_format)s A(MAXLWFSIZE,0:LOOPMAXCOEFS-1,MAXLWFSIZE)
607+ %(coef_format)s B(MAXLWFSIZE,0:VERTEXMAXCOEFS-1,MAXLWFSIZE)
608+ %(coef_format)s OUT(MAXLWFSIZE,0:LOOPMAXCOEFS-1,MAXLWFSIZE)
609+ INTEGER LCUT_SIZE,IN_SIZE,OUT_SIZE
610+ INTEGER NEW_POSITION
611+ %(coef_format)s UPDATER_COEF
612+"""%{'sub_prefix':self.sub_prefix,'proc_prefix':self.proc_prefix,
613+ 'r_1':r_1,'r_2':r_2,'coef_format':self.coef_format})
614+
615+ # Start the loop on the elements i,j of the vector OUT(i,coef,j)
616+ lines.append("C Welcome to the computational heart of MadLoop...")
617+ if loop_over_vertex_coefs_first:
618+ lines.append("OUT(:,:,:)=%s"%self.czero)
619+ lines.append(
620+ """DO J=1,OUT_SIZE
621+ DO M=0,%d
622+ DO K=1,IN_SIZE
623+ UPDATER_COEF = B(J,M,K)
624+ IF (UPDATER_COEF.EQ.%s) CYCLE
625+ DO L=0,%d
626+ NEW_POSITION = COMB_COEF_POS(L,M)
627+ DO I=1,LCUT_SIZE
628+ OUT(J,NEW_POSITION,I)=OUT(J,NEW_POSITION,I) + A(K,L,I)*UPDATER_COEF
629+ ENDDO
630+ ENDDO
631+ ENDDO
632+ ENDDO
633+ ENDDO
634+ """%(get_number_of_coefs_for_rank(r_2)-1,
635+ self.czero,
636+ get_number_of_coefs_for_rank(r_1)-1))
637+ else:
638+ lines.append("OUT(:,:,:)=%s"%self.czero)
639+ lines.append(
640+ """DO I=1,LCUT_SIZE
641+ DO L=0,%d
642+ DO K=1,IN_SIZE
643+ UPDATER_COEF = A(K,L,I)
644+ IF (UPDATER_COEF.EQ.%s) CYCLE
645+ DO M=0,%d
646+ NEW_POSITION = COMB_COEF_POS(L,M)
647+ DO J=1,OUT_SIZE
648+ OUT(J,NEW_POSITION,I)=OUT(J,NEW_POSITION,I) + UPDATER_COEF*B(J,M,K)
649+ ENDDO
650+ ENDDO
651+ ENDDO
652+ ENDDO
653+ ENDDO
654+ """%(get_number_of_coefs_for_rank(r_1)-1,
655+ self.czero,
656+ get_number_of_coefs_for_rank(r_2)-1))
657+
658+ lines.append("END")
659+ # return the subroutine
660+ return '\n'.join(lines)
661+
662+ def write_expanded_wl_updater(self,r_1,r_2):
663+ """ Give out the subroutine to update a polynomial of rank r_1 with
664+ one of rank r_2 """
665+
666+ # The update is basically given by
667+ # OUT(j,coef,i) = A(k,*,i) x B(j,*,k)
668+ # with k a summed index and the 'x' operation is equivalent to
669+ # putting together two regular polynomial in q with scalar coefficients
670+ # The complexity of this subroutine is therefore
671+ # MAXLWFSIZE**3 * NCoef(r_1) * NCoef(r_2)
672+ # Which is for example 22'400 when updating a rank 4 loop wavefunction
673+ # with a rank 1 updater.
674+
675+ lines=[]
676+
677+ # Start by writing out the header:
678+ lines.append(
679+ """SUBROUTINE %(sub_prefix)sUPDATE_WL_%(r_1)d_%(r_2)d(A,LCUT_SIZE,B,IN_SIZE,OUT_SIZE,OUT)
680+ USE %(proc_prefix)sPOLYNOMIAL_CONSTANTS
681+ INTEGER I,J,K
682+ %(coef_format)s A(MAXLWFSIZE,0:LOOPMAXCOEFS-1,MAXLWFSIZE)
683+ %(coef_format)s B(MAXLWFSIZE,0:VERTEXMAXCOEFS-1,MAXLWFSIZE)
684+ %(coef_format)s OUT(MAXLWFSIZE,0:LOOPMAXCOEFS-1,MAXLWFSIZE)
685+ INTEGER LCUT_SIZE,IN_SIZE,OUT_SIZE
686+"""%{'sub_prefix':self.sub_prefix,'proc_prefix':self.proc_prefix,
687+ 'r_1':r_1,'r_2':r_2,'coef_format':self.coef_format})
688
689 # Start the loop on the elements i,j of the vector OUT(i,coef,j)
690 lines.append("DO I=1,LCUT_SIZE")
691@@ -460,14 +610,12 @@
692
693 # Start by writing out the header:
694 lines.append("""SUBROUTINE %(sub_prefix)sEVAL_POLY(C,R,Q,OUT)
695- include 'coef_specs.inc'
696- include 'loop_max_coefs.inc'
697+ USE %(proc_prefix)sPOLYNOMIAL_CONSTANTS
698 %(coef_format)s C(0:LOOPMAXCOEFS-1)
699 INTEGER R
700 %(coef_format)s Q(0:3)
701 %(coef_format)s OUT
702- """%{'sub_prefix':self.sub_prefix,
703- 'coef_format':self.coef_format})
704+ """%self.rep_dict)
705
706 # Start by the trivial coefficient of order 0.
707 lines.append("OUT=C(0)")
708@@ -497,28 +645,20 @@
709 lines=[]
710
711 # Start by writing out the header:
712- lines.append("""SUBROUTINE %(sub_prefix)sMERGE_WL(WL,R,LCUT_SIZE,CONST,OUT)
713- include 'coef_specs.inc'
714- include 'loop_max_coefs.inc'
715- INTEGER I,J
716- %(coef_format)s WL(MAXLWFSIZE,0:LOOPMAXCOEFS-1,MAXLWFSIZE)
717- INTEGER R,LCUT_SIZE
718- %(coef_format)s CONST
719- %(coef_format)s OUT(0:LOOPMAXCOEFS-1)
720- """%{'sub_prefix':self.sub_prefix,
721- 'coef_format':self.coef_format})
722-
723- # Add an array specifying how many coefs there are for given ranks
724- lines.append("""INTEGER NCOEF_R(0:%(max_rank)d)
725- DATA NCOEF_R/%(ranks)s/
726- """%{'max_rank':self.max_rank,'ranks':','.join([
727- str(get_number_of_coefs_for_rank(r)) for r in
728- range(0,self.max_rank+1)])})
729+ lines.append(
730+"""SUBROUTINE %(sub_prefix)sMERGE_WL(WL,R,LCUT_SIZE,CONST,OUT)
731+ USE %(proc_prefix)sPOLYNOMIAL_CONSTANTS
732+ INTEGER I,J
733+ %(coef_format)s WL(MAXLWFSIZE,0:LOOPMAXCOEFS-1,MAXLWFSIZE)
734+ INTEGER R,LCUT_SIZE
735+ %(coef_format)s CONST
736+ %(coef_format)s OUT(0:LOOPMAXCOEFS-1)
737+"""%self.rep_dict)
738
739 # Now scan them all progressively
740 lines.append("DO I=1,LCUT_SIZE")
741 lines.append(" DO J=0,NCOEF_R(R)-1")
742- lines.append(" OUT(J)=OUT(J)+WL(I,J,I)*CONST")
743+ lines.append(" OUT(J)=OUT(J)+WL(I,J,I)*CONST")
744 lines.append(" ENDDO")
745 lines.append("ENDDO")
746 lines.append("END")
747@@ -533,21 +673,12 @@
748
749 # Start by writing out the header:
750 lines.append("""SUBROUTINE %(sub_prefix)sADD_COEFS(A,RA,B,RB)
751- include 'coef_specs.inc'
752- include 'loop_max_coefs.inc'
753+ USE %(proc_prefix)sPOLYNOMIAL_CONSTANTS
754 INTEGER I
755 %(coef_format)s A(0:LOOPMAXCOEFS-1),B(0:LOOPMAXCOEFS-1)
756 INTEGER RA,RB
757- """%{'sub_prefix':self.sub_prefix,
758- 'coef_format':self.coef_format})
759+ """%self.rep_dict)
760
761- # Add an array specifying how many coefs there are for given ranks
762- lines.append("""INTEGER NCOEF_R(0:%(max_rank)d)
763- DATA NCOEF_R/%(ranks)s/
764- """%{'max_rank':self.max_rank,'ranks':','.join([
765- str(get_number_of_coefs_for_rank(r)) for r in
766- range(0,self.max_rank+1)])})
767-
768 # Now scan them all progressively
769 lines.append("DO I=0,NCOEF_R(RB)-1")
770 lines.append(" A(I)=A(I)+B(I)")
771
772=== modified file 'tests/time_db'
773--- tests/time_db 2016-03-03 16:04:04 +0000
774+++ tests/time_db 2016-03-03 23:14:51 +0000
775@@ -52,6 +52,7 @@
776 <__main__.TestSuiteModified tests=[<tests.unit_tests.iolibs.test_export_v4.FullHelasOutputTest testMethod=test_four_fermion_vertex_normal_fermion_flow>]> 0.0366899967194
777 <__main__.TestSuiteModified tests=[<tests.unit_tests.iolibs.test_export_v4.FullHelasOutputTest testMethod=test_generate_helas_diagrams_epem_elpelmepem>]> 0.0915629863739
778 <__main__.TestSuiteModified tests=[<tests.unit_tests.various.test_aloha.test_aloha_creation testMethod=test_aloha_FFVC>]> 0.0635468959808
779+<__main__.TestSuiteModified tests=[<tests.unit_tests.various.test_usermod.Test_ADDON_UFO testMethod=test_identify_particle>]> 0.0015971660614
780 <__main__.TestSuiteModified tests=[<tests.unit_tests.various.test_decay.Test_DecayAmplitude testMethod=test_group_channels2amplitudes>]> 0.346367835999
781 <__main__.TestSuiteModified tests=[<tests.unit_tests.various.test_import_ufo.TestRestrictModel testMethod=test_detect_special_parameters>]> 0.0848360061646
782 <__main__.TestSuiteModified tests=[<tests.unit_tests.iolibs.test_file_writers.FortranWriterTest testMethod=test_write_fortran_error>]> 0.000140190124512

Subscribers

People subscribed via source and target branches

to all changes: