MadGraph5_aMC@NLO

Merge lp:~maddevelopers/mg5amcnlo/PY8_parallelization into lp:~maddevelopers/mg5amcnlo/2.5.1

PY8_parallelization
Merge into 2.5.1

Proposed by Valentin Hirschi on 2016-09-03

Status:	Superseded
Proposed branch:	lp:~maddevelopers/mg5amcnlo/PY8_parallelization
Merge into:	lp:~maddevelopers/mg5amcnlo/2.5.1
Diff against target:	848 lines (+446/-149) 5 files modified madgraph/interface/common_run_interface.py (+13/-1) madgraph/interface/madevent_interface.py (+368/-81) madgraph/various/banner.py (+13/-2) madgraph/various/histograms.py (+42/-62) madgraph/various/lhe_parser.py (+10/-3)
To merge this branch:	bzr merge lp:~maddevelopers/mg5amcnlo/PY8_parallelization
Related bugs:	Link a bug report

Reviewer	Review Type	Date Requested	Status
Olivier Mattelaer		2016-09-03	Pending
Review via email: mp+304843@code.launchpad.net

This proposal has been superseded by a proposal from 2016-09-03.

Description of the change

This branch implements PY8 LO parallelization.

It is fully functional and has been tested for a few cases (not exhaustive yet).

The remaining issues are:

1) [Not related to this branch] Systematics parallelization crash when using a cluster_temp_directory.

2) [Not related to this branch] The PY8 HTML is screwed up past the first run/tag.

3) The .lhe splitting is slow. It would be nice to have an advanced function for this in lhe_parser.py that bypasses the (full) parsing of the event files.
Also, obtaining the number of events in the event file is slow. Again an optimized static method for this bypassing the full parsing would be nice.

4) The merging of the split HEPMC is done in a very efficient way in this branch (very important given their size). However it use two system calls which are not secure. They need to be made secure.

5) More testing must be made, especially a comparison of the results between parallel and sequential runs for the merged_x_secs, HwU plots et hepmc event files must be performed so as to guarantee the correctness of the implementation.

6) The new bits of code in do_pythia8() could be a bit more refactored. In particular the two parts of the code related to parallel submission and merging of split results could be factored out in dedicated functions.

Olivier, could you review this and fix what you can already.
I you manage to clean it up all, then don't hesitate to merge this already to 2.5.1 (or even 2.5.0 since it is a nice feature and introduces some important bug fixing).
If there is still something that needs to be discussed with me, then we will only be able to do this on tuesday, since I will be mostly unavailable 'til then now.

Thanks,

lp:~maddevelopers/mg5amcnlo/PY8_parallelization updated on 2016-09-03

300. By Valentin Hirschi on 2016-09-03: 1. Fixed a typo in an MA5 option.
2. Updated the parallelization of PY8 in the UpdateNotes.

Revision history for this message

Valentin Hirschi (valentin-hirschi) wrote on 2016-09-03:

To test the above on a condor cluster, one must re-install MG5aMC_PY8_interface, because I modified its installation so that it links *statically* against HEPMC2 so that it doesn't have to be found on the worker nodes at run time.

lp:~maddevelopers/mg5amcnlo/PY8_parallelization updated on 2016-09-08

301. By Olivier Mattelaer on 2016-09-05: faster parsing for splitting event/get number of events
302. By Olivier Mattelaer on 2016-09-06: also apply the bypass of parsing for systematics
303. By Valentin Hirschi on 2016-09-08: 1. Fixed the sanity check of PY8 log file which was not ok with the parallelization.
304. By Valentin Hirschi on 2016-09-08: 1. fixed an issue with the warning about failing PY8 log. (needed to close the log stream).
2. Sandboxed the HEPMC merging syscalls.
305. By Valentin Hirschi on 2016-09-08: 1. Merged with latest version of 2.5.1

Unmerged revisions

305. By Valentin Hirschi on 2016-09-08: 1. Merged with latest version of 2.5.1
304. By Valentin Hirschi on 2016-09-08: 1. fixed an issue with the warning about failing PY8 log. (needed to close the log stream).
2. Sandboxed the HEPMC merging syscalls.
303. By Valentin Hirschi on 2016-09-08: 1. Fixed the sanity check of PY8 log file which was not ok with the parallelization.
302. By Olivier Mattelaer on 2016-09-06: also apply the bypass of parsing for systematics
301. By Olivier Mattelaer on 2016-09-05: faster parsing for splitting event/get number of events

Preview Diff

[H/L] Next/Prev Comment, [J/K] Next/Prev File, [N/P] Next/Prev Hunk

Subscribers

People subscribed via source and target branches

to all changes:

MadDevelopers

 === modified file 'madgraph/interface/common_run_interface.py'
 --- madgraph/interface/common_run_interface.py	2016-09-02 02:17:02 +0000
 +++ madgraph/interface/common_run_interface.py	2016-09-03 09:10:54 +0000
@@ -3860,6 +3860,17 @@
                          'ilc': ['run_card lpp1 0', 'run_card lpp2 0', 'run_card ebeam1 %(0)s/2', 'run_card ebeam2 %(0)s/2'],
                          'lcc':['run_card lpp1 1', 'run_card lpp2 1', 'run_card ebeam1 %(0)s*1000/2', 'run_card ebeam2 %(0)s*1000/2'],
                          'fixed_scale': ['run_card fixed_fac_scale T', 'run_card fixed_ren_scale T', 'run_card scale %(0)s', 'run_card dsqrt_q2fact1 %(0)s' ,'run_card dsqrt_q2fact2 %(0)s'],
++                        'simplepy8':['pythia8_card hadronlevel:all False',
++                                     'pythia8_card partonlevel:mpi False',
++                                     'pythia8_card BeamRemnants:primordialKT False',
++                                     'pythia8_card PartonLevel:Remnants False',
++                                     'pythia8_card Check:event False',
++                                     'pythia8_card TimeShower:QEDshowerByQ False',
++                                     'pythia8_card TimeShower:QEDshowerByL False',
++                                     'pythia8_card SpaceShower:QEDshowerByQ False',
++                                     'pythia8_card SpaceShower:QEDshowerByL False',
++                                     'pythia8_card PartonLevel:FSRinResonances False',
++                                     'pythia8_card ProcessLevel:resonanceDecays False']
+                         }
      special_shortcut_help = {
@@ -3873,7 +3884,8 @@
                '   3 : means PDF for elastic photon emited from an electron',
      'lhc'   : 'syntax: set lhc VALUE:\n      Set for a proton-proton collision with that given center of mass energy (in TeV)',
      'lep'   : 'syntax: set lep VALUE:\n      Set for a electron-positron collision with that given center of mass energy (in GeV)',
--    'fixed_scale' : 'syntax: set fixed_scale VALUE:\n      Set all scales to the give value (in GeV)',
++    'fixed_scale' : 'syntax: set fixed_scale VALUE:\n      Set all scales to the give value (in GeV)',
++    'simplepy8' : 'syntax: Turn off non-perturbative slow features of PY8.'
+     }
      def load_default(self):
 === modified file 'madgraph/interface/madevent_interface.py'
 --- madgraph/interface/madevent_interface.py	2016-09-01 20:55:35 +0000
 +++ madgraph/interface/madevent_interface.py	2016-09-03 09:10:54 +0000
@@ -38,6 +38,7 @@
  import tarfile
  import StringIO
  import shutil
++import copy
  import xml.dom.minidom as minidom
  try:
@@ -77,6 +78,8 @@
      import internal.sum_html as sum_html
      import internal.combine_runs as combine_runs
      import internal.lhe_parser as lhe_parser
++    import internal.histograms as histograms
++    from internal.files import ln
  else:
      # import from madgraph directory
      MADEVENT = False
@@ -92,8 +95,10 @@
      import madgraph.various.misc as misc
      import madgraph.madevent.combine_runs as combine_runs
      import madgraph.various.lhe_parser as lhe_parser
++    import madgraph.various.histograms as histograms
--    import models.check_param_card as check_param_card
++    import models.check_param_card as check_param_card
++    from madgraph.iolibs.files import ln
      from madgraph import InvalidCmd, MadGraph5Error, MG5DIR, ReadWrite
@@ -3502,7 +3507,7 @@
          """ Setup the Pythia8 Run environment and card. In particular all the process and run specific parameters
          of the card are automatically set here. This function returns the path where HEPMC events will be output,
          if any."""
--
++
          HepMC_event_output = None
          tag = self.run_tag
@@ -3554,7 +3559,7 @@
              # only if it is not already user_set.
              if PY8_Card['JetMatching:qCut']==-1.0:
                  PY8_Card.MadGraphSet('JetMatching:qCut',1.5*self.run_card['xqcut'])
--
++
              if PY8_Card['JetMatching:qCut']<(1.5*self.run_card['xqcut']):
                  logger.error(
      'The MLM merging qCut parameter you chose (%f) is less than'%PY8_Card['JetMatching:qCut']+
@@ -3568,6 +3573,7 @@
              # Automatically set qWeed to xqcut if not defined by the user.
              if PY8_Card['SysCalc:qWeed']==-1.0:
                  PY8_Card.MadGraphSet('SysCalc:qWeed',self.run_card['xqcut'])
++
              if PY8_Card['SysCalc:qCutList']=='auto':
                  if self.run_card['use_syst']:
                      if self.run_card['sys_matchscale']=='auto':
@@ -3580,7 +3586,7 @@
                          if PY8_Card['JetMatching:qCut'] not in qCutList:
                              qCutList.append(PY8_Card['JetMatching:qCut'])
                          PY8_Card.MadGraphSet('SysCalc:qCutList', qCutList)
--
++
              for scale in PY8_Card['SysCalc:qCutList']:
                  if scale<(1.5*self.run_card['xqcut']):
                      logger.error(
@@ -3728,12 +3734,11 @@
              self.options['automatic_html_opening'] = False
          if self.run_card['event_norm'] not in ['unit','average']:
--
              logger.critical("Pythia8 does not support normalization to the sum. Not running Pythia8")
              return
--                             #\n"+\
--                             #"The normalisation of the hepmc output file will be wrong (i.e. non-standard).\n"+\
--                             #"Please use 'event_norm = average' in the run_card to avoid this problem.")
++             #\n"+\
++             #"The normalisation of the hepmc output file will be wrong (i.e. non-standard).\n"+\
++             #"Please use 'event_norm = average' in the run_card to avoid this problem.")
          # Update the banner with the pythia card
          if not self.banner or len(self.banner) <=1:
@@ -3837,6 +3842,11 @@
                                ( os.path.exists(HepMC_event_output) and \
                                stat.S_ISFIFO(os.stat(HepMC_event_output).st_mode))
          startPY8timer = time.time()
++
++        # Information that will be extracted from this PY8 run
++        PY8_extracted_information={ 'sigma_m':None, 'Nacc':None, 'Ntry':None,
++                                    'cross_sections':{} }
++
          if is_HepMC_output_fifo:
              logger.info(
  """Pythia8 is set to output HEPMC events to to a fifo file.
@@ -3853,17 +3863,294 @@
                                            %HepMC_event_output,'$MG:color:GREEN')
              return
          else:
--            logger.info('Follow Pythia8 shower by running the '+
--                'following command (in a separate terminal):\n    tail -f %s'%pythia_log)
++            if self.options['run_mode']!=0:
++                # Start a parallelization instance (stored in self.cluster)
++                self.configure_run_mode(self.options['run_mode'])
++                if self.options['run_mode']==1:
++                    n_cores = max(self.options['cluster_size'],1)
++                elif self.options['run_mode']==2:
++                    n_cores = max(self.cluster.nb_core,1)
++
++                lhe_file_name = os.path.basename(PY8_Card.subruns[0]['Beams:LHEF'])
++                lhe_file = lhe_parser.EventFile(pjoin(self.me_dir,'Events',
++                                                    self.run_name,PY8_Card.subruns[0]['Beams:LHEF']))
++                n_events = len(lhe_file)
--            ret_code = self.cluster.launch_and_wait(wrapper_path,
--                    argument= [], stdout= pythia_log, stderr=subprocess.STDOUT,
--                                  cwd=pjoin(self.me_dir,'Events',self.run_name))
--            if ret_code != 0:
--                raise self.InvalidCmd, 'Pythia8 shower interrupted with return'+\
--                    ' code %d.\n'%ret_code+\
--                    'You can find more information in this log file:\n%s'%pythia_log
++                # Implement a security to insure a minimum numbe of events per job
++                if self.options['run_mode']==2:
++                    min_n_events_per_job = 100
++                elif self.options['run_mode']==1:
++                    min_n_events_per_job = 1000
++                min_n_core = n_events//min_n_events_per_job
++                n_cores = min(min_n_core,n_cores)
++
++            if self.options['run_mode']==0 or (self.options['run_mode']==2 and self.options['nb_core']==1):
++                # No need for parallelization anymore
++                self.cluster = None
++                logger.info('Follow Pythia8 shower by running the '+
++                    'following command (in a separate terminal):\n    tail -f %s'%pythia_log)
++                ret_code = self.cluster.launch_and_wait(wrapper_path,
++                        argument= [], stdout= pythia_log, stderr=subprocess.STDOUT,
++                                      cwd=pjoin(self.me_dir,'Events',self.run_name))
++                if ret_code != 0:
++                    raise self.InvalidCmd, 'Pythia8 shower interrupted with return'+\
++                        ' code %d.\n'%ret_code+\
++                        'You can find more information in this log file:\n%s'%pythia_log
++            else:
++                if self.run_card['event_norm']=='sum':
++                    logger.error("")
++                    logger.error("Either run in single core or change event_norm to 'average'.")
++                    raise InvalidCmd("Pythia8 parallelization with event_norm set to 'sum' is not supported."
++                                    "Either run in single core or change event_norm to 'average'.")
++
++                # Create the parallelization folder
++                parallelization_dir = pjoin(self.me_dir,'Events',self.run_name,'PY8_parallelization')
++                if os.path.isdir(parallelization_dir):
++                    shutil.rmtree(parallelization_dir)
++                os.mkdir(parallelization_dir)
++                # Copy what should be the now standalone executable for PY8
++                shutil.copy(pythia_main,parallelization_dir)
++                # Add a safe card in parallelization
++                ParallelPY8Card = copy.copy(PY8_Card)
++                # Normalize the name of the HEPMCouput and lhe input
++                if HepMC_event_output:
++                    ParallelPY8Card['HEPMCoutput:file']='events.hepmc'
++                else:
++                    ParallelPY8Card['HEPMCoutput:file']='/dev/null'
++
++                ParallelPY8Card.subruns[0].systemSet('Beams:LHEF','events.lhe.gz')
++                ParallelPY8Card.write(pjoin(parallelization_dir,'PY8Card.dat'),
++                                      pjoin(self.me_dir,'Cards','pythia8_card_default.dat'),
++                                                                    direct_pythia_input=True)
++                # Write the wrapper
++                wrapper_path = pjoin(parallelization_dir,'run_PY8.sh')
++                wrapper = open(wrapper_path,'w')
++                if self.options['cluster_temp_path'] is None:
++                    exe_cmd = \
++"""#!%s
++./%s >& PY8_log.txt
++"""
++                else:
++                    exe_cmd = \
++"""#!%s
++ln -s ./events_$1.lhe.gz ./events.lhe.gz
++./%s >& PY8_log.txt
++mkdir split_$1
++if [ -f ./events.hepmc ];
++then
++   mv ./events.hepmc ./split_$1/
++fi
++if [ -f ./pts.dat ];
++then
++   mv ./pts.dat ./split_$1/
++fi
++if [ -f ./djrs.dat ];
++then
++   mv ./djrs.dat ./split_$1/
++fi
++if [ -f ./PY8_log.txt ];
++then
++   mv ./PY8_log.txt ./split_$1/
++fi
++tar -czf split_$1.tar.gz split_$1
++"""
++                exe_cmd = exe_cmd%(shell_exe,' '.join([os.path.basename(pythia_main),'PY8Card.dat']))
++                wrapper.write(exe_cmd)
++                wrapper.close()
++                # Set it as executable
++                st = os.stat(wrapper_path)
++                os.chmod(wrapper_path, st.st_mode | stat.S_IEXEC)
++
++                # Split the .lhe event file, create event partition
++                partition=[n_events//n_cores]*n_cores
++                for i in range(n_events%n_cores):
++                    partition[i] += 1
++
++                logger.info('Splitting .lhe event file for PY8 parallelization...')
++                n_splits = lhe_file.split(partition=partition, cwd=parallelization_dir, zip=True)
++
++                # Distribute the split events
++                split_files    = []
++                split_dirs     = []
++                for split_id in range(n_splits):
++                    split_files.append('events_%s.lhe.gz'%split_id)
++                    split_dirs.append(pjoin(parallelization_dir,'split_%d'%split_id))
++                    # Add the necessary run content
++                    shutil.move(pjoin(parallelization_dir,lhe_file.name+'_%d.lhe.gz'%split_id),
++                                pjoin(parallelization_dir,split_files[-1]))
++
++                logger.info('Submitting Pythia8 jobs...')
++                for i, split_file in enumerate(split_files):
++                    in_files = [pjoin(parallelization_dir,os.path.basename(pythia_main)),
++                                pjoin(parallelization_dir,'PY8Card.dat'),
++                                pjoin(parallelization_dir,split_file)]
++                    if self.options['cluster_temp_path'] is None:
++                        out_files = []
++                        os.mkdir(pjoin(parallelization_dir,'split_%d'%i))
++                        selected_cwd = pjoin(parallelization_dir,'split_%d'%i)
++                        for in_file in in_files+[pjoin(parallelization_dir,'run_PY8.sh')]:
++                            # Make sure to rename the split_file link from events_<x>.lhe.gz to events.lhe.gz
++                            if os.path.basename(in_file)==split_file:
++                                ln(in_file,selected_cwd,name='events.lhe.gz')
++                            else:
++                                ln(in_file,selected_cwd)
++                        in_files  = []
++                    else:
++                        out_files = ['split_%d.tar.gz'%i]
++                        selected_cwd = parallelization_dir
++                    self.cluster.submit2(wrapper_path,
++                            argument=[str(i)], cwd=selected_cwd,
++                            input_files=in_files,
++                            output_files=out_files,
++                            required_output=out_files)
++
++                def wait_monitoring(Idle, Running, Done):
++                    if Idle+Running+Done == 0:
++                        return
++                    logger.info('Pythia8 shower jobs: %d Idle, %d Running, %d Done [%s]'\
++                                %(Idle, Running, Done, misc.format_time(time.time() - startPY8timer)))
++                self.cluster.wait(parallelization_dir,wait_monitoring)
++
++                logger.info('Merging results from the split PY8 runs...')
++                if self.options['cluster_temp_path']:
++                    # Decompressing the output
++                    for i, split_file in enumerate(split_files):
++                        misc.call(['tar','-xzf','split_%d.tar.gz'%i],cwd=parallelization_dir)
++                        os.remove(pjoin(parallelization_dir,'split_%d.tar.gz'%i))
++
++                # Now merge logs
++                pythia_log_file = open(pythia_log,'w')
++
++                n_added = 0
++                for split_dir in split_dirs:
++                    log_file = pjoin(split_dir,'PY8_log.txt')
++                    pythia_log_file.write('='*35+'\n')
++                    pythia_log_file.write(' -> Pythia8 log file for run %d <-'%i+'\n')
++                    pythia_log_file.write('='*35+'\n')
++                    pythia_log_file.write(open(log_file,'r').read()+'\n')
++                    if run_type in merged_run_types:
++                        sigma_m, Nacc, Ntry = self.parse_PY8_log_file(log_file)
++                        if any(elem is None for elem in [sigma_m, Nacc, Ntry]):
++                            continue
++                        n_added += 1
++                        if PY8_extracted_information['sigma_m'] is None:
++                           PY8_extracted_information['sigma_m'] = sigma_m
++                        else:
++                           PY8_extracted_information['sigma_m'] += sigma_m
++                        if PY8_extracted_information['Nacc'] is None:
++                           PY8_extracted_information['Nacc'] = Nacc
++                        else:
++                           PY8_extracted_information['Nacc'] += Nacc
++                        if PY8_extracted_information['Ntry'] is None:
++                           PY8_extracted_information['Ntry'] = Ntry
++                        else:
++                           PY8_extracted_information['Ntry'] += Ntry
++                # Normalize the values added
++                if n_added>0:
++                    PY8_extracted_information['sigma_m'] /= float(n_added)
++
++                # djr plots
++                djr_HwU = None
++                n_added = 0
++                for split_dir in split_dirs:
++                    djr_file = pjoin(split_dir,'djrs.dat')
++                    if not os.path.isfile(djr_file):
++                        continue
++                    xsecs = self.extract_cross_sections_from_DJR(djr_file)
++                    if len(xsecs)>0:
++                        n_added += 1
++                        if len(PY8_extracted_information['cross_sections'])==0:
++                            PY8_extracted_information['cross_sections'] = xsecs
++                            # Square the error term
++                            for key in PY8_extracted_information['cross_sections']:
++                                PY8_extracted_information['cross_sections'][key][1] = \
++                                    PY8_extracted_information['cross_sections'][key][1]**2
++                        else:
++                            for key, value in xsecs.items():
++                                PY8_extracted_information['cross_sections'][key][0] += value[0]
++                                # Add error in quadrature
++                                PY8_extracted_information['cross_sections'][key][1] += value[1]**2
++                    new_djr_HwU = histograms.HwUList(djr_file,run_id=0)
++                    if djr_HwU is None:
++                        djr_HwU = new_djr_HwU
++                    else:
++                        for i, hist in enumerate(djr_HwU):
++                            djr_HwU[i] = hist + new_djr_HwU[i]
++                if not djr_HwU is None:
++                    djr_HwU.output(pjoin(self.me_dir,'Events',self.run_name,'djrs'),format='HwU')
++                    shutil.move(pjoin(self.me_dir,'Events',self.run_name,'djrs.HwU'),
++                                pjoin(self.me_dir,'Events',self.run_name,'%s_djrs.dat'%tag))
++                if n_added>0:
++                    for key in PY8_extracted_information['cross_sections']:
++                        PY8_extracted_information['cross_sections'][key][0] /= float(n_added)
++                        PY8_extracted_information['cross_sections'][key][1] = \
++                         math.sqrt(PY8_extracted_information['cross_sections'][key][1])/float(n_added)
++
++                # pts plots
++                pts_HwU = None
++                for split_dir in split_dirs:
++                    pts_file = pjoin(split_dir,'pts.dat')
++                    if not os.path.isfile(pts_file):
++                        continue
++                    new_pts_HwU = histograms.HwUList(pts_file,run_id=0)
++                    if pts_HwU is None:
++                        pts_HwU = new_pts_HwU
++                    else:
++                        for i, hist in enumerate(pts_HwU):
++                            pts_HwU[i] = hist + new_pts_HwU[i]
++                if not pts_HwU is None:
++                    pts_HwU.output(pjoin(self.me_dir,'Events',self.run_name,'pts'),format='HwU')
++                    shutil.move(pjoin(self.me_dir,'Events',self.run_name,'pts.HwU'),
++                                pjoin(self.me_dir,'Events',self.run_name,'%s_pts.dat'%tag))
++
++                # HepMC events now.
++                all_hepmc_files = []
++                for split_dir in split_dirs:
++                    hepmc_file = pjoin(split_dir,'events.hepmc')
++                    if not os.path.isfile(hepmc_file):
++                        continue
++                    all_hepmc_files.append(hepmc_file)
++
++                if len(all_hepmc_files)>0:
++                    hepmc_output = pjoin(self.me_dir,'Events',self.run_name,HepMC_event_output)
++                    with misc.TMP_directory() as tmp_dir:
++                        # Use system calls to quickly put these together
++                        header = open(pjoin(tmp_dir,'header.hepmc'),'w')
++                        n_head = 0
++                        for line in open(all_hepmc_files[0],'r'):
++                            if not line.startswith('E'):
++                                n_head += 1
++                                header.write(line)
++                            else:
++                                break
++                        header.close()
++                        tail = open(pjoin(tmp_dir,'tail.hepmc'),'w')
++                        n_tail = 0
++                        for line in misc.BackRead(all_hepmc_files[-1]):
++                            if line.startswith('HepMC::'):
++                                n_tail += 1
++                                tail.write(line)
++                            else:
++                                break
++                        tail.close()
++                        if n_tail>1:
++                            raise MadGraph5Error,'HEPMC files should only have one trailing command.'
++                        ######################################################################
++                        # This is the most efficient way of putting together HEPMC's, *BUT*  #
++                        #    WARNING: NEED TO RENDER THE CODE BELOW SAFE TOWARDS INJECTION   #
++                        ######################################################################
++                        for hepmc_file in all_hepmc_files:
++                            # Remove in an efficient way the starting and trailing HEPMC tags
++                            os.system(' '.join(['sed','-i',"''","'%s;$d'"%
++                                        (';'.join('%id'%(i+1) for i in range(n_head))),hepmc_file]))
++                        os.system(' '.join(['cat',pjoin(tmp_dir,'header.hepmc')]+all_hepmc_files+
++                                                    [pjoin(tmp_dir,'tail.hepmc'),'>',hepmc_output]))
++
++                # We are done with the parallelization directory. Clean it.
++                if os.path.isdir(parallelization_dir):
++                    shutil.rmtree(parallelization_dir)
++
          # Properly rename the djr and pts output if present.
          djr_output = pjoin(self.me_dir,'Events', self.run_name, 'djrs.dat')
          if os.path.isfile(djr_output):
@@ -3875,7 +4162,7 @@
                                              self.run_name, '%s_pts.dat' % tag))
          if not os.path.isfile(pythia_log) or \
--             'PYTHIA Abort' in '\n'.join(open(pythia_log,'r').readlines()[-20]):
++             'PYTHIA Abort' in '\n'.join(open(pythia_log,'r').readlines()[:-20]):
              logger.warning('Fail to produce a pythia8 output. More info in \n     %s'%pythia_log)
              return
@@ -3888,70 +4175,34 @@
          # Study matched cross-sections
          if run_type in merged_run_types:
--            #####
              # From the log file
--            #####
--            # read the line from the bottom of the file
--            pythia_log = misc.BackRead(pjoin(self.me_dir,'Events', self.run_name,
--                                                        '%s_pythia8.log' % tag))
--            # The main89 driver should be modified so as to allow for easier parsing
--            pythiare = re.compile("Les Houches User Process\(es\)\s*\d+\s*\|\s*(?P<tried>\d+)\s*(?P<selected>\d+)\s*(?P<generated>\d+)\s*\|\s*(?P<xsec>[\d\.e\-\+]+)\s*(?P<xsec_error>[\d\.e\-\+]+)")
--            for line in pythia_log:
--                info = pythiare.search(line)
--                if not info:
--                    continue
--                try:
--                    # Pythia cross section in mb, we want pb
--                    sigma_m = float(info.group('xsec')) *1e9
--                    Nacc = int(info.group('generated'))
--                    Ntry = int(info.group('tried'))
--                    if Nacc==0:
--                        raise self.InvalidCmd, 'Pythia8 shower failed since it'+\
--                         ' did not accept any event from the MG5aMC event file.'+\
--                         'You can find more information in this log file:\n%s'%pythia_log
--
--                except ValueError:
--                    # xsec is not float - this should not happen
--                    self.results.add_detail('cross_pythia', 0)
--                    self.results.add_detail('nb_event_pythia', 0)
--                    self.results.add_detail('error_pythia', 0)
--                else:
--                    self.results.add_detail('cross_pythia', sigma_m)
--                    self.results.add_detail('nb_event_pythia', Nacc)
--                    #compute pythia error
--                    error = self.results[self.run_name].return_tag(self.run_tag)['error']
--                    try:
--                        error_m = math.sqrt((error * Nacc/Ntry)**2 + sigma_m**2 *(1-Nacc/Ntry)/Nacc)
--                    except ZeroDivisionError:
--                        # Cannot compute error
--                        error_m = -1.0
--                    # works both for fixed number of generated events and fixed accepted events
--                    self.results.add_detail('error_pythia', error_m)
--                break
--            pythia_log.close()
++            if all(PY8_extracted_information[_] is None for _ in ['sigma_m','Nacc','Ntry']):
++                PY8_extracted_information['sigma_m'],PY8_extracted_information['Nacc'],\
++                    PY8_extracted_information['Ntry'] = self.parse_PY8_log_file(
++                      pjoin(self.me_dir,'Events', self.run_name,'%s_pythia8.log' % tag))
++
++            if not any(PY8_extracted_information[_] is None for _ in ['sigma_m','Nacc','Ntry']):
++                self.results.add_detail('cross_pythia', PY8_extracted_information['sigma_m'])
++                self.results.add_detail('nb_event_pythia', PY8_extracted_information['Nacc'])
++                # Compute pythia error
++                error = self.results[self.run_name].return_tag(self.run_tag)['error']
++                try:
++                    error_m = math.sqrt((error * Nacc/Ntry)**2 + sigma_m**2 *(1-Nacc/Ntry)/Nacc)
++                except ZeroDivisionError:
++                    # Cannot compute error
++                    error_m = -1.0
++                # works both for fixed number of generated events and fixed accepted events
++                self.results.add_detail('error_pythia', error_m)
++
              if self.run_card['use_syst']:
                      self.results.add_detail('cross_pythia', -1)
                      self.results.add_detail('error_pythia', 0)
--            #####
++
              # From the djr file generated
--            #####
              djr_output = pjoin(self.me_dir,'Events',self.run_name,'%s_djrs.dat'%tag)
--            cross_sections = None
--            if os.path.isfile(djr_output):
--                run_nodes = minidom.parse(djr_output).getElementsByTagName("run")
--                all_nodes = dict((int(node.getAttribute('id')),node) for
--                                                              node in run_nodes)
--                try:
--                    selected_run_node = all_nodes[0]
--                except:
--                    selected_run_node = None
--                if selected_run_node:
--                    xsections = selected_run_node.getElementsByTagName("xsection")
--                    # We need to translate PY8's output in mb into pb
--                    cross_sections = dict((xsec.getAttribute('name'),
--                    (float(xsec.childNodes[0].data.split()[0])*1e9,
--                     float(xsec.childNodes[0].data.split()[1])*1e9))
--                                                          for xsec in xsections)
++            if os.path.isfile(djr_output) and len(PY8_extracted_information['cross_sections'])==0:
++                PY8_extracted_information['cross_sections'] = self.extract_cross_sections_from_DJR(djr_output)
++            cross_sections = PY8_extracted_information['cross_sections']
              if cross_sections:
                  # Filter the cross_sections specified an keep only the ones
                  # with central parameters and a different merging scale
@@ -3969,11 +4220,10 @@
                      self.results.add_detail('cross_pythia8', cross_sections[central_scale][0])
                      self.results.add_detail('error_pythia8', cross_sections[central_scale][1])
--                #if len(cross_sections)>0:
--                #    logger.info('Pythia8 merged cross-sections are:')
--                #    for scale in sorted(cross_sections.keys()):
--                #        logger.info(' > Merging scale = %-6.4g : %-11.5g +/- %-7.2g [pb]'%\
--                #        (scale,cross_sections[scale][0],cross_sections[scale][1]))
++                #logger.info('Pythia8 merged cross-sections are:')
++                #for scale in sorted(cross_sections.keys()):
++                #   logger.info(' > Merging scale = %-6.4g : %-11.5g +/- %-7.2g [pb]'%\
++                #               (scale,cross_sections[scale][0],cross_sections[scale][1]))
              xsecs_file = open(pjoin(self.me_dir,'Events',self.run_name,
                                                   '%s_merged_xsecs.txt'%tag),'w')
@@ -4007,6 +4257,43 @@
              self.exec_cmd('delphes --no_default', postcmd=False, printcmd=False)
          self.print_results_in_shell(self.results.current)
++    def parse_PY8_log_file(self, log_file_path):
++        """ Parse a log file to extract number of event and cross-section. """
++        pythiare = re.compile("Les Houches User Process\(es\)\s*\d+\s*\|\s*(?P<tried>\d+)\s*(?P<selected>\d+)\s*(?P<generated>\d+)\s*\|\s*(?P<xsec>[\d\.e\-\+]+)\s*(?P<xsec_error>[\d\.e\-\+]+)")
++        for line in misc.BackRead(log_file_path):
++            info = pythiare.search(line)
++            if not info:
++                continue
++            try:
++                # Pythia cross section in mb, we want pb
++                sigma_m = float(info.group('xsec')) *1e9
++                Nacc = int(info.group('generated'))
++                Ntry = int(info.group('tried'))
++                if Nacc==0:
++                    raise self.InvalidCmd, 'Pythia8 shower failed since it'+\
++                     ' did not accept any event from the MG5aMC event file.'
++                return sigma_m, Nacc, Ntry
++            except ValueError:
++                return None,None,None
++        raise self.InvalidCmd, "Could not find cross-section and event number information "+\
++                         "in Pythia8 log\n  '%s'."%log_file_path
++
++    def extract_cross_sections_from_DJR(self,djr_output):
++        """Extract cross-sections from a djr XML output."""
++        run_nodes = minidom.parse(djr_output).getElementsByTagName("run")
++        all_nodes = dict((int(node.getAttribute('id')),node) for
++                                                      node in run_nodes)
++        try:
++            selected_run_node = all_nodes[0]
++        except:
++            return {}
++        xsections = selected_run_node.getElementsByTagName("xsection")
++        # We need to translate PY8's output in mb into pb
++        return dict((xsec.getAttribute('name'),
++        [float(xsec.childNodes[0].data.split()[0])*1e9,
++         float(xsec.childNodes[0].data.split()[1])*1e9])
++                                              for xsec in xsections)
++
      def do_pythia(self, line):
          """launch pythia"""
 === modified file 'madgraph/various/banner.py'
 --- madgraph/various/banner.py	2016-09-02 02:17:02 +0000
 +++ madgraph/various/banner.py	2016-09-03 09:10:54 +0000
@@ -1483,6 +1483,17 @@
          self.add_param("Merging:Dparameter", 0.4, hidden=True, always_write_to_card=False)
          self.add_param("Merging:doPTLundMerging", False, hidden=True, always_write_to_card=False)
++        # Special Pythia8 paremeters useful to simplify the shower.
++        self.add_param("BeamRemnants:primordialKT", False, hidden=True, always_write_to_card=True)
++        self.add_param("PartonLevel:Remnants", False, hidden=True, always_write_to_card=True)
++        self.add_param("Check:event", False, hidden=True, always_write_to_card=True)
++        self.add_param("TimeShower:QEDshowerByQ", False, hidden=True, always_write_to_card=True)
++        self.add_param("TimeShower:QEDshowerByL", False, hidden=True, always_write_to_card=True)
++        self.add_param("SpaceShower:QEDshowerByQ", False, hidden=True, always_write_to_card=True)
++        self.add_param("SpaceShower:QEDshowerByL", False, hidden=True, always_write_to_card=True)
++        self.add_param("PartonLevel:FSRinResonances", False, hidden=True, always_write_to_card=True)
++        self.add_param("ProcessLevel:resonanceDecays", False, hidden=True, always_write_to_card=True)
++
          # Add parameters controlling the subruns execution flow.
          # These parameters should not be part of PY8SubRun daughter.
          self.add_default_subruns('parameters')
@@ -1613,9 +1624,9 @@
              return "%s" % value
          elif formatv == 'list':
              if len(value) and isinstance(value[0],float):
--                return ', '.join([PY8Card.pythia8_formatting(arg, 'shortfloat') for arg in value])
++                return ','.join([PY8Card.pythia8_formatting(arg, 'shortfloat') for arg in value])
              else:
--                return ', '.join([PY8Card.pythia8_formatting(arg) for arg in value])
++                return ','.join([PY8Card.pythia8_formatting(arg) for arg in value])
      def write(self, output_file, template, read_subrun=False,
 === modified file 'madgraph/various/histograms.py'
 --- madgraph/various/histograms.py	2016-08-31 03:10:04 +0000
 +++ madgraph/various/histograms.py	2016-09-03 09:10:54 +0000
@@ -13,7 +13,6 @@
  # For more information, visit madgraph.phys.ucl.ac.be and amcatnlo.web.cern.ch
+ #
  ################################################################################
--
  """Module for the handling of histograms, including Monte-Carlo error per bin
  and scale/PDF uncertainties."""
@@ -681,7 +680,7 @@
      def __init__(self, file_path=None, weight_header=None,
--                            raw_labels=False, consider_reweights='ALL', **opts):
++                raw_labels=False, consider_reweights='ALL', selected_central_weight=None, **opts):
          """ Read one plot from a file_path or a stream. Notice that this
          constructor only reads one, and the first one, of the plots specified.
          If file_path was a path in argument, it would then close the opened stream.
@@ -711,7 +710,9 @@
              weight_header = HwU.parse_weight_header(stream, raw_labels=raw_labels)
          if not self.parse_one_histo_from_stream(stream, weight_header,
--                  consider_reweights=consider_reweights, raw_labels=raw_labels):
++                  consider_reweights=consider_reweights,
++                  selected_central_weight=selected_central_weight,
++                  raw_labels=raw_labels):
              # Indicate that the initialization of the histogram was unsuccessful
              # by setting the BinList property to None.
              super(Histogram,self).__setattr__('bins',None)
@@ -1092,7 +1093,7 @@
              return ' '.join(res)
      def parse_one_histo_from_stream(self, stream, all_weight_header,
--                                    consider_reweights='ALL', raw_labels=False):
++                consider_reweights='ALL', raw_labels=False, selected_central_weight=None):
          """ Reads *one* histogram from a stream, with the mandatory specification
          of the ordered list of weight names. Return True or False depending
          on whether the starting definition of a new plot could be found in this
@@ -1136,10 +1137,14 @@
                  if j == len(all_weight_header):
                      raise HwU.ParseError, "There is more bin weights"+\
                                " specified than expected (%i)"%len(weight_header)
++                if selected_central_weight == all_weight_header[j]:
++                    bin_weights['central'] = float(weight.group('weight'))
                  if all_weight_header[j] == 'boundary_xmin':
                      boundaries[0] = float(weight.group('weight'))
                  elif all_weight_header[j] == 'boundary_xmax':
--                    boundaries[1] = float(weight.group('weight'))
++                    boundaries[1] = float(weight.group('weight'))
++                elif all_weight_header[j] == 'central' and not selected_central_weight is None:
++                    continue
                  elif all_weight_header[j] in weight_header:
                      bin_weights[all_weight_header[j]] = \
                                             float(weight.group('weight'))
@@ -1439,6 +1444,14 @@
          # of the two new weight label added.
          return (position,labels)
++    def select_central_weight(self, selected_label):
++        """ Select a specific merging scale for the central value of this Histogram. """
++        if selected_label not in self.bins.weight_labels:
++            raise MadGraph5Error, "Selected weight label '%s' could not be found in this HwU."%selected_label
++
++        for bin in self.bins:
++            bin.wgts['central']=bin.wgts[selected_label]
++
      def rebin(self, n_rebin):
          """ Rebin the x-axis so as to merge n_rebin consecutive bins into a
          single one. """
@@ -1602,7 +1615,7 @@
      def __init__(self, file_path, weight_header=None, run_id=None,
              merging_scale=None, accepted_types_order=[], consider_reweights='ALL',
--                                                     raw_labels=False, **opts):
++                                                         raw_labels=False, **opts):
          """ Read one plot from a file_path or a stream.
          This constructor reads all plots specified in target file.
          File_path can be a path or a stream in the argument.
@@ -1630,7 +1643,7 @@
              self.parse_histos_from_PY8_XML_stream(stream, run_id,
                      merging_scale, accepted_types_order,
                      consider_reweights=consider_reweights,
--                    raw_labels=raw_labels)
++                    raw_labels=raw_labels)
          except XMLParsingError:
              # Rewinding the stream
              stream.seek(0)
@@ -1638,19 +1651,34 @@
              if not weight_header:
                  weight_header = HwU.parse_weight_header(stream,raw_labels=raw_labels)
++            # Select a specific merging scale if asked for:
++            selected_label = None
++            if not merging_scale is None:
++                for label in weight_header:
++                    if HwU.get_HwU_wgt_label_type(label)=='merging_scale':
++                        if float(label[1])==merging_scale:
++                            selected_label = label
++                            break
++                if selected_label is None:
++                    raise MadGraph5Error, "No weight could be found in the input HwU "+\
++                      "for the selected merging scale '%4.2f'."%merging_scale
++
              new_histo = HwU(stream, weight_header,raw_labels=raw_labels,
--                                          consider_reweights=consider_reweights)
++                            consider_reweights=consider_reweights,
++                            selected_central_weight=selected_label)
++#            new_histo.select_central_weight(selected_label)
              while not new_histo.bins is None:
                  if accepted_types_order==[] or \
                                           new_histo.type in accepted_types_order:
                      self.append(new_histo)
                  new_histo = HwU(stream, weight_header, raw_labels=raw_labels,
--                                          consider_reweights=consider_reweights)
--
--            if not run_id is None:
--                logger.info("The run_id '%s' was specified, but "%run_id+
--                            "format of the HwU plot source is the MG5aMC"+
--                            " so that the run_id information is ignored.")
++                                consider_reweights=consider_reweights,
++                                selected_central_weight=selected_label)
++
++        #    if not run_id is None:
++        #        logger.debug("The run_id '%s' was specified, but "%run_id+
++        #                    "format of the HwU plot source is the MG5aMC"+
++        #                    " so that the run_id information is ignored.")
          # Order the histograms according to their type.
          titles_order = [h.title for h in self]
@@ -3229,54 +3257,6 @@
  ################################################################################
  ## matplotlib related function
  ################################################################################
--######## Routine from https://gist.github.com/thriveth/8352565
--######## To fill for histograms data in matplotlib
--def fill_between_steps(x, y1, y2=0, h_align='right', ax=None, **kwargs):
--    ''' Fills a hole in matplotlib: fill_between for step plots.
--    Parameters :
--    ------------
--    x : array-like
--        Array/vector of index values. These are assumed to be equally-spaced.
--        If not, the result will probably look weird...
--    y1 : array-like
--        Array/vector of values to be filled under.
--    y2 : array-Like
--        Array/vector or bottom values for filled area. Default is 0.
--    **kwargs will be passed to the matplotlib fill_between() function.
--    '''
--    # If no Axes opject given, grab the current one:
--    if ax is None:
--        ax = plt.gca()
--
--
--    # First, duplicate the x values
--    #duplicate the info # xx = numpy.repeat(2)[1:]
--    xx= []; [(xx.append(d),xx.append(d)) for d in x]; xx = xx[1:]
--    # Now: the average x binwidth
--    xstep = x[1] -x[0]
--    # Now: add one step at end of row.
--    xx.append(xx[-1] + xstep)
--
--    # Make it possible to change step alignment.
--    if h_align == 'mid':
--        xx = [X-xstep/2. for X in xx]
--    elif h_align == 'right':
--        xx = [X-xstep for X in xx]
--
--    # Also, duplicate each y coordinate in both arrays
--    yy1 = []; [(yy1.append(d),yy1.append(d)) for d in y1]
--    if isinstance(y1, list):
--        yy2 = []; [(yy2.append(d),yy2.append(d)) for d in y2]
--    else:
--        yy2=y2
--
--    # now to the plotting part:
--    ax.fill_between(xx, yy1, y2=yy2, **kwargs)
--
--    return ax
--######## end routine from https://gist.github.com/thriveth/835256
--
--
  def plot_ratio_from_HWU(path, ax, hwu_variable, hwu_numerator, hwu_denominator, *args, **opts):
      """INPUT:
         - path can be a path to HwU or an HwUList instance
 === modified file 'madgraph/various/lhe_parser.py'
 --- madgraph/various/lhe_parser.py	2016-09-01 09:14:58 +0000
 +++ madgraph/various/lhe_parser.py	2016-09-03 09:10:54 +0000
@@ -513,19 +513,26 @@
          else:
              return out
--    def split(self, nb_event=0):
++    def split(self, nb_event=0, partition=None, cwd=os.path.curdir, zip=False):
          """split the file in multiple file. Do not change the weight!"""
          nb_file = -1
          for i, event in enumerate(self):
--            if i % nb_event == 0:
++            if (not (partition is None) and i==sum(partition[:nb_file+1])) or \
++                                   (partition is None and i % nb_event == 0):
                  if i:
                      #close previous file
                      current.write('</LesHouchesEvent>\n')
                      current.close()
                  # create the new file
                  nb_file +=1
--                current = open('%s_%s.lhe' % (self.name, nb_file),'w')
++                # If end of partition then finish writing events here.
++                if not partition is None and (nb_file+1>len(partition)):
++                    return nb_file+1
++                if zip:
++                    current = EventFile(pjoin(cwd,'%s_%s.lhe.gz' % (self.name, nb_file)),'w')
++                else:
++                    current = open(pjoin(cwd,'%s_%s.lhe' % (self.name, nb_file)),'w')
                  current.write(self.banner)
              current.write(str(event))
          if i!=0: