>I am not sure how interesting it is to run our tests on a debian system on deploy. Do you think we should check that in? That is very late in the cycle of our product to catch bugs.

(I'm not quite sure how to interpret this, are you thinking about the branch/build/package/deploy/run tests-cycle or release stable version-work-work-work-new release candidate-cycle? Let me know I'm responding to something completely different.)

My starting point for this experiment was the PPA builders which checks out and builds widelands in a clean environment, so that felt like a good starting point when attempting to add the regression test suite to the process. I didn't know exactly how to go about it, but the idea was to either run the tests on a recently-built binary just before it got packaged up or alternatively with the freshly made package. I couldn't get the former to work, but I added the package tests which I got running locally though unfortunately the PPA builders won't run them.

So the idea was to run these on each development build, but that seems tricky at least with Launchpad's current setup. However, support for running the tests has been added to the package so there is nothing blocking on our side.

For a small digression, my master plan on the Debian packaging is that it should be trivial to take the next release candidate we release together with our updated packaging and have the first set of packages ready on day 1. The majority of changes which would be necessary going from build18 to build19 is adding and removing build dependencies + adjusting cmake invocation.The Debian maintainer would have needed to do at some point anyways, but since we have continuously updated it throughout the development cycle, that work is already done. With the exception of Debian policies/guidelines we haven't heard of, that's the majority of the work needed to get the next Widelands release into Debian/Ubuntu/Linux Mint/etc already done. 

Now, why is this relevant? Because if a package has tests, those tests will be run. I've triaged bugs in Ubuntu/Debian for some years now, and I've seen a surprisingly, even embarrassingly high number of packages where the main program either crash on startup or at the first sign of user interaction. (Mostly smaller, lesser-known programs, but still...) So I think even a smoketest which verifies the program starts and runs can be a useful gain.

More importantly, it can be used to verify the program still works after other changes in the system (this might be addressing your "on deploy" comment). Debian's CI system [1] states that the tests for a package will be run:
* when any package in the dependency chain of its binary packages changes;
* when the package itself changes;
* when 1 month is passed since the test suite was run for the last time.

For an example, let's say a new version of boost or SDL is packaged. In this case any dependent package which have tests will run them with the new version of the library. If Widelands for some reason fails with a newer version of this library, this problem can be discovered, pin-pointed and fixed immediately. Rather than the current situation where someone would report "hey, how come Widelands doesn't start anymore?" and then hopefully someone would be able to track down that one of the latest library updates broke it. (This example is somewhat contrived, since a change in boost or SDL would probably trigger a rebuild of all dependent packages, but you get the idea.)

>As said, I am on the fence if this is worth merging. You have a better understanding of the implications, so I think you should decide.

Biased as I am, I would say that this still adds value. Unfortunately it won't help us all that much in the day to day development and won't benefit Widelands directly as much as I had hoped when I started out. But I believe it can strengthen and ensure the quality/confidence in the official Widelands packages on Debian and Ubuntu. And that is the version I believe the majority of our users on those platforms will play. 

> we cannot have automated nightly testing from our bzr branches
We sorta can, but would need to manually configure it for each branch first. I do see that GitHub's ecosystem offers some nice possibilities.


[1] https://ci.debian.net/doc/

PS. Yes, I'll add a note to the document. Maybe not today, but hey, I finally wrote a reply here. Is there a tentative deadline for adding thoughts and comments to that document, btw?