Gotta release fast
-
I love the discussion that this has garnered. I think we can take all of the points to bring together a better plan soon.
The criteria for moving from rc to stable needs to be clearer or stricter. I suggest that any new issues on rc (i.e. ones that don't also exist on stable) should default to critical-rc unless there is an explicit group decision otherwise.
Good choice. It would require some very on-the-ball triagers keeping track of bugs and being able to mark anything new in this RC as a regression. I'm not sure GitHub or GitLab automation would have the tools we need to make that automatic.
Maybe there could be a place where they can sign off that they have "used it for 24 hours on XXX device" to feed into the release-to-stable decision?
Another good choice. Something as simple as a form where they enter their device, the version they're using, and say "yes" or "no" could work. Having trusted testers in our QA group with ultimate sign-off powers is what I'd come back with, though.
Could we perhaps document where we differ now from what processes and features were in place at that [Canonical] time, so we can perhaps bring back some of those in a manner suitable for UBports, and then go from there?
Hmm. I don't have a clear understanding of what those processes were, to be honest. I was just getting started with UBports when Canonical stopped releasing updates altogether. Could you give a clearer picture of what you're looking for, maybe some wiki.ubuntu.com pages?
This post is my first try for getting our processes compiled into a coherent place. Later, a fully fleshed out release document would be placed on docs.ubports.com.
Once a month for stable channel is plenty fast enough a cadence, and we could do interim security fix releases if necessary.
A good plan on its face, but the rolling development nature might make the "Interim security releases" part sticky. What if there's an in-image security fix at week 2 but we merged in a feature at week 1 that is still undergoing full QA (since we have a month, right?) prior to release? I think this situation would be fairly common and I wouldn't be comfortable pulling the release trigger in that case.
the tests situation is really bad now with UBports, as many packages just have tests disabled during the deb builds, autopilot tests aren't being run anywhere, and we don't have the infrastructure set up to be building with coverage enabled.
Correct on all counts. While Autopilot might be a problem (no one is maintaining it), autopkgtests should definitely be running.
System updates being released too often on stable channel just makes the product seem immature, while not enough updates makes it look unmaintained. We need to find the happy medium there.
Maybe if not weekly, then every 2 weeks. My primary concern is discouraging a mindset of "bring it in for QA, the release is 4 weeks away" and encouraging "QA, can you look at this a lot before we bring it in?" (this applies to @3arn0wl's quote from @elastic, too)
I think the point is to pick up almost every UT user, since there are not many of them. There are still users using Canonical version out there.
Not really. The only way we'll reach the people on Canonical builds is with marketing and word of mouth. The release cadence doesn't relate to that.
In some projects we used more layers to solve that kind of problems. In UT world the approach could be to introduce one more release channel
Who is your ideal user for the fourth channel? Right now, the split is fairly clear:
- developers who are merging things into the image should have
devel
on at least one device which they update extremely often. - Users who want faster updates and QA testers should have
rc
on all their devices - Users who want only the most tested software should have
stable
on all their devices.
I'm not sure where "testing" would fit in here.
But there should be a way to deliver critical bug fixes on any channel instantly.
The rolling release model shoots this as far as I can see. That's why I'd prefer a faster, but scheduled, release.
- developers who are merging things into the image should have
-
@unisuperbox said in Gotta release fast:
Good choice. It would require some very on-the-ball triagers keeping track of bugs and being able to mark anything new in this RC as a regression. I'm not sure GitHub or GitLab automation would have the tools we need to make that automatic.
UBports has an unpaid (or even paying!) community who want to get involved but may not have the skills or resources to help with writing the code. Reviewing issue reports is an essential task with a modest learning curve and can be done with limited resources such as time.
All that is required is to:
- Ensure the report describes a problem
- Ensure there is enough information to reproduce the problem
- Ensure the affected device(s), channel(s) and version(s) are mentioned
- Mark the issue as "checked"
It should be clear (but probably isn't) that this is something anyone with "5 minutes" can help with. It amounts to putting the issue into this outline:
- Title
- How to reproduce
- Expected result
- Actual result
- Devices, channels & versions
- Confirmed by
The above has to happen before additional steps, that requires more knowledge or skills, can be taken:
- Ensure the bug is reported against the right project
- Assess the impact
- Prioritize the work to fix
- Do the work to fix
-
@unisuperbox said in Gotta release fast:
Who is your ideal user for the fourth channel? Right now, the split is fairly clear:
I try to clarify my point of view and map suggested channels to the current channels we have:
devel -> current devel
testing -> current rc
rc -> current stable
stable -> kind of LTS release; for users who don't want to download the entire footfs image every week, but want to receive security updatesIt is not relevant ATM, since most users are kind of beta testers. But it could be important when the community grows.
-
I don't know how much effort is needed every stable release but perhaps we can have a regular OTA and a some kind of a mini-OTA midway. This mini-OTA will include small bug fixes that won't most likely create regressions. An example of this are the layout issues in core apps. Regular OTA will be for bigger fixes and new features.
-
@kugiigi said in Gotta release fast:
I don't know how much effort is needed every stable release but perhaps we can have a regular OTA and a some kind of a mini-OTA midway. This mini-OTA will include small bug fixes that won't most likely create regressions. An example of this are the layout issues in core apps. Regular OTA will be for bigger fixes and new features.
That sounds plausible, but I've worked (or consulted) on a lot of software projects and this is hard to make work. It is very difficult to manage updating both a released version and a development version without more than doubling the workload, It turns out there can be unexpected interactions between apparently small, safe changes.
It is better to be in a position where there's no need to manage two sets of changes. This does have risks (which is why devel breaks) and we have to mitigate those risks. One way is to release often so that the number of changes in each release is small, another is to have easy ways to revert to a previous good release, another is to release to a small group before rollout.
With the proposed use of rc a group of "canaries" try changes on rc first to identify unexpected issues.
Which reminds me: one essential test is that rc can revert back to stable in the event of problems.
-
@alan_g One should always be able to revert to the previous image in the same channel as well. The main place where this becomes problematic, is for things which are shipped as clicks, rather than as the more immutable system image. Reverting to previous clicks is nigh impossible (without already having the old click), and we don't have stable/testing/devel channels for things in the app store.
-
@dobey said in Gotta release fast:
@alan_g One should always be able to revert to the previous image in the same channel as well. The main place where this becomes problematic, is for things which are shipped as clicks, rather than as the more immutable system image. Reverting to previous clicks is nigh impossible (without already having the old click), and we don't have stable/testing/devel channels for things in the app store.
That is good both for the user and for the channel owner. Is that supported?
-
@alan_g said in Gotta release fast:
That is good both for the user and for the channel owner. Is that supported?
I would say so, though there isn't UI for it at the moment. We could probably make some changes to the UI to make it easier to select any of the available images in a channel, to switch to (assuming we only keep N images in a channel, rather than all images ever built).
Currently one needs to do the revert either using
ubuntu-device-flash
or by usingsystem-image-cli
directly on the device via an adb/ssh shell. -
@dobey Would be good to have an easy UI rather than having to resort to terminal etc. Only I don't know how difficult it would be to include.
-
For clicks:
The OpenStore can (and does) downgrade Clicks when it sees fit, but there is currently no way in the UI to select an older app version to download and install.
For images:
There is currently no way to select which image in a channel you wish to use. You have the latest, always.
-
@unisuperbox said in Gotta release fast:
There is currently no way to select which image in a channel you wish to use. You have the latest, always.
You can specify which revision of an image to install, and from which channel, with the two tools I mentioned. I presume the installer always uses the latest image and has no way to specify a build number, but I do not know with certainty, so I didn't mention it.
I think it would be pretty easy to add something to the System Settings updates/channels panel though.
-
As we near the OTA-6 release, I would like to gather what I believe are the most important points from this post:
- Everyone wants "release faster", without a doubt.
- "Release faster" depends on having enough confidence that software being released to
stable
is well-tested, which we do not have. Some things which would translate into more release confidence include:- Automated tests on all system components
- Integration tests between system components
- Automated full-device testing, such as Canonical's fabled "Frankenstein device lab"
- Formalized manual testing by users who can confirm stability
- Even with full release confidence (but especially without), an automatic release model requires the ability to roll back to a previous version of apps and the full system image
- Aside: I want to experiment with one of the new atomic release models like OSTree which allows agnostic packaging formats to be installed on top. From my quick foray, I see:
- Benefits: Automatic and manual rollbacks, automatic diffing, and switchable roots
- Drawbacks that make this a long-term or impossible idea for our existing devices
- Probably requires newer kernel versions than 3.4 or 3.10
- We don't have enough engineering power to concoct a solution which would allow converting a system-image system to OSTree in-flight, so we'd probably require users to manually switch.
- Aside: I want to experiment with one of the new atomic release models like OSTree which allows agnostic packaging formats to be installed on top. From my quick foray, I see:
I think that bringing up the
edge
channel as a place to do a large migrations out-of-band with the normal release cadence is going to have a huge impact on how quickly we can release in the future. Previously, @mariogrip's work on bringing us to upstream Libhybris would have had to wait until after OTA-6 lands, and then it would have delayed OTA-7 until we had proper confidence in it. Withedge
, we can have people help us test these huge changes with the ability to roll back and without disrupting current users.To respond to my earlier hindsight: OTA-6 was not (yet ) taken away by TDS. We did not hit the original deadline set, but we will still be within our 6-8 week cycle when it releases. This includes our testing and release admin stage, which we are currently in. All of this while I'm not sweating bullets from pushing a release on ourselves before we're sure we're ready.
We have ended up in "Gotta release slower" in this cycle without a doubt. I hope to make the improvements we've identified here so "gotta release fast" can become a reality. I also hope to be able to take our release management up so that we hit 2-4 weeks development, 2 weeks testing. This means we'll get faster releases, but we still won't hit fast releases.
Important note for anyone interested in contributing to... basically anything I've said here: I'll be holding an OTA-7 development meeting at the start of the cycle where we'll discuss how we want to improve during the OTA-7 cycle and taking bugs from the tracker for the release. Only assigned tickets will be added to this milestone, so if you have a pet bug that you want fixed now is the time to get involved to help fix it! Subscribe to this GitLab issue for more information as the day draws nearer.