How documentation is done in OpenStack
We use git a lot for documentation in OpenStack so that we can “treat documents like code” – and I see this trend in a lot of places, especially in the Write the documents community.
Check out these talks from the Write the Docs 2014 event in Portland:
In OpenStack, we have documentation workflows that mimic our code collaboration. We release fixes for people to review, we review other people’s fixes (similar to a pull request on GitHub), and we have build and test automation for each fix doc. The idea is to use the collaboration present in the GitHub checkout request workflow for documents as well as code. We’re all responsible for the relevant and accurate documentation of about 25 OpenStack projects written in Python across 130 git repositories, so let’s work together. I get questions from writers new to these types of workflows, so I wanted to collect some of the best practices we’ve found and learn more.
How do you deal with many requests to extract documents?
OpenStack is a popular open source project, the documentation must therefore adapt to many contributions and contributors. We have systems in place that allow us to merge up to 50 doc fixes per day, although it’s usually around 15. Since OpenStack uses Gerrit, some of my advice is specific to this web-based team software code review tool, not GitHub. But, these are guidelines that should apply universally. I also highly recommend “How Project Owners Use Pull Requests on GitHub“which provides survey results and extracts themes about how integrators use pull requests for further reading.
We deal with many patch requests (in OpenStack these are not technical pull requests) with these tools and processes:
If I’m missing your favorite assistant, please comment below!
We have “gate tests” which automate much of the initial quality checks for all of our community’s contributions. Our door tests verify these things:
- “Are the documents for this patch compiled?” “
- “Is the syntax correct in all patched files?” “
- âAre all the links working? “
- “Does the patch remove files used by other deliverables?” “
These four tests must be passed to allow fusion, so they are a real gatekeeper for us. For added efficiency, these tests are only performed if they are relevant to the patch. Thus, if no files were deleted in a patch, the test for deletion is not performed. This saves humans the time to fix it locally and run the tests manually. There are other tests that report but don’t actually prevent a fix from coming through the door, such as “Do translations still work with this fix?” Continuous integration (CI) for documentation is a game-changer. I don’t think I can stress this enough, but this article focuses on scaling literature reviews. If you would like to learn more about OpenStack CI systems, see ci.openstack.org.
Testing the technical accuracy of a patch is the time consuming part, and it is difficult to predict how long it will take to review the documentation. It is essential that a human verifies the technical correctness of all contributions to documents, and we rely on our reviewers here. Having environments configured that allow you to test user actions is part of the reviewer role and will save you time. DevStack is a collection of scripts for running a configured development environment, suitable for local configuration. TryStack is a free public cloud that runs on donated hardware and resources. For example, if I have a working, stable branch-based DevStack environment for OpenStack, I can run client commands on a known version of OpenStack. I can have administrator access to a DevStack environment that I manage myself. I also have a TryStack account, which gives me free cloud resources through CLI or API commands. I also have a Rackspace Cloud account which allows me to test API calls.
We have a shortcut system that allows a contributor to put phrases like “Closes-bug: nnnnnn” in the commit message, where nnnnnn comes from the Launchpad system (our bug tracker). This link between a bug doc and the post itself is handy to see if the fix fixes the issues with the recorded bug. I will often go through a fix by clicking the bug first, reading the comments, and then seeing if the fix fixes what was broken. You also want to be sure that your process ensures that contributors know if a bug is accepted as a bug. In our system, it should be set to Confirmed. In GitHub, you need to use issue tags and also link to the issue from your pull request.
In OpenStack, we can merge up to 70 fixes (only for documents) in a day, so consistent validation messages make it easier to analyze to find fixes that you can refer to with your basic knowledge. It sounds difficult, but understanding that you can watch 50 a day helps you understand the rationale for the thoroughness. Here is a summary of our standard:
- Provide a brief description of the change in the first line.
- Insert a single blank row after the first row.
- Provide a detailed description of the change in the following lines, interrupting paragraphs if necessary.
- The first line must be limited to 50 characters and must not end with a period (commit messages of more than 72 characters will be rejected by the portal).
- The following lines must contain 72 characters.
All reviewers know that we use a set of Agreements agreed for OpenStack documents, and we can report them when reviewing a patch. This set of standards helps us verify consistency of service names, legal capitalization, and structural conventions such as âno stacked headersâ. We post them on the OpenStack wiki and any changes to it are discussed on the openstack-docs mailing list before any drastic changes. When the wiki page does not answer a problem, we use the IBM Style Guide as the final arbiter for all questions of style or convention.
With Gerrit, we can do specific research for the relevant fixes to review. For example, I created a dashboard for all OpenStack document repositories (link requires login). You can also have a dashboard only for fixes impacting APIs. Really, with the search feature you can customize your dashboard to look at what you want and prioritize however you want.
With many meetings on our calendars, it’s a good idea to allow time for revisions. Sometimes people want to take stock at the end of their workday, or first to get a lot of the job done. It’s really up to you to prioritize the reviews in your workday. I use a calendar item as a reminder and to block time for reviews.
What is the expected turnaround time for a review?
In reality, it depends on the size of the patch or checkout request, as well as the number of reviewers who know enough to review this patch. We use data analytics to measure our review turnaround time in OpenStack. In the last five months or so, we’ve been doing this kind of critical recovery.
- Total number of reviews: 2102 (70.1 per day)
- Total number of reviewers: 113 (0.6 per assessor per day)
- Total number of reviews per core team: 1570 (52.3 per day)
- Core team size: 21 (2.5 per kernel per day)
Honestly, we have one reviewer (Andreas Jaeger) who is superhuman, and our active core team size is more like 15 than 21. In OpenStack, primary reviewers are the ones who can post to the site. Anyone can view a doc patch. I would like to spend at least 2-10 reviews per day. In a good exam week, I can get about 60 exams. So the expectation that I had set for our document contributors is about 3 to 5 days to get feedback from reviewers that they can respond to.
Editors may be reluctant to work with others on deliverables, and developers may find that they don’t have much to contribute to the documentation. I assert that no one can know everything, so distribute the workload by writing together as you collaborate on the code. Hope these tips help you match your docs workflow to git and GitHub workflows so you can speed up your collaborative writing, especially in open source.