Upcoming New Developer Documentation Platform - Replacing Wiki

ZedDB · January 3, 2024, 10:19am

I’m a bit confused, I thought we were talking about documentation in general and not just release notes?
If this is only about release notes, I don’t think pulling in the whole doc repo is a good idea.
We could do what some other projects do and have a concise and text only release notes that will be automatically bundled when creating a tag for a release.
(These can be also be automatically generated with either a white or a black list.)

Then that can act as a good checklist to see if all relevant changes has been added to the more fancy release notes in the docs.

While it is a bit more work to create pull requests in other repos, I don’t think it is that much to ask. Especially now when forking and creating PRs are very easy with gitea. If we were still phabricator I would agree, now I don’t really see this as an issue.

On the other had I don’t think it is realistic for us to force people to clone down a huge repo when people most of the time only want to work on a subset of it.

Especially the tests and the manual can balloon is size quite quickly. Currently the code repo is around 1.3GB. If we add in the manual and test repo I can see it easily taking up tens of gigabytes.

I don’t think it is realistic to ask someone to clone down tens of gigabytes that is useless to them just because we wanted to save a small subset of people a few clicks when forking and merging.

brecht · January 3, 2024, 11:00am

We are talking about the developer docs, what you can see on Blender Developer Documentation. Not the user manual.

In my experience those things are unreadable for end users unless someone edits them. Especially for the size of the Blender project. Also a problem is that we want users to be able to see these things when testing features before the release is out.

Really disagree with this, juggling that many branches and pull requests is not easy. Also for me as someone who is experienced with this thing, it’s a bunch of work.

The plan is to host the images and videos outside the repo. That means it will be about 4 MB of markdown files, that compress down to 1.2 MB. A fresh Blender code repo clone is 719 MB.

jacqueslucke · January 3, 2024, 12:40pm

I’d prefer a single repo that contains the code as well as the developer docs. Main reason for me is that it would feel much worse to change code but not the corresponding docs, compared to when the docs are in a separate repo where I have to commit it separately anyway. This should make it more likely that I keep my docs up to date.

I’m not concerned about the size of the repo if drive.blender.org becomes a thing.

I’m not concerned about the number of commits. I don’t think it would really significantly change how long it takes to scan the commit log, assuming the commit titles follow the existing guidelines. One thing I wonder is whether commits that only change docs should have Docs: in the commit title. Currently, I think that titles like Cleanup: Docs: typo, Geometry Nodes: Docs: clarify lazy evaluation, Animation: Docs: describe new animation system would be good. Using Docs: Geometry Nodes: ... could work goo, but I think the other way around is better because it allows people to filter by topic more easily.

Overall, it feels much easier to solve the issues with a mono-repo than a multi-repo with some extra tooling and documentation.

I have two questions:

How to handle proposals that we used to put on the wiki sometimes. I assume we shouldn’t do this anymore and put them on devtalk or in a gitea issue instead?
Is it feasible to put the docs even closer to the code, i.e. interleave it with the code or code files? My main motivation here is to deduplicate documentation in some cases. For example, lazy_function_graph_executor.cc has a big comment block at the top that could (with some changes) also be part of the developer docs, where it could also use images etc. However, it would be somewhat annoying to keep both places in sync. It would still be ok to have all the docs in a separate folder, but then it would be useful if we could reference those docs from the code. Not sure what kind of link we should use for that. Ideally, the link would be something that you can ctrl-click on in an editor to get to the docs.

brecht · January 3, 2024, 4:03pm

These are the proposals that got migrated, all of them were written by either the geometry nodes team or me.

I don’t feel strongly one way or the other. I’d be fine using Gitea issues like other developers are doing. Hopefully we can make it a habit of updating them and moving them into the developer docs once the implementation lands, which should at least be a bit easier as both are markdown.

It’s a good question. I guess doxygen could also be involved, and that has some markdown support. But I don’t immediately know how it would all fit together, maybe someone can investigate what’s possible or practical here.

ZedDB · January 3, 2024, 5:59pm

But this also means that you would still have to juggle with files from different sources. (Which is what you wanted to avoid in the first place)
It also means that if a user clones the doc repo that they will not have all data needed to to properly view the docs. Which to me is very bad.

I get that this is a work around to cut down on the repo sizes. But i don’t think this is a good workflow.
Specifically because it makes the availability of the files uncertain. (Users can take down or edit their contributions without any sign of it in the commit logs).

It also seems very counter intuitive as your goal was to make the process easy. By splitting out the file hosting, we now require users to upload the files separately instead of just including them in the commit for easy testing and review.

I don’t see why we couldn’t use git-lfs for this as we are doing for the manual and some of our other repos as well.

All of this feels to me like we are trying to make concessions that will make everything more complex to try to justify creating a big and bloated repo.

If the goal is to move the dev docs closer to the code, then perhaps actually having it in the code files as Jacques suggested would be better? Especially because that is probably were most of people will look in the first place. (And it makes it a bit easier to keep them up to date)

brecht · January 3, 2024, 8:36pm

The downside of drive.blender.org would be when you don’t have an internet connection, which is no different than wiki.blender.org now.

But in other cases the files should be always available. The images would show when reviewing. There is no juggling of files. You upload the file to drive.blender.org, put the link in markdown and from then on it is just there for everyone.

Users should not be taking down files. Maybe we allow it in the first 24 hours to fix mistakes, and after that only an admin can do it? Uploaded files would be immutable, with a unique URL based on the hash. It should not be possible to change them without a sign in the commit logs.

The developer docs consist of the developer handbook, code documentation and release notes. If there is a convenient way to interleave the code documentation part with the code then we could do it. The handbook and release notes would still be in some specific directory.

ThomasDinges · January 4, 2024, 10:19am

Hi,
I have some questions and concerns about merging the code and documentation repositories.

What happens with the various credit generators that we have (for the credits page, top commiters posts etc)? Would these include documentation commits or would we want to filter them out? If we include them we may want to include translations and user manual commits too for fairness then?

We already see some muscle memory issues with the regular stabilizing workflow (people forget to commit fixes to the branch, forget to merge to main…), adding the same for release notes is something I rather avoid. For the docs I’d prefer one main branch and not worry about release branches or merges.

+1

Contributing to documentation should be as easy as possible. We already have the situation that we cannot give isolated rights for the buildbot without granting full commit access, making the same true for documentation contributions seems a step back.

All in all, I would prefer to complete the migration first, say goodbye to the wiki and then give it a few weeks/months for the new manual to solidify. Merging the repositories can always happen later if this is requested, untangling it later will be more difficult.

mont29 · January 4, 2024, 12:04pm

Did not do a deep check, but overall both ‘handbook’ and ‘feature’ sections looks nice to me.

Handbook layout looks good to me, would not try to make top sections list shorter, think it would lead to confusing content then.
- For the ‘Translating’ section of the handbook, I would put the language specific page(s) into a dedicated sub-section. Right now we only have a French one, but I can imagine other teams following these steps. Would call it Language Teams maybe?
Regarding GSoC, my first reaction was to put it in the ‘contributing code’ of the handbook, but indeed it does not fit there. While a bit weird to give it so much emphasis, I do not really see any other place where it would belong, so would vote to keep it as-is for now.

mont29 · January 4, 2024, 12:21pm

I would also like to see the technical documentation closer to the code, so think trying to merge it into the main code repository would be nice to try. Albeit all the listed challenges.

I do not really understand the concerns about commit noise - I would not expect that many commits only affecting MD files? I’d rather see the change in code and the related changes to the doc to happen in a single commit? Similarly, credits do not seem to be an issue either, we can filter out typos fixes maybe, but again, I would expect most commits to doc to be considered as same level as commits to code.

And the fact that having all in a same repo would make review process way more streamlined is a big plus for me (just like we can already ask PR authors to add tests, it would become trivial to ask them to edit the relevant doc and/or release notes).

Am a bit more dubious with the idea of moving the ‘features’ content directly into code files, for several reasons:

It would make it very hard to tell apart commits that need blender rebuild, vs doc rebuild.
It would add a significant amount of ‘used screen space’ in code files, which could become annoying (already find some of our long detailed ‘architectural comments’ in some files annoying tbh).

I think however that it would be nice to add links to relevant MD doc files in source code comments. Hopefully some IDEs can even detect these and allow direct opening of the doc file.

brecht · January 4, 2024, 1:10pm

An alternative idea could be to put the release notes in the Blender manual repository. Reduce the number of required pull requests that way, while encouraging developers to update the manual immediately. Not sure it’s right, and raises its own set of questions, just thinking out loud.

Yes, that is the plan anyway.

ThomasDinges · January 4, 2024, 1:17pm

The manual has release branches as well though, which results in the same sort of issues. Every modification or tweak (which can also happen after releases are done) would need to be merged/backported between branches. I really want to avoid that.

ZedDB · January 5, 2024, 11:45am

By just looking at the name “drive.blender.org”, I get the impression that it is intended to be similar to “google drive”. That is, users can upload, share, edit and delete their files with other people.
Introducing restrictions to this to make hosting things for the docs seems a bit weird to me.
(This will also mean that we have an other place we have to police for spam and malicious actors)

However lets keep on topic and just say that it is called doc_file_host.blender.org and that all files uploaded are strictly for the dev docs.
For the current wiki, I have already wasted quite a lot of disk space because I had to upload multiple files as I made mistakes.

Even if you have a 24 hour limit, you have people waste quite a bit of disk space if they have to upload additional revisions of media without having the ability to delete or edit their previous files. Especially if this is intended to be used during the review process of a PR.

It seems to me like we are imposing restrictions like these to try to mimic version control. Why not just skip all the complexity and caveats by just biting the bullet and having the media files version controlled as well?

If we do that, then:

People don’t have to manually upload any files. They also don’t have to change anything between local tests and the final merge.
No risks that you go out of sync with what was accepted and merged with what is actually displayed to the end user (both online and offline). This also means that preservation and backup is much easier as people will have all files they need when cloning the repo. They won’t need to crawl and download the images by scraping the website to get a local copy.
We don’t deviate from how we host files compared to some our other repos (manual, studio-pipeline etc). Making it easier to people to jump between repos without having to keep in mind any special caveats for each repo.

I also think that it will be easier for outsiders that come from github, gitlab or other gitea instances as in my experience git-lfs is how quite a few projects host their media files for documentation or manuals nowadays.

brecht · January 5, 2024, 12:53pm

Some of the things you mention are non-issues already addressed before I think, but I don’t want to make this discussion too long.

Consider what happens when we use Git LFS for the docs.

Adding Git LFS to an existing repository without LFS causes all kinds of problems, which means we can’t do it for the Blender code repo. Therefore developer docs will have to be outside the Blender code repo, and we don’t get to simplify the workflow of updating them.
If we want contributors to update the docs, they have to download a repo that is likely to grow to many gigabytes in size. Even then there will be limitations on using large video and blend files to avoid the developer docs growing even bigger than that.

If other developers think adding those restrictions is fine, then we could choose to use Git LFS. To me, the idea of drive.blender.org is about lifting those restrictions.

And it would actually be nice to do the same for the user manual, in my opinion.

thorn-neverwake · January 5, 2024, 1:04pm

If the current workflow takes too much time and overhead to manage, then the above seems to simply create a different form of time and overhead. One can imagine that after several months, the drive will be filed with images/etc that aren’t used at all. After that number approaches several hundred images, it will be impossible for an admin to act as the janitor.

brecht · January 5, 2024, 1:08pm

Several hundred images is nothing, we can have that amount added every day and not worry about it.

ThomasDinges · January 5, 2024, 1:23pm

I am not exactly sure why a potential drive.blender.org needs to be public for everyone with a Blender ID. If we limit uploads to trusted people (commiters, documentation writers…) we have all benefits of a small git repo with the benefit of a trusted media platform to upload images, blend files and videos.

brecht · January 5, 2024, 1:24pm

We want anyone to be able to contribute changes to docs, not just trusted people.

ThomasDinges · January 5, 2024, 1:27pm

People will have to either have commit access / or go via a PR anyway to contribute to the docs, so there is already a form of access restriction. I rather give people permissions on the drive than moderating unwanted content. But that’s something we can still decide later. I just wanted to raise the point that it doesn’t necessarily need to be public in order for us to work.

ZedDB · January 5, 2024, 1:27pm

I’ll also stop now as I think we have probably now gone through the pros and cons in quite a bit of detail.
There are just a few things I want to respond to.

To me it seems like you want to simplify stuff but then to do the simplification you wanted, you introduce an other complex step not making it that much simpler in the end.
If you have require people to use a file host then I personally think it is not much different from having a separate repo. But I guess this is the point that we disagree on.

We probably want to impose restrictions regardless if we go with a separate file host or not, no?
For example we probably don’t want our videos to be 120 fps HDR 4K videos encoded in a lossless format. Probably easier to detect when people are uploading huge files if we host them in the repo as well.

If you want to simply clone the repo without any of the binary files, it is possible to do so with git-lfs as well. git-lfs per default also only downloads the current binary snapshots.

Eeehhhh. I think that would be a regression in the current workflow. Now you are forcing people to keep track of files in two different places than just in one place. Which is what you wanted to avoid in the first place.

Howard_Trickey · January 5, 2024, 1:29pm

My experience at a big tech company is that the idea of keeping developer docs in the same directories and source control system doesn’t work out that well in practice (at least in the part of the company that I can see). In practice, what happens is this:

Detailed usage documentation goes into large comments in the .h file for the API in question. Pretty much all you need to understand the capabilities, restrictions, and performance of API functions are documented in a very large comment block there.
“Design Documents” are usually just Google Docs. They represent a point in time about the design, and not necessarily the current state of the implemented software.

This is not ideal of course, and the workflow that Brecht et al. are hoping for seems better. Yet we have the capability (and even official encouragement) to follow something like that workflow, and yet don’t. Why? Here are a couple of my thoughts:

The main point is that “design” goes through many stages:

Idea of the functionality wanted and one or more ideas of the technical approach to be used.
Choices about the API and/or user interface.
Initial choices about data structures and some primitive helper functionality to be implemented.
Starting to implement. Discovering that some algorithm development is needed, and working out the math for that.
Discovering bugs, functionality lacks, performance problems in the initial code. Iterate on some of the previous stages until a satisfactory MVP is achieved.
Release. Get feedback and bug reports. Iteratoe on some of the previous stages again.

At what point does one have “developer docs” that are ready for committing to a repository? You could say “all of them”, to fit the proposed workflow. But in my experience, it is hard to keep a living, well-organized document up to date while going through the above stages. Also, it is difficult to do that and also keep a record of alternatives tried that didn’t work out. It can be done, but is a lot of work. What I personally do, on large projects (like Bevel, or Exact Boolean) is write a running Google doc that is kind of stream of consciousness - notes to myself, and sometimes, to a close collaborator. I could do this in markdown and submit it with each commit to my private fork, but would feel slighly embarrassed about doing so.

The other point is that something like a Google doc has much less friction for adding pictures and collaborating with others via side comment threads. I would encourage us here to try to find the most frictionless way possible to add pictures – ideally, a drag-and-drop or a simple “insert picture” dialog where the user just picks a .png file. (Even better: something like Google’s built-in Drawing insertion too, for diagrams that need vector art with labels.)

Download

What's New

Blender Studio

Manual

Developers Blog

Documentation

Benchmark

Blender Conference

Development Fund

One-time Donations

Upcoming New Developer Documentation Platform - Replacing Wiki