GSOC 2025 Draft - Pitch Correction for Sound Playback in Sequencer

NOTE: This is currently a draft and subject to change. There’s probably lots of changes probably needed, especially with the schedule and the technical details of the project.

Feel free to leave any feedback or questions in the thread below or on the Google Doc here, and I’ll do my best to answer!

Project Title:

Pitch Correction for Sound Playback in Sequencer

Name:

Kacey La

Contact:

Email: [email protected]

Blender: TheKaceFiles

Synopsis:

Blender comes with a built-in video sequence editor (VSE) that allows users to do basic to intermediate video editing tasks. While the editor supports retiming video and audio through the retiming keys, one particular feature that is missing for audio is being to preserve the original pitch of the audio when it is sped up or slowed down.

Thus, this project will focus on adding a toggle option to preserve pitch in the speed intervals between the retiming keys.

Benefits:

Pitch correction is important in video editing software as it allows for users to manipulate the timing and duration of audio clips and still retain the natural quality of voices, music, and other sound effects. By integrating pitch correction into Blender’s VSE, it allows Blender to better become an open-sourced alternative to other paid video editing softwares. This will also better integrate into the workflow for users already utilizing Blender’s VSE, as it eliminates the need to adjust audio pitch in an external program and enables them to stay within Blender’s packaged 3D modeling and video editing suite

Deliverables:

  1. Investigation document - explore research papers, voice and music
  2. Isolated implementation of pitch-preserving algorithm outside of Blender
  3. Integration of pitch preserving algorithm in Blender
  4. End-user documentation
  5. Super Stretch Goals - Start framework for pitch shifter, which will allow users to adjust the audio up or down the specified semi-tones

Project Details:

First, I will explore audio papers and 3rd party libraries that go into depth about pitch-shifting. I have so far compiled the following papers and libraries for approaching pitch shifting.

Papers:

  • See potential papers on Google doc

Libraries:

After exploring the research papers, 3rd party libraries, and how others approached this problem, I will culminate my findings into a document, where if possible, I will compare the benefits and tradeoffs between the different approaches, and choose the best approach that fits the needs of Blender. If we decide to implement the correction algorithms manually, then time will be put towards implementing the algorithm outside of the Blender codebase. After receiving approval from my mentor and other developers, the algorithm will be integrated into Blender’s audio library Audaspace which will then be used by Blender’s main codebase.

Blender Integration:

In Blender, retiming keys can be added through the shortcut IAdd Retiming Key, which can be used to adjust the speed of strips. These retiming keys can be repositioned to achieve the effect of speeding up or slowing down the audio as indicated by the audio speed percentage. However, the change in the audio playback speed has the effect of distorting the natural tonal quality of the audio.

I propose adding a Preserve Pitch toggle option under the Sound tab for each audio strip instance (including the ones that created by the “Split Strip” operation) as demonstrated below.

The pitch correction algorithm itself will be implemented in Blender’s high-level audio library, Audaspace, where it would need to account for any audio speed percentage. Then it would be defined as a binding somewhere in extern/audaspace/bindings which will then be wrapped by a function in blenkernel in sound.cpp to be used in video sequencer code. Additional details will later be further solidified with mentor and other developers.

Then as for the UI, this will require defining an RNA property in the function rna_def_sound() in rna_seqeuncer.cc which will correspond to a newly defined DNA flag defined in DNA_seqeunce_types.h Additionally, the UI for preserve pitch toggle will need to be added in the draw() method in class SEQUENCER_PT_adjust_sound in space_sequencer.py By default, the Preserve Pitch option will be turned on. When the user turns on the preserve pitch option, the audio stored strip will be passed through the pitch correction function in blenkernel. Otherwise, if the toggle is off, the audio would play as it normally would in Blender.

Project Schedule:

This is a large-sized project (350 hours) with a predicted completion time frame spanning across 17-18 weeks. I will likely be working part-time over the summer and will likely commit at least ~20 hours per week on this project after my final semester ends on May 17th. I will probably be on vacation during one of the weeks over the summer. Regardless of whether the project is accepted to GSOC, I still intend to further explore different approaches for pitch correction over the summer and refresh my digital signal processing knowledge.

  • Week 1-4

    • Read and explore different approaches; look at research papers, 3rd party libraries, or what others have implemented
    • Create an investigation document listing benefits and tradeoffs for each approach (Deliverable #1)
    • Research Blender’s codebase further and solidify details with mentor
  • Week 5-7

    • Continue experimenting with implementation of pitch correction algorithm s
    • Finalized isolated implementation of pitch correction in either Python or C++ (Deliverable #2)
    • If we’re using a 3rd party library, the weeks can become additional buffer weeks or time spent into integrating the external library to Blender
  • Week 7-9

    • Begin implementing pitch correction algorithm into Audaspace

    • Add bindings to extern/audaspace/bindings and integrate into blenkernel

    • Ask for feedback and fix any issues from the community

  • Week 10-13

    • Implement UI changes for pitch correction toggle

    • Finish integrating pitch correction toggle functionality (Deliverable #3)

    • Continue to ask for feedback and fix any issues from the community relating to pitch correction functionality

  • Week 14-16

    • Prepare for final submission

    • Make sure the code and functionality is well-optimized and memory-efficient (i.e no major bottleneck delays while preserving pitch and audio is being previewed)

    • Clean-up code and add test cases (if needed)

    • Finalize user and developer documentation (Deliverable #4)

  • Week 17-18

    • Buffer Weeks, start thinking about the pitch shifter stretch goal if enough time remains or fix any other bugs with VSE

Bio:

My name is Kacey, and I’m currently a senior at a small college called Ursinus College studying Computer Science and Math with an interest in computer graphics, game development, and a bit of digital signal processing. I am currently a Blender beginner, used Blender’s VSE to render a small video, and have some intermediate experience with Python and C++. In my free time, I’m a hobbyist game developer, where I frequently contribute to a roguelike library called RogueEssence/PMDC, help modders troubleshoot some bugs, and write some developer guides in the wiki.

I took a course called Digital Music Processing in 2023, where it broadly overviews how to represent, analyze, and morph/transform digital musical audio. Some topics that were covered include the Fourier Transform, beat tracking, spectrograms, autocorrelation, and implementing a few audio algorithms from papers such as Let it Bee - Towards NMF-inspired Audio Mosaicing.

I’ve been meaning to contribute to a large open-source project for a long time. Almost all tools that I currently use on a very frequent basis and love are open-sourced (Obsidian, Typst, Godot, and of course, RogueEssence!) and I feel that Blender best fits my interests and skills. I have lots to learn, but I hope I can become a lifelong contributor for a project such as Blender and further my understanding of this codebase.

My previous contributions so far (note that one PRs have not been reviewed yet currently as of 3/30/2025):

#136348: VSE: Fix delete retiming key and description

#133747: Improved UI for File Output node panel

Lastly, something very minor, but I reported a Blender Mac issue where the player head was inconsistently resetting back to the beginning and a small UI issue with the retiming keys.

6 Likes