GSoC 2024: Sample Sound Node

Greetings! I am Lleu Yang, a GSoC 2024 student and the future contributor of Sample Sound Node.

In this summer, I plan to add a Sample Sound node that retrieves audio from sound files, and provides their frequency response over time for use in Geometry Nodes. The full proposal can be viewed here.

There is also a dedicated feedback thread. Please feel free to share your opinions on this project!

I will post my reports of work here regularly once the coding period has begun.

20 Likes

Community Bonding Period & Week 1

In this period, I:

  • Implemented Sound socket (which is the prerequisite of all future works) and Sound input node,
  • Created a pull request containing all possibly related works of this project,
  • Learned the basic usage of Audaspace APIs through reading BKE_sound.c and Audaspace Device’s implementation, and
  • Discussed with my mentor about how to implement the actual Sample Sound node.

Most of the works mentioned above were finished in the Community Bonding Period. I have not contributed much in the first week, since it has been a little bit busy for me dealing with assignments in school. Thankfully, I am much more available from now on.

In the next week, I plan to:

  • Create a naïve Sample Sound node with access to time domain, by utilizing aud::ReadDevice, and
  • Get familiar with aud::FFTPlan, and try to provide initial support (not worrying about caching at this time) for getting frequency domain information in the node.
11 Likes

Always remember to use ASan, unless you want to ruin your afternoon by accidentally overriding your handles with 7,350 floating-point audio samples.

Week 2

In this week, I implemented the naïve node with the ability to get amplitude from a Sound, as is shown below:


The project file is attached in a comment from PR #122228. The audio is made by myself and licensed under CC0.

At this moment, known issues are:

  • Current temporal smoothing method is rather simple, inflexible, and even inefficient, which needs to be replaced by a faster, more accepted algorithm,
  • While being easy to use, aud::FFTPlan has no external binding for Blender, which means either modification of Audaspace code base or a little manipulation on the build system is needed, and
  • Lots of TODO’s in current code, which I believe will finally be cleaned up after a while.

In the next week, I plan to:

  • Discuss with my mentor about how to integrate aud::FFTPlan into Blender itself, and provide basic ability to get frequency information from a Sound.
  • Learn more about properties of sound (intensity, pressure, etc.) and provide better temporal smoothing algorithms. The previous Sound Spectrum node should be a great reference.
22 Likes

The reason that the previous smoothing algorithm didn’t work (while it should be,) and how I solved it:

-                      const bool frame_rate)
+                      const double frame_rate)

Uh-oh.

Week 3

In this week, I:

  • Added basic FFT functionality to the node, and
  • Implemented a very simple (and incomplete) caching mechanism.

The project file is attached in a comment from my working PR. The packed audio is made by myself and licensed under CC0.

Known issues, i.e. FIXME’s:

  • The cache is not thread-safe for now. Although one can hardly make it crash in its typical usage (like what the video has shown), it may be unable to handle heavy workloads.
  • Floating-point arithmetic error can cause sample position (time) to jump back and forth between different frames, making weird jiggles in animation.
  • Assertion error happens when a new sound is specified while current scene is still in playback.

In the next week, I plan to:

  • Support various window functions for FFT,
  • Support EMA for smoother attack/release, and
  • Add a Sound Info node for conveniently getting important information (e.g. channel amount and sample rate).
26 Likes

Week 4

In this week, I:

  • Supported custom smoothness (sample length) in the node,
  • Supported adding window functions when doing FFT, and
  • Rewrote LRU cache using Blender’s BLI libraries.

The project file is attached in a comment from my working PR. The packed audio is made by myself and licensed under CC0.

Known issues:

  • Overall amplitude (of FFT results) varies significantly with smoothness. This issues might be related to some properties of FFT itself and needs to be further researched.
  • fftwf_execute() fails to plan when multiple layers containing this node are present.

I will be inactive in the next week, since I have to prepare for the final exam (ends on July 5) at school. However, the next steps could be:

  • Carefully design the previously mentioned Sound Info node to get more than just sample rate and channel amount from a Sound,
  • Try to eliminate some of the crash scenarios, and
  • Prepare for the mid-term evaluation of GSoC.
23 Likes

Week 5

As mentioned before, I have been preparing for the coming exam at my school, so no new lines were written in this week.

However, based on the last code review from my mentor, the proper caching strategy for FFT has been found. Once I finished the exam, I will instantly return to coding and realize the idea.

10 Likes

Week 6

In this week, I redesigned the whole caching structure (throwing LRU away and keeping it simple) according to the instruction from my mentor. It allows users to choose FFT size by themselves, and is much more efficient than the previous solution. The difference between different FFT sizes is shown below (no smoothing is applied):

The project file is attached in a comment from my working PR. The packed audio is made by myself and licensed under CC0.

Due to the exam, the progress of this week is, still, a bit slow. The node with this new caching structure currently lacks temporal smoothing support. Thankfully, the solution to hardest part has become clear, and the rest steps (new features and improvements) would be pretty simple and straightforward.

12 Likes

Week 7

Glad that I passed the midterm evaluation! :wink:

In this week I mainly focused on fixing issues mentioned in last week’s code review. Outcomes included a caching system that doesn’t involve too many manual memory management operations, a simplified yet more intuitive user interface (still, inspired by Sound Spectrum), some refactoring, and also bug fixes.

The screenshot above shows the new design of the node, which replaced absolute frequency value with relative factors, added units onto parameters, and placed rarely-used settings into the side bar.

After the last code review, more of the design goals has been clarified. Hopefully I will finish the caching system and be able to continue adding more new features in the coming week.

23 Likes

Week 8

In this week, I:

  • Implemented a Sound Info node for getting important information from a Sound, and
  • Made Sample Sound node able to sample a specific channel of given Sound.

One of the possible use cases of the node is shown below:

Note that after last week’s discussion with my mentor, we’re back to using absolute frequency combined with Map Range node in order to achieve similar experience like relative frequency, while making the node’s parameters as obvious as possible. This kind of technique will later be included in user manual.

As for now, nearly all primary deliverables of the node has been finished. Next steps would be design-wise refinement, bugfixes, optimizations and documenting.

14 Likes