GSoC 2024: Sample Sound Node

Greetings! I am Lleu Yang, a GSoC 2024 student and the future contributor of Sample Sound Node.

In this summer, I plan to add a Sample Sound node that retrieves audio from sound files, and provides their frequency response over time for use in Geometry Nodes. The full proposal can be viewed here.

There is also a dedicated feedback thread. Please feel free to share your opinions on this project!

I will post my reports of work here regularly once the coding period has begun.

17 Likes

Community Bonding Period & Week 1

In this period, I:

  • Implemented Sound socket (which is the prerequisite of all future works) and Sound input node,
  • Created a pull request containing all possibly related works of this project,
  • Learned the basic usage of Audaspace APIs through reading BKE_sound.c and Audaspace Device’s implementation, and
  • Discussed with my mentor about how to implement the actual Sample Sound node.

Most of the works mentioned above were finished in the Community Bonding Period. I have not contributed much in the first week, since it has been a little bit busy for me dealing with assignments in school. Thankfully, I am much more available from now on.

In the next week, I plan to:

  • Create a naïve Sample Sound node with access to time domain, by utilizing aud::ReadDevice, and
  • Get familiar with aud::FFTPlan, and try to provide initial support (not worrying about caching at this time) for getting frequency domain information in the node.
11 Likes

Always remember to use ASan, unless you want to ruin your afternoon by accidentally overriding your handles with 7,350 floating-point audio samples.

Week 2

In this week, I implemented the naïve node with the ability to get amplitude from a Sound, as is shown below:


The project file is attached in a comment from PR #122228. The audio is made by myself and licensed under CC0.

At this moment, known issues are:

  • Current temporal smoothing method is rather simple, inflexible, and even inefficient, which needs to be replaced by a faster, more accepted algorithm,
  • While being easy to use, aud::FFTPlan has no external binding for Blender, which means either modification of Audaspace code base or a little manipulation on the build system is needed, and
  • Lots of TODO’s in current code, which I believe will finally be cleaned up after a while.

In the next week, I plan to:

  • Discuss with my mentor about how to integrate aud::FFTPlan into Blender itself, and provide basic ability to get frequency information from a Sound.
  • Learn more about properties of sound (intensity, pressure, etc.) and provide better temporal smoothing algorithms. The previous Sound Spectrum node should be a great reference.
20 Likes

The reason that the previous smoothing algorithm didn’t work (while it should be,) and how I solved it:

-                      const bool frame_rate)
+                      const double frame_rate)

Uh-oh.

Week 3

In this week, I:

  • Added basic FFT functionality to the node, and
  • Implemented a very simple (and incomplete) caching mechanism.

The project file is attached in a comment from my working PR. The packed audio is made by myself and licensed under CC0.

Known issues, i.e. FIXME’s:

  • The cache is not thread-safe for now. Although one can hardly make it crash in its typical usage (like what the video has shown), it may be unable to handle heavy workloads.
  • Floating-point arithmetic error can cause sample position (time) to jump back and forth between different frames, making weird jiggles in animation.
  • Assertion error happens when a new sound is specified while current scene is still in playback.

In the next week, I plan to:

  • Support various window functions for FFT,
  • Support EMA for smoother attack/release, and
  • Add a Sound Info node for conveniently getting important information (e.g. channel amount and sample rate).
22 Likes

Week 4

In this week, I:

  • Supported custom smoothness (sample length) in the node,
  • Supported adding window functions when doing FFT, and
  • Rewrote LRU cache using Blender’s BLI libraries.

The project file is attached in a comment from my working PR. The packed audio is made by myself and licensed under CC0.

Known issues:

  • Overall amplitude (of FFT results) varies significantly with smoothness. This issues might be related to some properties of FFT itself and needs to be further researched.
  • fftwf_execute() fails to plan when multiple layers containing this node are present.

I will be inactive in the next week, since I have to prepare for the final exam (ends on July 5) at school. However, the next steps could be:

  • Carefully design the previously mentioned Sound Info node to get more than just sample rate and channel amount from a Sound,
  • Try to eliminate some of the crash scenarios, and
  • Prepare for the mid-term evaluation of GSoC.
19 Likes

Week 5

As mentioned before, I have been preparing for the coming exam at my school, so no new lines were written in this week.

However, based on the last code review from my mentor, the proper caching strategy for FFT has been found. Once I finished the exam, I will instantly return to coding and realize the idea.

8 Likes