BirdCLEF 2022, My Most Voted Golden Medal Kaggle Notebook đŸ„‡, Noise Reduction | by Hasan Basri Akçay | Aug, 2022 | DataDrivenInvestor

2022-08-14 00:59:48 By : Ms. May Yang

This competition was about identifying bird species by sound to protect endangered Hawaiian birds.

Hawaii has lost some bird species, and the consequences could harm entire food chains. For this reason, a Kaggle competition was hosted. With a physical monitoring complex, scientists have turned to sound recordings of how native birds react to environmental changes and conservation efforts.

This competition was related to using machine learning skills to identify bird species by sound. Within the scope of the competition, I did Exploratory Data Analysis (EDA) and created a Noise Reduction Function.

Good notebooks that compete would be used to protect endangered Hawaiian birds. Therefore, this competition was one of my favorite competitions because this competition was about saving lives.

You can access the competition from here.

In the competition, meta datasets were used for memory reduction by the creators of the competition. There is specific information about the sound in the train CSV file, such as type of sound, primary_label, the direction of the file, etc. The sound data were in the train_audio folder. You can access more information about the datasets here.

The training dataset has 13 columns and 14852 rows. The first five rows of the training dataset are shown below.

I used Deepnote to share the codes. Deepnote is well designed Jupiter-like web-based notebook environment that supports multi-user development.

In the EDA part, I first checked the distributions of the data. The distributions show us the usability of the feature.

Note: The data cleaning and some data distribution parts were skipped for the more readable article. You can access the complete code from here.

The first distribution belongs to comman_name. In comman_name features, Northern Cardinal, Common Sandpiper, Mallard, Eurasian Skylark, Barn Owl, and House Sparrow are the most frequent data in the dataset and are present in equal numbers.

The rating feature indicates the quality rating on Xeno-canto and the number of background species. Where 5.0 is the highest and 1.0 is the lowest. 0.0 means that this recording has no user rating yet. Some low-value rating values can fool the machine learning model. That’s why this feature can become important for predictions.

In the records, there can be more than one bird species. The secondary_labels are background species as annotated by the recordist. According to the below plot, most rows have only one bird species record.

In addition to the secondary_labels, the type features also can have more than one value. For example, the type feature is the sound type of the birds like a fight, call, song, etc. We can see that many data have more than one type in the plot below.

Time series signals also can represent in the frequency domain. The difference between the time domain and frequency domain is the frequency domain shows periodic repeats of the motion, and the time domain shows changes in the signal over time. You can see below how they look on a graph.

In the frequency domain, high values generally are noise because they don’t repeat themselves in the record. Therefore, for noise reduction, we should use a low pass filter, and this filter eliminates other high values.

You can see the noise reduction function below. The process has three variables, and the first one (y) is the signal value. The second variable (plot) is a boolean variable that works for plotting the changes in signal value after filtering. Finally, the last variable (th) is used for the size rate of the filter.

FFT function of Scipy was used for the domain conversion time to frequency.

Librosa was used for loading the audio data. After loading the audio data, we called the noise_reduction function by using the plot parameter valid. The function returns new audio and plots two graphs. The first graph shows the audio data in the time and frequency domain before filtering, and the second shows the audio data in the time and frequency domain after filtering.

As you can see after the filtering, high values in the frequency domain were removed from the audio.

You can listen to both audios below. The first audio is original sound from the dataset and the second one is filtered sound. The background sounds of filtered audio are much lower than the first audio. As a result, birds’ sounds are cleaner in the second audio.

The competition dataset was very noisy. Other than the sound of a bird, there are a lot of sounds like wind, water, etc. Therefore, the noise reduction function was so important for accurate predictions. Also, a better noise reduction function can create a builder by using Test Time Augmentation (TTA) technique. TTA may be the subject of another article.

👋 Thanks for reading. I hope you enjoyed the article. You can access the complete python code here. If you enjoy my work, don’t forget to clap 👏, and follow me on Medium and LinkedIn. It will motivate me to offer more content to the Medium community! 😊

[1]: https://www.kaggle.com/code/hasanbasriakcay/birdclef22-eda-noise-reduction#Noise-Reduction [2]: https://www.kaggle.com/competitions/birdclef-2022/overview [3]: https://librosa.org/doc/latest/index.html [4]: https://docs.scipy.org/doc/scipy/tutorial/fft.html [5]: https://en.wikipedia.org/wiki/Frequency_domain

Join our network here: https://datadriveninvestor.com/collaborate

empowerment through data, knowledge, and expertise. subscribe to DDIntel at https://ddintel.datadriveninvestor.com

Data Scientist | đŸ„ˆ Kaggle Master | Master Degree on AI, https://www.linkedin.com/in/hasan-basri-akcay/