Audio Signal Classification Using Deep Learning

Uma Mahesh RN; Deepak Chakrasali; Suhas Chandra Thejasvi N; Manoj Kumar C; Srivathsa D Bharadwaj

doi:10.61927/igmin345

233 of 232

Comparative Study of Glucose Abnormalities Among Senegalese Migrants and Rural Populations

JM DolletL Soyeux, A Niang-Diene, F Guillemin and SN Diop

END

Engineering Group Research Article 記事ID: igmin345

Audio Signal Classification Using Deep Learning

Artificial Intelligence DOI10.61927/igmin345

Uma Mahesh RN ^* ,

Deepak Chakrasali ,

Suhas Chandra Thejasvi N ,

Manoj Kumar C and

Srivathsa D Bharadwaj

Affiliation

Department of CSE (AI&ML), ATME College of Engineering, Mysore, Karnataka, India

Fulltext HTML Fulltext PDF Cite this article

24

REFERENCES

87

VIEWS

32

DOWNLOADS

120

要約

Audio signal classification plays a significant role in various real-world applications such as speech recognition, environmental sound analysis, and music genre identification. Traditional approaches often depend on manually extracted features, which may not capture the full complexity of audio data. This paper presents a deep learning-based method for automatic classification of audio signals using a One-Dimensional Convolutional Neural Network (1D-CNN) and a Recurrent Neural Network (RNN). The CNN model is utilized to extract spatial features from spectrogram representations, while the RNN model effectively captures temporal dependencies within the audio sequences. Both models were trained and evaluated on a labelled dataset, and their performance was compared using metrics such as accuracy, precision, probability of detection (POD), and F1-score. The experimental results demonstrate that CNN has achieved high classification accuracy compared to RNN, with CNN excelling at spatial feature extraction and RNN providing temporal feature learning. The proposed approach confirms that deep learning models can significantly enhance the performance and reliability of audio signal classification systems.

数字

参考文献

Hershey S, Chaudhuri S, Ellis DP, Gemmeke JF, Jansen A, Moore RC, Plakal M, Platt D, Saurous RA, Seybold B, Slaney M. CNN architectures for large-scale audio classification. In: 2017 IEEE Int Conf Acoust Speech Signal Process (ICASSP). 2017 Mar; p. 131‑135.
Choi K, Fazekas G, Sandler M, Cho K. Convolutional recurrent neural networks for music classification. In: 2017 IEEE Int Conf Acoust Speech Signal Process (ICASSP). 2017 Mar; p. 2392‑2396.
Kumar R, Gupta M, Ahmed S, Alhumam A, Aggarwal T. Intelligent audio signal processing for detecting rainforest species using deep learning. Intell Autom Soft Comput. 2022;31(2):692‑706.
Gupta M, Sharma R. Deep learning‑based environmental sound classification using CNN and RNN architectures. J Intell Syst. 2021;30(4):415‑427.
Pons J, Lidy T, Serra X. Experimenting with musically motivated convolutional neural networks. In: Proc 14th Int Workshop Content‑Based Multimedia Indexing (CBMI). 2016 Jun; p. 1‑6.
Zaman K, Sah M, Direkoglu C, Unoki M. A survey of audio classification using deep learning. IEEE Access. 2023 Oct;11:106621‑106652. doi:10.1109/ACCESS.2023.3318015.
Bhangale P, Kothandaraman R. Deep learning architectures for audio classification: A comparative study of CNN and RNN models. Int J Eng Res Technol (IJERT). 2020;9(8):123‑130.
Qamhan MA, Altaheri H, Meftah AH, Muhammad G, Alotaibi YA. Digital audio forensics: microphone and environment classification using deep learning. IEEE Access. 2021;9:62719‑62733.
Kumar R, Gupta M, Ahmed S, Alhumam A, Aggarwal T. Intelligent audio signal processing for detecting rainforest species using deep learning. Intell Autom Soft Comput. 2022;31(2):693‑706. doi:10.32604/iasc.2022.019811.
Aslam MA, Sarwar MU, Hanif MK, Talib R, Khalid U. Acoustic classification using deep learning. Int J Adv Comput Sci Appl (IJACSA). 2018;9(8):153‑159.
Purwins H, Li B, Virtanen T, Schlüter J, Chang S‑Y, Sainath T. Deep learning for audio signal processing. IEEE J Sel Top Signal Process. 2019 May;13(2):206‑219. doi:10.1109/JSTSP.2019.2908700.
Akinpelu, Viriri S. Deep learning framework for speech emotion classification. IEEE Access. 2024 Oct;12:152152‑152182. doi:10.1109/ACCESS.2024.3474553.
Hashemi M, Aghabozorgi M, Sadeghi MT. Persian music source separation in audio‑visual data using deep learning. In: Proc 6th Iranian Conf Signal Process Intell Syst (ICSPIS). Yazd, Iran. 2020 Dec; p. 1‑6. doi:10.1109/ICSPIS51611.2020.9349614.
Hasan H, Rahman MSM, Islam MS. Audio forensic authentication using background noise. Appl Intell. 2015 Mar;42(3):627‑641. doi:10.1007/s10489‑014‑0629‑7.
Hassan E, Elbedwehy S, Shams MY, Abd El‑Hafeez T, El‑Rashidy N. Optimizing poultry audio signal classification with deep learning and burn layer fusion. J Big Data. 2024 Sep;11(135):1‑29. doi:10.1186/s40537‑024‑00985‑8.
Alzahrani MA, Aljohani M, Alzahrani MA. Audio‑based activities recognition using machine learning algorithms and deep learning. Sensors. 2019 Oct;19(4819):1‑19. doi:10.3390/s19224819.
Kim JW, Salamon J, Li P, Bello JP. Crepe: A convolutional representation for pitch estimation. In: 2018 IEEE Int Conf Acoust Speech Signal Process (ICASSP). 2018 Apr; p. 161‑165.
Reddy BL, Uma Mahesh RN, Nelleri A. Deep convolutional neural network for three‑dimensional object classification using off‑axis digital Fresnel holography. J Mod Opt. 2022;69(13):705‑717. doi:10.1080/09500340.2022.2081371.
Mahesh RN U, Nelleri A. Multi‑class classification and multi‑output regression of three‑dimensional objects using artificial intelligence applied to digital holographic information. Sensors. 2023;23:1095. doi:10.3390/s23031095.
Uma Mahesh RN, Lokesh Reddy B, Nelleri A. Deep learning‑based multi‑class 3D objects classification using digital holographic complex images. In: Sivasubramanian A, Shastry PN, Hong PC, eds. Futuristic Communication and Network Technologies. VICFCNT 2020. Lect Notes Electr Eng. Vol 792. Springer, Singapore; 2022. p. 43. doi:10.1007/978‑981‑16‑4625‑6_43.
Uma Mahesh RN, Basavaraju L. Three‑dimensional (3‑D) objects classification by means of phase‑only digital holographic information using Alex Network. In: 2024 Int Conf Signal Process Comput Electron Power Telecommun (IConSCEPT). Karaikal, India. 2024; p. 1‑5. doi:10.1109/IConSCEPT61884.2024.10627906.
Uma Mahesh RN, Basavaraju L. Deep learning‑based multi‑class three‑dimensional (3‑D) object classification using phase‑only digital holographic information. IgMin Res. 2024 Jul 9;2(7):550‑557. doi:10.61927/igmin216. Available from: igmin.link/p216.
Mahesh RU, Nagaraju S. Three‑dimensional (3‑D) objects classification by means of phase‑only digital holographic information using deep learning. In: Data Science & Exploration in Artificial Intelligence: Proc 1st Int Conf Data Sci Exploration Artif Intell (CODE‑AI 2024). Bangalore, India. 2024 Jul 3‑4; Vol 1. CRC Press; 2025 Feb. p. 363. doi:10.1201/9781003587392‑53.
Uma Mahesh RN, Rajanahalli Nataraj, Puttaswamy C. Deep residual network for three‑dimensional (3‑D) objects classification using phase‑only digital holographic information. J Intell Syst. 2026;35(1):20240393. doi:10.1515/jisys‑2024‑0393.

類似の記事

A Capsule Neural Network (CNN) based Hybrid Approach for Identifying Sarcasm in Reddit Dataset
Muhammad Faseeh and Harun Jamil
DOI10.61927/igmin137

The Influence of Dynamical Downscaling and Boundary Layer Selection on Egypt’s Potential Evapotranspiration using a Calibrated Version of the Hargreaves-samani Equation: RegCM4 Approach
Samy A Anwar and Ankur Srivastava
DOI10.61927/igmin229

Unlawful Homicide of Two Ugly and Disabled Victims in a Japanese Tale Based on a True Story
Sung Gyun Jung, Kun Hwang and Young Joong Hwang
DOI10.61927/igmin195

General Solutions for MHD Motions of Viscous Fluids with Viscosity Linearly Dependent on Pressure in a Planar Channel
Constantin Fetecau and Costică Moroşanu
DOI10.61927/igmin289

Application of Virtual Reality (VR) in Facility Management Competency-based Training (CBT) in the Era of Industrie 5.0
Ng Wei Xuan, Cheng Zhuo Yuan and Luke Peh Lu Chang
DOI10.61927/igmin165

The Impact of Stress on Periodontal Health: A Biomarker-Based Review of Current Evidence
Svitlana Boitsaniuk, Mariana Levkiv and Pavlo Ostrovskyi
DOI10.61927/igmin288

Qualitative Model of Electrical Conductivity of Irradiated Semiconductor
Temur Pagava, Levan Chkhartishvili, Manana Beridze, Darejan Khocholava, Marina Shogiradze and Ramaz Esiava
DOI10.61927/igmin166

Educational Innovation amidst Globalization: Higher Education Institutions and Societal Integration
Zainab Rasheed
DOI10.61927/igmin131

Relationship between Sustainable Development, Economy and Poverty
Antonio Oñate Tenorio and María del os Santos Oñate Tenorio
DOI10.61927/igmin224

Semiclassical Potential Function of B–B Interaction: Reduction to Integrable Form
Levan Chkhartishvili
DOI10.61927/igmin317

Page Navigation

Why publish with us?

Global Visibility – Indexed in major databases
Fast Peer Review – Decision within 14–21 days
Open Access – Maximize readership and citation
Multidisciplinary Scope – Biology, Medicine and Engineering
Editorial Board Excellence – Global experts involved
University Library Indexing – Via OCLC
Permanent Archiving – CrossRef DOI
APC – Affordable APCs with discounts
Citation – High Citation Potential

Submit Your Article

トップ10の記事をクリック

クイックリンク

原稿を提出する

Browse by Subjects

Browse by Sections

Special Issues

Members

Articles

Explore Content

Identify Us

Publish Now

Policies

Manuscript Guidelines

Other Services

Identify Us

Search

Select Language

Explore Section

END

Audio Signal Classification Using Deep Learning

Affiliation

要約

数字

参考文献

類似の記事

Page Navigation

Why publish with us?

クイックリンク

研究論文

私たちを識別する

今すぐ公開する

その他のサービス

政策

原稿のガイドライン

連絡

END

Audio Signal Classification Using Deep Learning

Affiliation

要約

数字

参考文献

類似の記事

Most Viewed

Nanorobots in Medicine: Advancing Healthcare through Molecular Engineering:...

The Salt and Dust of the Aral Sea Could Turn Central Asia into A Second Sah...

Revisiting Ice Ages Cycles...

Preparing for SpaceX Mission to Mars...

Revisit TBCK-A Pseudo Kinase or a True Kinase...

Efficacy of Alternative Insecticides against Dusky Cotton Bug (Oxycarenus l...

Use of Augmented Reality as a Radiation-free Alternative in Pain Management...

Mastocytosis: Principles and Pitfalls in the Diagnosis of a Unique Disease...

Study of the Histological Features of the Stroma of High-Grade Gliomas Depe...

Correlation between Different Factors of Non-point Source Pollution in Yang...

The Impact of Teledentistry on Modern Dental Practice...

How Increased CO2 Warms the Earth-Two Contexts for the Greenhouse Gas Effec...

The Role of CCL18 in Rheumatoid Arthritis Diseases...

A Study of Multi-Pose Effects On a Face Recognition System...

Utilising Phytoremediation in Green Technologies: Exploring Natural Means o...

Communication Training at Medical School: A Quantitative Analysis...

Synergistic Assessment of Supplementation of Ascorbic Acid and Massularia a...

The Influence of Low Pesticide Doses on Fusarium Molds...

Preventing Chronic Pain: Solutions to a Public Health Crisis...

High Resolution X-ray Diffraction Studies of the Natural Minerals of Gas Hy...

Ammonia: A Trend of Dry Deposition in Vietnam...

A Case of Facial Erysipelas with Necrosis of the Upper Eyelid...

Levetiracetam-induced Rhabdomyolysis - A Rare Complication...

Technical & Economic Feasibility Study of Proposed Pump Storage Power Plant...

Zinc Supplementation in Anorexic Children with Vomiting Syndrome: Evaluatio...

Abrasive Wear in Some High Fe-Cr-C Alloy in Cement Powder...

Will SpaceX Send Humans to Mars in 2028?...

AI-Driven Smart Auditory Health Systems: Bridging Audiology and Public Heal...

Climate Changes and Mango Production (Temperature)...

Revisiting 2,000 Years of Climate Change (Bad Science and the “Hockey Sti...

A Rare Entity of Idiopathic Clitoromegaly with HBsAg Positive Status Manage...

Balancing Act: Exploring the Interplay Between Human Judgment and Artificia...

Various Media used to Detoxify Abrus precatorius - A Mini Review...

The Power of Artificial Intelligence for Improved Patient Outcomes, Ethical...

Maternal Knowledge and Practices in Caring for Children under Five with Pne...

Prevalence of Non-specific Low Back Pain Among Chinese Healthcare Workers (...

Evaluating Digital Imaging Technologies for Anogenital Injury Documentation...

The Role of Supplementation in Enhancing Recovery and Endurance among Fitne...

Theoretical Review on Microplastic Pollution: A Multifaceted Threat to Mari...

Atmospheric Fungal Spore Injection: A Promising Breakthrough for Challengin...

Quality Culture - Lessons Learned from the Low- and Medium Income World...

Risk of Nutritional Deficiencies and Changes in Dietary Patterns after Bari...

The Relationship between Energy and Climate Warming...

The Impact of Stress on Periodontal Health: A Biomarker-Based Review of Cur...

Screening for Sexually Transmitted Infections in Adolescents with Genitouri...

The Dibia in Igbo Traditional Socio-political and Metaphysical Economy: An ...

Exploring Markov Decision Processes: A Comprehensive Survey of Optimization...

Macrorhabdus ornithogaster-associated Avian Macrorhabdosis: A Narrative Rev...

Knowledge Discovery on Artificial Intelligence and Physical Therapy: Docume...

The Kazakh Language Requires Reform of its Writing...

Unraveling Cognitive Aging: A Comprehensive Narrative Review of the Seattle...

The Examination of Game Skills of Children Aged 5-6 Years Participating in ...

Assessing Bee (Hymenoptera, Apoidea, Anthophila) Diversity and Floral Prefe...

Preventing Rectal Toxicity in Prostate Cancer: Diet and Supplement Alternat...

Examining the Causal Connection between Lipid-lowering Medications and Mali...

Diagnostic Challenges in Pancreatic Tumors...

Exploring Upper Limb Kinematics in Limited Vision Conditions: Preliminary I...

Country Risk to Face Global Emergencies: Negative Effects of High Public De...

IoT-based Real-time Temperature Monitoring in Critical Systems: A Review...

Association and New Therapy Perspectives in Post-Stroke Aphasia with Hand M...

Unlawful Homicide of Two Ugly and Disabled Victims in a Japanese Tale Based...

Risks and Effects of Medicinal Plants as an Adjuvant Treatment in Mental Di...

Potentially Toxic Metals in Cucumber Cucumis sativus Collected from Peninsu...

Prevalence of Diabetic Retinopathy among Self-reported Newly Diagnosed Diab...

EB Naevi-like Lesion in Infant Bullous Pemphigoid...

Innovative Strategies in the Prevention and Treatment of Peri-implantitis...

Comparative Analysis of Lattice Pylons and Polygonal Monopods in the SNEL S...

The Antioxidant and Antidepressant Properties of Dietary Proteins Derived f...

New HMPV Virus Outbreak: Emerging Concerns and Public Health Implications...

Effect of Rainfall on Water Parameters in Recreational Lakes in Heidelberg,...

A Comprehensive Review of Federated Learning in Cancer Diagnosis and Progno...

AFM Analysis of Polymeric Membranes Fouling...