Design de som e IA: um mapeamento categórico de modelos generativos de áudio

Laura Suzin Araujo; Alinne Balduino Pires Fernandes

doi:10.5151/pluraldesig2025-16

Blucher Design Proceedings

Outubro 2025, vol.1, num.14 - Plural Design Digital

Artigo - Open Access Idioma principal | Segundo principal

Design de som e IA: um mapeamento categórico de modelos generativos de áudio

Sound Design and AI: A Categorical Mapping of Generative Audio Models

Araujo, Laura Suzin; Fernandes, Alinne Balduino Pires

Artigo:

Este artigo apresenta um mapeamento categórico de modelos de inteligência artificial generativa aplicados à geração e manipulação sonora desenvolvidos até 2025. Com foco em modelos de código aberto e descritos na literatura científica, o estudo organiza as IAs segundo critérios de ano, desenvolvedor, categoria funcional e referência bibliográfica. A análise revela tendências técnicas e criativas, bem como a ampliação das possibilidades expressivas das ferramentas. Ao oferecer uma visão panorâmica e crítica dessas tecnologias, o trabalho contribuiu para a compreensão do papel da IA na transformação do design sonoro contemporâneo e levantou questões sobre autoria e ética algorítmica.

This article presents a categorical mapping of generative artificial intelligence models applied to sound generation and manipulation developed up to 2025. Focusing on open-source models documented in the scientific literature, the study organizes these AIs according to year, developer, functional category, and bibliographic reference. The analysis highlights technical and creative trends, as well as the expansion of the expressive possibilities of these tools. By providing a panoramic and critical view of these technologies, the work contributes to understanding the role of AI in the transformation of contemporary sound design and raises questions about authorship and algorithmic ethics.

Download (PDF)

Palavras-chave: design de som, inteligência artificial generativa, efeito sonoro, text-to-audio; text-to-sound, text-to-music sound design, generative artificial intelligence, text-to-audio, text-to-sound, text-to-music

DOI: 10.5151/pluraldesig2025-16

Referências bibliográficas

[1] ANASTASOPOULOU, Panagiota; TORREY, Jessica; SERRA, Xavier; FONT, Frederic. Heterogeneous sound classification with the Broad Sound Taxonomy and dataset. In: Detection and Classification of Acoustic Scenes and Events – DCASE 2024, Tokyo, Japão, 23–25 out. 2024. Disponível em: https://www.researchgate.net/publication/266051378.
[2] AKKERMANS, Vincent; FONT, Frederic; FUNOLLET, Jordi; DE JONG, Bram; ROMA, Gerard; TOGIAS, Stelios; SERRA, Xavier. Freesound 2: An improved platform for sharing audio clips. In: Late-Breaking Demo Abstracts of the International Society for Music Information Retrieval Conference, 2011. Disponível em: https://ismir2011.ismir.net/latebreaking/LB-21.pdf.
[3] BOMMASANI, Rishi. On the opportunities and risks of foundation models. Stanford CRFM Report, Jul. 2022. arXiv preprint arXiv:2108.07258v3. Disponível em: https://arxiv.org/abs/2108.07258.
[4] CARVALHO, André Carlos Ponce de Leon Ferreira de. Inteligência Artificial: riscos, benefícios e uso responsável. Estudos Avançados, São Paulo, v. 35, n. 101, p. 21–35, 2021. DOI: 10.1590/s0103-4014.2021.35101.003. Disponível em: https://www.scielo.br/j/ea/a/ZnKyrcrLVqzhZbXGgXTwDtn.
[5] COLLINS, Karen. Game sound: an introduction to the history, theory, and practice of video game music and sound design. Cambridge, MA: MIT Press, 2008. Disponível em: https://mitpress.mit.edu/9780262537773/game-sound/.
[6] COSTA, Christian.; PELEGRINI, Alexandre. Modelo para estabelecer competências para o futuro do design orientado pelas tecnologias emergentes. Estudos em Design, Rio de Janeiro, v. 27, n. 3, p. 180-118, 2019. Semestral. Disponível em: https://estudosemdesign.emnuvens.com.br/design/article/view/780/402
[7] CROOK, Tim. The Sound Handbook. London: Routledge, 2012.
[8] FACELI, Katti; LORENA, Ana Carolina; GAMA, João; ALMEIDA, Tiago Agostinho de; CARVALHO, André Carlos Ponce de Leon Ferreira de . Inteligência Artificial - Uma Abordagem de Aprendizado de Máquina. 2. edição. GrupoGen, 2021. Disponível em: https://www.grupogen.com.br/e-book-inteligencia-artificial-uma-abordagem-de-aprendizado-de-maquina.
[9] FREESOUND. AI-generated sounds in Freesound. Freesound Blog, 2023. Disponível em: https://blog.freesound.org/?p=2082 .
[10] FONSECA, Eduardo; PONS, Jordi; FAVORY, Xavier; FONT, Frederic; BOGDANOV, Dmitry; FERRARO, Andres; ORAMAS, Sergio; PORTER, Alastair; SERRA, Xavier. Freesound datasets: a platform for the creation of open audio datasets. In: Proceedings of the 18th International Society for Music Information Retrieval Conference, 2017. Disponível em: https://archives.ismir.net/ismir2017/paper/000161.pdf.
[11] GOOD, Irving John. Speculations Concerning the First Ultraintelligent Machine. Advances in Computers, v.6, p.31-88, 1966. Disponível em: https://doi.org/10.1016/S0065-2458(08)60418-0
[12] INGOLD, Tim. Being Alive: Essays on Movement, Knowledge and Description. London: Routledge, 2011.
[13] INGOLD, Tim. Against soundscape. In: CARLYLE, Angus (ed.). Autumn leaves: sound and the environment in artistic practice. Paris: Double Entendre, 2007. p. 10–13. Disponível em:https://www.mediateletipos.net/wp-content/uploads/2023/10/tim-ingold-against-soundscape-1.pdf.
[14] JEKOSCH, Ute. Sound Perception and Sound Design. 2nd ISCA/DEGA Tutorial & Research Workshop on Perceptual Quality of Systems. Berlin, Deutschland, set., 2006.
[15] KREUK, Felix; SYNNAEVE, Gabriel; POLYAK, Adam; SINGER, Uriel; DÉFOSSEZ, Alexandre; COPET, Jade; PARIKH, Devi; TAIGMAN, Yaniv; ADI, Yossi. AudioGen: textually guided audio generation. arXiv preprint, 2022. Disponível em: https://arxiv.org/abs/2209.15352.
[16] MCLUHAN, Marshall. Os meios de comunicação como extensões do homem. São Paulo: Cultrix, 2003.
[17] MURCH, Walter. In the blink of an eye: a perspective on film editing. 2. ed. Los Angeles: Silman-James Press, 2005.
[18] SCHAFER, R. Murray. A afinação do mundo. Tradução de Maria Lúcia Pereira. São Paulo: Editora da UNESP, 1997.
[19] SCHAEFFER, Pierre. (1966). Tratado dos objetos musicais. Brasília, EdUnB, 1993.
[20] SONNENSCHEIN, David. Sound design: the expressive power of music, voice and sound effects in cinema. Studio City: Michael Wiese Productions, 2001.
[21] SUSINI, Patrick; MISDARIIS, Nicolas; LEMAITRE, Guillaume; ADILOGLU, Kamil. Closing the loop of sound evaluation and design. In: Proceedings of the 2nd ISCA/DEGA Workshop on Perceptual Quality of Systems, 2006. Disponível em: https://www.isca-archive.org/pqs_2006/susini06_pqs.html.
[22] TRUAX, Barry. Acoustic communication. 2. ed. Westport, CT: Ablex Publishing, 2001. Disponível em: https://monoskop.org/images/1/13/Truax_Barry_Acoustic_Communication.pdf.
[23] UNION EUROPEIA. Artificial Intelligence Act (AI Act): Regulation (EU) 2024/1689 of the European Parliament and of the Council of 13 June 2024. Official Journal of the European Union, Brussels, 12 jul. 2024. Disponível em: https://eur-lex.europa.eu/legal-content/EN/TXT/?uri=CELEX:32024R1689.
[24] YIN, Robert K. Estudo de caso: planejamento e métodos. Bookman, 2015.
[25] TRUAX, Barry. Acoustic communication. 2. ed. Westport, CT: Ablex Publishing, 2001. Disponível em: https://monoskop.org/images/1/13/Truax_Barry_Acoustic_Communication.pdf.

Como citar:

Araujo, Laura Suzin; Fernandes, Alinne Balduino Pires; "Design de som e IA: um mapeamento categórico de modelos generativos de áudio", p-125-140. In: PluralDesign2025. São Paulo: Blucher, 2025.
ISSN 23186968, DOI 10.5151/pluraldesig2025-16

últimos 30 dias

107
downloads

118
visualizações

240
indexações

Blucher Design Proceedings

Design de som e IA: um mapeamento categórico de modelos generativos de áudio

Sound Design and AI: A Categorical Mapping of Generative Audio Models

Artigo:

Referências bibliográficas

Como citar:

Sobre a Blucher

Para você

Outros