Close Menu
  • Home
  • Stock
  • Parenting
  • Personal
  • Fashion & Beauty
  • Finance & Business
  • Marketing
  • Health & Fitness
  • Tech & Gadgets
  • Travel & Adventure

Subscribe to Updates

Subscribe to our newsletter and never miss our latest news

Subscribe my Newsletter for New Posts & tips Let's stay updated!

What's Hot

20-minute habit that may be the secret to a longer life

abril 8, 2026

Why ‘hybrid’ fitness is more efficient than other workouts

abril 7, 2026

Kegels can backfire during menopause — here’s what to do instead

abril 3, 2026
Facebook X (Twitter) Instagram
  • Home
  • Contact us
  • DMCA
  • Política de Privacidad
  • Publicidad en DD Noticias
  • Sobre Nosotros
  • Términos y Condiciones
Facebook X (Twitter) Instagram
DD Noticias: Tu fuente de inspiración diariaDD Noticias: Tu fuente de inspiración diaria
  • Home
  • Stock
  • Parenting
  • Personal
  • Fashion & Beauty
  • Finance & Business
  • Marketing
  • Health & Fitness
  • Tech & Gadgets
  • Travel & Adventure
DD Noticias: Tu fuente de inspiración diariaDD Noticias: Tu fuente de inspiración diaria
Home » Google Opens Access to Gemini 2.5 Native Audio Dialog and Controllable Speech Generation in Preview
Technology & Gadgets

Google Opens Access to Gemini 2.5 Native Audio Dialog and Controllable Speech Generation in Preview

Jane AustenBy Jane Austenjunio 4, 2025No hay comentarios2 Mins Read
Facebook Twitter Pinterest LinkedIn Tumblr Email
Google Opens Access to Gemini 2.5 Native Audio Dialog and Controllable Speech Generation in Preview
Share
Facebook Twitter LinkedIn Pinterest Email


Google introduced new audio generation capabilities with the Gemini 2.5 models at the Google I/O 2025. The Mountain View-based tech giant is now letting developers and individuals test these features on its platform. The two new capabilities include native audio dialog and controllable text-to-speech (TTS) with Gemini 2.5 Flash preview. While the former can natively generate human-like audio while responding to user prompts, the latter can convert any script into conversational speech. These features are currently not available to developers via application programming interfaces (APIs).

Google Showcases Gemini 2.5 Flash’s Audio Output Capabilities

In a blog post, the tech giant detailed the features of these two audio generation modes, highlighting how developers can use them to build new experiences for people. Currently, native audio dialog can be tried out in Google AI Studio’s stream tab, whereas the TTS feature can be tested in the generate media tab within AI Studio.

Native audio dialog with Gemini 2.5 Flash preview is designed for real-time conversations between a human user and the AI. The user can either type a prompt or speak it, and the AI responds verbally. This process directly generates audio, instead of first generating text and then converting it into speech.

There are several advantages to that as well. It supports affective dialog, which means when Gemini 2.5 Flash responds to the user’s tone of voice, it can recognise the emotion behind the said words. It can understand when the user sounds scared, angry, or surprised and respond accordingly.

Apart from this, the audio generation feature can express emotions when speaking, adopt different accents and linguistic styles, can access tools such as Google Search, and supports more than 24 languages.

Coming to the controllable TTS feature, it offers multi-speaker dialogue generation, can produce emotions and accents while narrating the script, control delivery speed and emphasise pronunciation, and supports the same 24 languages and language mixing.

Google says these capabilities were assessed for potential risks across the development process. The company used both internal mechanisms as well as red teaming to find and fix any vulnerabilities. The company also highlighted that all audio outputs from these models are embedded with SynthID, its watermarking technology.



Source link

Share. Facebook Twitter Pinterest LinkedIn Tumblr Email
Jane Austen
  • Website

Related Posts

Bitcoin Core v30 allenta OP_RETURN: Alcuni miner S19 affrontano una pressione di booster

noviembre 17, 2025

Bitcoin Core v30: No es una actualización — es presionar el 「booster de eliminación de mineros」

noviembre 17, 2025

Pika Labs Launches Social AI Video App on iOS, Unveils New Audio-Driven Video Generation AI Model

agosto 12, 2025
Add A Comment
Leave A Reply Cancel Reply

Editors Picks

Fast fashion pioneer Forever 21 files for bankruptcy — again

marzo 18, 2025

Dow gains 350 points as stocks climb for 2nd day after S&P 500 enters correction

marzo 18, 2025

Yellow Creditors Have Own Plan to Share Trucker’s $550 Million

marzo 18, 2025

Alphabet in Talks to Buy Startup Wiz for $30 Billion, WSJ Says

marzo 18, 2025
Top Reviews
DD Noticias: Tu fuente de inspiración diaria
Facebook X (Twitter) Instagram Pinterest Vimeo YouTube
  • Home
  • Contact us
  • DMCA
  • Política de Privacidad
  • Publicidad en DD Noticias
  • Sobre Nosotros
  • Términos y Condiciones
© 2026 ddnoticias. Designed by ddnoticias.

Type above and press Enter to search. Press Esc to cancel.