Google Gemini
The Google Gemini integration adds a conversation agent, speech-to-text, and text-to-speech entities powered by Google Gemini to Home Assistant. The conversation agent can optionally be allowed to control Home Assistant.
Controlling Home Assistant is done by providing the AI access to the Assist API of Home Assistant. You can control what devices and entities it can access from the exposed entities page. The AI is able to provide you information about your devices and control them.
This integration does not integrate with sentence triggers.
This integration requires an API key to use, which you can generate here, and to be in one of the available regions.
此集成可通过 UI 配置。前往 设置 > 设备与服务 添加。
生成 API 密钥
The API key is used to authenticate requests to the Google Gemini API. To generate an API key take the following steps:
- Visit the API Keys page to retrieve the API key you'll use to configure the integration.
On the same page, you can see your plan: free of charge if the associated Google Cloud project doesn't have billing, or pay-as-you-go if the associated Google Cloud project has billing enabled. Comparison of the plans is available at this pricing page. The major differences include: the free of charge plan is rate limited, and free prompts/responses are used for product improvement.
选项
To define options for Google Gemini, follow these steps:
-
In Home Assistant, go to Settings > Devices & services.
-
If multiple instances of Google Gemini are configured, choose the instance you want to configure.
-
On the card, select the cogwheel
. - If the card does not have a cogwheel, the integration does not support options for this device.

-
Edit the options, then select Submit to save the changes.
If you choose to not use the recommended settings, you can configure the following options:
Google 搜索
Due to an API limitation we cannot have the Google Search tool together with other tools. Request fails with 400 INVALID_ARGUMENT. {'error': {'code': 400, 'message': 'Tool use with function calling is unsupported', 'status': 'INVALID_ARGUMENT'}}.
But you can do the following workaround that exposes a script to voice assistants. The script calls a Google Gemini Conversation that only has the Google Search tool enabled.
Google 搜索工具替代方案
- Add a second Google Gemini conversation agent.
- Select Configure
- In the Control Home Assistant section, uncheck Assist and any other options.
- Uncheck Recommended model settings
- Select Submit
- Check Enable Google Search tool
- Increase Maximum tokens to return in response
- Select Submit
- Create a script (Settings > Automations & scenes > Scripts > Create script)
- Select 3 dots > Edit in YAML and enter the following (edit the
conversation.google_generative_ai_2to match the entity created from the 1st step):
- Select Save script
- Select 3 dots > Settings > Voice assistants
- Check Expose Assist
与超级马里奥对话
You can use this integration to talk to Super Mario and, if you want, have him control devices in your home.
The tutorial is using OpenAI, but this could also be done with the Google Gemini integration.
操作
生成内容
:::tip
This action isn't tied to any integration entry, so it won't use the model, prompt, or any of the other settings in your options. If you only want to pass text, you should use the conversation.process action.
::: Allows you to ask Gemini Pro or Gemini Pro Vision to generate content from a prompt consisting of text and optionally attachments (images, PDFs, etc.). This action populates response data with the generated content.
The response data field text will contain the generated content.
Another example with multiple images:
播报
The tts.speak action is the modern way to use TTS. Add the speak action, select the Google Gemini TTS entity, select the media player entity or group to send the TTS audio to, and enter the message to speak.
Text-to-speech (TTS) generation is controllable, meaning you can use natural language to structure interactions and guide the style, accent, pace, and tone of the audio. You can change the way the text is spoken directly in the message by, e.g. entering "Say cheerfully: Have a wonderful day" instead of just "Have a wonderful day".
For more options about speak, see the Speak section on the main TTS building block page.
In YAML, your action will look like this:
You can configure the following options:
The input language is detected automatically. Check the Google AI documentation for the supported languages.
视频教程
This video tutorial explains how Google Gemini can be set up, how you can send an AI-generated message to your smart speaker when you arrive home, and how you can analyze an image taken from your doorbell camera as soon as someone rings the doorbell.
故障排除
- To aid in diagnosing issues it may help to turn up verbose logging by adding these to your "
configuration.yaml":
移除集成
从 Home Assistant 中移除集成实例
- Go to Settings > Devices & services and select the integration card.
- From the list of devices, select the integration instance you want to remove.
- Next to the entry, select the three-dot
menu. Then, select Delete.

