`langchain_google_community.google_speech_to_text`.SpeechToTextLoader¶

class langchain_google_community.google_speech_to_text.SpeechToTextLoader(project_id: str, file_path: str, location: str = 'us-central1', recognizer_id: str = '_', config: Optional[RecognitionConfig] = None, config_mask: Optional[FieldMask] = None, is_long: bool = False)[source]¶

谷歌云语音转文本音频字幕加载器。

它使用谷歌云语音转文本API进行音频文件转录，并根据指定的格式将转录的文本加载到一个或多个文档中。

使用时，您应该已安装google-cloud-speech python包。

可以通过谷歌云存储URI或本地文件路径指定音频文件。

有关谷歌云语音转文本的详细信息，请参阅产品文档。https://cloud.google.com/speech-to-text

初始化GoogleSpeechToTextLoader。

参数

project_id (str) – 谷歌云项目ID。
file_path (str) – 谷歌云存储URI或本地文件路径。
location (str) – 语音转文本识别器位置。
recognizer_id (str) – 语音转文本识别器ID。
config (Optional[RecognitionConfig]) – 识别选项和功能。更多信息： https://cloud.google.com/python/docs/reference/speech/latest/google.cloud.speech_v2.types.RecognitionConfig
config_mask (Optional[FieldMask]) – 在此识别请求中覆盖识别器default_recognition_config中值的config字段列表。更多信息： https://cloud.google.com/python/docs/reference/speech/latest/google.cloud.speech_v2.types.RecognizeRequest
is_long (bool) – 使用异步云语音识别，主要用于长文档。更多信息： https://cloud.google.com/speech-to-text/v2/docs/batch-recognize

方法

`__init__`(project_id, file_path[, location, ...])	初始化GoogleSpeechToTextLoader。
`alazy_load`()	文档的懒加载器。
`aload`()	将数据加载到Document对象中。
`lazy_load`()	文档的懒加载器。
`load`()	转录音频文件并将文稿加载到文档中。
`load_and_split`([text_splitter])	加载文档并分成块。

__init__(project_id: str, file_path: str, location: str = 'us-central1', recognizer_id: str = '_', config: Optional[RecognitionConfig] = None, config_mask: Optional[FieldMask] = None, is_long: bool = False)[source]¶

初始化GoogleSpeechToTextLoader。

参数

project_id (str) – 谷歌云项目ID。
file_path (str) – 谷歌云存储URI或本地文件路径。
location (str) – 语音转文本识别器位置。
recognizer_id (str) – 语音转文本识别器ID。
config (Optional[RecognitionConfig]) – 识别选项和功能。更多信息： https://cloud.google.com/python/docs/reference/speech/latest/google.cloud.speech_v2.types.RecognitionConfig
config_mask (Optional[FieldMask]) – 在此识别请求中覆盖识别器default_recognition_config中值的config字段列表。更多信息： https://cloud.google.com/python/docs/reference/speech/latest/google.cloud.speech_v2.types.RecognizeRequest
is_long (bool) – 使用异步云语音识别，主要用于长文档。更多信息： https://cloud.google.com/speech-to-text/v2/docs/batch-recognize

async alazy_load() → AsyncIterator[Document]¶

文档的懒加载器。

返回类型: AsyncIterator[Document]

async aload() → List[Document]

将数据加载到Document对象中。

返回类型: List[Document]

lazy_load() → Iterator[Document]

文档的懒加载器。

返回类型: Iterator[Document]

load() → List[Document]

转录音频文件并将文稿加载到文档中。

使用Google Cloud Speech-to-Text API将音频文件转录，并在转录完成前阻塞。

返回类型: List[Document]

load_and_split(text_splitter: Optional[TextSplitter]= None) → List[Document]

加载文档并将其分割成块。块以Document的形式返回。

不要重写此方法。应考虑将其弃用！

参数: text_splitter (Optional[TextSplitter]) – 使用以分割文档的TextSplitter实例。默认为RecursiveCharacterTextSplitter。
返回：: 文档列表。
返回类型: List[Document]

langchain_google_community.google_speech_to_text.SpeechToTextLoader¶

`langchain_google_community.google_speech_to_text`.SpeechToTextLoader¶