deepgram python sdk: Deepgram Go SDK Tutorial: Getting Started with Audio Transcription and Live Streaming

deepgram python sdk: Explore the Deepgram Go SDK with examples for pre-recorded audio transcription, live streaming, and management APIs. Learn how to get started with the SDK and integrate it into your Go applications.

October 18, 2024 at 11:43

Deepgram Go SDK Tour

Introduction

The Deepgram Go SDK can be found at github.com/deepgram/deepgram-go-sdk. The README has been rewritten to match the new release and to provide a smooth onboarding experience for users.

Getting Started

To get started with the Deepgram Go SDK, sign up on the Deepgram website to get a free API key. Links to documentation and quick starts are available in the README. The SDK provides examples to help users get started with different functionality.

Examples Folder

The examples folder contains small applications and utilities to demonstrate the SDK's functionality. The examples are categorized into three main areas: pre-recorded, live streaming, and management APIs. Each category has multiple examples that demonstrate different usage scenarios.

Pre-recorded Category

The pre-recorded category includes examples for:

  • Transcribing audio from a file
  • Transcribing audio from a stream
  • Transcribing audio from a URL

The examples can be run by cloning the repository and navigating to the examples folder.

URL Example

The URL example takes a URL that contains an audio file and sends it to the Deepgram server for live transcription. The example demonstrates how to use the SDK to transcribe audio from a URL endpoint. To run the code, use the command go run main.go and you can see the transcription result.

The code is straightforward and consists of a main application with a transcription option and a client that creates a new pre-recorded client. The client takes a URL as input and performs a transcription of the pre-recorded audio from the URL. The result of the transcription is a structured result that is marshaled into data and then formatted as JSON to be printed to the screen.

Live Streaming Category

Another category of examples is live streaming, which can be found in the examples folder under streaming. There are three examples of live streaming: one from an HTTP endpoint, one from a microphone, and one that replays an audio file to live transcription.

To run the example of live transcription using a microphone, you need to have the PortAudio library installed. The live transcription example works by taking the microphone input attached to your laptop, forwarding the audio bits to the Deepgram platform, and then receiving transcription messages in real-time back to the console or main application.

To run this example, go to the examples directory and navigate to the streaming folder, then run the code with the command go run main.go. The example code for live transcription is a simple main application that demonstrates how to use live transcription with the API.

Real-Time Audio Transcription Example

The example demonstrates real-time audio transcription using the HTTP microphone and the Deepgram platform. To run the example, simply execute go run main and speak into the microphone. The transcription occurs in real-time, with notifications sent to the console application.

The code is simpler than previous examples, with a reduced number of lines and improved organization.

Initializing the Transcription

The main function initializes a go context, sets transcription options, and connects to the Deepgram WebSocket.

Starting the Microphone

The liveStream function is used to create a new live stream client, connect to the WebSocket, and start the microphone. The microphone function is used to allocate a microphone and start recording audio.

Streaming Audio to Deepgram

The stream function is used to take the microphone stream and pass it into the Deepgram client. The microphone and client use standard Go interfaces (io.Reader and io.Writer), making it easy to pass the client into the microphone stream.

Simplified Code

The code simplifies the process of connecting the microphone to the Deepgram platform.

Management APIs in Go SDK for Live Streaming

The Go SDK provides management APIs for managing live streaming and credit operations. The examples directory contains utility applications that demonstrate the usage of these APIs.

Each utility application exercises the entire library of credit operations, such as:

  • Creating an invitation
  • Listing or reading all invitations
  • Updating an invitation
  • Deleting an invitation

Example Code for Invitations API

Located in the manage folder within the examples directory, the code is a simple "hello world" style application that demonstrates the usage of the Invitations API.

The application:

  • Lists out all projects (invitations are project-based)
  • Lists out all invitations (none found initially)
  • Sends an invitation
  • Reads the invitation to ensure it was created successfully
  • Deletes the invitation
  • Lists out all invitations again to confirm the invitation was successfully deleted

Releasing the Go SDK

The Go SDK is now available, and examples can be found within the repo. If you have any questions or need assistance, feel free to ping the team on Discord. If you find any bugs, have suggestions for enhancements, or feedback, please file an issue in the repo.