Hackers Realm

Jul 14, 20222 min

Convert Speech to Text using Python | Speech Recognition | Machine Learning Project Tutorial

Updated: May 31, 2023

Unlock the power of speech-to-text conversion with Python! This comprehensive tutorial explores speech recognition techniques and machine learning. Learn to transcribe spoken words into written text using cutting-edge algorithms and models. Enhance your skills in natural language processing and optimize your applications with this hands-on project tutorial. #SpeechToText #Python #SpeechRecognition #MachineLearning #NLP

Convert Speech to Text using Speech Recognition

In this project tutorial we will install the Google Speech Recognition module and covert real-time audio to text and also convert an audio file to text data.

You can watch the step by step explanation video tutorial down below

Project Information

The objective of the project is to convert speech to text in real time and convert audio file to text. It uses google speech API to convert the audio to text.

Libraries

  • speech_recognition

  • Google Speech API

We install the module to proceed

# install the module
 
!pip install speechrecognition
 
!conda install pyaudio

Requirement already satisfied: speechrecognition in c:\programdata\anaconda3\lib\site-packages (3.8.1)
 
Collecting PyAudio
 
Using cached PyAudio-0.2.11.tar.gz (37 kB)
 
Building wheels for collected packages: PyAudio
 
Building wheel for PyAudio (setup.py): started
 
Building wheel for PyAudio (setup.py): finished with status 'error'
 
Running setup.py clean for PyAudio
 
Failed to build PyAudio
 
Installing collected packages: PyAudio
 
Running setup.py install for PyAudio: started
 
Running setup.py install for PyAudio: finished with status 'error'

Now we import the module

# import the module
 
import speech_recognition as sr

We initialize the module

# initialize
 
r = sr.Recognizer()

Convert Speech to Text in Real time

We will convert real time audio from a microphone into text

while True:
 
with sr.Microphone() as source:
 
# clear background noise
 
r.adjust_for_ambient_noise(source, duration=0.3)
 

 
print("Speak now")
 
# capture the audio
 
audio = r.listen(source)
 

 
try:
 
text = r.recognize_google(audio)
 
print("Speaker:", text)
 
if text == 'quit':
 
break
 
except:
 
print('Please say again!!!')

Speak now
 
Speaker: welcome to the channel
 
Speak now
 
Speaker: testing speech recognition
 
Speak now
 
Speaker: quit

  • Microphone() - Receive audio input from microphone

  • adjust_for_ambient_noise(source, duration=0.3) - Clear any background noise from the real time input

  • listen(source) - Capture the audio from the source

  • recognize_google(audio) - Google Speech recognition function to convert audio into text

  • text == 'quit' - Condition to quit the while loop

Convert Audio to Text

Now we will process and convert an audio file into text

with sr.AudioFile('test.wav') as source:
 
print("listening to audio")
 
# capture the audio
 
audio = r.listen(source)
 

 
try:
 
text = r.recognize_google(audio)
 
print("Audio:", text)
 
except:
 
print('Error')

listening to audio
 
Audio: welcome to speech recognition

  • Displayed text is the same as the speech in the audio file

  • For larger audio files you need to split them in smaller segments for better processing

Final Thoughts

  • Very useful tool for converting real time recordings into text which can help in chats, interviews, narration, captions, etc.

  • You can also use this process for Emotional Speech recognition and further analyze the text for sentiment analysis.

  • The Google Speech recognition is a very effective and precise module, you may implement any other module to convert speech into text as per your preference.

In this project tutorial we have explored Convert Speech to Text process using the Google Speech Recognition module. We have installed the module and processed real time audio recording and an audio file converting into text data.

Get the project notebook from here

Thanks for reading the article!!!

Check out more project videos from the YouTube channel Hackers Realm

    1199
    0