Text To Speech (TTS) With Voice Customization using HTML, CSS, & JavaScript

By Bytewebster - May 5, 2023

Text To Speech (TTS) using HTML, CSS, and JavaScript

Welcome to bytewebster javascript projects on Text To Speech (TTS) Generator With Voice Customization using HTML, CSS, & JavaScript. You can simply transform any written text into spoken words using our Text to Speech project, which includes a variety of customising options.

Working

This Text-to-speech generator allows you to turn written text into spoken words. The project includes customization options, such as the ability to choose from different voices, adjust the pitch and speed of the speech, and control the volume. Once the user inputs the text and selects their preferred customization options, they can click a button to hear the text spoken out loud.

Direct Download

Detailed Overview of Project

This Text To Speech project can have several potential uses, such as: It can be used as a tool for language learners to practice pronunciation and listening skills by providing spoken feedback on their written work.

Creating this project is a great way to gain hands-on experience in JavaScript, as it is simple to make and requires minimal code. So let's know how we design the HTML structure of this text to speech generator.

HTML Structure

This is the HTML code for our JavaScript Text To Speech Generator Let's break it down. This code starts with a div element with a class of "wrapper". it is used to group the entire TTS application together.

The first element inside the wrapper class is a label element with a "for" attribute set to "text". This label acts as a heading for the following element, a textarea element with the id "text." This textarea's value is set to the default text.

After the textarea, there is a div element with a class of "properties". This div contains various input elements for customizing the voice. There are four labels, each associated with a corresponding input element. The labels are for voice, pitch, rate, and volume respectively.

<div class="wrapper">
    <label for="text">JavaScript Text to speak:</label>
    <textarea id="text">JavaScript Text To Speech (TTS) With Voice Customization By ByteWebster</textarea>
    <div class="properties">
        <label for="voice">Voice:</label>
        <select id="voice"></select>
        <div></div>

        <label for="pitch">Pitch:</label>
        <input id="pitch" type="range" min="0.1" max="2" step="0.1" value="1" />
        <output for="pitch">1</output>

        <label for="rate">Rate:</label>
        <input id="rate" type="range" min="0.1" max="2" step="0.1" value="1" />
        <output for="rate">1</output>

        <label for="volume">Volume:</label>
        <input id="volume" type="range" min="0" max="1" step="0.1" value="1" />
        <output for="volume">1</output>
    </div>
    <button id="speak"><i class="bi bi-megaphone"></i> Speak Text</button>
</div>

Next, there are range inputs for changing the pitch, rate, and volume, as well as a select element for selecting a voice. The current value of each range input is shown in the output element linked to that input.

The final element is a button with the text "Speak Text" and the id "speak". This button activates the text-to-speech (TTS) feature, which uses the chosen voice and preset parameters to read the text from the textarea element.

Now that we have seen the HTML structure, Let's take a look at the CSS code that will help us style this text-to-speech generator.

Styling With CSS

CSS is used to style the HTML elements that comprise this project. Let's go over each rule and see what it does. First we will style the wrapper class. The .wrapper is a class selector used to style the wrapper div element. It uses display: grid to create a grid layout.

In this the gap property is used to create some space between the elements. And the width property sets the width of the grid container to 650px, and max-width is used to ensure that the container does not exceed the width of the viewport.

After that the #text is an ID selector used to style the textarea element. It sets display block to make the element take up the full width of its container. Then the box-shadow property creates a subtle shadow effect for the element.

.wrapper {
    display: grid;
    gap: 20px;
    width: 650px;
    max-width: calc(100vw - 40px);
    padding: 30px;
    border-radius: 10px;
    background-color: #0003;
}

#text {
    display: block;
    height: 200px;
    box-shadow: rgba(0, 0, 0, 0.1) 0px 1px 3px 0px, rgba(0, 0, 0, 0.06) 0px 1px 2px 0px;
    padding: 20px;
    border: none;
    font-size: inherit;
    font-family: inherit;
    resize: vertical;
    border-radius: 20px;
}

.properties {
    display: grid;
    grid-template-columns: max-content minmax(0, auto) 40px;
    gap: 20px;
    padding: 20px;
    border-radius: 4px;
    background-color: #0003;
}

#voice {
    border: 2px solid #ccc;
    border-radius: 5px;
    padding: 5px;
    font-size: 16px;
    width: 430px;
}

#speak {
    padding: 10px;
    border: 1px solid #fff;
    border-radius: 20px;
    color: #fff;
    background-color: #0009;
    font-size: inherit;
    font-family: inherit;
    cursor: pointer;
    appearance: none;
}

Next the .properties is a class selector used to style the div element containing the voice customization inputs. It uses display grid to create a grid layout with two columns, one for the labels and one for the inputs.

The #voice is an ID selector used to style the select element for choosing a voice. It sets a border and border-radius to create a border around the element and padding to add some space between the text and the edges of the element. Finally, The #speak ID selector is used to style the "Speak Text" button. The code sets padding to add some space between the text and the edges of the button.

JavaScript Explanation

Now come to the most important part which is JavaScript. This project is completely useless without javascript. So let's know the role of JavaScript in this project.

This JavaScript code is responsible for handling the text-to-speech functionality of the web page. It uses the Web Speech API to convert text into speech.

The code starts by getting references to various HTML elements using the getElementById() and querySelector() methods. It then sets up event listeners for changes to the pitch, rate, and volume input elements using the addEventListener() method.

After that the updateOutputs() function is called whenever any of these input elements change, and it updates the text content of the output elements with the current value of the input elements.

const textEl = document.getElementById('text');
const voiceInEl = document.getElementById('voice');
const pitchInEl = document.getElementById('pitch');
const rateInEl = document.getElementById('rate');
const volumeInEl = document.getElementById('volume');
const pitchOutEl = document.querySelector('output[for="pitch"]');
const rateOutEl = document.querySelector('output[for="rate"]');
const volumeOutEl = document.querySelector('output[for="volume"]');
const speakEl = document.getElementById('speak');
pitchInEl.addEventListener('change', updateOutputs);
rateInEl.addEventListener('change', updateOutputs);
volumeInEl.addEventListener('change', updateOutputs);
speakEl.addEventListener('click', speakText);

updateVoices();
window.speechSynthesis.onvoiceschanged = updateVoices;

function updateOutputs() {
  pitchOutEl.textContent = pitchInEl.value;
  rateOutEl.textContent = rateInEl.value;
  volumeOutEl.textContent = volumeInEl.value;
}

function updateVoices() {
  window.speechSynthesis.getVoices().forEach(voice => {
    const isAlreadyAdded = [...voiceInEl.options].some(option => option.value === voice.voiceURI);
    if (!isAlreadyAdded) {
      const option = new Option(voice.name, voice.voiceURI, voice.default, voice.default);
      voiceInEl.add(option);
    }
  });
}

function speakText() {
  window.speechSynthesis.cancel();

  const text = textEl.value;
  const utterance = new SpeechSynthesisUtterance(text);
  utterance.voice = window.speechSynthesis.getVoices().find(voice => voice.voiceURI === voiceInEl.value);
  utterance.pitch = pitchInEl.value;
  utterance.rate = rateInEl.value;
  utterance.volume = volumeInEl.value;
  
  window.speechSynthesis.speak(utterance);
}

The updateVoices() function is called when the page loads and when the available voices change. It retrieves the list of available voices using the getVoices() method and iterates over them, adding each voice to the select element used to choose the voice to use for text-to-speech.

Finally, The speakText() function is called when the "Speak Text" button is clicked. It retrieves the text to speak from the text input element, creates a new SpeechSynthesisUtterance object with the text, and sets various properties of the object (such as voice, pitch, rate, and volume) based on the values of the input elements.

We are grateful for your time and attention, and we trust that you have found the project to be interesting.

Video of the Project

Linear Gradient Colors Generator

Tailwind CSS Landing Page

Build A JavaScript Custom Captcha

Product Card slider parallax effect

Take This Short Survey!

Download Source Code Files

From here You can download the source code files of this javascript Text To Speech.
If you are just starting in web development, these snippets will be useful. We would appreciate it if you would share our blog posts with other like-minded people.

Download Source Code

Please wait ...

If the download didn't start automatically, click here

ByteWebster Play and Win Offer.

PLAY A SIMPLE GAME AND WIN PREMIUM WEB DESIGNS WORTH UPTO $100 FOR FREE.

PLAY FOR FREE