DESIGN AND IMPLEMENTATION OF TEXT-TO-SPEECH/AUDIO SYSTEM

  • Type: Project
  • Department: Computer Science
  • Project ID: CPU0233
  • Access Fee: ₦5,000 ($14)
  • Chapters: 5 Chapters
  • Pages: 58 Pages
  • Methodology: Nil
  • Reference: YES
  • Format: Microsoft Word
  • Views: 2.7K
  • Report This work

For more Info, call us on
+234 8130 686 500
or
+234 8093 423 853

ABSTRACT

Everyone uses the computer for one reason or the other. For those with poor eye sight, it is always a problem to read texts from screen, either due to small font-size or bad eye sight. This has led to the design of a text to speech system capable of converting written texts to speech.. TSpeech synthesis systems are often called text-to-speech (TTS) systems in reference to their ability to convert text into speech. However, systems exist that instead render symbolic linguistic representations like phonetic transcriptions into speech. A text-to-speech system is composed of two parts: a front-end and a back-end. Broadly, the front-end takes input in the form of text and outputs a symbolic linguistic representation. The back-end takes the symbolic linguistic representation as input and outputs the synthesized speech waveform . TTS software can "read" text from a document, Web page or e-Book, generating synthesized speech through a computer's speakers. TTS can also convert text files into audio MP3 files that can then be transferred to a portable MP3 player or CD-ROM. This can save time by allowing the user to listen to reports or background materials while performing other tasks. TTS makes a critical difference to those with disabilities such as poor vision or visual dyslexia. People with speech loss can utilize specialized TTS programs to turn typed words into vocalization. TTS programs provide a valuable edge, particularly for learning new languages. This thesis was implemented using the java programming language for front-end design and MySQL for data storage.

CHAPTER ONE

1.1 BACKGROUND OF THE STUDY

Language is the ability to express one’s thoughts by means of a set of signs (text), gestures, and sounds. It is a distinctive feature of human beings, who are the only creatures to use such a system. Speech is the oldest means of communication between people and it is also the most widely used. ‘Speech synthesis’ also called ‘Text to speech synthesis’ is the artificial production of human speech. A computer system used for this purpose is called a speech synthesizerand can be implemented in software. A text-to-speech (TTS)system simply converts text to speech. Many computer operating systems have included speech synthesizers since the early 1990s. Recent progress in speech synthesis has produced synthesizers with very high intelligibility but the sound quality and naturalness still remain a major problem. However, the quality of present products has reached an adequate level for several applications, such as multimedia and telecommunications. The following thesis presents a brief overview of the main text-to-speech synthesis problems, and the initial work done in building a TTS in English.

At first sight, this task does not look too hard to perform. After all we all have a deep knowledge of reading rules of our mother tongue. They were transmitted to us, in a simplified form, at primary school, and we improved them year after year. But in the context of TTS synthesis, it is impossible to record and store all the words of the language. Some other method has to be implemented for this purpose. The quality of a speech synthesizer is judged by its similarity to the human voice and by its ability to be understood. A text-to-speech synthesizer allows people with visual impairments and reading disabilities to listen to written works on a home computer. Many computer operating systems have included speech synthesizers since the early 1990s. Astro- physician Stephen Hawkins, who is completely paralyzed, gives all his lectures using a TTS system.

Text-to-speech synthesis -TTS - is the automatic conversion of a text into speech that resembles, as closely as possible, a native speaker of the language reading that text. Text-to-speech/ Audio system is the technology which lets computer speak to you. The TTS system gets the text as the input and then a computer algorithm which called TTS engine analyses the text, pre-processes the text and synthesizes the speech with some mathematical models. The TTS engine usually generates sound data in an audio format as the output.  The text-to-speech (TTS) synthesis procedure consists of two main phases. The first is text analysis, where the input text is transcribed into a phonetic or some other linguistic representation, and the second one is the generation of speech waveforms, where the output is produced from this phonetic and prosodic information. These two phases are usually called high and low-level synthesis. The input text might be for example data from a word processor, standard ASCII from e-mail, a mobile text-message, or scanned text from a newspaper. The character string is then pre-processed and analyzed into phonetic representation which is usually a string of phonemes with some additional information for correct intonation, duration, and stress. Speech sound is finally generated with the low-level synthesizer by the information from high-level one. The artificial production of speech-like sounds has a long history, with documented mechanical attempts dating to the eighteenth century. [O' Shaughnessy 2004].

Speech synthesis can be described as artificial production of human speech. A computer system used for this purpose is called a speech synthesizer, and can be implemented in software or hardware. A text-to-speech (TTS) system converts normal language text into speech. Synthesized speech can be created by concatenating pieces of recorded speech that are stored in a database. Systems differ in the size of the stored speech units; a system that stores phones or diaphones provides the largest output range, but may lack clarity. For specific usage domains, the storage of entire words or sentences allows for high-quality output.

1.2 STATEMENT OF THE PROBLEM

The importance of texts cannot be overemphasized. Hardly can anyone pass a message without including one form of text or the other. This is a problem for the visually impaired. They find it hard to read through the texts especially when the font-size is small. This has led to the development of a text to speech conversion system. For those with learning disabilities, some in literary levels, they often get frustrated trying to browse the internet because so much of it is in text form. 

Also in some already developed speech synthesizers, the problem area in speech synthesis is very wide. There are several problems in text pre-processing, such as numerals, abbreviations, and acronyms. This system will help solve the problems by using well written synthesis algorithm for the conversion.

Even for people with the visual capability to read, the process can often cause too much strain to be of any use or enjoyment. With text to speech, people with visual impairment can take in all manner of content in comfort instead of strain.

1.3 OBJECTIVES OF STUDY

The main objective of the paper is to design and implement a Text-to-Speech/Audio System. The Speech/Audio systemfocuses precisely on the following objectives:

- To Design and Implement a Speech synthesizer that converts text to audio.

- To Design and Implement a System that can read out text in any frequency that user specifies.

- To design and implement a speech synthesizer that can read out text in both female and male voices.

1.4 SIGNIFICANCE OF THE STUDY

The significance of this study is:

1. The application will build a platform to aid people with disabilities especially on reading and also help get information easily without any stress. 

2. The project could also help children learn how to pronounce words and how to read.

3. The study will serve as a foundation and guide to other research students interested in researching on Text-to-Speech systems.

DESIGN AND IMPLEMENTATION OF TEXT-TO-SPEECH/AUDIO SYSTEM
For more Info, call us on
+234 8130 686 500
or
+234 8093 423 853

Share This
  • Type: Project
  • Department: Computer Science
  • Project ID: CPU0233
  • Access Fee: ₦5,000 ($14)
  • Chapters: 5 Chapters
  • Pages: 58 Pages
  • Methodology: Nil
  • Reference: YES
  • Format: Microsoft Word
  • Views: 2.7K
Payment Instruction
Bank payment for Nigerians, Make a payment of ₦ 5,000 to

Bank GTBANK
gtbank
Account Name Obiaks Business Venture
Account Number 0211074565

Bitcoin: Make a payment of 0.0005 to

Bitcoin(Btc)

btc wallet
Copy to clipboard Copy text

Details

Type Project
Department Computer Science
Project ID CPU0233
Fee ₦5,000 ($14)
Chapters 5 Chapters
No of Pages 58 Pages
Methodology Nil
Reference YES
Format Microsoft Word

Related Works

ABSTRACT Everyone uses the computer for one reason or the other. For those with poor eye sight, it is always a problem to read texts from screen, either due to small font-size or bad eye sight. This has led to the design of a text to speech system capable of converting written texts to speech.. TSpeech synthesis systems are often called... Continue Reading
VISUAL CLASSROOM AND TEXT BASED E- LEARNING ABSTRACT INTERNET: As global connection of computers sited around the world forming huge network for information to be shared and disseminated by many millions of people has done so well in information Technology and in education so to speak. Since the introduction of Value Added Network certain events... Continue Reading
ABSTRACT A Text-to-speech application is a synthesizer software that converts text into spoken word, by analyzing and processing the text using Natural Language Processing (NLP) and then using Digital Signal Processing (DSP) technology to convert this processed text into synthesized speech representation of the text. The vision impaired students... Continue Reading
ABSTRACT Kate is the default text editor in linux, and also one of the most powerful and feature-rich text editors available for Linux. It can also be used successfully as an IDE (integrated development environment) and supports, among many others, spell-checking, highlighting for a huge amount of programming languages, it has an integrated... Continue Reading
ABSTRACT Kate is the default text editor in linux, and also one of the most powerful and feature-rich text editors available for Linux. It can also be used successfully as an IDE (integrated development environment) and supports, among many others, spell-checking, highlighting for a huge amount of programming languages, it has an integrated... Continue Reading
ABSTRACT INTERNET: As global connection of computers sited around the world forming huge network for information to be shared and disseminated by many millions of people has done so well in information Technology and in education so to speak. Since the introduction of Value Added Network certain events... Continue Reading
Introduction The advent of new information and communication technologies (ICTs) has ushered in a new era of new media, signalling unbounded possibilities for language and communication studies. In actual fact, the ever increasing mobility of the Internet the world over has opened yet other dimensions to the study of language use in... Continue Reading
ABSTRACT                             Words form the basis for a Linguistic analysis at any level of language study. The aim of this essay was to identify the various Lexical and Syntactic elements that make up personal text... Continue Reading
ABSTRACT This study is a research into the effect of market segmentation and positioning on company performance in selected textbook publishing firms in Owerri. The study seeks to find out how market segmentation and product positioning help the firms in achieving greater part of it’s sales objectives. In an attempt to carry out the study... Continue Reading
Natural Language Processing is a flourishing aspect of Data Science in both academia and industry. This paper examines the technique of sentiment classification on text data using automated Machine learning approaches. It will delve into newly introduced embedding techniques and machine learning model pipelines. The models explored in this paper... Continue Reading
Call Us
whatsappWhatsApp Us