Java and C++ Video Tutorial – Use Java Native Interface to create Text-to-Speech application

This chart describes what I want to create in this video. It will be simple program which involves two applications. First application will be written in Java language. Thanks to Java Native Interface it will integrate with application written in C++ language. The second application, I mean written in C++, will use Component Object Model and Microsoft Speech API. It will instruct the TTS engine to speak with chosen voices. This C++ application will be Dynamic Link library. To integrate these applications I have to generate header file with JavaH.exe utility. This tool creates it because I indicate Java class with defined native methods.

JNI DLL C++ JAVA TTS

I start to create native methods in Java.  I open NetBeans  and create Java application.  I create TTS class.  I declare one native method which requires two parameters, language to use and text to speak.  I have written here Boolean type, as the result type. But I create special class called SayResult. It has two fields: success and message.  I generate setters and getters.  I compile this project.

package net.keinesorgen.tts;

/**
 *
 */
public class Tts {
 public native SayResult say(String language, String text) throws Exception;
}

package net.keinesorgen.tts;

/**
 *
 */
public class SayResult {
 private boolean success;
 private String message;

 public SayResult() {
 }

 @Override
 public String toString() {
 return success+", "+message;
 }

 
 
 
 /**
 * @return the success
 */
 public boolean isSuccess() {
 return success;
 }

 /**
 * @param success the success to set
 */
 public void setSuccess(boolean success) {
 this.success = success;
 }

 /**
 * @return the message
 */
 public String getMessage() {
 return message;
 }

 /**
 * @param message the message to set
 */
 public void setMessage(String message) {
 this.message = message;
 }
 
 
}

package net.keinesorgen.tts;

/**
 *
 */
public class Start {

 public static void main(String[] args) throws Exception {

 System.loadLibrary("SayLibrary");
 Tts tts = new Tts();

 try {
 SayResult en = tts.say("Vendor=IVONA Software Sp. z o. o.;Language=809", "This is example messsage");
 System.out.println("en=" + en);
 } catch (Exception ex) {
 System.err.println("Error " + ex);
 }
 try {
 SayResult de = tts.say("Vendor=IVONA Software Sp. z o. o.;Language=407", "Guten Tag. Ich heiße Martin. Ich habe die einfache Applikation gemacht.");
 System.out.println("de=" + de);
 } catch (Exception ex) {
 System.err.println("Error " + ex);
 }
 try {
 SayResult pl = tts.say("Vendor=IVONA Software Sp. z o. o.;Language=415", "Cześć. To jest prosta aplikacja.");
 System.out.println("pl=" + pl);
 } catch (Exception ex) {
 System.err.println("Error " + ex);
 }
 }

}

I’m going to generate header file for this class. At first,  I create C++ application  so I open Visual Studio C++ environment.  I choose DLL application type and check “export symbols” additional option.  I can see one field and one method which are exported by this library. They are automatically generated. I replace them later.  Now I open Visual Studio Prompt command line. I go to project’s directory.  I use JavaH utility.  I can see here the options of this utility.  I have to know where the compiled Java class is. I indicate “classes” directory and enter the package and class path. I generate it. I can see here generated JTTS.h header file. I have to add existing header file into Visual Studio project. The environment tells me that it doesn’t understand some types in this header file and doesn’t know where the JNI.h header file is. I have to indicate the directory where the JNI header files are. I go to configuration properties, C++, general, additional include directories  and indicate the path to include directory in my JDK. Now the environment understands everything.

/* DO NOT EDIT THIS FILE - it is machine generated */
#include <jni.h>
/* Header for class net_keinesorgen_tts_Tts */

#ifndef _Included_net_keinesorgen_tts_Tts
#define _Included_net_keinesorgen_tts_Tts
#ifdef __cplusplus
extern "C" {
#endif
/*
 * Class: net_keinesorgen_tts_Tts
 * Method: say
 * Signature: (Ljava/lang/String;Ljava/lang/String;)Lnet/keinesorgen/tts/SayResult;
 */
JNIEXPORT jobject JNICALL Java_net_keinesorgen_tts_Tts_say
 (JNIEnv *, jobject, jstring, jstring);

#ifdef __cplusplus
}
#endif
#endif

I’m going to write the code which uses Microsoft Speech API and the voice installed in my computer. I go back to Visual Studio. I remove auto-generated exported class, function and the variable. I declare here “say” function. It is exported by DLL, so other applications, written in C++ or other language, will be able to call this function. But. This function can’t be directly call through Java Native Interface. I call it from the implementation of JTTS.h header file. I do it later. The result type should be similar to the type in Java application. I define SayResult structure. The types in C++ are different than the types in Java. It is required to convert the data between these types.

#ifdef SAYLIBRARY_EXPORTS
#define SAYLIBRARY_API __declspec(dllexport)
#else
#define SAYLIBRARY_API __declspec(dllimport)
#endif

struct SayResult {
 bool success;
 char* message;
};


SAYLIBRARY_API SayResult say(const wchar_t *language, const wchar_t *text);

I define the body of “say” function. This is the result type. I use here Component Object Model and Microsoft Speech API. In my last videos I made the code which does the same job. Therefore visit my GitHub profile and my blog. You will find the code and explanation of each method I’m writing now. I add necessary header files into common header file. I will be able to use COM and SAPI. I declare the voice and token. I initialise Component Object Model. If something goes wrong I return the failure message. I code the release block. I’m cleaning up.

#include "stdafx.h"
#include "SayLibrary.h"
#include <sapi.h>
#include <sphelper.h>

SAYLIBRARY_API SayResult say(const wchar_t * language, const wchar_t * text)
{
 SayResult result;

 // using SAPI 5
 ISpVoice *pVoice(NULL);
 ISpObjectToken *cpToken(NULL);

 // COM initialize
 if (SUCCEEDED(::CoInitialize(NULL))) {

 if (SUCCEEDED(CoCreateInstance(CLSID_SpVoice, NULL, CLSCTX_ALL, IID_ISpVoice, (void **)&pVoice))) {

 // find voice
 if (SUCCEEDED(SpFindBestToken(SPCAT_VOICES, language, L"", &cpToken))) {

 // set the voice

 if (SUCCEEDED(pVoice->SetVoice(cpToken))) {

 // speak it

 if (SUCCEEDED(pVoice->Speak(text, 0, NULL))) {

 result.success = true;
 result.message = "Success";

 }
 else {
 result.success = false;
 result.message = "I can not speak this text with this voice";
 }

 }
 else {
 result.success = false;
 result.message = "I can not set this voice";

 }

 }
 else {
 result.success = false;
 result.message = "I can not find this voice";
 }

 }
 else {
 result.success = false;
 result.message = "I can not create Voice instance";
 }





 // COM uninitialize and resources releasing

 if (NULL != pVoice) {
 pVoice->Release();
 pVoice = NULL;
 }
 if (NULL != cpToken) {
 cpToken->Release();
 cpToken = NULL;
 }

 ::CoUninitialize();
 }
 else {
 result.success = false;
 result.message = "I can not initialize COM";
 }


 return result;
}

I create the instance of ISpVoice. I need to know what the ClassID is. I have found it out in the SAPI documentation. The address of created instance will be inserted in “pVoice” pointer. I service the error scenario. I will get best matched voice token. I pass this attribute to the “say” function because I’m going to define the voice attributes in Java application. It will be in the configuration because it strongly depends on the user’s environment. I set the voice here. I code the error scenario. If everything goes right, I call “speak” function.  I pass the text here. I service the error scenario, again. If I am here, it means the “speak” function succeeded.

Next, I have to implement the body of the function generated with JavaH utility. It is described in JTTS.h header file. I can see here “jobject” result type instead of SayResult type. This is the way how JNI works. The data types are different. I have to use something like reflection. I use JNIEnv pointer to create SayResult instance. If something goes wrong here I want to throw the exception to Java application. I code this case. I use the constructor of SayResult class to create the instance. I throw the exception here. If I have the object, I set the “success” and “message” fields. These “()V” and “(Z)V” signatures are described in the JNI documentation. “Z” means here “boolean” type. “V” means “void” type. I write “TODO” comment here. I create new function especially for creating this result. I copy and paste the code. I call “createResult” function. I pass the parameters

I convert Java strings into C++ strings. To do that I code new function. I use standard I/O stream library. I call “say” method here. I create converting function. It needs JNI environment too. I compile the application. I can see DLL file has appeared. I use DumpBin utility to print exports of this library. I can see two exported functions. First function is for general propose, I can use it in each C++ application. The second function is designed for calling by Java applications.

#include "stdafx.h"
#include "JTTS.h"
#include "SayLibrary.h"
#include <iostream>

using namespace std;

void javaException(JNIEnv * env, const char* message) {
 env->ThrowNew(env->FindClass("java/lang/Exception"), message);
}

jobject createResult(JNIEnv * env, bool success, const char* message) {
 // create net.keinesorgen.tts.SayResult instance
 jobject result(NULL);
 jclass resultClass = env->FindClass("net/keinesorgen/tts/SayResult");
 if (NULL != resultClass) {
 jmethodID resultClassConstructor = env->GetMethodID(resultClass, "<init>", "()V");
 if (NULL != resultClassConstructor) {
 result = env->NewObject(resultClass, resultClassConstructor);
 if (NULL != result) {
 env->CallVoidMethod(result, env->GetMethodID(resultClass, "setSuccess", "(Z)V"), success);
 jstring jMessage = env->NewStringUTF(message);
 env->CallVoidMethod(result, env->GetMethodID(resultClass, "setMessage", "(Ljava/lang/String;)V"), jMessage);
 }
 else {
 javaException(env, "I can not create result instance");
 }
 }
 else {
 javaException(env, "I can not constructor for result instance");
 }
 }
 else {
 javaException(env, "I can not find result type");
 }
 return result;
}



wstring convertJString(JNIEnv * env, jstring candidate) {
 const jchar *raw = env->GetStringChars(candidate, 0);
 jsize len = env->GetStringLength(candidate);
 wstring result;
 result.assign(raw, raw + len);
 env->ReleaseStringChars(candidate, raw);
 return result;
}


JNIEXPORT jobject JNICALL Java_net_keinesorgen_tts_Tts_say(
 JNIEnv * env, jobject jo, jstring language, jstring text)
{

 // convert jstring into const wchar_t *

 // say
 wstring sayLanguage = convertJString(env, language);
 wstring sayText = convertJString(env, text);

 SayResult result = say(sayLanguage.c_str(), sayText.c_str());
 return createResult(env, result.success, result.message);
}

I open NetBeans project and the library I have just created. I call System.loadLibrary. I create TTS object and try to say something with “say()” method. It is very important to run Java application with the parameter “java.library.path” or LD_LIBRARY_PATH. You have to tell JVM where your library is. I compile Java application. I can see first execution error. “Can’t load IA 32-bit .dll on a AMD 64-bit platform“. I have forgotten I need to compile DLL file for 64-bit platform. I go back to Visual Studio and change the platform. I compile C++ library again. The library is compiled into another directory. Therefore I have to change this path in NetBeans.
I print the result of “Tts.say()” method. I run the application. I can see the error scenario has just happened. I got the error “I can’t find this voice“. It is obvious because I didn’t define right voice attributes. In my GitHub profile, you can find example language attributes. They are dependent on the user’s environment. Watch my previous videos. You will know how to create valid language string. I copy these voice attributes. I run the application and it works. I use other voices in my computer. The application will say something in English, German and Polish.

C++ Video Tutorial : Loading DLL and calling function

In my previous video I have described what  the Dynamic Link library is. Now I try using one simple library and call easy function. There are three important functions to do that: LoadLibrary, GetProcAddress and FreeLibrary. They are defined in <windows.h> header file.

Open your Visual Studio C++ environment and create console application. I include Windows header file. I copy the syntax of LoadLibrary function. If the application couldn’t load the library, I will print the information about it. I include I/O stream header file and use standard namespace. I have to know the function names . I open Dependency Walker and analyse my library.

I want to get the address of entry point of function. I copy the syntax of GetProcAddress function.  I know only the name of function but if I want to call it I need to know what are the parameters of this function and what is the type of result of this function. The information about these things is not visible in the library. I have to find the prototype of this function, it can be found for example in the documentation published at Internet. Let’s find avcodec header file. The Google recommended a couple of results.

I define the function pointers where the GetProcAddress function writes the address. I compile and run the application. I have got the error. The application wasn’t able to load the library because it isn’t in the path. I have to copy the library into right place. I compile and run the application. It works. I need to free the library too. I copy the syntax of free library function.  I compile and run the application. It works.

I want to show you one more thing. I divide the execution of application into parts. After each part I will press enter to continue. Thanks to this I will show you how the consumption of memory is changing. I compile and run the application. I open the task manager and find my application.  We can see that after loading the library the memory consumption increases. After calling FreeLibrary the memory consumption decreases.

 

 

#include "stdafx.h"
#include <Windows.h>
#include <iostream>

using namespace std;

// unsigned avcodec_version(void);
typedef UINT(CALLBACK* AvCodecVersion)(void);
// const char *avcodec_license(void); 
typedef char*(CALLBACK* AvLicense)(void);

int _tmain(int argc, _TCHAR* argv[])
{
	HMODULE dll = LoadLibrary(L"avcodec-55.dll");
	if (NULL != dll) {		

		// version
		{
			AvCodecVersion procedure = (AvCodecVersion)GetProcAddress(dll, "avcodec_version");
			if (NULL != procedure) {
				UINT response = procedure();
				cout << "Version : " << response << endl;
			}
			else {
				cout << "Procedure av version not found" << endl;
			}
		}

		// license
		{
			AvLicense procedure = (AvLicense)GetProcAddress(dll, "avcodec_license");
			if (NULL != procedure) {
				char* response = procedure();
				cout << "License : " << response << endl;
			}
			else {
				cout << "Procedure av licence not found" << endl;
			}
		}

		BOOL freeResponse = FreeLibrary(dll);
		if (freeResponse==0) {
			cout << "Error during library releasing";
		}
	}
	else {
		cout << "There is no DLL library" << endl;
	}
	return 0;
}

C++ Tutorial : What is the DLL ?

As a programmers, we often don’t need to program a lot of things. It’s true because a lot of complicated algorithms and solutions are just developed and tested by others. We have big open source community. Today you are often supposed to use existing solutions and connect them with your code instead of reinventing the wheel. As a result of this the big software, your product, is able to appear in short time.

In each operational system and programming language there is a concept of library. Today I’m talking about Dynamic Link libraries (in shortcut DLL) which exist in the Microsoft Windows operational system and can be developed by you with your C++ environment, like Visual Studio C++.

I clarify the DLL shortcut this way: Library lets you isolate pieces of your code from one another. You can divide your application into many problem domains. Thanks to this you can share or just sell it. The other advantage of using libraries are versioning and swapping. Many times your applications are updated automatically. It usually means some older libraries are exchanged into their new versions. It happens without a reinstallation of your whole application. The Link means your library is outside your application. Their logic, resources and implementation is compiled separately and isn’t included directly into your application. And finally the Dynamic means your library can be loaded on demand. It may be not loaded into memory if you don’t want to use it right now. You decide when you load and use it. As a result of that you save your memory. It’s very important that one DLL can be shared between many applications. The most popular Microsoft libraries are used by almost all your applications. Share means that in the memory of your operational system there is only one instance of your loaded library. And each application uses it.

There are two types of libraries in Microsoft system. We have the libraries delivered with your operational system and the software development kits (in shortcut SDK). These libraries are located for example in “c:\windows\system32 directory“. The most popular and crucial system libraries are cached as KnownDlls and are enumerated in the system registry under “HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\Control\Session Manager\KnownDLLs” node. We have here such important libraries like user32.dll and kernel32.dll.

The second type of libraries consists of the libraries being part of applications you downloaded and installed. These libraries may be written by the vendor of this application, for example it wants to separate these libraries from his application because he wants to update them one day or will load into memory only then when you use a specific part of his application. These libraries are often separated from these applications because of that they are written by open source community or written by another vendor. They are often called as third-party libraries.

Here we have an application called Audacity. This is the software for sound processing. You can use it for recording your voice and music and processing them. This application uses third-party libraries, I show them here, in this folder, and here, in this one. You can see the application uses avcodec-55.dll and this library is available in open source space and everyone can use it for coding or decoding the media. The Audacity is of course great software, but like many other applications, you are using right now, it partly owes the success to other programmers from open source community who developed these libraries and published them.

The dynamic link libraries are some kind of container to your compiled code and resources, for example bitmaps or strings so they are often used to solve localization problems and make your software available in many languages.

How to use and inspect the libraries? I am presenting you some tools.

The rundll32.exe tool is available in your operational system. Thanks to it I can run, from command line, concrete function from each DLL. The syntax of this function is described at this website. I demonstrate calling some functions from crucial system libraries. I have prepared a list of example commands. You can find it at this website. Here I am ordering opening Control Panel window.

RunDll32.exe shell32.dll,Control_RunDLL

This function is called Control_RunDLL and is located in shell32.dll. By calling SetSuspendState method in the powrprof.dll I am ordering hibernating your system.

RunDll32.exe powrprof.dll,SetSuspendState

What’s more, this command is used for locking your screen.

RunDll32.exe user32.dll,LockWorkStation

And this one can be used for opening system window for unplugging and ejecting hardware.

RunDll32.exe shell32.dll,Control_RunDLL hotplug.dll

These functions are described in Windows.h (windows header file) and you can call them in your source code after including this header file.

The other important tool is Dependency Walker. You can download it from this website. Let’s drag and drop some dynamic link library to this window. This utility provides a lot of information about this DLL. This left window represents our dependency libraries. Here, you can see which libraries your library is dependent on. The upper right window represents the functions which are imported to your library from other libraries. This middle right window represents all the functions which your DLL exports to use by others. Finally, this bottom window represents all the DLL’s loaded in the memory. It displays additional information like file timestamp, link timestamp, file size, version, etc… Here we can see our functions we called with rundll32 utility. We did it couple minutes ago. Here we can see that the avcodec-55.dll depends on kernel32.dll, user32.dll and avutil-52.dll library. It requires them to work. If you don’t copy avutil-52.dll with your avcodec-55.dll it doesn’t work. Here we can see which functions from avutil-52.dll are called by avcodec-55.dll. And here we can see all the functions available in avcodec-55.dll.

The next utility I show you is dumpbin.exe tool. It is able from Visual Studio Command Prompt. I open my Visual Studio environment, click Tools and click Visual Studio Command Prompt. The description of the DumpBin utility is published at this website. With this command I can for example display all exported functions. This is the Relative Virtual Address (in shortcut RVAs) of entry point of function and this is the name of this function.

Let’s go back to the “FFmpeg for Audacity” directory. I try using one library and calling some functions. I open this library in Dependency Walker utility and choose simple function. I’m going to write simple C++ application which does it. The function avcodec_version should be easy. Of course this is only name of function. To call it I need to know what are the parameters of this function and what is the result type of this function. The information about these things are not visible in the library. You have to find the declarations of this method, it can be found for example in the documentation published at Internet. Let’s find avcodec header file. The Google recommended a couple of results.

unsigned avcodec_version(void);

const char *avcodec_configuration(void);

const char *avcodec_license(void);

I choose this and I find avcodec_version function. OK, I got it. “unsigned avcodec_version(void);”

I can see the function avcodec_license “const char *avcodec_license(void);” is easy too. I use both functions.

Here is the source code of my simple application:

#include "stdafx.h"
#include <Windows.h>
#include <iostream>

using namespace std;

// unsigned avcodec_version(void);
typedef UINT(CALLBACK* AvCodecVersion)(void);
// const char *avcodec_license(void); 
typedef char*(CALLBACK* AvLicense)(void);

int _tmain(int argc, _TCHAR* argv[])
{
	HMODULE dll = LoadLibrary(L"avcodec-55.dll");
	if (NULL != dll) {		

		// version
		{
			AvCodecVersion procedure = (AvCodecVersion)GetProcAddress(dll, "avcodec_version");
			if (NULL != procedure) {
				UINT response = procedure();
				cout << "Version : " << response << endl;
			}
			else {
				cout << "Procedure av version not found" << endl;
			}
		}

		// license
		{
			AvLicense procedure = (AvLicense)GetProcAddress(dll, "avcodec_license");
			if (NULL != procedure) {
				char* response = procedure();
				cout << "License : " << response << endl;
			}
			else {
				cout << "Procedure av licence not found" << endl;
			}
		}

		BOOL freeResponse = FreeLibrary(dll);
		if (freeResponse==0) {
			cout << "Error during library releasing";
		}
	}
	else {
		cout << "There is no DLL library" << endl;
	}
	return 0;
}


Continue reading

Easy way to make TTS text-to-speech application

This is my first video I publish at YouTube and I made in English so try being nice and indulgent when you rate it.

I have been looking the best speech synthesiser for couple of days which I could use in my home made applications. I program mostly in Java language but I haven’t found any solution which is completed and has enough quality to satisfy me. I found free library FreeTTS which was entirely written in Java language. I will talk about it some day. This is the link to the website of this library. If you see release notes page, you read that the development of this library was finished in 2005. FreeTTS speaks only in English. On the left side you can see the links to the sound files. Try to listen to how it sounds. It sounds like a robot. Would you like to have your e-book read aloud like this? I think you would be tired after listening to a couple of paragraphs.

I have found good speech synthesiser, called IVONA. It speaks in over 18 languages, Polish, English, German as well. At this website you can see all available IVONA voices. Some languages have a couple of voices. For example English language involves Amy, Brian and Emma voices. Let me present Brian’s voice to you. It is quite pleasant to hear something like this, knowing this is a software. As you can see the Brian voice costs 45$. You can download 30-day trial version. I did it and I’m still testing these voices. It is important that the voices of this synthesiser support SAPI 5 interface. SAPI 5 is the native Microsoft Speech API (SAPI). The documentation for SAPI is available at the website MSDN Library.

SAPI 5 is not available from Java Virtual Machine position. It is required to program it in C++ or C# or Visual Basic. My simple application is written in the C++ language. If I could use SAPI in the application written in Java, I can, for example, make in C++ the library DLL (Dynamic-link library) and take the benefit from Java Native Interface. Thanks to this I will use C++ application in my Java application. There is also one more solution, a little dirty. I can execute EXE application with proper command-line arguments from Java. To do that, I use ProcessBuilder class or System.Runtime.exec method.


// tts_experiments.cpp : Defines the entry point for the console application.
//

#include "stdafx.h"
#include "language.h"
#include "speak.h"

using namespace std;

int wmain(int argc, wchar_t* argv[])
{

if (FAILED(::CoInitialize(NULL)))
 return FALSE;

if (argc == 3) {
 try {
 wchar_t * voice = getLanguage(argv[1]);
 wcout << "Choosed voice " << voice << endl;

bool spoken = speak(argv[2], voice);
 wcout << "Result " << spoken << " returned by speaking process." << endl;
 }
 catch (wchar_t* ex) {
 wcout << "ERROR " << ex << endl;
 }
 }
 else {
 wcout << "ERROR " << " You made a mistake in arguments." << endl;
 }

::CoUninitialize();

return 0;
}

// speak.h

#pragma once
#include "stdafx.h"

bool speak(wchar_t * text, wchar_t * pszReqAttribs);
// language.h

#pragma once
#include "stdafx.h"

wchar_t * getLanguage(wchar_t * languageShortcut)  throw (wchar_t*);
// speak.cpp

#include "stdafx.h"
#include "speak.h"

using namespace std;

bool speak(wchar_t * text, wchar_t * pszReqAttribs)
{
	ISpVoice * pVoice = NULL;
	HRESULT stInitializing = CoCreateInstance(CLSID_SpVoice, NULL, CLSCTX_ALL, IID_ISpVoice, (void **)&pVoice);
	if (SUCCEEDED(stInitializing))
	{
		ISpObjectToken* cpToken(NULL);
		HRESULT stTokenFinding = SpFindBestToken(SPCAT_VOICES, pszReqAttribs, L"", &cpToken);
		if (SUCCEEDED(stTokenFinding))
		{
			HRESULT stVoiceSetting = pVoice->SetVoice(cpToken);
			if (SUCCEEDED(stVoiceSetting))
			{
				HRESULT stSpoken = pVoice->Speak(text, 0, NULL);
				if (SUCCEEDED(stSpoken))
				{
					cpToken->Release();
					cpToken = NULL;

					pVoice->Release();
					pVoice = NULL;

					return true;
				}
				else
				{
					cpToken->Release();
					cpToken = NULL;

					pVoice->Release();
					pVoice = NULL;

					wcout << "Error, I couldn't play this text " << text << endl;
					return false;
				}
			}
			else
			{
				cpToken->Release();
				cpToken = NULL;

				pVoice->Release();
				pVoice = NULL;

				wcout << "Error, I can't set this voice " << pszReqAttribs << endl;
				return false;
			}
		}
		else
		{
			pVoice->Release();
			pVoice = NULL;

			wcout << "Error, I can't find this voice " << pszReqAttribs << endl;
			return false;
		}
	}
	else {
		wcout << "Error, I can't create Voice instance" << endl;
		return false;
	}
}
// language.cpp

//#pragma once 
#include "stdafx.h"
#include "language.h"

wchar_t * getLanguage(wchar_t * languageShortcut) throw (wchar_t*)
{	
	if (wcscmp(languageShortcut, L"EN") == 0) {
		return L"Vendor=IVONA Software Sp. z o. o.;Language=809";
	}
	else if(wcscmp(languageShortcut, L"DE") == 0){
		return L"Vendor=IVONA Software Sp. z o. o.;Language=407";
	}
	else if (wcscmp(languageShortcut, L"PL") == 0) {
		return L"Vendor=IVONA Software Sp. z o. o.;Language=415";
	}
	else {
		throw L"I don't uderstand your language";
	}
}

Before I describe the “speak(…)” function, look at the “main(…)” function. When I call the function “speak(…)” I pass the text which will be read aloud and the description of the voice which will be used. This attribute is used by the function “SpFindBestToken” for indicating the speech synthesiser and the voice. How do you know what to write here? Please open the windows registry (regedit command) and open the node “HKEY_LOCAL_MACHINE\SOFTWARE\Microsoft\Speech\Voices”. There is a list of all voices you can use. You can indicate the voice you are interested in by passing of attributes like Age, Gender, Language, Name and Vendor.

It’s interesting that Speech API can be used not only for speech synthesising but also for speech recognising. If I had installed the application for speech recognising, which supported SAPI 5, I could see it in the node “HKEY_LOCAL_MACHINE\\SOFTWARE\\Microsoft\\Speech\\Recognizers”.

In my simple application I use 3 functions which comes from Component Object Model (COM) library. There are:

CoInitialize – Initializes the COM library on the current thread. You need to initialize the COM library on a thread before you call any of the library functions.

CoUninitialize – Closes the COM library on the current thread, unloads all DLLs loaded by the thread, frees any other resources that the thread maintains, and forces all RPC connections on the thread to close.

CoCreateInstance – Creates a single uninitialized object of the class associated with a specified CLSID.

I pass here “CLSID_SpVoice” which is connected with the class “ISpVoice” and described in “sapi.h” header file. Here I have the identifiers of all classes of Speech API. Now you can use “pVoice” pointer. Here I choose the voice and here I order the speech synthesiser to read my text. Finally, I release the resources. Of course this chunk of code can be refactored because I do the same things three times.

In my next presentation I will show you the solution for integrating Java application with this simple text-to-speech application. I will describe other text-to-speech platforms and application programming interfaces (APIs).

Visit my GitHub proffile.

Knowledge

Ivona voices

Environment and IDE

API

Component Object Model (COM)

Speech

Windows Regedit

  • HKEY_LOCAL_MACHINE\SOFTWARE\Microsoft\Speech\Voices

Livecoding tv

Great platform for software developers. Choose your favourite programming language and technology, connect with the TV channel and watch how the software is made and try affecting the development process. However, I must admit that when I watch some solutions written in C/C++ I recall why I have choosed Java as my language🙂

Visit livecoding.tv

Live coding

Live coding front page