Experiences of large scale implementation of speech analyzing tools in learning Swedish as second language

HŚkan Larson

Larson Education i Lerum AB


Presented at the ESCA/SOCRATES Tutorial and Research Workshop


Method and Tool Innovations for Speech Science Education

University College London, U.K.

16-17 April 1999



Speech analysis tools within the Lingus system have been used on a large scale for adults and children learning Swedish as second language. Several thousands of students have had the opportunity to for instance compare their own pitch curves with those of native speakers. The attitudes and experience of students and teachers have been investigated, in a first stage through a number of interviews, in a second stage (still in progress) in a more statistical way through a questionnaire. A third stage with quantitative measurements to prove the power of speech analysis is yet to be defined.

The results so far are encouraging. The results vary between students but a great majority seem to find the new techniques stimulating. The tools are used in various ways depending on the studentsí personal preferences.

Teachers seem to appreciate the value of the tools but need more training to understand them better in order to feel more comfortable in the class. They also need better support to handle rather trivial technical problems with headsets or microphones.

The integration of the computer work with other training methods is important. Some examples will be given.

1. Introduction

The development in speech technology and in computer sciences in general is rapid. Thorough evaluation of techniques, methods and products tend to lag behind, a disadvantage both to users and developers. Although an evaluation by an independent body would have been preferable we as producers of a Computer-Aided Language Learning system have initiated an evaluation project.

The study is made in three stages. The first one is to have rather informal interviews with users, both students and teachers, the second one is to make a questionnaire to students and another one to their teachers, and the third stage is to make a quantitative assessment of the development of pronunciation proficiency when using the system. The second stage is slightly delayed so this report will cover mainly the first stage. For the third stage we are looking for an independent phonetic institution for a co-operation.

2. Difficulties of spoken communication

Spoken communication in a foreign language is a very subtle task. The sound patterns produced and perceived do not only and should not only contain neutral fact-based information but also information about the speakerís attitudes and feelings. The sound pattern could reflect the speakerís origin and social situation. The sound may be combined width expressions in the face and body of the speaker.

It is very difficult by conventional methods to train spoken communication efficiently, or at least very big resources in time and support are required. The result has been that in theoretical studies, the spoken communication often remains on a fairly basic level. It is normal that for instance politicians and businessmen speaking a foreign language reveal very clearly their mother tongue. Living in a foreign country and communicating daily in the foreign language will certainly improve your ability, but itís normal that you will keep some of your accent forever.

Since improvements would require much effort, the situation is regarded as normal.

3. Requirements on speech analysis for large scale use

Speech analysis has a potential in spoken language training, both for listening comprehension and for pronunciation. If the tools are designed in such a way that they

they could have a major effect on the learning process.

4. Learning Swedish as second language

A main obstacle to a successful integration of immigrants in the Swedish society is the language problem. The Swedish language has more vowels than most other languages, and intonation and accent can be used to completely change the meaning of a sentence or a word. The intonation is also very important for expressing feelings and attitudes. Swedish could therefore be considered to be a difficult language to learn, especially as an adult.

Speech analysis tools have been applied on a relatively large scale, mainly through the Lingus system, which has been installed in some 3000 computers. A rough estimate indicates that at least 30000 students of this category have been using the system. A majority has been adults from Iran, Turkey, Somalia, Bosnia and Kosovo.

5. The Lingus system

General description

Lingus is a general platform for language training comprising the studentís training tools, authoring tools to supply the student with training material, a communication system to transport, via local networks or the Internet, material to the student and progress reports to the teacher. It has been described in ref [1].

Pronunciation training without speech analysis

The students own work

Lingus provides the student with a simple listening and recording facility. The master sound could be listened to as a whole or line by line or (if the exercise has been designed using this facility) word by word. The student can record his own voice and compare it to the master voice. He can repeat the recording until he is satisfied. He can also use the method of simultaneous repetitive imitation. If the sound card can play and record simultaneously (duplex) the student can record a mix of his and the masterís voice, a particularly amusing feature it the exercise is based on music.

The support from the teacher

The teacher can via a local network collect samples of the studentsí recordings for analysis. He can then give advice, give links to suitable existing exercises or design a new exercise which could help the student to get over his difficulties.

The Linguascope tool

The Linguascope add-on program analysis the masterís and the studentís voice with respect to pitch and intensity variations with time thus giving a description of the prosody components intonation, stress, and tempo. The program can be used in a number of ways as described below.

The Sonoscope tool

The Sonoscope add-on program analyses the formants of the voice and compares it with different standard sound patterns using neural networks. So far the system has been taught to recognise Swedish vowels, which are more numerous than in most other languages.

6. The aim of the evaluation

The aim of the study is to find out

7. Integration of Lingus in different educational systems

How the system is being used varies widely from school to school depending on priorities, available computer resources, pedagogical ideas, preferences of the teachers etc. It varies between intensive studies where the system plays a central role to more conventional studies where the system is used more as a break in the regular training. Some examples will be given.

A model for intensive six hours a day training developed by the Hogia Institute in Gothenburg employs two hours with Lingus, two hours which other computer programs mainly word processing and two hours with free communication. During 10 weeks about 100 hours is spent with Lingus. The Linguascope tool is used especially in the beginning to make the students aware of differences and define problem areas.

In another model used by Folkuniversitetet in Stockholm the system is used 2 hours a week during a 20 weeks course. Special importance is paid to the use of the speech analysis system (Linguascope).

A third model is used by SIDA, the Swedish development aid agency, where the system is used by students outside the regular school hours on their own initiative.

In the normal secondary school system priorities normally lie on getting students to pass a written exam. Although pronunciation is considered important it is not as central as in the models described above. Normally the computer rooms with the installed software are scheduled for all students and languages alike, i.e. the number of occasions per student per year is relatively low, maybe 8-10 lessons.

8. The teachersí attitudes to and experience of the system

The attitudes of the teachers vary from person to person but also between the different organisation models described above. Schools where the computer system is to play a central part can recruit their teachers with this in mind. In the normal schools some teachers have accepted the system while others have been more reluctant. In those school the teachers can also expect to have less support from technicians to handle the technical difficulties.

Itís our opinion that so far relatively trivial technical problems have been the most serious ones. Especially problems with headsets, volume control etc tend to scare language teachers away from the computer rooms. Another factor is that many teachers do not feel confident with letting their students get access to advanced tools, which they themselves have not had a chance to get acquainted to. These problems could be solved with better training and more experience. Very few teachers express any basic doubts to the use of speech analysis in pronunciation training. There is some fear, however, that teacher hours could be replaced by computer hours. On the other hand there is also a challenge since the system is built as a framework with authoring tools giving teachers the chance to produce his own training material.

9. The studentsí attitudes and experiences of the system

The employment situation for immigrants from outside Western Europe is quite bad in Sweden. People who have already passed a number of less successful training courses and are offered another one to fill out his unemployment will not be well motivated. The introduction of computers in a course will in itself affect the atmosphere in a very positive way. A good training method will multiply this effect.

Interviews have shown that a high percentage of the students find the computer-aided training very stimulating. An example of the degree of satisfaction is that after long training at school many students purchase the system to continue to use it on their own home computers. The questionnaire, which is in progress, will give some more statistical evidence about the situation.

An analysis of the different ways of using the tools will be given below.

Normally a training session should not be longer than on hours, since the work is very intensive.

In some groups it might be a little embarrassing to work with pronunciation while other students do other kind of training. This situation has to be handled by the teacher.

10. The Linguascope user interface

In addition to visualising the prosody of the speech the student can use this tool in several different ways depending on his preferences and the type of problem he is addressing. The tools give possibilities to use eyes and ears, analytical skill and intuition in many different ways.

The user interface can be seen in fig 1

Figure 1. The Linguascope user interface. Only the master sound is shown

The upper graph shows the variation of sound intensity with time. If both sounds are present, the master sound has a blue graph; the student sound has a red graph. Intensity is split up in a low frequency (dark blue/dark red) and a high frequency (light blue/light red) part in order to allow training of sounding and nonsounding consonants e.g. s-z.

The lower graph shows the intonation curve (pitch variations with time) over the range 60Hz-500Hz approximately. The distance between two horizontal scale lines is one octave.

From the Linguascope window you can listen to the master (pre-recorded) sound by clicking the blue loudspeaker. You can select and listen to only a part of the sound, select automatic restart, listen to the sound at half speed or translate it to a musical tone in order to enhance earís perception of the intonation. There is also a zooming possibility.

If you record your own voice youíll get at red curve with the same options as with the master sound.

11. The Linguascope modes of operation

After having studied the different ways students use the Linguascope tool the following instructions have been worked out.

There is a multitude of ways to operate the Linguascope tool, some more directed to the ear, some to the eye. Some ways have a more analytical approach while some have a more intuitive one. Different persons have different preferences and will after some time develop their own schemes.


12. The effect of speech analysis on the relation teacher-student.

A very interesting observation is that the technological tools can affect the communication between the student and the teacher. The total result of the training can be affected in a very positive way even if the amount of time spent with the tools is relatively small. This is related to the fact that your voice is an important part of your personality and could not easily be looked on in a neutral way. This could lead to rather tensed situations where the student doesnít accept that there is a difference in his way of saying a word. By looking at two curves which differ, it is easy to agree on this fact. The correction of these differences can then be discussed in technical terms like making the curve drop at the end rather than in terms of changing a voice The teacher works together with the student to identify the problem an correct it. Psychologically this seems to be very favourable.

13. The best time to introduce speech analysis.

This question has been discussed with students and teachers. The conclusion is that an early introduction could help getting it right from the start thus avoiding a laborious process of correcting errors later. On the other hand it might be an absolutely indispensable tool to motivate and assist a student who didnít learn if right from the beginning. So the answer is: Any time.

14. Conclusions

Speech analysis already today plays an important part in learning Swedish as second language. The evaluation so far has indicated that the system has won a wide acceptance among students and teachers. A further growth can be expected.


[1] HŚkan Larson (1998) Lingus - a general purpose computer aided language learning system which could serve as a platform for the implementation of speech analysis tools, Proc Speech Technology in Language Learning, May 98, Marviken