Developing an Alternative Math Assessment Tool Using Speech Recognition

From Asian Research Index - Social Sciences & Humanities
Jump to navigation Jump to search
Bibliographic Information
Journal Conference Proceedings of Educational Paradigm, Systems and Strategies
Title Developing an Alternative Math Assessment Tool Using Speech Recognition
Author(s) Raboy, Love Jhoye, April Mae Ablon, Irish Jane Cabulay, Recy Mae Mendoza, Guilda Marie Nanaman
Volume 1
Issue 1
Year 2014
Pages 90-96
DOI 10.21016/ICEPSS.2014.14035
Full Text Crystal Clear mimetype pdf.png
URL Link
Keywords Math Assessment Tool, Speech Recognition,, Preschooler, review test
Chicago 16th Raboy, Love Jhoye, April Mae Ablon, Irish Jane Cabulay, Recy Mae Mendoza, Guilda Marie Nanaman. "Developing an Alternative Math Assessment Tool Using Speech Recognition." Conference Proceedings of Educational Paradigm, Systems and Strategies 1, no. 1 (2014).
APA 6th Raboy, L. J., Ablon, A. M., Cabulay, I. J., Mendoza, R. M., Nanaman, G. M. (2014). Developing an Alternative Math Assessment Tool Using Speech Recognition. Conference Proceedings of Educational Paradigm, Systems and Strategies, 1(1).
MHRA Raboy, Love Jhoye, April Mae Ablon, Irish Jane Cabulay, Recy Mae Mendoza, Guilda Marie Nanaman. 2014. 'Developing an Alternative Math Assessment Tool Using Speech Recognition', Conference Proceedings of Educational Paradigm, Systems and Strategies, 1.
MLA Raboy, Love Jhoye, April Mae Ablon, Irish Jane Cabulay, Recy Mae Mendoza, Guilda Marie Nanaman. "Developing an Alternative Math Assessment Tool Using Speech Recognition." Conference Proceedings of Educational Paradigm, Systems and Strategies 1.1 (2014). Print.
Harvard RABOY, L. J., ABLON, A. M., CABULAY, I. J., MENDOZA, R. M., NANAMAN, G. M. 2014. Developing an Alternative Math Assessment Tool Using Speech Recognition. Conference Proceedings of Educational Paradigm, Systems and Strategies, 1.

Abstract

The world today is now in the era of Information Technology. The development of ICT-based processes specifically in the area of assessment in school is now visible. Project LISTEN (Literacy Innovation that Speech Technology ENables) is an inter-disciplinary research project at Carnegie Mellon University to develop a novel tool to improve literacy – an automated Reading Tutor that displays stories on a computer screen, and listens to children read aloud. This study does not provide right or wrong answers for they let the user evaluate the answer. The main objective of this study is to develop an Alternative Math Assessment Tool for Preschoolers using Speech Recognition. These software aims to assist teachers in the review of Math lessons for preschooler using speech recognition. The development of the system utilizes the System Development Cycle approach that includes data gathering to identify system’s expected functionalities, designing the system using Use-Case Diagram, integration of JSAPI for Voice Recognition, using Synthesizer software for reading the questions out loud, a graphical display of teacher representation and a graphical display for every questions in the review. Along in the development of this assessment tool is the implementation of the system. The system was developed using Java Programming language. It also uses MySql database to store data for preschooler, review questions and text answers. In the conduct of the review digital microphone and a speaker is needed. The developed system is capable of creating questions for a particular review, activating a review for the preschooler to take, and record the preschooler’s scores at every end of the review. The system also includes graphical display of questions. In the conduct of the review, the system was able to read out loud the questions, and a 5-second time span for the pupil to answer the review questions. The system will listen and the feedback from the study will display the correctly uttered answer. User testing results indicates an 83% correct response of system against the correct uttered answer of the preschooler.

Introduction

Learning is fun and productive. The language skill of a child is developed during his early age. These four language skills are mainly listening, speaking, reading and writing (Steil, 2012). Often times, reading and writing are given much more attention in preparation for the child to go to school (Mendelsohn, 1994). With fluent reading, you develop your cognitive and comprehension skills ability (Bowers 1993). Thus, the attention being placed on the listening skill is lesser than the other one (Jones, 1988). This is because listening comprehension skills can be improve through experience, also known as Audiolingual Method, where students just listen to the target language day by day. It is believed that listening comprehension skills can be learned without help, it just needs to expose the preschooler to the spoken language to improve their listening comprehension ability.

The preschoolers nowadays are having a hard time in reading. If exam is mainly written in plain text the preschoolers will be having a hard time in answering the exam since they cannot read at all or if they can, they cannot visualize or imagine what they are reading (Ehrlich 1993, Stanovich 1991). Not only is it a burden in the preschooler's part, but for the teachers too. With the increasing number of preschoolers in a section, teachers are having a hard time to have a one to one review in each and every preschooler for preparation prior to the actual exam.

Objectives of the Study

General Objective:

The main objective of the study is to develop an alternative Math Assessment tool for Preschooler using Speech Recognition that will be used by teachers as an alternative Math Review test for pupils.

Specific Objectives:

a. To design an alternative Math Assessment tool for preschoolers that integrates the JAVA speech Application Programming Interface to handle the speech recognition.

b. To implement the Math Assessment tool for Preschooler using Speech Recognition.

c. To test the correctness of the system’s response from the preschooler’s uttered correct answer.

Review Of Related Literature

Speech Recognition

Speech recognition basically means talking to a computer, having it recognize what we are saying, and lastly, doing this in real time (Bansal and Bajety, 2008). A block diagram of a speech recognizer tells how this process fundamentally functions as a pipeline that converts PCM (Pulse Code Modulation) digital audio from a sound card into recognized speech (Rabiner. 1989). Basically, this starts with the uttered speech. This speech will be analyzed further. Any sound is then identified by matching it to its closest entry in the database of such graphs, producing a number, called the “feature number” that describes the sound. The Lexical Decoding constraints the unit matching system to follow only those search paths sequences whose speech units are present in a word dictionary. This must apply a grammar so that the speech recognizer knows what phonemes to expect.

Java Speech Application Programming Interface (JSAPI)

The idea of machines that speak and understand human speech has long been a fascination of application users and application builders. With advances in speech technology, this concept has now become a reality. Research projects have evolved and refined speech technology, making it feasible to develop applications that use speech technology to enhance the user's experience.

There are two main speech technology concepts speech synthesis and speech recognition. Speech synthesis is the process of generating human speech from written text for a specific language. Speech synthesis provides the reverse process of producing synthetic speech from text generated by an application, an applet, or a user. It is often referred to as text-to-speech technology. Speech recognition is the process of converting human speech to words/commands. This converted text can be used or interpreted in different ways. A speech-enabled application does not directly interact with the audio hardware of the machine on which it runs. Instead, there is a common application, termed the Speech Engine, which provides speech capability and mediates between the audio hardware and the speech-enabled application. (Java Speech API, ND)

Rule Grammar

A grammar defines what a recognizer should listen for in incoming speech. Any grammar defines the set of tokens a user can say (a token is typically a single word) and the patterns in which those words are spoken. A rule grammar is provided by an application to a recognizer to define a set of rules that indicates what a user may say. Rules are defined by tokens, by references to other rules and by logical combinations of tokens and rule references. The references may refer to other rules defined in the same rule grammar or to rules imported from other grammars. Rule grammars can be defined to capture a wide range of spoken input from users by the progressive combination of simple grammars and rules (JavaTM Speech Grammar Format (JSGF) Specification, ND). Figure 1 below shows the rule grammar of this study. The grammar contains the words or phrases that the user may utter and if a match occurs in the rule, the sounds will then be converted to text.

Figure 1: Alternative Math Assessment Tool Rule Grammar.

Methodology

System Overview


Figure 2. An Overview of Alternative Math Assessment Tool for Preschooler

The Alternative Math Assessment Tool for Preschooler works both for the Teacher and the Preschooler. The teacher will create and activate test for the Preschooler. The questions in each test were created in a text form. So the teachers do not need to pre-record the question in a voice-form. The text questions and its respective correct answer were stored in a database. Once a test is activated, the text form of question will then be converted to sound through JSAPI and the Speech Synthesizer. So the question will then be read out loud by the system for the preschooler. The preschooler on the other hand, will use the system as a review test. A question will be read out loud for the preschooler, then the later will utter the answer. The uttered answer will pass through the Window’s Speech Recognition and JSAPI will then extract the text of the correctly uttered answer.

Key Components in the implementation of an alternative Math assessment tool for preschooler.

Speech Recognizer: A speech recognizer is a speech engine that captures the uttered speech. It will interact with the audio hardware or microphone and is located in the operating system. In Windows XP and Windows 7, speech engine is already built in, otherwise, it can be downloaded from the internet.

Java Speech Application Programming Interface (JSAPI): The JSAPI is an application programming interface for cross platform support of command and control recognizers, dictation systems, and speech synthesizers. Since the speech recognizer itself is not enough to perform conversion from speech to text, JSAPI will serve as the bridge from the Speech Recognizer to the proponents’ application for it to perform the needed task, which is to capture the uttered speech and be converted into text. JSAPI has two core technologies: the Speech Synthesizer which is responsible for converting text to speech; and the Speech Recognition which is responsible for converting speech to text (Chitnis and Anathamurthy, 2003).

Hardware and Software Requirements. A high-quality close-talk (headset) microphone or digital microphone and a set of speakers. 1.2 GHz CPU speed, 1 GB (MB) of memory is enough but for better performance, the proponents prefer to have 2GB or more of computer memory. Basic software requirement for the system to work is the Cloud Garden Implementation of JSAPI that contains support for Speech Synthesis and Speech Recognition. Programming language used is the Java Programming Language and MySql as the database.     

System Testing

System testing was conducted to test the correctness of the system’s response from the preschooler’s uttered correct answer. Table 1 below is the form used in order to tabulate the correctness of the system response to the correctly uttered answer of the pupil. Categories of the test were Numbers, Shapes, Colors and Built-in. P1 to p5 represents the pupil. Which means that the proponents picked 5 preschoolers to test the system per category. A value “1” in the table means, the preschooler Uttered Answer is correct and so with the system response. Otherwise, a value 0 means that the preschooler uttered answer is wrong, and also a 0 in the system response is also wrong. Compute for the average to get the overall performance of the system’s response to the uttered answer of the preschooler. A summary table will be used that contains entry for the different categories of the test.

Table 3.1. Tabulation of Correctly Uttered Answer and the System Response

Preschooler Question 1 Question 2 Question 3 Question 4 Question 5
Uttered Correct answer System response Uttered Correct answer System response Uttered Correct answer System response Uttered Correct answer System response Uttered Correct answer System response
P1 1 1 1 1 1 1 0 1 1 1
P2 1 0 0 1 1 0 1 1 1 1
P3 1 0 1 1 1 0 1 1 1 1
P4 1 1 1 1 1 1 1 1 1 1
P5 1 1 1 1 1 1 0 1 1 1
Total Score 5 3 4 5 5 3 3 5 5 5
Average 60% 100% 60% 100% 100%
Category Number Overall Average: 84%

Results And Discussions

The Alternative Math Assessment Tool: Teacher’s Interface

Figure 3. Teacher’s Create Test Panel

The Teacher’s Create Test Panel contains buttons and menus for creating a test for preschooler. The different drawing objects in the test were created before hand. In the Create Test tab, it lets the teacher to create and add questions for the exam. The question is in a text form. The exam consists of four categories. These are Numbers, Shapes, Colors, and the Built-in questions. Built-in questions were already created in the system. In the creation of a test the Teacher can create his own questions through picking first on the category, select an image, add it into his newly created test by clicking the Add to Exam button. Later the teacher will then activate the newly created test.

The Alternative Math Assessment Tool: Pupil’s Interface

Figure 4. Preschooler’s Question Display Window

Figure 4 shows the interface of the preschooler taking up the test. The system will display the questions for the activated exam, the pupil will utter his/her answer, the text box area will display if there is a match of the preschool’s answer to the grammar provided. The instruction “Please state your answer after the beep” was provided in order to give a go for the preschooler to answer.

Figure 5. Preschooler’s Listening Window

After the beep the listening window appears as shown in Figure 5. The beep will let the preschooler speak and utter its answer. The preschooler will utter his answer and the system will proceed in checking the preschooler’s answer from the correct answer stored in the database The system only recognizes the correct answer and this answer will be printed as text in textbox, otherwise the answer is discarded and will not be converted into the text.

Summary of Test Results

Table 4.1 System’s Response Test Results

Category System Response
Numbers 96%
Shapes 68%
Colors 84%
Built-in 84%
Overall System Performance: 83%

The following table shows the results of capturing the systems response to the correctly uttered answer of the preschoolers. There were 5 preschoolers who took the test. There were four test to be given to the preschoolers. These (4) different tests constitute the 4 categories. These categories are Numbers, Shapes, Colors and Built-in. Results for the Numbers test shows that 96% the system respond correctly. Result for the Shapes is 68%. This may be due to the noise in the environment, and the way the preschooler pronounced the answer. Other categories resulted to 84% correct response of the system. The overall system performance based on the average of responses from the four sets of tests is 83%. According to the grading system from the MUST Student Handbook, it indicates that the system performs fairly. The other remaining percent error maybe due to the following reasons: the environment is noisy. This implies that the system’s capture of the preschooler’s uttered speech maybe distorted. Other factor also includes the pronunciation or the intonation of the preschooler.

Conclusion And Recommendations

The development of an alternative Math Assessment tool is not an easy task. The motivation of developing this tool is in the context of helping teachers in the review of preschoolers in their Math subject. Personal experience with one of the researchers preschool child also contributed to the formulation of this study. The returned test paper shows part of the test that involves the recitation of child. Manually the teacher will write the answer of the child.

The result of the development of study shows that the system was able to perform the basic CRUD function. CRUD stands for Create, Read, Update and Delete data from the system. The system also has the listening capability where it will be able to capture the correctly uttered answer of the preschooler and able to compare the “text” answer to the correct answer indicated in the database. Basically the system can be able to create test and uses a Speech synthesizer in order to read out loud the questions, activate a test and record the preschooler’s test results. The test result of system’s correct responses to the preschooler’s uttered correct answer is 83%. This implies that the system performs fairly based on the description of grade from the MUST’s Student handbook (Undergrad Student Handbook, 2011).

The integration of Speech Technology and database used as an assessment tool for Math Preschooler introduces a new technique in the conduct of oral test/recitation for the preschooler. This in turn may help teachers for its immediate storage and capture of data from the test. The preschoolers on the other hand may use this tool as a review for them. For this system also includes a little of multimedia effects such as sounds, photos and colors and in one way or another may also capture them to focus on the test.

To further improve this study, the following enhancement maybe included

  1. noise cancellation may be used
  2. a matching pronunciation differences may also be conducted as a study. This means that an English word maybe uttered differently by different races. And system can capture different pronunciations but means one word.
  3. Design this study to cater a client-server implementation. In this way, an online oral recitation is possible.

Reference:

“A Reading Tutor that Listens.” (2013, June 6). http://www.cs.cmu.edu/~listen/

Chitnis, M. & Ananthamurthy. L. (2003, Aug. 06). The Java Speech API, Part 1. http://www.onjava.com/pub/a/onjava/2003/08/06/jsapi.html

Donaldson, C. (2009, May 1). Teaching Kid Why Math Matters. http://www.education.com/magazine/article/math-matters/

http://www.mensaforkids.org/lessons/shapes/mfklessons-shapes-all.pdf

http://cseweb.ucsd.edu/classes/fa06/cse237a/finalproj/ruchi.pdf

JavaTM Speech API Programmer's Guide (1997-1998). Retrieved from http://www.ling.helsinki.fi/~gwilcock/Tartu-2003/L7-Speech/JSAPI/Introduction.html

Java™ Speech API Programmer’s Guide (1998, Oct. 26). http://java.coe.psu.ac.th/Extension/JavaSpeech1.0/jsapi-guide.pdf

JavaTM Speech API Programmer's Guide. (1997-1998). http://www.ling.helsinki.fi/~gwilcock/Tartu-2003/L7-Speech/JSAPI/Recognition.html

“Printable Kindergarten Worksheet”, ND, retrieved from www.kidslearningstation.com

“MUST Undergraduate Student Handbook”, 2011 Edition, March 25, 2014, from http://www1.must.edu.ph/wp-content/uploads/2013/04/Undergraduate-Student-Handbook.pdf

“Super Teacher Worksheet” , ND, retrieved from www.superteacherworksheets.com/counting.html

The JavaTM Speech Grammar Format (JSGF) Specification h, ND retrieved from http://java.sun.com/products/java-media/speech/forDevelopers/JSGF/

“The JavaTM Speech Grammar Format (JSGF) Specification “, ND retrieved from http://java.sun.com/products/java-media/speech/forDevelopers/JSGF/


References