AT for Communication

Session Chair : Akila S, National Institute of Speech & Hearing (NISH), Trivandrum

Transcripts

AT for Communication

 

Aishwarya: The session chair for this session is Dr Akila Surendran. And just to introduce Dr Akila, before I hand it over to her for the rest of the session. She’s a senior engineer at the Center for Assistive Technology and Innovation (CATI) in the National Institute of Speech and Hearing Trivandrum. And she is interested in issues of last-mile implementations of AT. And she is also interested in using free and open source technologies for AT implementation. Over to you Akila for rest of the session.

 

Dr Akila: Thank you, Aishwarya, good to see you and hear you for a long time. So I should it was one of our AT course participants. Very active participant.  Coming back to our session, the topic of the session is AT for communication. As we all know, communication is a basic need of human beings in our society in the society that we live in. And we’ve been doing this session for the last two editions of Empower. And the earlier sessions had papers on augmentative and alternative communication as well. But this time, I think the focus has shifted more to the hearing impaired community. And there are three papers, interestingly, around Indian Sign Language, and working at niche, we find that the interest in this area like to tackle the problem of ISL translation has been huge among researchers. But it’s a very difficult and complex problem to solve. And let’s see what these researchers have to present for us. The first presenter would be Amal Jude Ashwin, and he’s going to talk about the fingerspelling system for Indic scripts. Yeah, over to you.

 

Amal Jude Ashwin: So good afternoon everyone and thank you for visiting my presentation and title empowering the speech and hearing impaired using a fingerspelling system for Indic scripts. Though so the about project is carried out at the computational neuroscience laboratory Indian Institute of Technology, Madras in tie up with TCS research and innovation. Okay, so this is small fact check. So the normal hearing ability of a healthy individual varies from zero decibels to 140 decibels and depending on the lower threshold until which are unable to hear your hearing loss is classified and up to something decibels, the hearing loss can be restored using hearing aids.

 

But if a person is profoundly deaf, then he or she prefers sign language for communication. Now, let’s focus on the Indian population. In India 6.3 crore people have significant urination And at least 50 lakh of them are children. Now just like how are normal hearing individuals are the conscience speak to them in a spoken language for the Deaf people, the native language is a sign language and they tend to think in sign language. So their conscience, if they are not exposed to any spoken language will talk to them in sign language.

 

Now, the bridge for communication between the hearing impaired people that is the deaf community and the hearing people is still under construction. We do have sign language interpreters, for example in this session so we have Aniket and Preeti an sign language interpreters. And we also have been developing AI-based tools that can convert sign language to worktexts, etc. So the bridge is still under construction.

 

I’ve been talking a lot about Sign Language. So what is Sign Language? It is a system of communication used by the deaf community using hands and facial gestures. Now, the first established sign language was the French Sign Language. And just like how we are or spoken language spread throughout the world, we also have different sign languages spread throughout the world. In India, we use the Indian Sign Language convention, no sign language can be used to convey words that have meaning. But what about names? For example, how will I convey my name to another person?

 

So in that case, I’ll use something called fingerspelling which is that a presentation of alphabets using only hands. So my name is Amal, so I’ll use A, M, A and L. Fingerspelling is used by non- deaf community to convey proper nouns that don’t have a proper sign for them. Now, let’s focus on Indian sign language in Indian sign language, fingerspelling convention is based on the English alphabets and not based on any Indian languages. So why is that? So the main reason why that is is because of the geometrical complexity of Indian characters.

 

Now, it’s easy for me to gesture A in English, but it’s pretty hard for me to gesture Aa in Hindi. So, I have to be flexible enough to do that. And also in English alphabets, I just have 26 alphabets, but in Indian characters, I have a humongous dictionary, I have constants, I have vowels and comnination of consonant- vowel etc. And also if I overcome these two problems and development in fingerspelling system based on any Indian language that can be used in only one part of India and not all over the country, because of the cultural diversity in India. So, in order to address the problem of developing an Indian sign language based on any fingerspelling convention for Indian sign language based on Indian script, we took something called Mudrabharati. So, Mudrabharati, mudra means signs and bharati it’s a unified name for India. So, Mudrabharati is developed using the phonetics that is the sounds of the characters and not based on geometry. Although Indian scripts are geometrically different, they are developed using the common 40 odd sounds that can be classified into consonants and vowels. Now, a few of the characters in the Indian language there is a combination of these consonant and vowel put together.

 

For example, A is a vowel and Aa is also a vowel, but it’s in the form of the shorter version A.Now the Mudrabharati convention that follows Indian sign language uses two hands for communication and so does Mudrabharati. So, the right hand is used to gesture consonets while the left hand is used to gesture vowel. Now, consonants of similar sounds are put together in one position. So, I have nine different positions to gesture constants, I’ll  ahve only one position to gesture vowels. So consonant of similar sounds are put together in one position for example, ick, egg, igg, ig, so, all those constants are put together in position one, if you see Table A. Now vowels are few, therefore I have unique gesture for each and every vowels and vowels of similar sounds have similar gestures.

 

Now, in addition to consonants and vowels, I also have included a table for punctuations that can be used for text entry systems. Now, what if I have to gesture consonet-vowels combination? In that case, I perform the corresponding gesture for the consonant using the right hand and the corresponding gesture for the vowel using the left hand and that could be interpreted as a consonant-vowel combination. For example, if I have to just check ki, I go to position one closest gesture according to Table B that is considered ick and then using a left hand ee. So, ick+ee gives me ki.

 

So, this is an example. So, let me take an example imli. So imli is a Hindi for tamarind. So imli, can be split into ee, ma, and li. So ee is vowel with its thumbr out and then ma is position five a gesture and combined with a vowel aa. So em+aa gives me ma and then finally I’ll do li. So, similarly, any words can be constructed using Mudrabharati convention. So Mudrabharati teaches alphabets, especially Indian alphabets to the Deaf in a way that they understand, that is using signs and the space in front of them and the beauty of Indian language is that it is based on it’s constructed using phonetics, the sounds. The deaf feel empowered with the knowledge of sound. And it provides a platform to understand words at an alphabet level, especially Indian words.

 

For example, if you take the example of psychology, science and cycle, so, these three words the first syllable in it l sounds the same, but spelling wise they are different from each other. So Mudrabharati in essence is based on syllables and phonetics, it can provide a platform to understand words at alphabet and syllable levels, it can also enhance the reading and writing ability of readers because they can know that these groups of both sound so and so.

Now, with Mudrabharati since it has so many advantages, it can be used to establish communication between the Deaf and hearing population, especially in India. So, India not a lot of people know English script. So, therefore, if I have to gesture in using English alphabets as well, they might not be able to get it. So having a fingerspelling convention, based on Indic script, would be useful for them in bridging the communication gap. And although the native tongue for the Deaf is the sign language, the spoken native tongue, let’s say they’re born to hearing parents, the spoken native can be understood better by deaf when they learned something like Mudrabharati.

 

So it can also provide more knowledge creators to the deaf community because they can understand words at an alphabet level. And they can reach out to more people in the urine community as well because Mudrabharati takes us one step closer to the urine community. And thereby it can also open up new job opportunities. In addition to the convention, we have also developed technology tools for Mudrabharati, one is the AI-based detection system when I just move my hands around in front of a camera and the frame is being captured and the corresponding alphabet is rendered. So, this system has been developed in Telugu, Hindi and Tamil so far. So it can be expanded to any other Indian language as well. So this is all smaller example. So based on my face position and size, the nine consonant key points and the two vowel key points are and enter. So this is me trying to gesture Mera Bharat Mahan.

 

So in addition to that, we also have a text to Mudrabharati convertor, so the previous detection system and converts Mudrabharati to text similarly we have a text to Mudrabharati converter and even this software has been developed in Tamil, Telugu as well as Hindi. Here I’m using Google Translator. I’m typing my India is Great, which means Mera Bharat Mahan. And then once I put the texts I hit the Translate button so, this can be used to convert test to Mudrabharati.

 

In addition to them, we also have educational aids for Mudrabharati for people to learn Mudrabharati. So, we have all primers, which are just booklets, which has pictorial representation of the word and the corresponding Mudrabharati gesture. So, this can be used to quickly learn Mudrabharati, there are a few examples of that. So, other applications can also be used for people to familiarize themselves with Mudrabharati. So, this so far is the first phase of the project where we have developed a convention for Indian language fingerspelling system. And we are also developing additional tools for them. And the next step of the project, we are trying to reach out to deaf schools and teach them with Mudrabharati and get feedback from them. So, for that, we are actually tied up with Ali Yavar Jung National Institute of Hearing and Speech. So, it’s in Hyderabad and Mumbai. So that’s the main reason why we focus on developing the prime Telugu as well as and Hindi. So if you are interested to know more about our project, or if you work in a deaf school, we are more than happy to collaborate with you. And we’d like to share our work with you as well. Thank you. So feel free to reach out to me. I’ve also mentioned my mail id over here. And Dr. V. Srinivasa Chakravarthy is the principal investigator of the project of the lab. Thank you.

 

Akila: Thank you. We have time for a few questions. Any questions for Amal from the audience? While we wait, maybe I’ll start. So from my understanding of the deaf community and how sign language evolved, they create signs, right. So they are very, I mean, that is how they make their own signs. But here the process is reversed. Yeah, what do you think about that?

 

Amal Jude Ashwin: So normally, the deaf community doesn’t have an idea of how sounds work. So we are just proposing this system to the deaf community so that they can also understand Indian characters better. So normally, deaf community, they stick with the fingerspelling based on English alphabets, and they don’t more often come to Indian alphabets. But in India, it’s essential for us to learn any one of the Indian scripts, to establish good communication. And also to understand about our culture and the literature in India, it’s essential for us to learn an Indian script. So this will bring the deaf community closer to the speaking population. So we are proposing this system to the deaf community. So it’s up to them to modify the system accepted based on their needs.

 

Akila: There is a group from my own institute NISH, Trivandrum that has developed a fingerspelling system for Malayalam. And this was done by a group of deaf individuals. So just like, have you looked at other projects like this? And I mean, there are so many different systems being developed and which is going to be standardized. So I guess that ESLRTC or some central body has to take the call.

 

Amal Jude Ashwin: So we tried to Google a few other projects on the same lines, but we couldn’t find anything concrete that’s been in practice right now. So since you have suggested that NISH has already developed such a tool for Malayalam, I guess we’d be happy to collaborate with them because we can exchange ideas as well. And that can be good for both our projects.

 

Akila: Thank you, Amal. I think we can move on to the next presenter. Yeah, the next presenter would be Ankit or Mahesh or Sparsh. I’m not sure who’s representing the team. But their topic is improving efficacy of the Indian sign language translation model. It’s about a rule based English to ISL. Machine translation framework for a virtual Indian sign language interpreter, and which is an animated avatar.

 

Ankit Jindal: Hi, this is Ankit Jindal, good to connect with you on this forum. And thank you for having us on this platform. I think our team at Friends for Inclusion are very happy and very proud to be on this platform, we are essentially going to talk about how we took forward the research that we’ve been working with IIIT in the past, and how we’ve improved on the translator model. Now, as many of you would be aware that we were engaged with IIIT, Bangalore last year with a research project with Professor Dinesh, where we kind of created a basic prototype. When we showed it to the users, we did an intensive date user testing. And one of the feedback we got was to improve the grammar part of it. And in the last few months, we have made several improvements. Today’s conversation, my colleagues, Sparsh and Mahesh will show you what a few of the changes that we have brought about in this and also show a couple of a few examples to the audience. So on that note, I let Spash take over. And yeah, we’ll pick it up from there.

 

Sparsh Nagpal: Yeah, so proving the efficacy of the virtual Indian sign language interpreter. First, we’ll talk about the Indian sign language as you discussed in the previous one. It was initially known as the Indo Pak sign language, and it is one of the most used languages in the whole world. And it constitutes of various features like gesture signals or facial expressions that goes through the language. And also it has its own unique set of grammar rules. The purpose behind me research was the lack of awareness in India about the sign language amongst especially amongst a Deaf community.

 

And according to WHO they have been around 63 million deaf people we call it in the country. And when it comes to the certified sign language interpreters, the numbers given us only 325 by the ISLRTC. So there’s a huge need to bridge the gap of communication between the Deaf people and the rest of the country.

 

So different approaches. One of the main approaches used initially was Signed Exact English, or Signed Exact English is basically translating each and every word directly in an English sentence or Hindi sentence, word by word level, and using the animations for the auto the sign for that. But currently, we are trying to work on the ISL Grammar model, which uses the rule based machine translation model. What it basically does is it has a set of grammatical rules set at different levels, and it works on the whole sentence at a huge level and then breaks all the rules and then converts to the Indian sign language, grammar, and then word by word each and every single sign has been represented. So let’s talk about few of the rules which we worked upon, one is the POS. It of parts of speech. So basically, it identifies different grammatical aspects of each and every word of the sentence. And there is a sorted order the ISL follows.

 

So it basically starts with time and location, then it comes to the second person or the object, then comes a main subject or the root of the word, then the verb and the adverb, if there is any negative connotation like not in the sentence, and towards the end, if it’s a question, the question tag will become coming towards the end. So, you can see the flow of the graph or the whole grammar aspect we talked about. Then another one that comes into line is eliminating the unused words. So what these words are called a stop words, and in a common English sentence like I am going to school, the word the term am or to won’t actually be represented during the ISL. So it ends up becoming  ‘school me go’. And you won’t eventually show the whole sentence and each and every word constituted in it. So articles like a, an the are eliminated and even the words like is are you actually use them for the tense of the sentence rather than the whole words for signing.

 

When it comes to tense, it’s important I feel to represent the tense of the sentences well, for example, I am going to school and I was going to school mean actually two different things and they should be aware of the presenting it. So we have tried to bring in that as well where it tries to incorporate the tense of the sentence.

 

Next, we have state of the words. So there are a lot of words in which ISL actually lives and the vocabulary is very limited, especially the proper nouns. So when it comes to a person’s name, the word we end up splitting the word for so that we can show it character by character and each and every different character is then later on sign. For example, my name is Diana, if I say, Diana, Diana doesn’t actually consist in the ISL vocabulary. So, it will be shown as D-I-A-N-A each and every different character will be signed, and then the rest of the sentence will be shown. Even numbers. So, the numbers has terms in the ISL but they are very limited. So when it comes to a huge number, and maybe 10567 Something like that, so so for that we can’t actually show the word wordings of the number like 1557 in ISL we end p presenting each and every single individual numbers so 1057 will be shown as individual numbers. So if a sentence is I have 10 candies it will be become candies one zero we have the one zero okay, we’ll present 10

 

Now, the subordinate conjunctions is basically means to know when the two sentences combined together in a question on the form with if it involves since or as a because oh the first part of the sentence answers to the next part. For example, he went to a restaurant because he was hungry. So the eyes are part and it ends up becoming two different sentences clubbed together. So, for example, restaurant he goes, why, he hungry. So, the whole sentence is broken down into parts and then showcase it further or limited vocabulary.

 

So, the main idea behind ISL is basically getting the context of the whole sentence instead of showing the exact English meaning because at the end of the English and ASL are very two different things. So, words like good, nice, well, eventually the sentence will end up meaning the same thing. So, in the sentence the I am feeling well or have a nice day, the well and nice will end up meaning good only. So, we have worked words interpreting all of those words, which when it connected to one common word and showcase it in a model so that we are able to represent it during this time.

 

Another factor is the relation so for example, it’s a relation between two people or association with something or something that belongs to our community or ISL actually breaks down the two worlds so the brother actually is represented sibling plus man or Indian will be shown as India plus person in terms of signing. So, the words are recognized and broken down into two different words and then sign ahead.

 

And homographs are basically words that same spelling and sound, but at the end of the day, they have different meanings and you come across many of these in the English use. So for a sentence like I am always right, or I took a right turn or I these are my rights so all of these sentences how the word right but while translating it into ISL model can have difficulty figuring out what exactly the word right over there means. So this is another one of the problems we have been able to tackle through our program. Now I’ll like my colleague Mahesh to showcase a working demo

 

Mahesh: So I’ll type the inputs here.  So What is the proce of iPhone? Can you guys see? So, it does ‘iPhone price what’. Similarly, another example is ‘the Prime Minister whet to

Amritsar’ so it does ‘Amritsar Prime Minister go’. So like this over here we type the input and we get the output let me go for the next input which has a name and now so I say ‘Krish is strong he eats veggitables. So, now the words Strong and Krish are being spelled out and it says’ why vegetables he eat’. So, if the audience have any input that they would like to test feel free to drop in the comment and we could try that as well.

 

 And I’ll go to the next input ‘we have 30 dogs’ and the output would be ‘dog three zero

we have’ and I continue with two more examples ‘I have two brothers and one sister’ oh I have typed wrong spelling let me just redo it. So, this is the right spelling. So I said I have two brothers and one sister. So it says ‘brother two sister one I have’

 

So I do see some messages Okay. I shall try that “you could use the Rename option to do this if required”

 

So wehave trained to particular words and the words ‘other’ has not been trained, it is usually being spelled out and if it’s grammatically incorrect sentence then we will have to correct it for it to me properly. So I removed a part of it and now you could see it is a spelling out option that is O-P-T-I-O-N and there is another word spelling or which is renamed is being spelled out because our we don’t have an animation for rename in the data set. Now it does you do this habit. So I believe habit was a synonym for do could you use dream? So that’s how it works. So basically, if you don’t have a particular word, it spells out spells it out. And it has to be grammatically correct the sentence has to be grammatically correct.

 

And really if we expand a word bank out you’re right now the model is based on a certain definite number of the wordbank. As we keep expanding the word bank you will see, the model being you know, performing a lot more correctly.

 

Let me try one last example which is ‘India got a gold medal in Japan in 2021’. Now it does ‘two zero two one. Japan, gold medal, India, get’. So that’s the presentation. Would you like to try?

 

Akila: That’s the question. Have you had any competition? Are they included in your database?

 

Mahesh: We do. So like jalebi. I think we do have along with a few other ones as well.

 

Akila: And one more question on that as a question, did you train the model on different social networking platforms.

 

Mahesh: So we haven’t added all of the names yet. But we can, overtime, we will be increasing the vocabulary range of a model. And we will start including even the social media names and, you know, checking how exactly they are supposed to be represented. And the vocabulary kee getting expanded with time.

 

Akila: So I have a question or observation kind of this. So we tried that complex sentence, right? “You could use the rename option to do this if required”. So it is essentially sentences like these complex sentences that are difficult for a sign language user to understand if they are not very literate in English, simple sentences, they don’t need the translator, actually. So I think that is the limitation that the technologies and the other technologies haven’t yet cracked. For example, you take this sentence that I gave, you could use the rename option to do this, if required. As a teacher, or as a person who regularly interacts with our, with Deaf people, I would actually explain the meaning of the words for which the vocabulary is not available in Indian sign language. Right, I would give a lot of examples, and I would explain to them and somehow I would find a way to, so that they understand what that sentence means.

 

Ankit So yeah, we’ll be happy to take inputs. And I think you and I started some conversation, and so we will be more than happy to be in touch with you and improve this model. Of course, you know, we do understand, Tech has its limitation. But yeah, we are the team is extremely passionate and committed to, you know, do this, I think we’ve come far, you know, very far from what we where we started last year, or where we kind of moved on. So I think, yeah, we’ll be more than happy to collaborate with you and see if we can incorporate some of the suggestions that you mentioned. And we’ve been also, by the way, we’ve also been in touch with the deaf community and regularly taking inputs. In fact, all these rules have been validated by the community, you know, so I, myself have a disability. I’m also a person with vision disability, but I do understand the pains of people with disabilities when they don’t have access. And that’s why we’ve been very, very kind of strongly interlinked with the community. And kind of validating this at various stages to take this forward.

 

Akila: I think there are a couple of raised hands.

 

Raheel: So my question is like, it’s basically a reductionist approach of reducing the sentence into a basic form and converting into a sign language if I’m not wrong. So what about sentences which have an emotional stance to them, like a sarcastic sentence? So when we reduce it, so we would say sarcasm or as emotional?

 

Ankit: That’s a great question. I think our team is kind of building on that piece. We made some progress. However, that wasn’t ready enough for us to show that at this forum. So it’s too premature for us to kind of comment on that like whether we will be successful or not. But yeah, there are we are aware of that and we’re just kind of working towards it.

 

Rof Bala: Good presentation, this amount of animation I haven’t seen on ISL till now, or at least not, you know, whatever you’re doing is very good. What is the sustainability of your project? So what is it? Because the work involved is very huge to actually come up with a product like this. So maybe I missed that part in the beginning, maybe the introduction. What is the team size and how do we get supported?

 

Ankit: We started to collaborate with IIITB and it helped us in the few initial months, we are still kind of part with them. In addition to that, our team sizes currently about five people who are working on this, we definitely want and then there are various freelancers or part times who are supporting us. We do have a productization strategy. And I’ll be happy to kind of discuss this with you offline. Good to see you sir. After many years, you remember we’ve been on some panels in the past.

 

Akila: Thank you for the presentation. We’ll be glad to take this forward with you.

 

Unknown 4:01:20

Thank you. Thank you. Thank you, Kayla, and thank you Dr. Bala. I think we’ll reach out to you offline and take take it forward.

 

Deepthi: Good morning to one and all. So, I Deepthi representing Mar Baselios College of Engineering and Technology and we are here to present the speech therapy system for children with cleft lip and palate and we are guided by Lani Rachel Mathew and co-guided by Miss Amrita BJ. So, the project is as I said, a speech therapy system for children with cleft lip and palate. So, the main objective of this project is to create a sort of visual feedback module, which is children friendly as well as cost effective and we are planning to make this for enhancing the therapeutic experience of children who have cleft lip or have undergone cleft lip or palate surgery.

 

So, the motivation of this project is that around 2,20,000 infants are born with cleft lip or palate each year and almost a million of them are left untreated. And this is because of the lack of awareness or maybe because of the complexity of the procedures or basically the lack of proper facilities for them. So, the functional restoration of this defect requires multiple surgeries which is followed by different therapies including the speech therapy, so, the contribution to infant mortality and mobility rate as recognized by WHO is very high and few and this feels the international effort at proving quality of the health care.

 

So, I’ll just step into my project. So, the project basically is the speech therapy module as I said earlier, so, it first it actually undergoes a different procedures. First procedure is that it first step is the speech acquisition, which is then followed by a pre processing and which is then followed by a feature extraction procedure. And then the matching of this trained models are done with the data that we have used for the training and if the system compares the real time voice with that voice, which was the data which we have trained before and it matches the voice and it will give a positive feedback on that it gives a visual positive feedback where if the child or the worker articulate the word correctly, and it will give a negative feedback when the child articulates the word in a wrong way.

 

So, the main procedures that we that are involved in are basically the model training and we have used a support vector machine model. So the which is used for the classification as well as regression problems. And we have also used to the goal of this SVM algorithm is to create the best line On decision boundary that can segregate n-dimensional space into classes. And so, we have also used a graphical user interface for displaying the results in a an interactive way. So, we have used almost 3000 data’s for training of the model and we have acquired an accuracy of about 85 percent. So, the system was tested with actual real children having the defect and also with the normal children and both the test results are comparatively high and it is almost 80 to 85 percentage. Thank you so much

 

Akila: I think we can move on to the next presenter. Thank you Deepthi

 

[Pre-recoded video plays]: Hello everyone, our research paper titled Voice to Indian Sign Language Translator, authored by Prasannna J. Shete, Pranjali S. Jadhav, Dikshita Jain & Pearl A. Kotak has the following problem statement. So, as you know, sign language is a natural way of communication for people with hearing and speaking disabilities. These are Project that aims at making use of videos for specific words to translate the voice into Indian sign language. Our main objectives is to assist people who have hearing disabilities and to enhance the communication with them. Our aim is to convert English speech to in sign language with facial expressions, hand gestures and hand movements to make it as real as possible and widen the scope of Indian sign language.

 

So, our project scope aims to encompass the domain of Indian language which is roughly about 1800 words and improve communication with people having hearing disabilities and the main objective is to also help people with hearing disabilities at railway stations, bus stations, banks and hospitals and to have better communication with them and to help them out. So, the source language would be a recorded voice or English speech and the target language would be Indian Sign Language video or ISL English.

 

So, the system architecture is as follows. Speech or English input will be given to the system it will be converted to English text, the text will be passed and then sentence reordering and ISL grammar rules will be applied to get a list of words. And then certain words which are not part of ISL dictionary would be removed using stock per detonator, then a lemmatization process would be performed to reduce the word to their lemma form, then these words would be converted to video, and the video would be then combined to form the sentence. And then the video translation would be shown to the user.

 

So Part A consists of converting English voice to text or using the web speech API, while the Part B consists of converting English text to a video of ISL language. So the voice to text module that we have made is using Web Speech APM this API is provided by Google and it uses speech recognition interfaces interface which helps in translating the English voice to English text.

 

The Part B consists of the following steps. The first step is the collection of video clips. So we have your stored dictionaries that Indian Sign Langage dot Org by FDM SE Coimbatore and the second is ISL RTC, which is provided by the government of India. So these two dictionaries we have used for the connection of video clips and we have come to take in the video and we have made them as key-value pairs. So the key would be the word and the value would be the video dot mp4 file.

 

The second part of Part B is part of speech stagger, so what POS standard does is it tags each word of the sentence as its form. So whether it would be a noun whether it would be adjective, whether it would be a verb, so it would tag each of the word. So the method that we have used is a probabilistic method, which uses CRF ML algorithm. And its accuracy score is 93.2%. So the input that we have given is I’m going to university that translates to I, pronoun, auxiliary verb, going verb. So, this is how it goes.

 

The next step is sentence reordering Module. English language has the form as subject-verb-object, but ISL language that has the form of subject-object-verb. So, we change the form of the sentence to subject-object-verb. So, we have done here.

 

Next us we are in using Stop Word Eliminator, to eliminate the words that are not part of the ISL dictionary, such as “am”, “the” do not have a translation in in in sign language.

 

Next is lemmatization is a task of reducing each word to its root or the lemma form. So we have used two methods corpus-based and rule-based to increase efficiency. The first corpus-based method uses two dictionaries, Universal Dependency Treebank and Corpus and British National Corpus, which approximately combines  35,000 words, which helps us in making the model more efficient. So the example is living abjective gets converted to living bob living verb gets converted to live, but we encountered a difficulty of the plural nouns like leaves, it did not get converted to leaf. So that’s why we had to improve the efficiency and we made a bunch of rules to extract the lemma or plural nouns. So leaf, then would convert to leaf watches would then convert to watch, fairies would convert to fairy. So combining both the methods, we got the proper output.

 

Now the for the last part, video conversion module, we used a movie package of Python and each of the video of the world was fetched from our dictionary that we collected. And these videos were then concatenated to form a confabulated video, which would be shown to the user. So I am going to the university these videos were fetched. And we deployed the app on Heroku I will show you the demonstration of the app. So this is our app and if I speak if I attach the mic button and speak into it, the output will be input will be captured and then I hit the submit button and then the translation begins and a video is shown.

 

So as you can see the output is given and the speech that we gave was “Friend is Clever” that converted to text “friend is clever” and that text is then converted to Indian Sign language English. So his word which is a stop word which needs to be eliminated is eliminated. And the subject verb object form is made your and the video is shown. So these videos are from Indian Sign Language dot Org.

 

so there is a subtitle also yours for the person to see it and then I’ll try one more sentence.

 

“Museum is beautiful”

 

So Museum is beautiful got converted to Indian Sign language

 

So that’s how our app works. Now coming back to the conclusion. So we try to improve communication with people having hearing disabilities. And we try to make it as real as possible. And the project is based on Indian sign language, which has not made much progress. The dictionary is limited to 1800 words. So our model successfully converts the entire voice input into a single video and give gives the video a more materialistic, realistic and lively appeal since the actual people are and enacting the words. So thank you.

 

Akila: Raheel has a question. I think that there’s a team member here.

 

Raheel: yeah, I had a question. How efficient would it be for names etc. For example, if it is a name, so instead of signing each letter, since it’s a visual platform. For example Disha. So this is where you need to sign each and every letter instead of signing each and every letter if you displayed a tetx or for all local words, for example, Gulabjamun and Rasgulla they look very similar. So instead of trying to find a way to sign it, why don’t you show a pictorial representation of gulab jamun and rasgulla. In certain words which are local etc. So how efficient would this be?

 

Akila: So, yeah, I don’t think they are here.

 

That’s the end of the session on AT for Communication. We had a lot of discussion around Indian Sign Language translation and the use cases as well. Yeah, good to see the growing interest. It’s always been there because I guess Indian Sign Langage has the right amount of complexity for machine language learning models, and hence for student projects and all but it’s, it’s it is a very complex area to track. Your hope to see. Hope to see the work going ahead then what more progress made before the next empower.

Scroll Up