Current a 3D model as a second direction.

Current hardware based systems as mentioned before very
costly and make the signer not able to freely signing. On the other hand
hardware based system can measure 3D movements and exactly differentiate among
these movements while vision based systems use 2D imagining which losses the 3D
information. Thus its effectiveness isn’t as hardware based system but still
vision based system simplicity makes it hardly required.

SVBiComm system aims to provide new e-services to improve
communication. This is done by developing a new vision-based tool for
translating sign language (sequence of images) to text and then to speech as
one direction and translating voice spoken words to text and then to signs
represented by a 3D model  as a second
direction. The first direction: As a preprocessing phase video stream is
captured by the deaf user’s camera. Frames of interest are extracted and passed
down to server to be processed. The input could be as a text which is directly sends
to the server using a special purpose deaf-keyboard.  The server then applies some image processing
techniques on the passed frames.  Then
the skin detection mechanism is applied on the image, in order to get the black
and white image; where the white parts are the skin detected parts. The next
step is passing down the image to a median filter to remove any possible noise
on the image. The image is then portioned to pieces. Afterwards, the pieces are
passed to a scale down function to make them all of one size. Then merge the
pieces again into one image in one specific order starting from top left. The
last step is to feed the image into a well-trained neural network to recognize
the image and return the corresponding text. Finally, the text produced is
converted to speech using TTS tool, then this speech is played on the normal/blind
client’s machine. The second direction: – The system records audio
stream from the normal/blind client and send it to the server to be recognize.
The speech is filtered before recognition then feature extraction function is
applied. The speech is recognized by using Dynamic Time Wrapping (DTW), and the
corresponding text is returned. The text is then passed down to the 3D
graphical model on the deaf machine to animate the text as visual signals.
SVBiComm system with its video and voice capturing devices uses the object
oriented programming language to implement image processing algorithm. The
image could be captured from any distance with non/black background without any
skin colored objects; speech should be in quite space with minimum noise. The proposed
vision-based hand tracking system does not require any special markers or
gloves that can operate on a commodity PC with low-cost cameras. SVBiComm
system provides user by: static sign translation; isolated words animation;
isolated word translation; continuous sentence translation; continuous sentence
playing by voice; inputs from deaf keyboard. SVBiComm system analyses video and
voice streams and its content on real time with high speed network connection
and high performance computing capabilities.