Sign Language Analysis Surveys

Definitions:
  1. SL - sign language
  2. SLR - sign language recognition (analysis)
  3. NMS - non-manual sign(s)
Common main points from survey papers on SLR:
  1. Generally, SLR is for classification and understanding the meaning of signs. More detailed, understanding SL involves recognition (classification) of face expressions and gestures, tracking (if tracking method is used) and motion analysis
  2. SL is multimodal and it is performed in parallel (NMS only can be performed in parallel). Therefore, SLR requires simultaneous observations of separate body articulations. Authors propose Parallel HMM instead of the classic HMM
  3. Research focusses on feature extraction, classification, and scaling to large vocabularies
  4. Signs are affected by: a) showing action that is performed in time, b) inter-personal signing, c) emphasis of smth, d) one sign influences the other, e) transition from one sign to the other
  5. Distinguish tracking and non-tracking methods
  6. Identify importance of NMS
  7. Very small corpora/datasets (especially for the Kinect)
  8. Two approaches for classification: a) single classification b) classifying (simultaneous) components (must integrate/combine extracted features to describe a sign)
  9. Distinguish techniques based on what body part is being detected/classified/recognised: hand, fingers, NMS (temporal, appearance, encoding, positional dimensions), body, and head
  10. Data acquisition with datagloves (advantages: precision; disadvantages: price, cumbersome to the signer results in unnatural signing), attached accelerometers (same as datagloves), cameras (with/without colored gloves) (advantages: price, expressive; disadvantages: pre-processing, motion blur, no depth information), Kinect (advantages: depth information; disadvantages: price)
Specific additional points from Ong et al on SLR:
  1. Many incorrectly treat SL as gestures (causes incorrect use of techniques and methods).
  2. Gestures are different, although SL is more complicated (confined), many gesture recognition techniques are applicable to SL. 
  3. Although SL is natural language, same as speech, many speech techniques are not suitable for SLR.
  4. Kinect improves classification accuracy
  5. Tracking is hard because conversations are fast and images are blurred and occluded
  6. Research frontiers: continuous SLR through not only simple segmentation but epenthesis modelling, signer independence, fusion of multimodal data, use of linguistics theories to improve recognition, generalisation to complex corpora
Specific additional points from Cooper et al on SLR:
  1. Distinguishes simple, not-so-simple, and robust tracking methods
  2. Simple methods for tracking and occlusion are not satisfactory without coloured or data gloves
  3. Advantage of separating feature-level and sign-level classification is that fewer classes (finite number) need to be distinguished at the feature level.
  4. Approaches that use features for classification are not scalable to larger datasets, while approaches that use sign-level classification are scalable. Feature-level classifiers do not need to be retrained when new signs are added
  5. Success of a classifier (either vision-based on glove-based) is measured with: a) classification accuracy b) continuous signing (general approach - segmentation, boundary detection with automatically learned appropriate features, or model epenthesis that had been proven to be more advantageous to classification accuracy, c) Grammatical processes in sign languages (do not 'squeeze' sign into some window, tolerate variable time) d) signers independence