Introduction to this dataset
- Our dataset is consists of 32 static hand-gesture classes. There are 20 subjects with 10 males and 10 females. The subjects are required to stand about 2 to 5 meters away from the front camera while performing the same gesture with both hands.
- To ensure the variability of the dataset, we collected data under various external factors such as clothing, background, indoor or outdoor taken position. Moreover, to avoid unnecessary noise, the background condition must be without individuals and or human body parts. The dataset is proposed for mobile-related development, that is why we choose to use normal phone cameras in terms of camera devices, set up with four sides of taking photos such as left, right, front and top; horizontal and vertical figure for each side to make data become spatial diverse.
- Our hand gesture dataset only focuses on meaningful gestures. Some gestures have specific cultural meanings in some countries such as lucky flower, Kung Fu salute. This dataset may be used in the context of human communication or online chatting.
- Below is the general folder structure for each section. The last file will be specifically mentioned in each section.
- Data
- Gesture (01, 02, 03,...,32)
- Subject
- Side (front, left, right, top)
- File
- Side (front, left, right, top)
- Subject
- Gesture (01, 02, 03,...,32)
- key: name of the image
- value: a dictionary with 2 key-value pairs:
- size: size of the image
- contour: a list contains the coordinates of each pixel that lies on the contour of the segmentation mask.
File was named the same name as the corresponding image, with PascalVOC format (.xml) Most images will have two or one bounding box for each, depend on the gesture. With no gesture, the image may have no bounding box.
xml file contains some main tags as filename, size, object. Tag object contain name of class, (xmin, ymin, xmax, ymax) of bounding box.
Data is labeled by the LabelImg tool
We provided the masks stored in JSON format for segmentation process. The JSON format as a dictionary, with each of the key-value pair stores information of a single image.
The specific structure is as follow:
Data Statistics

- ✓ 32 static hand-gesture classes.
- ✓ 33471 samples of RGB images
- ✓ 1,069 images for each gesture
- ✓ File extensions:.jpg, .jpeg, .png, .heic.
- ✓ High resolution of 4128 × 3096.
- ✓ Subjects: 10 males and 10 females
Copyright Notes
The databases is released for research and educational purposes. We hold no liability for any undesirable consepuences of using the database. All right of the Hand Gesture Dataset are reserved.
If you use our dataset, please kindly cite the following paper:
- citation paper 1
- citation paper 2
- citation paper 3