SCUT-Ego-Gesture dataset is the extension of SCUT-Ego-Finger dataset. The dataset contains 59,111 RGB images in the egocentric vision with 16 different hand gestures (11 single hand gestures and 5 double hand gestures). 11 gestures with single hand: SingleOne (3374 frames), SingleTwo (3763 frames), SingleThree (3768 frames), SingleFour (3767 frames), SingleFive (3755 frames), SingleSix (3757 frames), SingleSeven (3773 frames), SingleEight (3380 frames), SingleNine (3769 frames), SinleBad (3761 frames) and SingleGood (3769 frames); 5 gestures with both hands: PairSix (3681 frames), PairSeven (3707 frames), PairEight (3653 frames), PairNine (3653 frames) and PairTen (3536 frames).
We collected the data under the following conditions: complex backgrounds, illumination change, different user hands and directions, skin-like backgrounds, etc. To make the dataset fully cover various situations, we collected the dataset in 7 different scenes, which contain 4 outdoor scenes and 3 indoor scenes.
 Wenbin Wu, Chenyang Li, Zhuo Cheng, Xin Zhang and Lianwen Jin. "YOLSE: Egocentric Fingertip Detection from Single RGB Images." Proceedings of the IEEE International Conference on Computer Vision Workshops. 2017.
There are 16 folders (From SingleOne to PairFive) in the "imgJpg" folder. Each of them has a corresponding txt file in "label" folder. All the images are named as "AAA_BBB_CCC_color_DDD.jpg", where "AAA" represents a certain scene such as ChuangyeguBusstop, ChuangyeguLab, etc, "BBB" is "Single" or "Pair", "CCC" is "One", "Two", "Three"..., and "DDD" means the serial number of the picture in the scene "AAA".
We label all the 16 gestures by saving the normalized top-left and bottom-right coordinates of the bounding box of hands and the normalized fingertip and finger joint coordinates of the outstretched fingers. We visualize some representative samples of each gestures in Figure 1.
Figure 1. Some representative samples of each gesture
As for txt file, we arrange it obeying the following principles:
1 Each row is a path of one image followed with some normalized coordinates of key points (a series of floating numbers);
2 Save the bounding box coordinates first, then the coordinates of hand key point;
3 Save the coordinates of hand key point follows the order: thumb, index, middle, ring, pinkie;
4 Save the fingertip coordinates first, then the finger joint coordinates;
5 Save the X coordinates first, and then the Y coordinates;
6 If the gestures use both hands (i.e. “Pair”), save the coordinates of right hand first, then the left hand.
Take SingleFive.txt for example first, the first row of SingleFive.txt is:
Dataset/Img/SingleFive/ChuangyeguLab_Single_Five_color_0.jpg 0.2625 0.2 0.517187 0.78125 0.275 0.589583 0.33125 0.658333 0.2875 0.327083 0.353125 0.510417 0.353125 0.229167 0.392188 0.464583 0.43125 0.245833 0.428125 0.472917 0.492188 0.339583 0.4625 0.50625
Obviously, “Dataset/Img/SingleFive/ChuangyeguLab_Single_Five_color_0.jpg” is the path of ChuangyeguLab_Single_Five_color_0.jpg. According to the principles mentioned above, the floating numbers represent: tlx, tly, brx, bry, ttx, tty, tjx, tjy, itx, ity, ijx, ijy, mtx, mty, mjx, mjy, rtx, rty, rjx, rjy, ptx, pty, pjx, pjy (where “tlx” means x coordinate of top-left point of bounding box, “bry” means y coordinate of bottom-right point of bounding box, “ttx” means x coordinate of thumb tip, “tjy” means the y coordinate of thumb joint, and so on).
Next, take SingleTwo.txt for instance, because we only label the outstretched fingers (index and middle for SingleTwo gesture), the floating numbers after the path respectively represent: tlx, tly, brx, bry, itx, ity, ijx, ijy, mtx, mty, mjx, mjy.
Further, when it comes to both hands gesture, we save the all the coordinates associated with right hand first, and then left hands.
Please fill in the APPLICATION FORM, follow the note inside and send an email
If the dataset link is not available due to network problem or server error, please contact Chenyang Li: firstname.lastname@example.org for other approaches.
Chenyang Li: email@example.com