How gesture recognition can be implementedas an aid for electroacoustic composition:
with a specific focus on the Leap Motion Device
Jonathan Higgins
Submitted in partial fullfilment of therequirements for the degree of BMus
Department of MusicUniversity of Sheffield
England
August 2015
Acknowledgements
Firstly I would like to thank my dissertation supervisor Adrian Moore for all the excellent help and
advice you have provided whilst I have been working on this dissertation (as well as most of my
other work). The research areas that you suggested have significantly shaped the direction of this
project and I am extremely grateful for all of your help.
I would also like to thank my partner Mabel, for her help, patience and cups of tea; my friends Alex
and Jay, for their help with troubleshooting bugs in the jh.leap tools; and finally, my parents, for
their support and regular gifts of Dominos pizza.
2
Contents
1 Introduction 4
1.1 Gesture Recognition . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4
1.2 Input and Computer Vision . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
2 Musical Applications of Gesture Recognition 7
2.1 Early Applications of Gesture Recognition . . . . . . . . . . . . . . . . . . . . . . . 7
2.2 Contemporary Applications of Gesture Recognition . . . . . . . . . . . . . . . . . . 9
2.2.1 The Wiimote . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9
2.2.2 The Kinect . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10
3 The Leap Motion Device 12
3.1 Construction and Vision Processes . . . . . . . . . . . . . . . . . . . . . . . . . . . 12
3.2 Accuracy . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14
3.3 Musical Applications of the Leap Motion Device . . . . . . . . . . . . . . . . . . . . 15
4 jh.leap 17
4.1 The Tools . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17
4.1.1 jh.leap main . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17
4.1.2 jh.leap sample player . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18
4.1.3 jh.leap reverb . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19
4.1.4 jh.leap tremolo . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20
4.1.5 jh.leap pan . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20
4.2 The Utilities . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21
5 Conclusion 23
Bibliography 25
3
Chapter 1
Introduction
There are countless peripherals available to musicians that offer methods of human-computer inter-
action beyond the mouse and keyboard; however few of these are available to consumers and fewer
still attain widespread adoption (Collins 2010, p. 198). As well as peripherals specifically designed
for musical purposes, such as MIDI keyboards and drum pads; many other peripherals either offer
a more general method of interaction, for example the Leap Motion Device (Garber 2013), or are
repurposed for use within a musical environment, examples of this include the Playstation Move
(Pearse 2011) and the Wacom Graphics Tablet (Moore 2008).
Emmerson (2000, p. 209) explains that approaches to human-computer interaction within musical
applications can be placed into two main groups: “devices which track and measure physical ac-
tion”; and devices “which analyse the sound produced in performance [...] for the control of sound
production or processing”. The first of these groups has been extensively implemented since the
first developments in electronic music (ibid., p. 209). Examples of this vary from the simple MIDI
keyboard through to complex data gloves (Fischman 2013). The second of Emmerson’s groups
requires significantly higher processing power and more complex software in order to be successful.
Because of this its use is significantly less widespread; one example of this method of interaction is
the OMax system developed at IRCAM (Assayag, Bloch, and Chemillier 2006).
1.1 Gesture Recognition
Gesture recognition is a method of human-computer interaction which fits into the first of Em-
merson’s approaches to interaction. As the name suggests, gestural recognition utilises sensors to
identify and track the gestures and motions of the user. These can be full body movements or
be limited to movements of a hand or finger. For this paper I will be focusing on hand gesture
recognition. Gestures can be defined as a movement that imparts meaning. They differ from purely
functional movements as they also carry information. As Badi and Hussein (2014, p. 871) explain,
the motion of steering a car and describing a circular object with your hands are extremely similar.
4
Steering a car serves only to fulfil a functional purpose whereas motions describing a circular object
contain information about the size of the object.
The variety of gestures the human hand can create is vast, these gestures can be sorted into four main
categories: conversational gestures (gestures which function alongside speech), controlling gestures
(gestures such as pointing or the orientation of hands in a 3D space), manipulative gestures (gestures
which interact with real or virtual objects) and communicative gestures (such as sign language) (Wu
and Huang 1999). Musical applications of gestural recognition usually focus on utilising controlling
and manipulative gestures as these are both the easiest to detect and the most consistent between
users; the applications of pairing these two groups of gestures have been widely explored and are
increasingly expanding (Al-Rajab 2008, p. 12). By being less user dependent than conversational
and communicative gestures, controlling and manipulative gestures are not as prone to variation and
co-articulation, issues that often make gestural recognition difficult (Dix et al. 2004, p. 383).
Dix et al. (ibid., p. 88) suggested in 2004 that the “rich multi-dimensional input” provided by
gestural recognition devices was a solution in search of a problem, as most users do not require such
a comprehensive form of data input and those that do cannot afford it. However with advances
in technology, consumer grade electronics capable of detailed gesture recognition are beginning to
become increasingly affordable and widespread. Because of this, software designers are beginning to
take advantage of the new levels of interaction available to them (Zeng 2012, p. 4).
1.2 Input and Computer Vision
There are two main hardware approaches to relaying information about hand movement to a com-
puter. The first utilises a peripheral that is worn on the hand called a dataglove. There are several
approaches to the construction of a dataglove and each approach employs different sensors to detect
motion. The most common method uses fiberoptic cables attached to the fingers of a lycra glove.
When a finger bends, light leaks from the bend in the fiber optic cable; the glove detects these
fluctuations in light intensity and relays this information back to the computer. The computer uses
this information to map the movement of the hand (one example application of this approach is the
Lady’s Glove developed for Laetitia Sonami (Bongers 2000, p. 482)) (Dix et al. 2004, p. 88). The
second approach uses computer vision.
Computer vision is the “acquisition and processing of visual information” by a computer; this infor-
mation can then be used by the computer for a variety of different applications (Badi and Hussein
2014, p. 876). Approaches to detecting hands through computer vision are numerous. They can
vary from obvious methods such as detecting infrared, through to the more obscure, such as detect-
ing which pixels of a video-feed are skin by analysing the colour of each pixel (Forsyth and Ponce
2003, p. 591). As well as the ability to detect hands, computer vision also allows us to track the
hand’s movements in space. Depth can be tracked in a variety of ways. The most common method
uses the same process our eyes and brain use to detect depth. Using two cameras mounted next to
5
each other the computer can compare the two image streams. By knowing the physical relationship
between the two cameras i.e. how far apart they are, the computer can analyse the difference in
the two images to create a strong sense of depth; this method is known as stereoscopic vision (or
simply stereo vision) (Forsyth and Ponce 2003, p. 321). Another method of measuring depth is an
approach called structured light sensing. Structured light sensors project a known pattern onto an
unknown surface, by analysing the deformation of the pattern the sensor is able to determine the
three dimensional shape of the unknown surface (Weichert et al. 2013, p. 6381).
6
Chapter 2
Musical Applications of Gesture
Recognition
In this chapter I will be examining previous applications of gesture recognition within a musical
environment. I will be looking at how approaches to gestural input and the technology required to
successfully carry it out have developed over the last century and will present two case studies of
contemporary applications of gesture recognition.
2.1 Early Applications of Gesture Recognition
Although often considered cutting edge technology, gesture recognition has a rich history withinelectronic music; the Theremin, patented in 1928 is the earliest example of gesture being used as ameans of interaction within electronic music (Bongers 2000, p. 481). The Theremin works using apair of slightly detuned oscillators that broadcast extremely high frequencies (>1,000,000 hz) over aradio antenna (Theremin and Petrishev 1996, p. 50). Moving a hand nearer to the antenna causesa change in capacitance as the hand and performer ground some of the broadcasted signal. TheTheremin measures these changes and uses them to control pitch and volume (ibid., p. 50). Whilstdeveloping his device Leon Theremin stated:
“I believe that the movement of the hand in space, unencumbered by pressing a stringor bow, is capable of performing in an extremely delicate manner. If the instrumentwere able to produce sounds by responding readily to the free movement of the handsin space, it would have an advantage over traditional instruments.” (ibid., pp. 49-50)
This approach to instrument design was unique for its time and inspires many contemporary devel-
opments in gestural control. Although basic in its ability to recognise gesture, the system was none
the less revolutionary, both for being the first implementation of gestural control and for being a
proof of concept for the worth of gesture recognition.
After the release of the theremin in the 1920’s it wasn’t until the late 1970’s that gesture recognition
began to be explored further. Development of the Sayre Glove in 1979 kick-started the field of gesture
7
recognition as a means of human computer interaction (Sturman and Zelter 1994, p. 32). As this field
began to expand, musicians inevitably began to adopt and develop gesture recognition technology for
their own means, producing pioneering research in computer music and human computer interaction
(for example: (Buxton et al. 1979)). Michel Waisvisz’s The Hands was one of a number of dataglove
systems designed specifically for musical applications in the 1980’s (Roads 1996, pp. 630-635).
Developed at STEIM The Hands was the result of Waisvisz’s experiments with instrument design for
the improved performance of electronic music (Bongers 2000, p. 482). The Hands contained several
different methods of interaction, some were common in many non-wearable musical peripherals -
such as buttons and pressure sensors - however, The Hands expanded on this existing technology by
incorporating mercury tilt switches and ultrasonic sensors to enable aspects of gesture recognition
(Bongers 2007, pp. 11-12). Other dataglove devices that were developed for musical applications in
the 1980’s and early 90’s include the Lady’s Glove (discussed in section 1.2) and modified versions
of the Mattel Powerglove, a dataglove released by Nintendo for use as a game controller (ibid.,
pp. 12-13).
Alongside the developments in dataglove technology, during the 80’s and 90’s musicians were also
looking at alternative input methods for gesture recognition. One of these methods was a peripheral
called an electronic conductor’s baton. Several different electronic conductor’s batons were produced
during this time, employing a variety of different sensors to detect the motions of a conductor. These
batons allowed real time control of a synthesised or acousmatic composition (Roads 1996, p. 654).
Some batons like the various MIDI Batons developed at Queen’s University, Canada, employed
hardware sensors like accelerometers within the baton to detect the movements of the conductor.
This information could then be relayed to a computer for analysis and detection of gestural content
(Keane and Gross 1989) (Keane and Wood 1991). Another method for tracking the baton employed
rudimentary computer vision systems that allowed the computer to watch the gestures produced by
the conductor. Many of the computer vision systems utilised cameras fitted with infrared filters to
track an infrared light on the end of the baton (Morita, Hashimoto, and Ohteru 1991, p. 47) (Marrin
et al. 1999). Pairing the baton with a dataglove the system developed by Morita, Hashimoto, and
Ohteru (1991) allowed the conductor to relay further information to the computer such as changes
in dynamic. Conducting falls into the category of communicative gestures (more: section 1.1). Like
spoken language, communicative gestures are user dependent; as voice recognition software often
struggles to decipher various accents, so too does gesture recognition software struggle to decipher
each user’s ‘gestural accent’. Many of these batons were hampered by their ability to work precisely
when changing between conductors and because of this many systems faced a trade-off between
stability and sensitivity (Keane and Gross 1989, p. 153) (Morita, Hashimoto, and Ohteru 1991,
p. 52). This inconsistency, coupled with the expense of producing a baton, is possibly why these
electronic batons have become less popular in recent years.
Advancements in computer vision in the 80’s and 90’s were beginning to allow musicians to interact
with computers without the need to hold or wear peripherals. The Infrared-based MIDI Event
Generator designed by Genovese et al. (1991) is a gestural controller that operated using infrared
to track objects moving in space above the controller. As we will see in chapter 3, this device
8
utilises a similar hardware approach to computer vision as the Leap Motion device. By using four
groups of infrared transmitters and receivers mounted in a 25cm square, the Infrared-based MIDI
Event Generator could track movements in a pyramid space above the device (Genovese et al. 1991,
pp. 2-3). By transmitting infrared light in a pyramid, any object that moved into the pyramid would
reflect the infrared light back towards the device. These reflections were then picked up by the
infrared receiver which allowed the device to see (ibid., p. 3). This data was then analysed and
output as MIDI allowing the device to control any MIDI capable hardware (ibid., pp. 5-6). Another
early approach to computer vision used sonar. By utilising ultrasonic signals, Chabot (1990, p. 20)
was able to create a live electronic music performance system capable of computer-vision-based
gesture recognition. Interestingly, Chabot (ibid., p. 27) emphasised the importance of good software
to accompany new hardware. He states: “We have seen too many quick hacks jeopardizing the
use of great hardware gesture controllers”. This emphasis on software is missing from many papers
written on musical human-computer interaction in this era. As we move into the next section we will
explore applications of consumer-grade electronics and how innovation has begun to shift towards
software developments.
2.2 Contemporary Applications of Gesture Recognition
With gesture recognition technology becoming more affordable and widespread, applications of ges-
ture recognition are gradually becoming increasingly popular amongst musicians. To see this, one
need only look at the number of papers submitted to NIME (New Interfaces for Musical Expres-
sion) on gesture recognition in recent years (NIME 2015). Although extensive research into the
development of new gestural devices specifically for musical applications still exists, there has been
a significant shift in recent years towards utilising consumer grade gesture recognition devices for
musical applications. This has been prompted by the widespread adoption of gesture recognition
within console gaming as a means of interaction through peripherals such as the Wiimote and Kinect
(Collins 2010, p. 197).
2.2.1 The Wiimote
The Nintendo Wii games console was released in 2006 and to date has sold over 101 million units
worldwide (Nintendo 2015, p. 2). The controller that came with the Nintendo Wii (the Wiimote)
was a wireless hand-held device that tracked its position in space, as well as acceleration on X,
Y, and Z axes, and the controllers rotation, pitch and yaw (Wingrave et al. 2010, p. 72). One of
the two primary sensors that the Wiimote employs utilises a separate sensor bar which contains a
number of infrared LEDs at fixed widths (ibid., p. 74). The Wiimote contains an infrared camera
which when pointed at the sensor bar allows the Wiimote to calculate its relative position in space
from the sensor bar (ibid., p. 74). The second primary sensor is a three axis accelerometer similar
to those found in mobile phones (ibid., p. 75). The Wiimote is not without its draw backs however.
9
By utilising an accelerometer rather than a gyro-meter the controller has no gravitational bearing
and because of this its measurements are notoriously approximate (Pearse 2011, p. 126).
One of the most popular applications of the Wiimote is to trigger sound files like a virtual drum stick
(Collins 2010, p. 197). By utilising a peak detection algorithm on the output of the accelerometer it
is possible to determine when a drumming like motion has been made (Kiefer, Collins, and Fitzpatrick
2008). By utilising both the sensor bar and the Wiimote, Miller and Hammond (2010) were able to
create a virtual instrument that mimicked the violin or cello. The performer plays the instrument
by pressing buttons on the Wiimote to determine finger positions and uses the sensor bar as a
bow (ibid.). This approach was interesting as it required the performer to hold the Wiimote still
and to move the sensor bar rather than the intended configuration where the sensor bar stays in a
fixed position (ibid., p. 497). Peng and Gerhard (2009) used the Wiimote to create an electronic
conductor’s baton by attaching an infrared LED to the end of a conventional conductor’s baton.
Using the infrared camera in the Wiimote to track the motion of the baton they were able to create
a low cost alternative to other computer-based conducting systems. This system works using the
same principle as the batons developed by Morita, Hashimoto, and Ohteru (1991) and Marrin et al.
(1999) which were discussed in section 2.1.
The Wiimote introduced many people to gesture recognition and its adaptation by consumers has
had a profound effect on the field of human-computer interaction (Wingrave et al. 2010, p. 71).
However the Wiimote as a musical controller is limited by its hardware and the approximate data
that it outputs. This means that very few useful applications have been developed for it. Rapid
advancements in gesture recognition technology since the Wiimote’s release mean that consumers
have access to more accurate hardware, such as the Kinect. Because of this the Wiimote has been
all but forgotten by most musicians in the last few years.
2.2.2 The Kinect
Another game controller which allows gestural input is the Kinect sensor. Released by Microsoft in
2010, the Kinect was designed for use with the Xbox 360 and Xbox One games consoles. By 2013
Microsoft had sold in excess of 24 million Kinect sensors (Microsoft 2013). The Kinect features a
full RGB camera as well as an infrared projector, infrared camera and a four microphone array (Zeng
2012, p. 4). To detect depth the Kinect sensor utilises structured light sensing (Weichert et al.
2013, p. 6381). By projecting a constant pattern of infrared laser dots onto a scene and recording
the results with the infrared camera, the device is able to compare the resulting pattern of dots to
a reference pattern. By analysing the displacement between the two images the Kinect device can
use this information to estimate depth (Khoshelham and Elberink 2012, p. 1438). Using either the
drivers provided by the OpenKinect project or the Kinect SDK it is possible to connect the Kinect
sensor to a computer and utilise its data output for functions other than gaming (OpenKinect 2015)
(Microsoft 2015).
Since its release there have been many projects that utilised the Kinect sensor for gesture controlled
10
musical applications. One such project is the Crossole system developed by Senturk et al. (2012).
Crossole is a meta-instrument which utilises gesture recognition as its main method of interaction. A
meta-instrument implements a “one-to-many mapping between a musician’s gestures and the sound
so that a musician may perform music in a high level instead of playing note by note” (ibid., p. 1).
Using the Kinect’s depth data the system tracks the position of both the user’s hands. The user can
then perform a variety of controlling gestures (pointing, swiping) and manipulative gestures (drag
and drop) (ibid., p. 3). These gestures allow the user to control a myriad of parameters including
tempo and dynamic, as well as controlling melody sequencing and chord changes (ibid., pp. 2-3).
Another musical application of the Kinect sensor is the USSS-Kinect environment and the USSS-
KDiffuse tool designed by Pearse (2011). The USSS-Kinect environment allows the user to place
virtual spheres that can be touched, intersected or moved to control a musical parameter. What
this controls can be defined by the user as the data is sent out via Open Sound Control (OSC), a
data transfer protocol which is favourable over MIDI control messages (ibid., p. 128). OSC employs
networking technology to transfer data for sound control between devices; unlike MIDI the data
values and naming scheme are user defined allowing for complete control over the output (Wright
2002). One example application of how the USSS-Kinect environment can be employed is the
USSS-KDiffuse. A four-by-eight matrix of spheres in the USSS-Kinect environment controls a four-
in eight-out diffusion system. Each row of the matrix of spheres represented a sound source and
each column represented a different speaker routing (Pearse 2011, pp. 128-129). The system allows
the user to interact with diffusion in a unique way and produce spacial effects that would have been
extremely difficult, if not impossible with traditional diffusion setups.
The Kinect sensor allows for significantly more precise input than the Wiimote but costs nearly four
times as much. The wealth of depth information output by the Kinect allows for detailed interaction
that before its release was unheard of in consumer grade electronics. The Kinect’s skeletal tracking
is limited and is unable to track fingers. The release of the v2 Kinect and SDK in late 2014
introduced the ability to track more joints including thumbs which allows for more detailed methods
of gestural interaction. It will be interesting to see how in the coming years musicians utilise these
new features.
11
Chapter 3
The Leap Motion Device
In the past decade console gaming has caused a significant rise in the popularity of gesture recogni-
tion. Because of this, manufacturers have begun to produce gesture recognition hardware designed
specifically for interfacing with computers. Released in July 2013, the Leap Motion was one of the
first of these peripherals to come to market. In this chapter I will be looking at how the Leap Motion
device functions and how it has been utilised in musical applications.
3.1 Construction and Vision Processes
Figure 3.1: Construction of the Leap Motion Device (Weichert et al. 2013, p. 6382)
There is no official documentation on how the Leap Motion functions. However, by analysing the
construction of the device and patents filed by Leap Motion, Inc. and David Holz (the original
inventor of the Leap Motion device), we can theorise on how the Leap Motion operates. As can be
seen from figure 3.1 the Leap Motion’s hardware is surprisingly simple, formed only of three infrared
LEDs and a pair of infrared cameras. The peripheral is a little larger than a match box and is
designed to sit in front of a computer. It connects to the computer via a microUSB 3.0 cable. The
Leap Motion’s effective range is approximately 25 to 600 millimetres above the device and the field
of view is an inverted pyramid centred on the device (Guna et al. 2014, p. 3705). It is probable that
the Leap Motion utilises edge detection to distinguish between objects and their background. Edge
12
detection is prone to inaccuracies due to changes in lighting conditions. The three infrared LEDs
likely serve to remedy this as they can be used to increase the contrast of the image as outlined by
D. Holz and Yang (2014).
Due to the Leap Motion containing two cameras Guna et al. (2014, p. 3705) theorise that the device
utilises the stereo vision principle to detect motion and depth; although this is a reasonable theory,
patents filed by Leap Motion, Inc. suggest that this is incorrect. “Systems and methods for capturing
motion in three-dimensional space” is a patent granted to David Holz and Leap Motion, Inc. for
a method of detecting the shape and motion of a three dimensional object, utilising two or more
cameras as well as one or more light sources (David Holz 2014b). The patent suggests that once
the Leap Motion’s software has applied edge detection to locate objects it then analyses the objects
for changes in light intensity and uses this to create a series of ellipse-shaped 2D cross-sections of
the object. These 2D cross-sections can then be pieced together to create a 3D image of the object
(figure 3.2) (ibid., p. 3). By using two cameras instead of one, an object can be viewed from at
least four different vantage points. The increased vantage points help to capture a full picture of
the changes in light density across the surface of the object. This in turn increases the accuracy
of each 2D cross-section (ibid., pp. 5-6). It is discussed in the patent that the accuracy of this
process can be increased by using multiple light sources each tuned to emit different wavelengths.
Each wavelength of light can be identified individually on the object to speed up the process of
determining spatial position (ibid., p. 2). As the Leap Motion employs three different infrared LEDs
it is possible that this process is being implemented.
Figure 3.2: Graphic illustration of hand model generated using 2D cross-sections (David Holz 2014b)
13
Looking closely at figure 3.1, small pieces of plastic can be seen partially obscuring the outer two
IR LED’s. It is possible that these allow the device to determine the orientation of an object in a
manner similar to the process employed by the Kinect sensor (structured light sensing). David Holz’s
system for determining the orientation of an object in space utilises a partially obscured light source
to create a shadow line (D. Holz 2014). The system captures images of objects intersecting this line
and by analysing how the shadow displays on the object, information about the orientation of the
object can be established (ibid., p. 4). This is another process that is probably being implemented.
This assumption is evidenced by the fact that the Leap Motion informs the user of finger prints
or smudges on the glass, a process that is outlined in the patent as a useful byproduct of utilising
shadow lines for motion tracking (ibid., p. 1).
Another process developed by Holz (also probably implemented in the Leap Motion) utilises inter-
lacing and alternating light sources to remove noise from the image and reduce latency (David Holz
2014a). By alternating between two light sources for each video frame captured by a motion sensor
it is possible to compare the two images and remove noise from the image (ibid., pp. 1-2). The two
light sources available to the Leap Motion device are the infrared LEDs and ambient room lighting.
Interpolation can be used to reduce the latency caused by having to constantly perform analysis on
alternating images. By only transporting half the lines of an image (alternating between odd and
even lines) to the readout circuitry it is possible to increase the frame rate of a video at the expense
of halving the resolution of each image (ibid., p. 1). Despite utilising USB 3.0 technology the Leap
Motion is backwards compatible with USB 2.0 devices. If the Leap Motion does utilise interlacing
it would be particularly useful when dealing with the relatively low bandwidth of USB 2.0. When in
use the Leap Motion visualiser reports that the device outputs in excess of 100 frames per second.
This high frame rate is partially responsible for the Leap Motion’s precise computer vision.
Although relatively rudimental individually, the combination of these software processes allow the
device to gather a large amount of data from its field of view. These innovations enable the Leap
Motion to produce accurate results with a simple hardware set-up.
3.2 Accuracy
Testing the accuracy of the Leap Motion device is difficult as it is designed to specifically recognise
hands and therefore conventional sensor testing methods for accuracy need to be modified in order
to be utilised. Despite this, some studies into the accuracy and robustness of the device’s motion
sensing capabilities have been carried out. Upon its release the Leap Motion was purported as having
sub millimetre accuracy down to 0.01mm (Garber 2013, p. 23). However proof for these impressive
claims were not provided by Leap Motion, Inc.. Guna et al. (2014, p. 3202) found that the Leap
Motion is indeed capable of sub millimetre accuracy, but noted that the performance of the device
was inconsistent particularly at distances of more than 250mm above the controller. Weichert et al.
(2013, p. 6391) found the Leap Motion to have an average accuracy of 0.7mm for static discreet data
input and an average accuracy of 1.2mm for continuous data input. Weichert et al. (ibid., p. 6387)
14
also found the device to be inconsistent across the axes. The x axis significantly out-performed the y
and z axes in all of their accuracy and repeatability tests. Although these findings are not consistent
with the purported 0.01mm accuracy, it is important to note that the human hand is generally only
capable of a maximum accuracy of between 0.2mm and 1.1mm, so an average of 0.7mm still allows
for extremely detailed input (Weichert et al. 2013, p. 6383).
3.3 Musical Applications of the Leap Motion Device
Since the release of the Leap Motion device many musical applications have been created for it.
These applications vary from note based compositional environments such as The BigBang Rubette,
through to versatile MIDI and OSC generators such as GECO (Tormoen, Thalmann, and Mazzola
2014) (GECO 2015). This section will analyse various existing approaches for utilising the Leap
Motion device in musical applications, as well as identifying the advantages and drawbacks of these
implementations of gestural control.
Arguably one of the most popular musical applications available for the Leap Motion is GECO
(ibid.). By converting simple gestures into discrete MIDI or OSC messages GECO allows the user
to control any software or hardware that accepts these protocols. Despite not directly manipulating
or generating sound, GECO can be extremely useful to musicians who are looking to incorporate
the rich multidimensional input of the Leap Motion device into their existing workflow. The simple
interface and setup allows musicians to easily utilise the Leap Motion device without the need to
understand the complexities of programming gesture recognition. However this simplicity of use
does have some drawbacks: only very simple gestures are recognised by GECO and these mostly
consist of just motion tracking. By focusing on the position of the hands instead of interpreting what
the hands are doing, the software only utilises the bare minimum of data that gesture recognition is
capable of producing. Despite its rudimentary gesture recognition capabilities, musicians have begun
to develop tools designed specifically to work with GECO. Musical software development utilising
GECO is aided by its ability to save custom presets into a file that can be distributed by developers,
allowing a fast setup for the end user. The Greap project developed by Konstantinos Vasilakos
(2015) utilises GECO as a link between a Supercollider patch and the Leap Motion device as there
is currently no way to natively interact with the Leap Motion within Supercollider. GECO provides
a convenient work-around, allowing musicians to utilise software with which they are familiar whilst
gaining the benefits of gestural input.
Tekh Tonic developed by Ethno Tekh is also a MIDI and OSC generator. However unlike GECO it
utilises manipulative gestures and physics simulations to create a sophisticated gestural interaction
environment (EthnoTekh 2015). Despite the more advanced gesture recognition provided by Tekh
Tonic it has not achieved the same level of success as GECO. This can likely be attributed to the
increased complexity of setup as well as flaws in the design of the software. The majority of physics
simulations utilised by Tekh Tonic take place inside a cuboid space. However, as the field of vision of
the Leap Motion is an inverted pyramid, interaction in the extremities of the simulation is unreliable
15
and at times impossible (Guna et al. 2014, p. 3705). Additionally many of the simulations rely heavily
on the y-axis for interaction and, as discussed in section 3.2, the y-axis is the most inaccurate and
unreliable (Weichert et al. 2013, p. 6387). This reliance on the y-axis, coupled with poor interaction
at the extremities of the simulation, means that Tekh Tonic often produces undesired results. This
is unfortunate as when the physics simulations work successfully, they provide a unique and intuitive
method of interaction with many creative possibilities.
Hantrakul and Kaczmarek (2014, p. 648) discuss a variety of musical applications for the Leap Motion
device in their paper “Implementations of the Leap Motion in sound synthesis, effects modulation
and assistive performance tools”. Their research focuses on utilising the Leap Motion for the live
performance of electronic music. In all of their projects Hantrakul and Kaczmarek (ibid., p. 649)
used the aka.leapmotion object for MAX MSP to interface with the Leap Motion and built a patch
to extract and interpret the desired data from the device (Akamatsu 2014) (Cycling74 2015). From
this patch Hantrakul and Kaczmarek (2014) developed several tools for live performance including
a system similar to GECO that interfaces with Ableton live. Of particular note is their granular
synthesis tool. Their granulation tool utilises a mixture of controlling and manipulative gestures.
Grains are triggered by depressing a virtual piano key with the left hand, whilst other parameters
are mapped to the x, y and z positions of both the user’s hands. This allows the user to manipulate
a multitude of parameters at once in a manner that would be extremely difficult, if not impossible
with conventional hardware (Hantrakul 2014). Unfortunately the software’s hand identification
is extremely rudimental: the hand with the smallest x coordinate is assigned the left hand and
conversely the hand with the largest x coordinate, the right hand (Hantrakul and Kaczmarek 2014,
p. 649). Due to this unsophisticated method of hand detection, certain combinations of parameters
within the software are impossible to produce, as they require the left and right hands to cross over,
whereupon the software immediately reassigns them. However the authors do discuss the possibility
of implementing the new Skeletal Tracking API in the future. It will be interesting to see how their
projects develop after the implementation of improved hand detection.
BigBang Rubette is an extension for Rubato Composer, a visual music programming environment
that utilises various mathematical theories for the composition and analysis of music. Designed to
bring real-time interaction and gestural control to Rubato Composer, BigBang Rubette was originally
designed to be controlled with a mouse (Thalmann and Mazzola 2008). Currently BigBang Rubette
is being expanded to include gestural input from the Leap Motion device (Tormoen, Thalmann,
and Mazzola 2014). Working with synthesised sound BigBang Rubette allows the user to compose
music by “creating and manipulating Score-based denotators” (similar to MIDI data) in a virtual
three dimensional space (ibid., p. 208). Unfortunately BigBang Rubette only allows for composition
with synthesised sound and this limits its use as a compositional tool for many musicians.
16
Chapter 4
jh.leap
jh.leap is a suite of bespoke compositional tools that utilise the Leap Motion device. Built in
the visual music programming software Pure Data the tools utilise the leapmotion object built by
Chikashi Miyama (2013) to interface with the Leap Motion device. Designed for the manipulation
of audio, jh.leap provides a variety of audio processing tools as well as utilities for implementing the
Leap Motion device into any Pure Data patch. These tools were built as an aid for my composition
and as support for this dissertation.
The tools can be used individually, or be connected to other tools (Pure Data objects/patches) to
create an effects chain. All of the audio processing tools contain motion capture systems allowing
the user to record and automate gestural input, making it possible to utilise more than one tool
at once. Only a very basic understanding of Pure Data is required to use jh.leap and every tool
within the suite comes with a detailed help file explaining how it works, making it accessible to both
beginners and advanced users. jh.leap utilises a variety of controlling and manipulative gestures,
creating a rich multidimensional environment for sound manipulation that is designed to be intuitive
and promote creativity. Where possible, gestures have been specifically chosen to mimic the sounds
produced, thus creating a sense of causation. The tools can be downloaded from appendix 1 or
alternatively the most up-to-date version can be found on github (Higgins 2015).
4.1 The Tools
4.1.1 jh.leap main
To function correctly all of the jh.leap tools require jh.leap main to run. jh.leap main does not
process any audio; instead it acts as a bridge between the Leap Motion device and the rest of the
tools allowing them to communicate with each other. It also presents some useful information to the
user regarding the Leap Motion’s field of view and the CPU load of Pure Data. A visualiser of the
Leap Motion’s field of view can be toggled from jh.leap main and it opens in a new GEM (Graphics
17
Figure 4.1: jh.leap main and it’s options panel
Environment for Multimedia) window (figure 4.2). Visual feedback allows the user increased precision
when interacting with the Leap Motion as it is clear where the user’s hands are in it’s field of view.
The visualiser is customisable and its options are displayed with the “Opt” toggle. The ability to
change the window size, as well as what is rendered means the visualiser can be used effectively on
almost any size screen and run on computers with low processing power.
Figure 4.2: jh.leap main’s visualiser displaying two hands
4.1.2 jh.leap sample player
jh.leap sample player is the main method of sound file playback provided by the tools. Inspired by
the granular synthesis tool discussed in section 3.3, jh.leap sample player utilises the manipulative
gesture of pressing a virtual piano key to trigger a sound file (Hantrakul and Kaczmarek 2014). The
motion of depressing a finger is detected by measuring velocity. Once a finger’s negative velocity
along the y-axis crosses a set threshold a ”bang” is output (a ”bang” is a boolean value used by Pure
Data as a trigger). A soundfile can be loaded per finger by clicking the corresponding “Open sf”.
18
Figure 4.3: jh.leap sample player tool
Once all soundfiles are loaded the user can play back a sound file by depressing a finger in the air.
Although similar in movement to playing a physical midi keyboard this gesture is much freer; because
of this it is significantly easier to create natural sounding rhythms as well as rapid passages. The
sample player works particularly well with micro sounds. These can be played back rapidly to create
various compound gestures. The tool also works well with longer drone based sounds (particularly
pitched material). Different layers can be overlapped subtly over time to create gradually evolving
textures (when working with long textures the “Stop all” button is particularly useful when you
accidentally trigger a 40 minute sound file). The record and loop functions (which also feature on
all other tools) allows for gestures to be recorded and played back. This can be used to create a
loop or to change the sounds whilst keeping the gestures the same.
4.1.3 jh.leap reverb
Figure 4.4: jh.leap reverb tool
jh.leap reverb utilises a variety of manipulative gestures to change the parameters of the effect. To
control the reverb users manipulate a virtual space in which their hands are the room and the Leap
Motion device is the sound source. Room size is controlled by how far apart the user’s hands are on
19
the x-axis; wet/dry is controlled by how far the user’s hands are from the Leap Motion device along
the y-axis; damping is mapped to the z-axis; and finally, freeze is toggled on or off by clenching a fist
as if to catch the sound. The ability to control so many parameters at once allows the composer to
simply focus on the composition rather than focusing on automating a variety of parameters so that
multiple events can happen at the same time. jh.leap reverb is also capable of motion capture. This
can be toggled for all parameters or selected parameters allowing the composer complete control
over the manipulation.
4.1.4 jh.leap tremolo
Figure 4.5: jh.leap tremolo tool
Similar to the reverb, jh.leap tremolo utilises manipulative gestures that mimic the process taking
place to control the parameters of the tool. The user shapes a virtual sine wave to vary the amount
of modulation the effect has on the input sound. The depth of the tremolo is controlled by how far
apart the hands are on the y-axis; frequency is controlled by how far apart they are on the x-axis
and the smoothness of the wave is shaped by rotating the hands (flat palm down being closer to
a square wave and 45 degree angles closer to a sine wave). Palm rotation is not measured by the
Leap Motion device; instead this is calculated by measuring the difference along the y-axis between
the thumb and little finger. The ability to control the tremolo in this way makes it an extremely
expressive tool capable of both subtle undulations and harsh rhythmic cuts, as well as the ability to
rapidly transition organically between the two. Also similar to jh.leap reverb, this tool is capable of
motion capture on individual parameters.
4.1.5 jh.leap pan
The final tool currently part of the jh.leap suite is jh.leap pan, a multichannel panner with cus-
tomisable speaker layouts. jh.leap pan has presets for stereo, quadrophonic, 5.1 and eight channel
setups. However, it is possible to utilise the tool for any speaker setup up to eight channels using
the “Speaker X/Y” sliders to move speakers to custom positions. Unlike the other tools in the
suite jh.leap pan only uses one hand as input. It tracks the position of the hand along the x and
z-axes to control panning position and the user can clench a fist to make the sound wiggle along
20
Figure 4.6: jh.leap pan tool set up for 8 channel output
the x-axis. Currently the tool only accepts a mono input. Plans to expand the tool to accept stereo
input and utilise both of the user’s hands are currently in development. Like the other tools it is
also possible to record automation. This is particularly useful when using panning patterns that can
loop seamlessly, such as circular motions. The tool also has a toggle for random panning which
makes a sound jump around in space. This is a quick way to create interesting spatialisation whilst
composing active gestural passages of music.
4.2 The Utilities
As well as the audio processing tools listed above, the jh.leap suite also includes a variety of practical
tools for interfacing with the Leap Motion device. These allow the user to quickly implement
gestural control into any Pure Data patch. The utilities recognise a variety of gestures: from
simple controlling gestures, such as tracking hand positions (jh.leap hand); through to more complex
gestures such as clenching a fist (jh.leap fists), swiping a hand (jh.leap swipe) or depressing a finger
(jh.leap keyboard). Each utility comes with a detailed help file (see figure 4.7). These follow a
similar layout to the default Pure Data help files, providing information regarding what the object
does as well as the inlets, outlets and creation arguments of the utility. The ability to quickly view
this information helps to speed up the process of implementing the Leap Motion into a patch, making
it accessible to users who have never worked with gesture recognition before.
21
Chapter 5
Conclusion
This dissertation has explored the rich history of gesture recognition within electronic music. The
Theremin (1928) is the earliest example of gesture control being implemented within electronic
music (Bongers 2000, p. 481). Since then advances in computer technology have allowed musicians
to explore the possibilities of gesture recognition in depth; developing a variety of tools to track
gesture. Due to constraints in computer vision technology, the majority of early musical applications
of gesture recognition relied on wearable or hand held peripherals to track motion. Some examples
of these devices include data gloves such as: Michele Waisvisz’s The Hands and the Lady’s Glove;
as well as electronic conductor’s batons, a variety of which were developed at Queen’s University,
Canada (Roads 1996, pp. 630-635) (Bongers 2000, p. 482) (Keane and Gross 1989). Advances in
computer vision in the late 1990’s and early 2000’s began to allow for gesture recognition to move
beyond wearable peripherals. However gesture recognition was for a long time too expensive to
receive widespread adoption.
Innovations in console gaming over the last decade have brought gesture recognition into the main-
stream. Gaming companies produced low cost peripherals for gesture control that were unparalleled
by previous consumer grade electronics. Peripherals such as the Wiimote and the Kinect have been
repurposed in a variety of ways to produce cost effective musical applications for gesture recognition.
Systems utilising these peripherals, such as the USSS-Kinect, provide methods of interaction with
musical applications in ways that until recent years were unattainable for many outside of research
institutions (Pearse 2011).
Released in 2013 the Leap Motion device was designed specifically to interface with a computer.
Capable of tracking hands within ' 0.7mm, the Leap Motion brought unprecedented accuracy to
the consumer price bracket (Weichert et al. 2013). Rather than utilising expensive hardware, the
Leap Motion instead likely utilises several innovative software processes to produce such accurate
results. Several musical applications have been developed for the Leap Motion, including the popular
GECO software, which converts gestures to MIDI and OSC messages (GECO 2015). Throughout
this dissertation I have attempted to identify processes that are employed by the Leap Motion device.
I have done this by providing the first detailed analysis of patents filed by Leap Motion, Inc..
23
The jh.leap tool suite provides a variety of compositional tools that utilise gesture recognition.
Although currently limited in the number of tools available, the suite serves as a useful proof of
concept for the merits of gesture recognition as an aid for electroacoustic composition. I have
utilised the tools myself when composing, most notably in my piece Digital Spaces. When using
the jh.leap suite, I have found that they allow me to quickly create interesting and organic gestural
material, particularly when working with micro-sounds. The ability to manipulate and shape audio
using only your hands brings an element of tangibility to the compositional process. This sense of
physically manipulating sound is often devoid when composing using only a mouse and keyboard.
Although some other compositional tools that utilise the Leap Motion do exist; jh.leap is to the
authors knowledge, the first compositional environment for the manipulation of audio to utilise the
Leap Motion device.
I intend to continue the development of the jh.leap suite and currently have plans to implement a
granulation tool, as well as a third order ambisonic panner. To increase the accuracy of the tools I
am currently in the process of updating Chikashi Miyama’s leapmotion object to work with the new
skeletal tracking API (Miyama 2013). When complete this will allow the tools to better keep track
of fingers locations (even when blocked by the rest of the hand), as well as the ability to identify and
assign which is the left or right hand. Beyond this I would like to research how other peripherals for
gesture control can be implemented into the suite and compare what effect the peripheral has on the
compositional process. Finally, research into providing mid-air haptic feedback for gesture control
using the Leap Motion - called Ultrahaptics - is currently being developed at the University of Bristol
(Carter et al. 2013). When this hardware becomes available, it will be interesting to investigate how
haptic feedback effects gesture recognition as a compositional tool.
24
Bibliography
Akamatsu, Masayuki (2014). aka.leapmotion. url: http://akamatsu.org/aka/max/objects/.
Assayag, G, G Bloch, and M Chemillier (2006). “OMAX-OFON”. In: Sound and Music Computing
(SMC) 2006. Marseille.
Badi, Haitham Sabah and Sabah Hussein (2014). “Hand posture and gesture recognition technol-
ogy”. In: Neural Computing and Applications 25.3-4, pp. 871–878.
Bongers, A.J (2000). “Interaction in multimedia art”. In: Knowledge-Based Systems 13.7-8, pp. 479–
485.
– (2007). “Electronic Musical Instruments: Experiences of a New Luthier”. In: Leonardo Music
Journal 17, pp. 9–16.
Buxton, William et al. (1979). “The Evolution of the SSSP Score Editing Tools”. In: Computer
Music Journal 3.4, pp. 14–25.
Carter, Tom et al. (2013). “Ultrahaptics: Multi-Point Mid-Air Haptic Feedback for Touch Surfaces”.
In: UIST’13.
Chabot, Xavier (1990). “Gesture Interfaces and a Software Toolkit for Performance with Electronics”.
In: Computer Music Journal 14.2, pp. 15–27.
Collins, Nick (2010). Introduction to Computer Music. Hoboken: John Wiley & Sons Inc.
Cycling74 (2015). Cycling 74 MAX MSP. url: https://cycling74.com/.
Dix, Alan et al. (2004). Human-Computer Interaction (3rd Edition). Essex, England: Pearson Prentice-
Hall.
Emmerson, Simon (2000). Music, Electronic Media and Culture. Aldershot: Ashgate.
EthnoTekh (2015). Tekh Tonic. url: http://www.ethnotekh.com/software/tekh-tonic/.
Fischman, Rajmil (2013). “A Manual Actions Expressive System (MAES)”. In: Organised Sound
18.03.
Forsyth, David and Jean Ponce (2003). Computer Vision: a modern approach. N.J. : Prentice Hall:
Upper Saddle River.
Garber, Lee (2013). “Gestural Technology: Moving Interfaces in a New Direction [Technology
News]”. In: Computer 46.10.
GECO (2015). url: http://uwyn.com/geco/.
Genovese, V et al. (1991). “Infrared-Based MIDI Event Generator”. In: Proceedings of the Interna-
tional Workshop on Man-Machine Interaction in Live Performance. Computer Music Department
of CNUCE/CNR. Pisa, pp. 1–8.
25
Guna, Joze et al. (2014). “An Analysis of the Precision and Reliability of the Leap Motion Sensor
and Its Suitability for Static and Dynamic Tracking”. In: Sensors 14.2.
Hantrakul, Lamtharn (2014). Linked Media For ICMC 2014. url: http://lh-hantrakul.com/
2014/04/15/linked-media-for-icmc-2014/.
Hantrakul, Lamtharn and Konrad Kaczmarek (2014). “Implementations of the Leap Motion in sound
synthesis, effects modulation and assistive performance tools”. In: Proceedings ICMC SMC 2014.
Athens, Greece, pp. 648–653.
Higgins, Jonathan (2015). jh.leap tools. url: https://github.com/j-p-higgins/jh.leap_
tools.
Holz, D. (2014). Determining the orientation of objects in space. US Patent App. 14/094,645. url:
https://www.google.com/patents/US20140267774.
Holz, D. and H. Yang (2014). Enhanced contrast for object detection and characterization by optical
imaging. US Patent 8,693,731. url: https://www.google.com/patents/US8693731.
Holz, David (2014a). Object detection and tracking with reduced error due to background illumina-
tion. US Patent App. 14/075,927. url: https://www.google.com/patents/US20140125815.
– (2014b). Systems and methods for capturing motion in three-dimensional space. US Patent
8,638,989. url: https://www.google.com/patents/US8638989.
Keane, David and Peter Gross (1989). “The MIDI Baton”. In: Proceedings of the International
Computer Music Conference 1989, pp. 151–154.
Keane, David and Kevin Wood (1991). “The MIDI Baton III”. In: Proceedings of the International
Computer Music Conference 1991, pp. 541–544.
Khoshelham, Kourosh and Sander Oude Elberink (2012). “Accuracy and Resolution of Kinect Depth
Data for Indoor Mapping Applications”. In: Sensors 12.12.
Kiefer, Chris, Nick Collins, and Geraldine Fitzpatrick (2008). “Evaluating the Wiimote as a Musical
Controller”. In: Proceedings of the International Computer Music Conference 2008. Belfast.
Marrin, T. et al. (1999). Apparatus for controlling continuous behavior through hand and arm
gestures. US Patent 5,875,257. url: http://www.google.co.uk/patents/US5875257.
Microsoft (2013). Xbox Execs Talk Momentum and the Future of TV. url: http : / / news .
microsoft.com/2013/02/11/xbox-execs-talk-momentum-and-the-future-of-tv/.
– (2015). Kinect SDK. url: http://www.microsoft.com/en-us/kinectforwindows/.
Miller, Jace and Tracy Hammond (2010). “Wiiolin: a virtual instrument using the Wii remote”. In:
Proceedings of the 2010 Conference on New Interfaces for Musical Expression. Sydney.
Miyama, Chikashi (2013). Leapmotion PD Object. url: http://puredatajapan.info/?page_
id=1514.
Moore, Adrian (2008). “Fracturing the Acousmatic: Merging Improvisation with Disassembled Acous-
matic Music”.
Morita, Hideyuki, Shuji Hashimoto, and Sadamu Ohteru (1991). “A Computer Music System that
Follows a Human Conductor”. In: IEEE Computer 24.7, pp. 44–53.
NIME (2015). url: http://www.nime.org/?s=gesture+recognition.
Nintendo (2015). Consolidated Sales Transition by Region 2015. Sales Report.
26
OpenKinect (2015). url: https://github.com/OpenKinect/libfreenect.
Pearse, Stephen (2011). “Gestural Mappings: Towards the Creation of a Three Dimensional Com-
positon Environment”. In: Proceedings of the International Computer Music Conference 2011.
University of Huddersfield. Huddersfield, UK, pp. 126–129.
Peng, Lijuan and David Gerhard (2009). “A Wii-based gestural interface for computer conducting
systems”. In: Proceedings of the 2009 Conference on New Interfaces for Musical Expression.
Pittsburgh, PA, United States.
Al-Rajab, Moaath (2008). “Hand Gesture Recognition for Multimedia Applications”. PhD thesis.
University of Leeds.
Roads, Curtis (1996). The Computer Music Tutorial. United States: Mit Press.
Senturk, Sertan et al. (2012). “Crossole: A Gestural Interface for Composition, Improvisation and
Performance using Kinect”. In: Proceedings of the 2012 Conference on New Interfaces for Musical
Expression. Ann Arbor, Michigan.
Sturman, David and David Zelter (1994). “A survey of glove-based input”. In: Computer Graphics
and Applications, IEEE 14.1, pp. 30–39.
Thalmann, Florian and Guerino Mazzola (2008). “The Bigbang Rubette: Gestural Music Composition
With Rubato Composer”. In: Proceedings of the International Computer Music Conference 2008.
Belfast.
Theremin, Leon and Oleg Petrishev (1996). “The Design of a Musical Instrument Based on Cathode
Relays”. In: Leonardo Music Journal 6, pp. 49–50.
Tormoen, Daniel, Florian Thalmann, and Guerino Mazzola (2014). “The Composing Hand: Musical
Creation with Leap Motion and the BigBang Rubette”. In: Proceedings of the International Confer-
ence on New Interfaces for Musical Expression. London, United Kingdom: Goldsmiths, University
of London, pp. 207–212.
Vasilakos, Konstantinos (2015). Greap 1.0v. url: https://github.com/KonVas/Greap.
Weichert, Frank et al. (2013). “Analysis of the Accuracy and Robustness of the Leap Motion Con-
troller”. In: Sensors 13.5. Images reproduced under the terms and conditions of the Creative
Commons Attribution license (http://creativecommons.org/licenses/by/3.0/).
Wingrave, Chadwick et al. (2010). “The Wiimote and Beyond: Spatially Convenient Devices for 3D
User Interfaces”. In: IEEE Computer Graphics and Applications 30.2.
Wright, Matthew (2002). Open Sound Control 1.0 Specification. url: http://opensoundcontrol.
org/spec-1_0.
Wu, Ying and Thomas Huang (1999). “Human hand modeling, analysis and animation in the context
of HCI”. In: Image Processing, 1999. ICIP 99. Proceedings. Vol. 3. Kobe, pp. 6–10.
Zeng, Wenjun (2012). “Microsoft Kinect Sensor and Its Effect”. In: IEEE Multimedia 19.2.
27
Top Related