go back

Terra - interactive exhibit with gesture control

An interactive exhibit, that allows the user to navigate through multiple layers of information. The user takes control with the Myo armband, developed by Thalmic Labs and navigates with the movement of his arm and a couple of gestures.

Process

The task in the course “Interactive Communication Systems 1” was to develop an exhibit for a fictional museum based on a topic of choice. The topic we chose was the ongoing deforestation of the rain forest and the species, which are endangered by this. Pretty early in the process, it became clear, that a lot of different questions had to be answered, that vary in magnitude. There are the initial “big questions”, such as “Which rainforest do you want to learn more about?”, highly detailed questions, such as “How does the living environment of this species look like?” and a whole spectrum of questions in between. Due to this large range of information, the user might expect from such a system, structuring the information quickly became a crucial part of the project. The information was clustered in three layers based on questions, users would ask in the process of using the application. The first layer contains the three areas (content-wise, as they each include different species, as well as geographical, since they represent the three big rain forests) from which the user can select one. The second layer contains all options the user can select inside this rain forest, so different plant- and animal species. The third layer then contains detailed information regarding the previously selected species. All in all, the layered approach is driven by content and user expectations.

Concept

Due to the visually rich appearance of the rain forest, it became a key goal to build an application, that supports this appearance through a strong visual sense itself. And since the exhibit is not required to be used in daily tasks that demand a specifically high amount of productivity, but rather a playful exploration of content, it became clear, that the opportunities could exceed a standardised desktop or mobile application. To enable high quality interactive visuals, as well as to support the playful exploration characteristics, the choice fell on an interactive 3D-application. Another key decision, that had to be made was the type of input which the user would use to interact with the application and therefore the content. By using the simple layered information model previously described, the main interaction was to switch between layers by selecting elements in the individual layers, therefore the main interaction is a selection: 1 out of x options. Additionally, the possibility to return to the previous layer is necessary, a back button of some sort. As the few options of interacting that were mandatory became apparent, the door to explore more experimental types of interaction was opened and thus new questions popped up: “How could a system like this be controlled via voice / gestures / movement / bringYourOwnDevice / etc. or a combination of all of those together?” As immersiveness was a key goal in the development process, options like voice input seemed to be to distant, as it would most likely involve a virtual assistant of some sort. The vision of directly being able to jump into the rainforest and reaching out for interesting looking things was a lot more appealing at this time, hence the decision to build up a gesture controlled system. As there is always a fixed state of the application and a certain set of further options, multi-user support was not intended, therefore it became a necessity to clarify, which user is currently in control of the software. Optical solutions for gesture control, such as Xbox Kinect offer the advantage, that there is no specific setup process necessary, but would make it difficult for the user to know, wether he is currently in control or not, therefore another solution was to be found, that would provide a clear signal of control. This solution was found in the Myo armband, developed by ThalmicLabs in Canada. It is a wearable armband, that communicates to a computer using Bluetooth. The armband itself includes an accelerometer, that is used to detect the rotation of the arm, and multiple sensors, that sense electronic signals, that are sent through the muscles. Since the armband can only be worn by one person at a time, the multi user problem is hereby solved and on top of that the process of actually putting on the armband (and wearing it) clearly communicates that this user is now in control of the application. From then on, the detailed navigation of the application had to be worked out. The three main interaction being “Select”, “Back” and “Navigate”, fitting gestures had to be developed that transport the meaning of these functions. Being the most variable form of interaction (in that the number of potential options can widely vary), navigation was tackled first. The amount of options in the first layer, the selection of the rain forest, is limited to three. It was attempted to incorporate as many information as possible into the process of navigating (and finally selecting) an option. In the case of the first layer, this especially meant bringing across the geographical distance and size of the individual rain forests. Of course this could have be done in a strict typographic fashion, but since the geographical location and size can also be described in a visual manner, this option was chosen for the display of the rainforest. A visual representation of geographical data can historically either be done in a two-dimensional map or a three-dimensional globe. The globe was picked for two reasons, the first being the distance of the rain forests themselves, as they are spread across the entire world, a map could not be cropped in any way, which could have been an argument for the map, as a smaller excerpt of a map might communicate more data. The second reason is grounded in the usability of the application. The user must interact in some fashion to indicate his intention (this corresponds to a “hover” option on traditional desktop devices) and as the Myo is not intended to be used as a pointing device, another possibility had to be found to indicate the intention of interaction on different options. The solution that worked for this layer was to map the rotation of the arm to the rotation of the virtual globe and depending on this resulting rotation of the globe, a different option is “hovered”. To illustrate wether the option is hovered or not an indicator was designed, that is used throughout the application. Since the user typically moves his arm in some way or another, he immediately sees the reaction of the application (the response is instant) and can therefore learn the correlation of arm rotation and globe rotation. The rotation of the globe is limited to a rotation around the y-axis, as this is enough to distinctively differentiate between the possible options. Confirming a selection had to be a gesture, that is definitely not performed by accident and should in some way symbolise the act of grabbing something, hence the fist gesture of the Myo gesture set was chosen for this task. Finding an appropriate gesture for returning to the previous layer turned out to be a bigger challenge, as the according contrary action to grabbing something would be placing it back to it’s original position again. But since the entire view has changed, as the user went deeper into the system, he has no reference where the object he has reached for would have to be placed. However, the actual motion of performing the grab action can be seen as an opposite to a releasing motion, therefore a releasing motion (described in the Myo gesture set as “Finger spread”) can be used to cover the back function. Since these gestures have to be explicitly performed (opposing to the navigation gesture, which will be performed by the user, even if he doesn’t intend to) they require additional explanation in form of a short instruction. In the second layer, the species selection, the user can again move his arm to perform a rotating action, but in this case he doesn’t rotate an object he is viewing, but rotates the camera he is viewing the environment through. The overall principle of interaction does not change though, objects are still selected by focussing them (rotating towards the object) and performing the select gesture. The rotation of the camera in the second layer is limited to two axis, x and y, because a rotation of the z-axis proved to be unnecessary, as it did not provide any benefits for the user. The third layer, the detailed information concerning a specific species, again enables the user to rotate the camera by moving his arm, this time limited to a rotation around the y-axis. A special requirement for the third layer is the ability to further interact with elements, while staying inside the third layer. This is used for image galleries or interacting with statistics. To start the process of interacting with these elements, the select gesture was used again, this time leading to a slightly different view inside the current layer. Even though the same gesture is used to select a sub-point inside a layer as to switch from one layer to another, it can be argued, that a layer change can also be seen as a view change (the user can interact with something different in the new view), therefore the difference is not apparent to the user and therefore will not lead to confusion. There are three sub-points inside the third layer, which shall be called detail-views. For all three of these detail-views a new set of gestures is introduced: left- and right swipe gestures. The first detail-view is an image gallery of the selected species. If the user selects this detail view, the camera moves to the first image. He can then use left- and right swipe gestures to move the camera between these images. To close the detail-view and therefore return to the third layer, he uses the already known finger spread gesture. The second detail view highlights different parts of the selected species. Again, the swipe gesture is used to move the camera from one highlight to the other. In the third detail view, he can view statistics concerning the selected species. The camera is hereby static and the swipe gesture is used to trigger the display of the individual statistics.

Self-criticism

Terra can be seen as a content-driven experimental exhibit, as transporting content (and therefore knowledge) was a key focus from the beginning. One could argue, that by using the rather unusual (and for most users unknown) interaction pattern, this goal was not fully met, a simple website or an table-top installation, if we stay in the context of an exhibition, with known interaction patterns could have made the content more accessible for a wider range of users, and this is definitely a valid point. On the other hand one could also argue, that this interaction is in some form more “interesting”, than known patterns and could therefore encourage users to interact with the system. Users that would not have shown interest in the content, if it were presented in a more traditional fashion. In my opinion, there is no definitive solution for this struggle, but during the time of “Terra”, I was not as aware of the difficulties in balancing these contrasting poles. During the process, the focus somewhat shifted from doing a system that transports content and knowledge in the most efficient was possible, to doing “something cool”. This “something cool”, that the project now is, may very well transport content, but I believe that if I were to do it again today, it could have been better balanced.

Facts