On the Attempt to Implement Social Addressability within a Robotic System

Philipp Graf; Manuela Marquardt; Diego Compagna

doi:10.1515/icom-2017-0009

Article Publicly Available

On the Attempt to Implement Social Addressability within a Robotic System

Philipp Graf is a Master student in Technology Studies and works as part of the junior research group on human-robot interaction at the Technical University Berlin. His research interests focus on social theory (especially network theory and social systems theory), mixed-method approaches for the evaluation of human-robot interactions and the qualitative evaluation of robotic design. Further research interests focus on network analysis and network visualisation as well as in new religious movements.
,

Manuela Marquardt is a Master student in Sociology and worked as part of the junior research group on human-robot interaction at the Technical University Berlin. Her HRI research focused on the mixed-method empirical investigation of human-robot interactions in the context of public science events and the application of sociological theory to robot-related research. As part of her theoretical work, she engaged with the phenomenon of anthropomorphisation. Further research interests focus on mobility, quantitative empirical research and sequence analysis of social science data.
and

Diego Compagna is a senior research fellow (post-doctorate) in the Control Systems Group, part of the Department of Electrical Engineering and Computer Science at the Technical University Berlin. His research interests in the field of Science and Technology Studies focus on the area of theory-building, sociological actor-models, methodology for the evaluation of human-robot encounters as well as interaction and the politics of innovation strategies for special target groups.

Published/Copyright: August 10, 2017

Published by

Become an author with De Gruyter Brill

Submit Manuscript Author Information Explore this Subject

From the journal i-com Volume 16 Issue 2

Abstract

We conducted a Human-Robot Interaction (HRI) study during a science event, using a mixed method experimental approach with quantitative and qualitative data (adapted version of Godspeed Questionnaire and audio-visual material analysed videographically). The main purpose of the research was to gather insight into the relevance of the so-called “point of interaction” for a successful and user-friendly interaction with a non-anthropomorphic robot. We elaborate on this concept with reference to sociological theories under the heading of “addressability” and “social address” and generate hypotheses informed by former research and theoretical reflections. We implement an interface on our robot system, comprising two LEDs, which indicate the status of the robot/interaction, and which might possibly serve as basal form of embodied social address. In one experimental condition, the movements were accompanied by a light choreography, the other one was conducted without the LEDs. Our findings suggest a potential relevance of social address for the interaction partner to receive additional information, especially if the situation is a contingent one. Nevertheless, the overall rating on the Godspeed scales showed no significant differences between the light conditions. Several possible reasons for this are discussed. Limitations and advantages are pointed out in the conclusion.

Keywords: Human-Robot Interaction; Point of Interaction; Anthropomorphisation; Social Robots; Social Robotics; Videography; Social Addressability; Social Address

1 Introduction

The need for embedded and embodied robot entities [4] is often emphasised in the literature with regard to the question of how to construct acceptable and sociable robotic systems. Underlying this need are questions pertaining to considerations in building a robot entity with whom a user would like to engage socially, and which qualities of the robot’s appearance or shape are necessary with regard to the task the robot is to perform. Historically, the design of social robots is oriented heuristically towards the human form and therefore the assumption is that social robots – or as Neil Frude puts it: intimate machines – need a human face in order to enhance the human-robot relationship and their interactions is an old one (see Halbach [10]: 77). Meanwhile, the notion of face was displaced by more technical concepts such as interface or point of interaction and more recent functional approaches refer to the hypothesis that suggests the materialised form of a robot should correspond to its function in order to improve interaction (“matching hypothesis”).^[1]

We assume that the abstract functional properties for social interactions do not necessarily need a human-like form and we see a lack of theoretical description of these functional preconditions for an intimate or socially interactive robot in the HRI literature. Because the human-human interaction serves as a reference model for social interactions in general, especially for most of the constructors and engineers in the field of HRI, we see an unexploited potential in the use of social and sociological theory and methodology to think about and beyond HRI. Therefore, we would like to give an overview of existing concepts and approaches that address this issue in Chapter 2, and subsequently we would like to emphasise the theoretical notion of social address.

As the interaction with a robot bears its own needs, we differentiate the term “social address” from terms like “point of interaction” or “interface.” Therefore, we would like to point out the characteristics of social address and addressability (3.2), but also the in situ construction of interaction sequences (3.1), where in an ethnomethodological approach, it is necessary to provide information to the interaction partner constantly. On the basis of these assumptions, in combination with findings from a study conducted in 2015, we propose five hypotheses (3.3) on how the robot interaction could become smoother. As a result of the interdisciplinary discourse on these hypotheses, our research team implemented new LEDs on the non-humanoid robot BROMMI:TAK, which was to serve as a basal form of such an address. During the Lange Nacht der Wissenschaften 2016, a science event that took place at the Technical University of Berlin, we had the opportunity to test the modified robot in a handover-interaction with a large number of visitors to the event (4) and we thus tried to find answers to our hypotheses (5). Hence this paper addresses two objectives regarding the abstract thinking about interaction and – building on these thoughts – the specific experimental interaction between a human and a robot (HRI). In Chapter 5 we discuss the findings regarding our hypotheses. As the hypotheses are inspired by abstract social theory, not all of them can be answered sufficiently, but the data shows the general relevance of such an address for HRI. We can show that it bears the potential to make the interaction with a robot smoother if the robot is able to indicate its general status and, most notably, the further actions that will take place.

2 Point of Interaction and the Social Gaze in HRI

HRI-Research focuses on the interdisciplinary research question of how to design a robot that can deal with the complexity of an interaction with a human – one can describe this as the robot-centred view [3]. Additionally, it is important to consider the perspective of the human interaction partner, as his or her perception of the interaction sequence is directly linked to his or her (re-)actions and therefore also linked to the question as to the sequence as a whole is successful or not. The literature here refers to the term human-centred HRI [3] and is mostly oriented towards the well-known human-human-interaction in order to achieve greater acceptance on the part of the human user.

Our research is concerned with what is often unspecifically called a “point of interaction” (e.g. [18]: 55; [16]), which can be a particular interface or particular action and therefore shows a proximity in content to HCI research. There are also articles emphasizing the importance of the gaze of a robot’s “eyes” for an HRI [19], or articles which reflect on the anthropomorphic design of robots heads [5]. These articles did not reference the phenomenon to existing theoretical approaches but rather discuss single aspects of an address, in this example the use of gazing or anthropomorphic designs. Although we are aware of extensive research in this area from a psychological perspective, we have consciously chosen a sociological approach. The main focus of the paper presented here lies in suggesting an innovative sociological perspective which is grounded in social theory, instead of comparing psychological and sociological insights. Such a comparison must take place systematically at a later point in time. In our conception, the point of interaction is crucial, as every interaction between a robotic system and a human user is centred around and oriented towards it.

We will underpin this theoretically with the notion of an address,^[2] which is a more general concept to describe possible points of interaction between a human and a robot entity. A social address in a human-centred view on HRI entails the human user receiving information on what the robotic system is doing and will do next, which enables the human user to adjust his or her actions in response. We thereby hope to be able to generalise from the specific interaction-forms of a robot entity.

Both former and current research in this field related to the PoI are primarily concerned with the use of gaze and its importance in typical HRI settings. Most of this conducted research came to the conclusion that social gaze has a favourable impact. Fischer et al. [7] for instance summarised their research as follows:

“Our qualitative and quantitative analyses show that people in the social gaze condition are significantly more quick to engage the robot, smile significantly more often, and can better account for where the robot is looking. In addition, we find people in the social gaze condition to feel more responsible for the task performance. We conclude that social gaze in assembly scenarios fulfils floor management functions and provides an indicator for the robot’s affordance, yet that it does not influence likability, mutual interest and suspected competence of the robot.” (204)

The social gaze of the robot was tested in a typical tutor setting, i.e. the human had to teach the robot how to perform a cooperative task. Moon et al. [14] as well as Zheng et al. [20], [19] came to very similar conclusions in a similar HRI test setting regarding handover situations. They were able to “provide empirical evidence that using humanlike gaze cues during human-robot handovers [...] the timing and perceived quality of the handover event” (Moon [14]: 2) can be improved. In almost all the recent gaze research that has been conducted in HRI, the models were similar human-human interactions. The tested setting can be seen as a blueprint of the analogous human-human interaction situation. In addition to work cooperation and handover situations, the function of gaze as feedback pertaining to acknowledgement of content-based exchange was also tested [15]. The overall results were similarly positive: “We argue that a robot – when using adequate online feedback strategies – has at its disposal an important resource with which it could pro-actively shape the tutor’s presentation and help generate the input from which it would benefit most.” (Pitch et al. [15]: 268) Recent research on the topic of gaze in HRI came so far to the shared conclusion that the consideration and the proper implementation of social gaze is of paramount importance for a successful outcome in terms of a satisfying, effective, and efficient interaction between humans and robots.

In a former study conducted during the “Lange Nacht der Wissenschaften 2015”^[3] we observed the coordinative mechanisms during a handover-interaction between a robot and a human interaction partner according to several dimensions (i.e. spatio-temporal, material, social) with a robot that was not able to react to the human’s exact position but had fixed positions for the handover instead. In other words, the robot worked in reference to scripts and was not able to act reciprocally – this fact was well perceived by the human interaction partners. We discovered that the human adjusts to the movements of the robot in order to lead the interaction to a success. As the situation took place in front of an audience, it is important to note that the participants had an even higher motivation to succeed in the experimental interaction than in a laboratory setting. Therefore, the data showed a strong need to implement accountable movements, communicative acts, or symbols to provide the human with more information about the robot’s further actions.

3 Theoretical Concepts

Although it is unusual in common HRI research, we do work with sociological theories and concepts. We propose this theoretically informed perspective with two main purposes in mind: First, we use ethnomethodological assumptions to interpret video data, and second, we would like to emphasise the concept of (social) addressability to achieve a proper general description of an HRI.^[4]

3.1 Accounts and Sequentiality

Ethnomethodology follows the basic assumption that social reality is not predetermined by societal structures, but is instead created in situ between the interacting actors who act on based on methods. Each communicative action of ego needs to be interpreted by alter to be understandable and compatible for follow-up communication. Therefore, ethnomethodology assumes that meaning does not lie in communication itself, as it is always vague and preliminary and pervaded by indexical expressions (e.g. “this”, “here”), but emerges in the social context between the interactors. To be accountable (reportable, observable), they use methods which serve to make each other understand how the performed action should be understood. These ethnomethods are reflexive in nature; or as Garfinkel puts it “the activities, whereby members produce and manage settings of organized everyday affairs are identical with members’ procedures for making those settings ‘accountable’” (Garfinkel [9]: 1).

Another important concept is sequentiality: With sequentiality, we account for the fact that it is anything but random when, i.e. at what point in the interaction something happens. The sequential order is to be taken seriously and actions can be interpreted by formulating hypotheses and taking the following action as a verification or falsification (sequential interpretation). These theoretical assumptions are also inherent in videography, which will be introduced in the methodological section.

3.2 Social Addressability

Social addressability refers to an abstract prerequisite for general communication, which has to be manifested and materialised as an embodied form of a social address in order to enable communication and social situations. We do this by relating to the notion of communicative address by Braun-Thürmann [2] and of addressability as a basal concept as set forth by Peter Fuchs’ sociological theory (1997). As follow-up questions, it should be determined which specific forms of an address can be theoretically described and which functions they have to fulfil.

An example derived from Science-Fiction culture could help here to demonstrate the relevance of this issue: The HAL9000 robot in Stanley Kubrick´s “2001: A Space Odyssey”. The HAL9000 is described as a conscious computer system that controls a spacecraft and interacts with the human crew members. He does not have a definable body but instead can be seen as integrated into the spaceship itself, and which he also controls. In the movie, his (hardware) address is depicted as a camera lens containing a red/white dot in the centre and a speaker- and microphone-unit below. Furthermore (and just in some scenes), the lettering “HAL9000” can be seen above the lens and indicates the entity’s name. The example suggests the following possible qualities of an address:

its ability to speak,
its ability to hear,
its ability to see,
or in general: to notice its (social) environment (as potentially social),
symbolised address: to display these abilities through socialised and generalised attributes (e.g. a lens and speakers)
materialised address: to display these abilities
or in general: to be noticeable by the (social) environment as potentially social

One can note that in case of technical systems, the medium of the address can contain a much wider variety of forms than the human body, which is limited to its biological formation. Additionally it can have more than one address to which a human could address his/her communication. Addressability in this notion is a prerequisite for specific tasks such as spatiotemporal coordination via accounting of gaze behaviour but also for more elaborated forms of social interaction like the ascription of perceptive faculty or emotions. Therefore we refer to the concept as social addressability.

The concept refers to the reciprocal and multifunctional properties of an address in every act of communication. An address becomes a social one in performing a communicative act and qua ascription. This act of ascribing addressability to an object is socially and culturally shaped, but it is always the object itself that serves as a mediator for the communicative act. And even though a materialised social address does not presently mediate communication, it bears the potential to do so. As Braun-Thürmann [2] mentioned the status of a (virtual) address as an entity between two worlds, we see the social address of a robot ranges between the technical concept of a (user) interface and the more natural form of a humanlike face.

While faces are socialised and institutionalised (social) addresses in everyday human-human-interactions, the (user) interface (according to Halbach [10]) is a mediating instance between a human user and a technological object, which can guarantee communication in a technical sense (i.e. the reciprocal exchange of messages and also the adequate coding and decoding of these messages, which enables the user to attribute meaning, Halbach [10]: 13–14). From a (system-)theoretical point of view, addressability in general is seen as a crucial precondition for communication and – building on this – as a precondition for the emergence of social systems. Whether the social system processes meaning or instead hybrid variations of it like proto-meaning [12], addressability underlies communication as a foil in a fundamental way, which needs to be objectified or materialised in either way to enable the attribution of meaning or action to an entity.^[5]

With the words of Peter Fuchs, one could also ask about the form of address itself, what it differentiates and how it could be differentiated (Fuchs [8]: 62). So the first question will be: What can serve as an address? Fuchs answers this question:

“What are addressable are occurrences in the world that […] can be elaborated as an address. The essential prerequisite for this is the fact that these addressable occurrences (i.e.: humans, trees, computers) maintain a self-relationship, or in other words: that one can ascribe them self-reference.” (Fuchs [8]: 62–63, own translation)

But what does it mean for an entity to be described as self-referential? The concept of self-reference is widely used by disciplines like philosophy, cybernetics and system theory describing the ability of an entity to maintain its borders on different levels – the concept therefore covers, for example, not just the reproduction of skin, but also the reproduction of a coherent self-describing psychic or social system. The crucial point here is not the existence of a conscious robot entity in a literal sense, but the ascription of the ability to differentiate between its body and its environment. In a gradual concept of self-reference, this distinction is still relatively low level, but it enables the entity to react to its environment. We know from psychological studies that geometric shapes reacting to their environment are ascribed life by a human observer (see Heider & Simmel [11]). With respect to state-of-the-art robotics, one has to note that there is still no robotic system that can meet the high demands for communication as articulated by social constructivists like Luhmann. But there are empirical hints that the operations of social systems do not rely on full understanding^[6] by all (psychic or electronic) systems involved. Rather it can also be sufficient if the human interaction partner is aware of the lack of consciousness or the lack of capability to understand. Theoretically, the structure of communication in this case was described as processing proto-meaning [12] or the “unilateral awareness in the surroundings of an unconscious address” (Fuchs [8]: 65).

3.3 Hypothesis

A concrete concern in the HRI-research community is the construction of an interaction that can be qualified as smooth, intuitive, and pleasant for the human interaction partner. We define a smooth Human-Robot-Interaction as an interaction that is not irritating for participants with regard to their social role in the process (social dimension), with regard to the spatial and temporal sequence of the concrete actions that have to take place in the process (spatio-temporal dimension), and with regard to the meaning of the whole interaction (dimension of meaning/purpose). Additionally, a smooth HRI has to be performed without any help from bystanders (in our case: spectators or presenters). If an irritation becomes too strong, the interaction could collapse – with regard to Garfinkel we are talking of such an irritated sequence as a crisis in the interaction.

For an interaction to qualify as intuitive, it has to function by referring to familiar knowledge without reasoning. For social robotics and social HRI, the stock of knowledge derived from human-human interactions is at stake and influences the intuitiveness perceived by the human interaction partner.

As a result of our former research in 2015, we have been able to formulate a few general hypotheses on the implementation of a smooth and intuitive HRI. They were used by our team to implement new features for BROMMI:TAK in order to improve the quality of a handover-interaction. The hypotheses are:

H1: An HRI is more intuitive if the robot is able to indicate its general ability to communicate via a social address.
1. H1.1: An HRI is more intuitive if the robot is able to indicate its general status via a glowing light, which serves as a basal form of an address and which indicates the status of the robot.
2. H1.2: A handover-interaction with a robot is evaluated more positively on the Godspeed scales in the light-condition compared to the without-light-condition.
H2: A handover-interaction with a robot is smoother if the robot can indicate where and when the handover will take place. The robot could fulfil this by:
1. H2.1: – following a direct pathway which can be anticipated by the human participant.
2. H2.2: – indicating a specific status or progress in the interaction process, which can serve as an account for the required follow-up action of the human.

4 Method / Implementation

4.1 Study Design

The study was conducted during the “Lange Nacht der Wissenschaften” (Long Night of the Sciences) – an annual science event in which Berlin’s universities and research institutions open their doors for visitors and present a multifaceted programme. In this context, the presentation “BROMMI:TAK – the biomimetic robot trunk” took place for the third consecutive year – for the first time, however, in the FabLab of our research group. BROMMI:TAK is a pneumatically operated, non-humanoid and biomimetically designed robotic system which conducted a basic interaction (handing over an apple) with the visitors in a Wizard-of-Oz setup.

We conducted further preliminary studies in our FabLab to determine possible positions for the information-providing interface and its implementation as a basal form of address. Although the implementation of the LEDs might appear quite specific, there are in fact not that many possibilities due to the limiting but efficient construction of the robotic arm. As the robotic arm has neither many rigid parts nor the possibility to carry much weight, we decided to use small and lightweight LEDs. With respect to familiar forms of status indication on machines, we chose two bigger LED lights at the trunk as general status indicators and smaller LEDs on the gripper as an account for the progress of the interaction sequence (see Fig. 1). While the status LEDs should provide information about the general status and serve as an account for animacy, the gripper LED should provide more specific information on whether the robot is going to open or close the gripper. Therefore, the LEDs on the trunk were active throughout the whole experiment, while the LEDs on the gripper were only active if the robot was holding an object or using (opening/closing) the gripper.

For this study, the form of the implemented address was chosen in respect to a working scenario where one may not be able to use audio but only visual accounts. Also, we would like to implement the address in a form which is most advantageous (on the manipulator and on the trunk) for the interacting user considering the context of use.

Figure 1

BROMMI:TAK with the basal light address.

The study was conducted with two experimental conditions in a mixed-method-design, using a videographic approach (see 4.3.3) combined with quantitative and qualitative questionnaire data. As we had to deal with a come-and-go audience, we decided to test the robot system half the time with the LED-address and half the time without. The advantage of using such an event for conducting a study on HRI lies in reaching a broader range of participants in a less artificial context as in regular laboratory experiments, generating an HRI situation that is more related to an everyday context. Despite the convenient conditions for empirical inquiry in the FabLab however, disruptive factors are not as controllable as in strict laboratory experiments and one has to take into account the social factor of a (more or less) present audience during the HRI.

4.2 Procedure

The setting was open for visitors to come and go and the whole evening was videotaped, which was announced via a sign outside of the FabLab. A presenter would usually start by explaining some details about the robot’s construction and the work of the junior research group, and when a corresponding amount of visitors was reached, the presenter would initiate the handing-over of the apple. The robot therefore was driven in a position beyond its vertical axis, took an apple from the presenter, and handed it over in its interaction space facing the audience. In one of the experimental conditions, the movements were accompanied by a light choreography of two different LED lights – one in the trunk and one around the gripper. The LED in the trunk pulsated in green when the robot reached a rest- or end position and shone in orange during the transfer movements – serving as general status indication. The LED around the gripper was supposed to shine in blue when holding an object and start blinking before grasping or releasing the object, which should function as an account of the interaction sequence. The second experimental condition was conducted without both LED lights. If the interaction process got irritatingly slow because of the slow movement of the robotic system, the participants were asked to be patient with the robot. We could do this as we were not asking for the influence of the speed of the movements, but rather for the influence of a symbolic address. When leaving the room, all visitors were asked to participate in the study by filling in a questionnaire with standardised and open-ended questions. As mentioned above, we tested the first condition (with light) in the first half and the second condition (without light) in the second half of the event, providing a different questionnaire for each condition. In order to avoid getting questionnaires filled in by persons who did not observe the experiment, we firstly asked if they observed a handover situation – we were therefore able to control our sample.

4.3 Measures

4.3.1 Godspeed Questionnaire

To evaluate the interaction in general, we used a quantitative measurement instrument designed for HRI. Although an adequate measurement instrument to evaluate human-robot interactions comparatively is still lacking, the Godspeed Questionnaire (GSQ) is the most popular and frequently used measure in the community [17]. We translated and adapted the Godspeed Questionnaire by Bartneck et al. [1] for our purposes and used an 11-point semantic differential scale. When the proposed German translation on his website seemed inappropriate, we found another adjective that would serve as a better antonym or we tried to depict another interesting subdimension of the scale. The scales measured were “Anthropomorphism”, “Animacy”, “Likeability”, and “Perceived Intelligence.” Additionally, we replaced the “Perceived Safety” scale completely and used items to measure a “Trustworthiness” scale instead. The participants were asked to rate their impression of the robot intuitively. The opposing adjectives were presented as semantic anchors of an 11-point scale with zero as middle category and numbers from 1 to 5 on the side of each adjective. No negative numbers were used. The adjective pairs were in random order. We additionally collected information on participants’ age, gender, occupation, and interest in robots and robotics.

4.3.2 Open-Ended Questions

Furthermore, we asked the participants in the light condition if they recognised any colours during the presented interaction and which meaning they would attribute to them. As we used open-ended questions, the participants had to mention the colour and its attributed meaning so we could avoid evaluating only socialised knowledge of colour attributes. As most people who filled out the questionnaire were looking at an ongoing interaction again to determine the usage of the colours, the data cannot be used to answer questions about whether the colours were already consciously perceived during the initial interaction. But we would nonetheless suggest that it can evaluate whether the light concept, with regard to the attributed meaning in general, was working well or not. Additionally, we asked everyone who participated as an interaction-partner of BROMMI:TAK themselves for a description of their experience as well as for noticeable characteristics of the situation. By these means we tried to complement the rating of BROMMI:TAK itself with a subjective evaluation of the handover situation.

4.3.3 Videography

Apart from the questionnaire data, we collected audio-visual material, using a method of the social sciences called videography. Videography is an interpretative method that combines a focussed ethnographic approach with video-interaction-analysis. Videography is a method used for the empirical investigation of “natural” situations, which means that the events are videotaped at the moment they are happening without building reconstructive artifacts. The advantage – compared to mere observational methods – is the possibility of playing the video material over and over again, at increased or decreased speeds, forward and backward, allowing in-depth insights into “natural” situations that wouldn’t have been possible with the naked eye, as in classical ethnographic approaches. As this point appears simple it is important to emphasize that the benefit of videography lies in the use of these technical possibilities in order to take the sequentially of the interaction process into account. This means that what is happening at what point in time is anything but coincidental. The method is highly suitable for an ethnomethodological theoretical approach to the data (see 3.1).

Using videography, one has to bear in mind the selectivity of the camera frame and hence choose the camera positions carefully. Another important aspect is responsiveness (people act differently when they are aware of being filmed), which turned out to be manageable, as long as the researcher didn’t stand behind the camera.

For the analysis, we categorised the numerous hand-over interaction sequences (n=93). The categories were related to the hand-over process, characteristics of the interaction partner, and specific observable actions (e.g. gazes).

5 Findings

Using videography, it is possible to gain information about the relevance of the two LED-light addresses, which were implemented in the trunk and the gripper. As the gripper holds the apple, it is hard to say whether a gaze is directed towards the apple or the LED-light on the gripper itself – however it is easy to differentiate those from gazes on the LED-light in the trunk.

The analysis of the video data suggests a general relevance of the implemented LED-Light on the trunk, as it was perceived by the interaction partners in 50 out of 93 (54%) interactions. Differentiated by the point in the interaction at which the gazing occurs (several gazes were possible), we could observe a lower interest in the LED-light before the interaction started (19 views, 20%), a high interest while the interaction is going on (32 views, 34%), and a medium interest when the apple was successfully handed over (26 views, 28%).

The categorisation of the video data shows a relation between gazes on the address and the amount of formerly observed interactions by the human interaction partner. This makes sense intuitively as observing others interacting with the robot is an important source for gathering knowledge about the robot and the interaction. The gathered knowledge could include the action that is expected of the participant (his or her own role), the sequential order of the interaction, the dimensioning of the relevant space, the purpose of the interaction and – possibly most importantly – the competence that can be expected from the robot interaction partner. The data shows that most of the participants who interacted with the robot tried to get further information before they enter the interaction space of BROMMI:TAK – they tend to act more cautiously in front of an audience. In contrast participants who didn’t observe any former HRI had to gather all information during the interaction itself by searching for accounts more actively and interpreting them – these situations are of special interest for us, as they contain data about the potential relevance of a social address in a more contingent HRI.

Table 1

Interaction Sequence 1 – unsuccessful handover (apple falls). Time designations in brackets (starting with 0 sec. in Frame 1).


1. (t=14sec.): P. focuses on gripper.	2. (t=19sec.): P. observes the second blinking.

3. (t=21sec.): The gripper blinks for the fourth time.	4. (t=22sec.): The gripper opens and the apple falls to the ground.

In total, our data contains 15 such participants who interacted with BROMMI:TAK without previous knowledge about the situation and in 14 (93%) of those situations we could observe a consultation of the LED-light on the trunk. In contrast, we count 72 situations in which the participants had more information and we observed such a consultation in only 34 (47%) of those.

In the following paragraphs, we want to provide two exemplary sequential interpretations of handovers in our video material. The first case shows an unsuccessful interaction sequence, where the apple falls to the ground after a misunderstanding about the time of the handover; the second one is an example of a smooth and intuitive handover with the initiation of a giving-back sequence, which takes mistakable characteristics.

In sequence one (Table 1), the interaction partner makes visual contact, recognises the status change in the light address in the trunk and focuses on the gripper (image 1). After the gripper reaches its final position, it blinks two times, until the interaction partner moves his hand towards the apple (image 2). Meanwhile a third blink occurs. But the anticipated handover takes too long – after the fourth blinking, the interaction partner pulls back his hand (image 3), shortly afterward the gripper opens and the apple falls to the ground (image 4). The behaviour of the interaction partner can be interpreted as correctly anticipating the first blinking as an affordance to take the apple whereupon the subsequent blinkings irritated this interpretation and led the participant to draw back his hand. Additionally, there was a long delay between BROMMI:TAK reaching its final position and the opening of the gripper, which is not in accordance with an intuitive and smooth spatio-temporal order of a handover.

In the second sequence (Table 2), one can observe the man on the left taking the apple being offered by the robot. He is entirely anticipating the purpose of the situation and his own role in it, as he is reaching the specific place for the handover at the exact moment when the gripper has started blinking (image 1). It is important to note that this handover of the apple was very smooth regarding the time that the participant had to wait for the gripper to open. The gripper stays open (in relation to the smooth handover: maybe too long) and the trunk moves forth and back a bit because of the loss of weight of the apple. The little movements, combined with the open gripper were interpreted by P. as an affordance to offer the apple back to the robot in the next sequence. But just before the gripper closes he suddenly pulls back his hand.^[7] The gripper opens again and the man places the apple in the gripper (image 2). After the gripper closes, there are no following movements until the LEDs on the gripper blink again. The man moves his hand slowly to the gripper, but BROMMI:TAK starts moving to the right when the man reaches the handover-position (image 3). The man acts embarrassed by pulling back his hand and leaves the situation by turning away (image 4). We interpret the behaviour of the participant as anticipating the blinking of the gripper correctly, which can be seen in his reaction to the second blinking after BROMMI:TAK has received the apple again: The man expects the gripper to open another time because it starts to blink. But the LED address was never meant to communicatively handle the return of the apple and so it was falsely blinking instead of just shining. Nevertheless, it can also be seen that the forms of communication are lacking in complexity as the participant is possibly irritated by the gripper staying open and by the small movements. We are guessing that the man reacts to the temporal structure of the situation (when the gripper stays open) in a sensible way because of the fluent initial handover he experienced (images 2–4). We therefore conclude that the complexity of the communicative potential was not enough to handle the situation and the control of the LEDs was not precise enough to give the accounts in an appropriate window of time. The example not only shows the functional effects of a basal light address, but also the need for its accurate implementation and coherent usage. Otherwise the path dependency of a situation would dominate the interpretation of any new user of a robotic system and could lead to crises in the interaction.

Table 2

Interaction Sequence 2 – smooth handover with mistakable giving-back sequence. Time designations in brackets (starting with 0 sec. in Frame 1).


1. (t=8sec.): P. reaches out to take the apple while the gripper is still closed.	2. (t=17sec.): Second try to give the apple back.

3. (t=28.5sec.): Second blinking, P. moves his hand towards the handover destination.	4. (t=30sec.): B. moves to the right, P. pulls back his hand completely.

With respect to the open-ended questions asking about the colour concept, we conclude that the colour associations were correct in most cases. The colour green was mostly associated with “Everything OK”, “Affordance”, or “Readiness”. Blue was associated with an active meaning, e.g. an “Action”, “Readiness”, “Expectation”, or “Detection.” Red was mostly interpreted as “Error” or “Warning” and yellow was in most of the cases linked to a “Movement” category.

$Figure 2 Semantic differential of the adapted and reduced Godspeed Questionnaire (GQS).Adapted and factoranalytically reduced version of the Godspeed Questionnaire, survey at „Lange Nacht der Wissenschaften 2016“, n=100\mathrm{n}=100, own translation (German survey) and reverse translation for English publication, 11-point scale, no negative value Labels in the questionnaire, random order, some items with reversed polarity, light condition: n=65\mathrm{n}=65, without light condition: n=35\mathrm{n}=35.$

Figure 2

Semantic differential of the adapted and reduced Godspeed Questionnaire (GQS).^[8]

Our sample consisted of n=135 respondents between the age of 6 and 73, with 37% being female. As younger respondents tended to choose extreme categories more often and might have had difficulties understanding some of the adjective pairs, we dropped them from the sample, resulting in n=100 cases (age range 18–73).

To evaluate the general effect of the LED light address, we first considered the overall rating on the semantic differential scale for the two experimental conditions (with and without light). A factor analysis of our adaption of the Godspeed scales led to a reduction of the original item pool and generated the factors

“anthropomorphism” (corresponding items: unnatural-natural, mechanical-human/humanlike, artificial-natural/real; Cronbach’s Alpha 0.7),
“animacy” (corresponding items: inanimate-animate, inert-vivid, not interactive-interactive, indifferent/apathetic-responsive, inflexible-adaptable; Cronbach’s Alpha 0,6),
“intelligence” (corresponding items: incompetent-competent, irresponsible-responsible, tactless/inappropriate-tactful/appropriate, unintelligent-intelligent; Cronbach’s Alpha 0,8),
“sympathy” (corresponding items: unsympathetic-sympathetic, impolite-polite, unpleasant-pleasant, unfriendly-friendly; Cronbach’s Alpha 0,8) and
“trustworthiness” (corresponding items: deceitful-trustworthy, threatening-harmless, unreliable-reliable; Cronbach’s Alpha 0,7).

Figure 2 shows a comparison of the means for both conditions. There were no significant differences in the ratings, except with the adjective pairs “mechanical-organic” and “threatening-harmless” (p<0.05), where participants in the condition without the light rated BROMMI:TAK as less mechanical and even more harmless than those in the light condition.

In the aggregated subscales of the Godspeed, there are no significant differences, although the differences on the sympathy scale approach statistical significance (p=0.055) (see Fig. 3). Participants in the condition without light rated BROMMI:TAK as more sympathetic than in the light condition.

$Figure 3 Comparison of means of the adapted and reduced Godspeed scales.Adapted and factoranalytically reduced version of the Godspeed Questionnaire scales, survey at „Lange Nacht der Wissenschaften 2016“, n=100\mathrm{n}=100, 11-point scale, light condition: n=65\mathrm{n}=65, without light condition: n=35\mathrm{n}=35.$

Figure 3

Comparison of means of the adapted and reduced Godspeed scales.^[9]

6 Discussion

Regarding hypothesis H1 (relevance of a social address for the quality of an HRI), our data does not suffice to provide an adequate answer. Nonetheless, our data suggests that H1.1 (relevance of the LED light as status address for the quality of the HRI) can be confirmed, especially for first encounters. This data allows us to draw conclusions about H1 indirectly.

The categorisation of the video data with regard to gazes directed to the LEDs in the trunk revealed interesting findings depending on the amount of previously observed interactions. In first encounters, participants looked at the LED address in almost all the cases, whereas only about half of the participants looked when they already had observed prior interactions. Therefore we draw the conclusion (see also H1) that the address is more important in HRI when the situation is more contingent. Or in other words, if the participant does not have knowledge, he/she is more eager to request further information from a potential address on the robot. As the LED-Lights in the trunk did not contain much information, we could not expect an increase in communication or reciprocal exchange of information, but it nevertheless shows that there is a potential need on side of the participants for getting further information from the robot before and during the interaction. As we were testing a specific form of a non-anthropomorphic robot arm it is important to say that the findings presented here should only be generalized to the specific class of robotic arms. It should be clear, that there are several other possible forms of implementing a social address on a robot, which may have different effects on the user experience – we choose our concrete design considering the initial scenario for BROMMI:TAK, namely a working scenario where simple visual accounts would work better than – for example – auditive forms. Regarding the enormous variety of robots, one has to note that almost all HRI findings have to be interpreted respectively to the concrete form. If a socially acting robot is supposed to be able to interact with every human, whether he/she is knowledgeable or not, our data shows the need for such a social address. But it should be clear that further studies will be necessary until the findings respectively the concept of social addressability could be generalized onto other robotic systems.

H1.2 (more positive evaluation on the Godspeed scales in the light condition) has to be rejected as the statistical analysis shows no significant effects. There are several possible reasons as to why our data didn’t show significant differences between the two experimental conditions:

Firstly, the measurement instrument might not have been sensitive enough to trace the effects concerning the quality of the interaction that would be ascribed to the implemented light address (experimental manipulation). This argument is in line with the lack of adequate quantitative measures to evaluate an HRI. Effects of the light address might have had greater impacts on other subdimensions than those depicted by our modified Godspeed scales (anthropomorphism, animacy, intelligence, sympathy, trustworthiness). Beyond this, the Godspeed questionnaire might not be as appropriate for non-humanoid robots as for humanoids or androids.^[10]

And secondly, if we assume that the results of the Godspeed scales are reliable and valid, there are several possible reasons for why the implementation of the address could have been disadvantageous. The information displayed by the LED lights might have been too unspecific and possibly did not bear the intended communicative relevance. This can be supported qualitatively as we observed many interaction sequences in which the blinking of the LEDs took much too long before the apple was released by the gripper, which queries the information value of the LEDs. In the implementation of such an address, there is no room for ambiguities if the effect is supposed to manifest itself in an overall quantitative rating. The qualitative video data shows evidence that the implemented address elements have to be coherently synchronised in order to serve as unambiguous accounts for the interaction, otherwise the interaction becomes irritated quickly. Another possible explanation, which is in line with the approach of significance in the sympathy scale ratings between the two conditions is that the light colour of the LEDs is generally too cold. This could explain why participants in the without light condition rated BROMMI:TAK as slightly more sympathetic than participants in the light condition.

We can generally confirm hypothesis H2 (H2.1 and H2.2 – smoother handover-interaction, if the robot indicates where and when the handover will take place by following a direct pathway or indicating the progress in the interaction process) with reference to exemplary cases in our video material. We observed very smooth HRIs when BROMMI:TAK followed a direct pathway to address the potential interaction partner. Moreover, there are several instances in our video material in which the participants react to the blinking of the LEDs on the gripper and read them as an account for the following action of releasing the object. This only functions well if the object is released shortly after the blinking starts, otherwise it starts to get irritating or is interpreted falsely. We therefore conclude that the implementation of any address on a robot entity has to take place and be programmed to correspond strictly to the actions undertaken by the entity in order to be able to provide sufficient information about the robot or the interaction sequence. A non-coherently implemented address does more harm than good.

7 Conclusion

The basic idea was to implement social addressability via an LED interface on a robotic system. This implementation of a social address had only limited success, as the LEDs only had restricted informational and no reciprocal communicative capabilities. Nevertheless, they served as accountable instances in some interaction sequences, which promoted smooth and intuitive HRIs. The robot in the WoZ-setup was able to maintain an illusion of interactivity, reciprocity, and even sociality in some interaction sequences. Notwithstanding, our data shows that the interpretation of a materialised address as social bears significant challenges. A refined measurement tool or experimental setup is needed to account for the gradual differences that differentiate what we introduced as social address from mere mechanical behaviours or reactions.

Another challenge our study faced was the use of a non-humanoid robot to test the relevance of an addressable interface. We chose the concept of a social address to also include more abstract forms than faces or clearly anthropomorphically designed surrogates. Our implementation possibly missed the display of some sort of perceptive faculty (e.g. via a camera lens or other sensors), which might have maintained the illusion of an addressable interface more efficiently. Another plausible assumption is that – with regard to the smoothness of the interaction with non-humanoid robots – other parameters might play a more important role than social addressability (e.g. direct pathway, adequate speed of the interaction in general). Social addressability, however, could absorb other deficiencies in performance of the robotic system, if the address is able to display possible reasons for them.

Despite all the limitations of our study, we were able to achieve an untypically high number of cases compared to other HRI studies. Our sample contains a wider range of ages, levels of education, and social contexts (e.g. families, singles, couples) than studies that usually use university students. Because of the public character of the event, we were able to analyse a situation containing a variety of interaction forms, which is much closer to everyday occurrences in which robots will be introduced in the near future. The social component of other persons present is of particular importance, especially concerning the learning effects.

Our theoretical perspective proved to be very fruitful in gathering interesting insights in the large amount of (primarily audio-visual) data. We were still able to preserve an analytic openness and impartiality with our qualitatively oriented approach. Videography as a method is particularly suitable for this type of experiment as it takes the in situ construction of the situation seriously. It poses the question: How do the persons involved create the HRI situation sequentially? The assumption is that what is observable is anything but coincidental. Beyond that, we also see the necessity for standardised quantitative measurement tools for HRI in order to measure and compare the quality of an HRI. With respect to our findings, we see the need for further research under stricter laboratory conditions in order to validate the outcomes, especially regarding the relevance of a more elaborated social address. This relevance was basically already approved qualitatively, especially for first encounter interactions in which the human interaction partner does not have specific knowledge about robotic systems.

Referencing sociological theory in order to reformulate the phenomenon and acquiring an abstract understanding of the situation helped us to formulate hypotheses and thereby investigate the categorised video data. The notion of a social address was introduced as an embodied form so it can be recognised as a distinct form (or: body/entity), which can be ascribed the ability to recognise a social environment and being able to give accounts of what will happen next in the interaction. By giving accounts, an entity can be ascribed the ability to differentiate between a self and an environment and therefore has to be able to process some sort of self-reference. Being able to represent these abilities is the crucial quality of a social address we would like to emphasise. In this regard, BROMMI:TAK might be seen as a robot that behaves with self-referentiality, but with a limited potential for complex reciprocal communication. An ethnomethodologically informed perspective indicates the strong need for accountable actions or communication in an interaction with an unfamiliar artificial interaction partner. We conclude that if the interaction partner has learned to differentiate between the robots accountable actions (i.e. LEDs, gazes, or speech) from the simply necessary ones (i.e. technically induced movements, noise), the use of additional behavioural forms of communication (or the use of an address) makes the interaction significantly smoother and thus faster. As a precondition for this, the temporal implementation and the use of accountable behaviour has to be strictly coherent in order to avoid irritations.

About the authors

Philipp Graf

Philipp Graf is a Master student in Technology Studies and works as part of the junior research group on human-robot interaction at the Technical University Berlin. His research interests focus on social theory (especially network theory and social systems theory), mixed-method approaches for the evaluation of human-robot interactions and the qualitative evaluation of robotic design. Further research interests focus on network analysis and network visualisation as well as in new religious movements.

Manuela Marquardt

Manuela Marquardt is a Master student in Sociology and worked as part of the junior research group on human-robot interaction at the Technical University Berlin. Her HRI research focused on the mixed-method empirical investigation of human-robot interactions in the context of public science events and the application of sociological theory to robot-related research. As part of her theoretical work, she engaged with the phenomenon of anthropomorphisation. Further research interests focus on mobility, quantitative empirical research and sequence analysis of social science data.

Diego Compagna

Diego Compagna is a senior research fellow (post-doctorate) in the Control Systems Group, part of the Department of Electrical Engineering and Computer Science at the Technical University Berlin. His research interests in the field of Science and Technology Studies focus on the area of theory-building, sociological actor-models, methodology for the evaluation of human-robot encounters as well as interaction and the politics of innovation strategies for special target groups.

References

[1] Bartneck, C., Kulic, D., Croft, E., & Zoghbi, S. Measurement instruments for the anthropomorphism, animacy, likeability, perceived intelligence, and perceived safety of robots. International Journal of Social Robotics 2009;1:71–81.10.1007/s12369-008-0001-3Search in Google Scholar

[2] Braun-Thürmann, Künstliche, H. Interaktion. In: Christaller, T. & Wehner, J., editors. Autonome Maschinen. Wiesbaden: Westdeutscher Verlag, 2003: 221–243.Search in Google Scholar

[3] Dautenhahn, K. Socially intelligent robots: Dimensions of human–robot interaction. Philosophical Transactions of the Royal Society B: Biological Sciences 2007;362(1480): 679–704.10.1098/rstb.2006.2004Search in Google Scholar PubMed PubMed Central

[4] Dautenhahn, K., Ogden, B., & Quick, T. From embodied to socially embedded agents – Implications for interaction-aware robots. Cognitive Systems Research 2002;3(3): 397–428.10.1016/S1389-0417(02)00050-5Search in Google Scholar

[5] Duffy, B.R. Anthropomorphism and the social robot. Robotics and Autonomous Systems 2003;42(3): 177–190.10.1016/S0921-8890(02)00374-3Search in Google Scholar

[6] Echterhoff, G., Bohner, G., & Siebler, F. “Social Robotics” und Mensch-Maschine-Interaktion. Zeitschrift Für Sozialpsychologie 2006;37(4): 219–231.10.1024/0044-3514.37.4.219Search in Google Scholar

[7] Fischer, K., Jensen, L. C., Kirstein, F., Stabinger, S., Erkent, Ö., Shukla, D., & Piater, J. The Effects of Social Gaze in Human-Robot Collaborative Assembly. ICSR 2015. Berlin: Springer, 2015: 204–213.10.1007/978-3-319-25554-5_21Search in Google Scholar

[8] Fuchs, P. Adressabilität als Grundbegriff der soziologischen Systemtheorie. Soziale Systeme 1997;3(1): 57–79.Search in Google Scholar

[9] Garfinkel, H. Studies in Ethnomethodology. Upper Sadle River: Prentice Hall, 1967.Search in Google Scholar

[10] Halbach, W.R. Interfaces – Medien- und kommunikationstheoretische Elemente einer Interface-Theorie. München: Wilhelm Fink Verlag, 1994.Search in Google Scholar

[11] Heider, F., & Simmel, M. An experimental study of apparent behaviour. The American Journal of Psychology 1944;57(2): 243–259.10.2307/1416950Search in Google Scholar

[12] Lorentzen, K. F. Luhmann goes Latour – Zur Soziologie hybrider Beziehungen. In Rammert, W. & Schulz-Schaeffer, I., editors. Können Maschinen Handeln. Frankfurt am Main: Campus Verlag, 2002: 221–243.Search in Google Scholar

[13] Luhmann, N. Soziale Systeme. Frankfurt am Main: Suhrkamp, 1984.Search in Google Scholar

[14] Moon, Aj., Troniak, D.M., Gleeson, B., Pan, M.K., Zheng, M., Blumer, B.A., … Croft, E.A. Meet me where I’m gazing: How shared attention gaze affects human-robot handover timing. Proceedings of the 2014 ACM/IEEE International Conference on Human-Robot Interaction 2014: 334–341.10.1145/2559636.2559656Search in Google Scholar

[15] Pitsch, K., Vollmer, A.-L., & Mühlig, M. Robot feedback shapes the tutor’s presentation: How a robot’s online gaze strategies lead to micro-adaptation of the human’s conduct. Interaction Studies 2013;14(2): 268–296.10.1075/is.14.2.06pitSearch in Google Scholar

[16] Rakotonirainy, A. Human-computer interactions: Research challenges for in-vehicle technology. Proceedings of Road Safety Research Policing and Education Conference September 2003.Search in Google Scholar

[17] Weiss, A., & Bartneck, C. Meta analysis of the usage of the Godspeed questionnaire series. 24th IEEE International Symposium on Robot and Human Interactive Communication (RO-MAN) 2015: 381–388.10.1109/ROMAN.2015.7333568Search in Google Scholar

[18] Young, J.E., Sung, J., Voida, A., Sharlin, E., Igarashi, T., Christensen, H.I., & Grinter, R.E. Evaluating human-robot interaction. International Journal of Social Robotics 2011;3(1): 53–67.10.1007/s12369-010-0081-8Search in Google Scholar

[19] Zheng, M., Moon, Aj., Croft, E. A., & Meng, M.Q.-H.. Impacts of robot head gaze on robot-to-human handovers. International Journal of Social Robotics 2015;7(5): 1–16.10.1007/s12369-015-0305-zSearch in Google Scholar

[20] Zheng, M., Moon, Aj., Gleeson, B., Troniak, D.M., Pan, M.K., Blumer, B.A., … Croft, E.A. 2014. Human behavioural responses to robot head gaze during Robot-to-Human handovers. Proceedings of the 2014 IEEE International Conference on Robotics and Biomimetics. Bali: IEEE.10.1109/ROBIO.2014.7090357Search in Google Scholar

Published Online: 2017-08-10

Published in Print: 2017-08-28

Articles in the same Issue

https://doi.org/10.1515/icom-2017-0009

Keywords for this article

Human-Robot Interaction; Point of Interaction; Anthropomorphisation; Social Robots; Social Robotics; Videography; Social Addressability; Social Address