GOOD SAMARITANISM:

AN UNDERGROUND PHENOMENON?'

IRVING M. PILIAVIN JUDITH RODIN

University of Pennsylvania Columbia University

AND JANE ALLYN PILIAVIN

University of Pennsylvania

A field experiment was performed to investigate the effect of several variables on helping behaviour, using the express trains of the New York 5th Avenue Independent Subway as a laboratory on wheels. Four teams of students, each one made up of a victim, model, and two observers, staged standard collapses in which type of victim (drunk or ill), race of victim (black or white) and presence or absence of a model were varied. Data recorded by observers included number and race of observers, latency of the helping response and race of helper, number of helpers, movement out of the "critical area," and spontaneous comments. Major findings of the study were that (a) an apparently ill person is more likely to receive aid than is one who appears to be drunk, (b) race of victim has little effect on race of helper except when the victim is drunk, (c) the longer the emergency continues without help being offered, the more likely it is that someone will leave the area of the emergency, and (d) the expe6ted decrease in speed of responding as group size increase the "diffusion of responsibility effect" found by Darley and Latan6-does not occur in this situation. Implications of this difference between laboratory and field results are discussed, and a brief model for the prediction of behaviour in emergency situations is presented.

Since the murder of Kitty Genovese in Queens, a rapidly increasing number of social scientists have turned their attentions to the study of the good Samaritan's act and an associated phenomenon, the evaluation of victims by bystanders and agents. Some of the findings of this research have been provocative and nonobvious. For example, there is evidence that agents, and even bystanders, will sometimes derogate the character of the victims of misfortune, instead of feeling compassion (Berscheid & Walster, 1967; Lerner & Simmons, 1966). Furthermore, recent findings indicate that under certain circumstances there is -not "safety in numbers," but rather "diffusion of responsibility." Darley and Latane (1968) have reported that among bystanders hearing an epileptic seizure over earphones, those who believed other witnesses were present were less likely to seek assistance for the victim than were bystanders who believed they were alone. Subsequent research by Latane and Rodin (1969) on response to the victim of a fall confirmed this finding and suggested further that assistance from a group of bystanders was less likely to come if the group members were strangers than if they were prior acquaintances. The field experiments of Bryan and Test (1967), on the other hand, provide interesting findings that fit common sense expectations; namely, one is more likely to be a good Samaritan if one has just observed another individual performing a helpful act.

Much of the work on victimization to date has been performed in the laboratory. It is commonly argued that the ideal research strategy over the long haul is to move back and forth between the laboratory, with its advantage of greater control, and the field, with its advantage of greater reality. The present study was designed to provide more information from the latter setting.

The primary focus of the study was on the effect of type of victim (drunk or ill) and race of victim (black or white) on speed of responding, frequency of responding, and the race of the helper. On the basis of the large body of research on similarity and liking as well as that on race and social distance, it was assumed that an individual would be more inclined to help someone of his race than a person of another race. The expectation regarding type of victim was that help would be accorded more frequently and rapidly to the apparently ill victim. This expectation was derived from two considerations. First, it was assumed that people who are regarded as partly responsible for their plight would receive less sympathy and consequently less help than people seen as not responsible for their circumstances (Schopler & Matthews, 1965).

Secondly, it was assumed that whatever sympathy individuals may experience when they observe a drunk collapse, their inclination to help him will be dampened by the realization that the victim may become disgusting, embarrassing, and/or violent. This realization may, in fact, not only constrain helping but also lead observers to turn away from the victim-that is, to leave the scene of the emergency.

Aside from examining the effects of race and type of victim, the present research sought to investigate the impact of modelling in emergency situations. Several investigators have found that an individual's actions in a given situation lead others in that situation to engage in similar actions. This modelling phenomenon has been observed in a variety of contexts including those involving good Samaritanism (Bryan & Test, 1967). It was expected that the phenomenon would be observed as well in the present study. A final concern of the study was to examine the relationship between size of group and frequency and latency of the helping response, with a victim who was both seen and heard. In previous laboratory studies (Darley & Latane, 1968; Latane & Rodin, 1969) increases in group size led to decreases in frequency and increases in latency of responding. In these studies, however, the emergency was only heard, not seen. Since visual cues are likely to make an emergency much more arousing for the observer, it is not clear that, given these cues, such considerations as crowd size will be relevant determinants of the observer's response to the emergency. Visual cues also provide clear information as to whether anyone has yet helped the victim or if he has been able to help himself. Thus, in the laboratory studies, observers lacking visual cues could rationalize not helping by assuming assistance was no longer needed when the victim ceased calling for help. Staging emergencies in full view of observers eliminates the possibility of such rationalization.

To conduct a field investigation of the above questions under the desired conditions required a setting, which would allow the repeated staging of emergencies in the midst of reasonably large groups, which remained fairly similar in composition from incident to incident. It was also desirable that each group retain the same composition over the course of the incident and that a reasonable amount of time be available after the emergency occurred for -good Samaritans to act. To meet these requirements, the emergencies were staged during, the approximately 7½minute express run between the 59th Street and 125th Street stations of the Eighth Avenue Independent (IND) branch of the New York subways.

METHOD

Subjects

About 4,450 men and women who travelled on the 8th Avenue IND in New York City, weekdays between the hours of 11:00 A.m. and 3:00 P.m. during the period from April 15 to June 26, 1968, were the unsolicited participants in this study. The racial composition of a typical train, which travels through Harlem to the Bronx, was about 45% black and 55% white. The mean number of people per car during these hours was 43; the mean number of people in the "critical area," in which the staged incident took place, was 8.5.

Field situation. The A and D trains of the 5th Avenue IND were selected because they make no stops between 59th Street and 125th Street. Thus, for about 7½ minutes there was a captive audience who, after the first 70 seconds of their ride, became bystanders to an emergency situation. A single trial was a non-stop ride between 59th and 125th Streets, going in either direction. All trials were run only on the old New York subway cars, which serviced the 8th Avenue line since they had two-person seats in group arrangement rather than extended seats. The designated experimental or critical area was that end section of any car whose doors led to the next car. There are 13 seats and some standing room in this area on all trains (see Figure 1).

Procedure

On each trial a team of four Columbia General Studies students, two males and two females, boarded the train using different doors. Four different teams, whose members always worked together, were used to collect data for 103 trials. Each team varied the location of the experimental car from trial to trial. The female confederates took seats outside the critical area and recorded data as unobtrusively as possible for the duration of the ride, while the male model and victim remained standing. The victim always stood next to a pole in the centre of the critical area (see Figure 1). As the train passed the first station (approximately 70 seconds after departing) the victim staggered forward and collapsed. Until receiving help, the victim remained supine on the floor looking at the ceiling. If the victim received no assistance by the time the train slowed to a stop, the model helped him to his feet. At the stop, the team disembarked and waited separately until other riders had left the station. They then proceeded to another platform to board a train, going in the opposite direction for the next trial. From 6 to 8 trials were run on a given day. All trials on a given day were in the same "victim condition."

Victim. The four victims (one from each team) were males between the ages of 26 and 35. Three were white and one was black. All were identically dressed in Eisenhower jackets, old slacks, and no tie. On 38 trials the victims smelled of liquor and carried a liquor bottle wrapped tightly in a brown bag (drunk condition), while on the remaining 65 trials they appeared sober and carried a black cane (cane condition). In all other aspects, victims dressed and behaved identically in the two conditions. Each victim participated in drunk and cane trials. [It will be noted later that not only were there more cane trials than drunk trials, they were also distributed unevenly across black and white victims. The reason for this is easier to explain than to correct. Teams 1 and 2 (both white victims) started the first day in the cane condition. Teams 3 (black) and 4 (white) began in the drunk condition. Teams were told to alternate the conditions across days. They arranged their running days to fit their schedules. On their fourth day, Team 2 violated the instruction and ran cane trials when they should have run drunk trials; the victim "didn't like" playing the drunk! Then the Columbia student strike occurred, the teams disbanded, and the study of necessity was over. At this point, Teams 1 and 3 had run on only 3 days each, while 2 and 4 had run on 4 days each.]

Model. Four white males between the ages of 24 and 29 assumed the roles of model in each team. All models wore informal clothes, although they were not identically attired. There were four different model conditions used across both victim conditions (drunk or cane).

1. Critical area-early. Model stood in critical area and waited until passing fourth station to assist victim (approximately 70 seconds after collapse).

2. Critical area-late. Model stood in critical area and waited until passing sixth station to assist victim (approximately 150 seconds after collapse).

3. Adjacent area-early. Model stood in middle of car in area adjacent to critical area and waited until passing fourth station.

4. Adjacent area-late. Model stood in adjacent area and waited until passing sixth station.

When the model provided assistance, he raised the victim to a sitting position and stayed with him for the remainder of the trial. An equal number of trials in the no-model condition and in each of the four model conditions were pre-programmed by a random number table and assigned to each team.

TABLE I

PERCENTAGE OF TRIALS ON WHICH HELP WAS GIVEN,

By RACE AND CONDITION OF VICTIM, AND TOTAL

NUMBER OF TRIALS RUN IN

EACH CONDITION

Note.-Distribution of model trials for the drunk was as follows: critical area: early, 4; late. 4; adjacent area: early, 5; late, 3. The three model trials completed for the cane victim were all early, with 2 from the critical area and 1 from the adjacent area.

Measures. On each trial one observer noted the race, sex, and location of every rider seated or standing in the critical area. In addition, she counted the total number of individuals in the car and the total number of individuals who came to the victim's assistance. She also recorded the race, sex, and location of every helper. A second observer coded the race, sex, and location of all persons in the adjacent area. She also recorded the latency of the first helper's arrival after the victim had fallen and on appropriate trials, the latency of the first helper's arrival after the programmed model had arrived. Both observers recorded comments spontaneously made by nearby passengers and attempted to elicit comments from a rider sitting next to them.

RESULTS AND DISCUSSION

As can be seen in Table 1, the frequency of help received by the victims was impressive, at least as compared to earlier laboratory results. The victim with the cane received spontaneous help, that is, before the model acted, on 62 of the 65 trials. Even the drunk received spontaneous help on 19 of 38 trials. The difference is not explicable on the basis of gross differences in the numbers of potential helpers in the cars. (Mean number of passengers in the car on cane trials was 45; on drunk trials, 40. Total range was 15-120.)

On the basis of past research, relatively long latencies of spontaneous helping were expected; thus, it was assumed that models would have time to help, and their effects could be assessed. However, in all but three of the cane trials planned to be model trials, the victim received help before the model was scheduled to offer assistance. This was less likely to happen with the drunk victim. In many cases, the early model was able to intervene, and in a few, even the delayed model could act (see Table 1 for frequencies).

A direct comparison between the latency of response in the drunk and cane conditions might be misleading, since on model trials one does not know how long it might have taken for a helper to arrive without the stimulus of the model. Omitting the model trials, however, would reduce the number of drunk trials drastically. In order to get around these problems the trials have been dichotomised into a group in which someone helped before 70 seconds (the time at which the early model was programmed to help) and a group in which no one had helped by this time. The second group includes some trials in which people helped the model and a very few in which no one helped at all. If a comparison of latencies is made between cane and drunk nonmodel trials only, the median latency for cane trials is 5 seconds and the median for drunk trials is 109 seconds (assigning 400 seconds as the latency for nonrespondents). The Mann-Whitney U for this comparison is significant at p <.0001.

It is quite clear from the first section of Table 2 that there was more immediate, spontaneous helping of the victim with the cane than of the drunk. The effect seems to be essentially the same for the black victim and for the white victims. Among the white victim teams, the data from Team 2 differ to some extent from those for Teams 1 and 4. Team 2 accounts all of the cane-after 70 seconds trials for, as are 4 of the 5 drunk before 70 trials. Median latency for cane trials is longer for Team 2 than for the other teams; for drunk trials, shorter. This is the same team that violated the "alternate days" instruction. It would appear that this team is being rather less careful-that the victim may be getting out of his role. The data from this team have been included in the analysis although they tend to reduce the relationships that were found.

What of the total number of people who helped? On 60% of the 81 trials on which the victim received help, he received it not from one good Samaritan but from two, three, or even more. The data from the model trials are not included in this analysis because the model was programmed to behave rather differently from the way in which most real helpers behaved. That is, his role was to raise the victim to a sitting position and then appear to need assistance. Most real helpers managed to drag the victim to a seat or to a standing position on their own. Thus the programmed model received somewhat more help than did real first helpers.

There are no significant differences between black and white victims, or between cane and drunk victims, in the number of helpers subsequent to the first who came to his aid. Seemingly, then, the presence of the first helper has important implications that override whatever cognitive and emotional differences were initially engendered among observers by the characteristics of the victim. It may be that the victim's uniformly passive response to the individual trying to assist him reduced observers' fear about possible unpleasantness in the drunk conditions. Another possibility is that the key factor in the decisions of second and third helpers to offer assistance was the first helper. That is, perhaps assistance was being offered primarily to him rather than to the victim. Unfortunately the data do not permit adequate assessment of these or other possible explanations.

Characteristics of Spontaneous First Helpers

Having discovered that people do, in fact, help with rather high frequency, the next question is, "Who helps?" The effect of two variables, sex and race, can be examined. On the average, 60%, of the people in the critical area were males. Yet, of the 81 spontaneous first helpers, 90% were males. In this situation, then, men are considerably more likely to help than are women (p < .001).

Turning now to the race variable, of the 81 first helpers, 64% were white. This percentage does not differ significantly from the expected percentage of 55% based on racial distribution in the cars. Since both black and white victims were used, it is also possible to see whether blacks and whites are more likely to help a member of their own race. On the 65 trials on which spontaneous help was offered to the white victims, 68% of the helpers were white. This proportion differs from the expected 55% at the .05 level. On the 16 trials on which spontaneous help was offered to the black victim, half of the first helpers were white. While this proportion does not differ from chance expectation, we again see a slight tendency toward "same race" helping.

When race of helper is examined separately for cane and drunk victims, an interesting although nonsignificant trend emerges (see Table 3). With both the black and white cane victims, the proportion of helpers of each race was in accord with the expected 55%-45%Split. With the drunk, on the other hand, it was mainly members of his own race who came to his aid.

This interesting tendency toward same-race helping only in the case of the drunk victim may reflect more empathy, sympathy, and trust toward victims of one's own racial group. In the case of an innocent victim (e.g., the cane victim), when sympathy, though differentially experienced, is relatively uncomplicated by other emotions, assistance can readily cut across group lines. In the case of the drunk (and potentially dangerous) victim, complications are present, probably blame, fear, and disgust. When the victim is a member of one's own group-when the conditions for empathy and trust are more favourable-assistance is more likely to be offered. As we have seen, however, this does not happen without the passing of time to think things over. Recent findings of Black and Reiss (1967) in a study of the behaviour of white police officers towards apprehended persons offer an interesting parallel. Observers in this study recorded very little evidence of prejudice toward sober individuals, whether white or black. There was a large increase in prejudice expressed towards drunks of both races, but the increase in prejudice towards blacks was more than twice that towards whites.

Modelling Effects

No extensive analysis of the response to the programmed model could be made, since there were too few cases for analysis. Two analyses were, however, performed on the effects of adjacent area versus critical area models and of early versus late models within the drunk condition. The data are presented in Table 4. While the area variable has no effect, the early model elicited help significantly more than did the late model.

Other Responses to the Incident

What other responses do observers make to the incident? Do the passengers leave the car, move out of the area, make comments about the incident? No one left the car on any of the trials. However on 21 of the 103 trials, a total of 34 people did leave the critical area. The second section of Table 2 presents the percentage of trials on which someone left the critical area as a function of three variables: type of victim, race of victim, and time to receipt of help (before or after 70 seconds). People left the area on a higher proportion of trials with the drunk than with the cane victim. They also were far more likely to leave on trials on which help was -not offered by 70 seconds, as compared to trials on which help was received before that time. The frequencies are too small to make comparisons with each of the variables held constant.

Each observer spoke to the person seated next to her after the incident took place. She also noted spontaneous comments and actions by those around her. A content analysis of these data was performed, with little in the way of interesting findings. The distribution of number of comments over different sorts of trials, however, did prove 'interesting (see Section 3 of Table 2). Far more comments were obtained on drunk trials than on cane trials. Similarly, most of the comments were obtained on trials in which no one helped until after 70 seconds. The discomfort observers felt in sitting inactive in the presence of the victim may have led them to talk about the incident, perhaps hoping others would confirm the fact that inaction was appropriate. Many women, for example, made comments such as, "It's for men to help him," or "I wish I could help him-I'm not strong enough...... I never saw this kind of thing before-I don't know where to look," "You feel so bad that you don't know what to do."

A Test of the Diffusion of Responsibility Hypothesis

In the Darley and Latane experiment it was predicted and found that as the number of bystanders increased, the likelihood that any individual would help decreased and the latency of response increased. Their study involved bystanders who could not see each other or the victim. In the Latane and Rodin study, the effect was again found, with bystanders who were face to face, but with the victim still only heard. In the present study, bystanders saw both the victim and each other. Will the diffusion of responsibility finding still occur in this situation?

In order to check this hypothesis, two analyses were performed. First, all nonmodel trials were separated into three groups according to the number of males in the critical area (the assumed reference group for spontaneous first helpers). Mean and median latencies of response were then calculated for each group, separately by type and race of victim. The results are presented in Table 5. There is no evidence in these data for diffusion of responsibility; in fact, response times, using either measure, are consistently faster for the 7 or more groups compared to the 1 to 3 groups."

As Darley and Latane pointed out, however, different-size real groups cannot be meaningfully compared to one another, since as group size increases the likelihood that one or more persons will help also increases. A second analysis as similar as possible to that used by those authors was therefore performed, comparing latencies actually obtained for each size group with a base line of hypothetical groups of the same size made up by combining smaller groups. In order to have as much control as possible the analysis was confined to cane trials with white victims and male first helpers coming from the critical area. Within this set of trials, the most frequently occurring natural groups (of males in the critical area) were those of sizes 3 (n = 6) and 7 (n = 5). Hypothetical groups of 3 (n = 4) and 7 (n = 25) were composed of all combinations of smaller sized groups. For example, to obtain the hypothetical latencies for groups of 7, combinations were made of (a) all real size 6 groups with all real size I groups, plus (b) all real size 5 groups with all real size 2 groups, etc. The latency assigned to each of these hypothetical groups was that recorded for the faster of the two real groups of which it was composed. Cumulative response curves for real and hypothetical groups of 3 and 7 are presented in Figure 2.

As can be seen in the figure, the cumulative helping response Curves for the hypothetical groups of both sizes are lower than those for the corresponding real groups. That is, members of real groups responded more rapidly than would be expected on the basis of the faster of the two scores obtained from the combined smaller groups. While these results together with those summarized in Table 5 do not necessarily contradict the diffusion of responsibility hypothesis, they do not follow the pattern of findings obtained by Darlev and Latane and are clearly at variance with the tentative conclusion of those investigators that "a victim may be more likely to receive help . . . the fewer people there are to take action [Latane & Darley, 1968, p. 221]."

Two explanations can be suggested to account for the disparity between the findings of Table 5 and Figure 2 and those of Darley and Latane and Latane and Rodin. As indicated earlier in this paper, the conditions of the present study were quite different from those in previous investigations. First, the fact that observers in the present study could see the victim may not only have constrained observers' abilities to conclude there was no emergency, but may also have overwhelmed with other considerations any tendency to diffuse responsibility. Second, the present findings may indicate that even if diffusion of responsibility is experienced by people who can actually see an emergency, when groups are larger than two the increment in deterrence to action resulting from increasing the number of observers may be less than the increase in probability that within a given time interval at least one of the observers will take action to assist the victim. Clearly, more work is needed in both natural and laboratory settings before an understanding is reached of the conditions under which diffusion of responsibility will or will not occur.

CONCLUSIONS

In this field study a personal emergency occurred in which escape for the bystander was virtually impossible. It was a public, face-to-face situation, and in this respect differed from previous lab studies. Moreover, since Generalizations from field studies to lab research must be made with caution, few comparisons will be drawn. However, several conclusions may be put forth:

An individual who appears to be ill is more likely to receive aid than is one who appears to be drunk, even when the immediate help needed is of the same kind.
Given mixed groups of men and women, and a male victim, men are more likely to help than are women.
Given mixed racial groups, there is some tendency for same-race helping to be more frequent. This tendency is increased when the victim is drunk as compared to apparently ill.
There is no strong relationship between number of bystanders and speed of helping; the expected increased "diffusion of responsibility" with a greater number of bystanders was not obtained for groups of these sizes. That is, help is not less frequent or slower in coming from larger as compared to smaller groups of bystanders, what effect there is, is in the opposite direction.
The longer the emergency continues without help being offered (a) the less impact a model has on the helping behaviour of observers; (b) the more likely it is that individuals will leave the immediate area; that is, they appear to move purposively to an other area in order to avoid the situation; (c) the more likely it is that observers will discuss the incident and its implications for their behaviour.

The authors are currently developing a model of response to emergency situations consistent with the previous findings. It is briefly presented here as a possible heuristic device. The model includes the following assumptions: Observation of an emergency creates an emotional arousal state in the bystander. This state will be differently interpreted in different situations (Schachter, 1964) as fear, disgust, sympathy, etc., and possibly a combination of these. This state of arousal is higher (a) the more one can empathize with the victim (i.e., the more one can see oneself in his situation-Stotland, 1966), (b) the closer one is to the emergency, and (c) the longer the state of emergency continues without the intervention of a helper. It can be reduced by one of a number of possible responses: (a) helping, directly, (b) going, to get help, (c) leaving the scene of the emergency, and (d) rejecting the victim as undeserving of help (Lerner & Simmons, 1966). The response that will be chosen is a function of a cost-reward matrix that includes costs associated with helping (e.g., effort, embarrassment, possible disgusting or distasteful experiences, possible physical harm, etc.), costs associated with not helping (mainly self-blame and perceived censure from others), rewards associated with helping, (mainly praise from self, victim, and others), and rewards associated with not helping (mainly those stemming from continuation of other activities). Note that the major motivation implied in the model is not a ‘positive altruistic’ one, but rather a selfish desire to rid oneself of an unpleasant emotional state.

In terms of this model, the following after the-fact interpretations can be made of the findings obtained:

1. The drunk is helped less because costs for helping are higher (greater disgust) and costs for not helping are lower (less self blame and censure because be is in part responsible for his own victimization).

2. Women help less because costs for helping are higher in this situation (effort, mainly) and costs for not helping are lower (less censure from others; it is not her role).

3. Same-race helping, particularly of the drunk, can be explained by differential costs for not helping (less censure if one is of opposite race) and, with the drunk, differential costs for helping (more fear if of different race).

4. Diffusion of responsibility is not found on cane trials because costs for helping in general are low and costs for not helping are high (more self-blame because of possible severity of problem). That is, the suggestion is made that the diffusion of responsibility effect will increase as costs for helping increase and costs for not helping decrease. This interpretation is consistent with the well-known public incidents, in which possible bodily harm to a helper is almost always involved, and thus costs for helping are very high, and also with previous research done with nonvisible victims in which either (a) it was easy to assume someone had already helped and thus costs for not helping were reduced (Darley & Latane) or (b) it was possible to think that the emergency was minor, which also reduces the costs for not helping (Latane & Rodin).

5. All of the effects of time are also consistent with the model. The longer the emergency continues, the more likely it is that observers will be aroused and therefore will have chosen among the possible responses. Thus, (a) a late model will elicit less helping, since people have already reduced their arousal by one of the other methods; (b) unless arousal is reduced by other methods, people will leave more as time goes on, because arousal is still increasing; and (c) observers will discuss the incident in an attempt to reduce self-blame and arrive at the fourth resolution, namely a justification for not helping based on rejection of the victim.

Quite obviously, the model was derived from these data, along with data of other studies in the area. Needless to say, further work is being planned by the authors to test the implications of the model systematically.