Framing Attention in Japanese and American Comics: Cross-Cultural Differences in Attentional Structure

This is an open-access article distributed under the terms of the Creative Commons Attribution License , which permits use, distribution and reproduction in other forums, provided the original authors and source are credited and subject to any copyright notices concerning any third-party graphics etc.

Research on visual attention has shown that Americans tend to focus more on focal objects of a scene while Asians attend to the surrounding environment. The panels of comic books – the narrative frames in sequential images – highlight aspects of a scene comparably to how attention becomes focused on parts of a spatial array. Thus, we compared panels from American and Japanese comics to explore cross-cultural cognition beyond behavioral experimentation by looking at the expressive mediums produced by individuals from these cultures. This study compared the panels of two genres of American comics (Independent and Mainstream comics) with mainstream Japanese “manga” to examine how different cultures and genres direct attention through the framing of figures and scenes in comic panels. Both genres of American comics focused on whole scenes as much as individual characters, while Japanese manga individuated characters and parts of scenes. We argue that this framing of space from American and Japanese comic books simulate a viewer’s integration of a visual scene, and is consistent with the research showing cross-cultural differences in the direction of attention.

Introduction

Cross-cultural research shows that Asians and Americans differ in their direction of attention (Nisbett, 2003; Nisbett and Miyamoto, 2005). Beyond studying attention through perception, cognition can also be compared through cultural production (Morling and Lamoreaux, 2008), as in artistic expression (Masuda et al., 2008). Comic books provide an ideal place to analyze the direction of attention, because panels act like windows onto a scene (Cohn, 2007). Thus, analysis of panels in Asian and American comics provides a place to look for cultural differences in cognition through creative expression.

Cross-cultural differences in attention have been consistent across numerous behavioral paradigms. After viewing video scenes, Americans mostly describe the salient objects, while Asians describe significantly more aspects of the surrounding context (Masuda and Nisbett, 2001). Americans also tend to notice changes to focal objects in animations that feature slight changes to a single scene, while Asians pick up on changes to the broader environment and relations between objects (Masuda and Nisbett, 2006). When recalling scenes where the background is changed from its original context, Americans are unaffected while Asians’ memory appears impaired (Masuda and Nisbett, 2001), and Americans’ eye movements fixate sooner and longer on focal objects, while Asians make more saccades to elements of the background (Chua et al., 2005). Additionally, when viewing photographs of objects, fMRI studies show that Americans have stronger activation than Asians in brain regions associated with the storing of semantic information about object properties (Gutchess et al., 2006). All of this work supports that Americans focus more on focal objects while Asians attend more to aspects of environments and relationships.

Research has also suggested that preferences for attention permeate into artistic representations. Masuda et al. (2008) looked at a corpus of artwork, and found that “Western” paintings emphasized the focal objects and figures, while Asian paintings emphasized the broader context and environment. This trend was reinforced in drawings and photographs of figures and scenes produced by individuals from these cultures. Thus, these cognitive preferences for attention extend into artistic expression, and other contemporary media produced by these cultures might be expected to show further evidence of these trends.

Comic books are an ideal place to examine the focus of attention in artistic expression. Because comic panels act as a window on a visual story, they can simulate a “spotlight of attention” for a reader’s perception of a fictitious scene (a similar argument for film shots is made by Levin and Simons, 2000). Importantly, unlike the isolated images of photos and drawings, comic panels are meant to be read (and are created) in a sequence. Individual photos often include the whole field of vision of a single scene, and attention can be directed to parts of this scene in different ways. In contrast, sequential images serve as a window on unfolding events, with panels potentially simulating the way that attention might be directed on a scene. Indeed, recognition that the panel serves as a window appears to require some degree of exposure and practice: children’s drawings from Japan (where comics and visual representations are highly prevalent) use this windowing quality of panels to occlude parts of images far more often than those from Egypt, which has less prevalent cultural pictorial representation (Wilson and Wilson, 1987).

With this view in mind, Cohn (2007) described comic panels as “attention units” that highlight parts of a scene in different ways. Within a sequence of images, a scene may have two types of meaningful elements: Active entities are those that repeat across panels by engaging in the actions and events of the sequence, while inactive entities are elements of the background. Panels can be categorized related to the ways that they depict these meaningful elements (see Figure A):

Macro – depict multiple active entities
Mono – depict single active entities
Micro – depict less than one active entity (as in a close up)
Amorphic – depict no active entities (i.e., only inactive entities)

These categories are distinguished by the amount of information they contain, which decreases successively: Macros contain more active information than Monos, which show more than Micros, which are more than Amorphic panels. These ways of highlighting attention are similar to types of film shots, though ultimately they differ in important ways. Thus, it is worth addressing these differences.

Film tradition has developed various conventional ways to frame figures and scenes based on what is being shown (Arijon, 1976; Bordwell and Thompson, 1997; Brown, 2002). There are many variations on the ways to frame figures and scenes; however, the main categories can be broadly defined as:

Long shot – figures are prominent in the frame, but the background dominates
Full shot – frames all of an entity or object (for example, a whole person or a whole car)
Medium shot – frames less than a whole entity, object, or scene (for example, when depicting a single person, Medium shots show the body from the knees or waist up)
Close shot – frames slightly more than a particular part of an entity or object, though less than a Medium shot (as in a person’s torso and up)
Close up – zoom in on a particular part of an entity or object (as in a person’s head or closer)

These divisions create framing for various aspects of scenes and people, as depicted in Figure B. Unlike the attentional categories, filmic shots frame the presentation of objects, as opposed to dividing the amount of information shown. In essence, attentional categories outline the framing of meaningful elements of a scene, while film shots describe the presentation of those meaningful elements. For example, a Mono panel shows only one character, as in the Gunman in Figure A. However, that character can be presented in various ways, including Full, Medium, and Close shots, as in Figure B. These are all ways in which to present the same meaning.

Nevertheless, the overlap between attentional categories and film shots should immediately be apparent, and prototypical correspondences may exist between them. For example, a Macro may typically involve a Long shot to capture the most information possible, as in the Long shot in Figure B, but it could tighten on just the specific multiple characters involved in the action, like the Macro in Figure A. This would be a Medium shot. Also, a panel showing only the hands of individuals exchanging a piece of paper would be a Macro that uses a Close up shot, because it involves multiple characters. Along these lines, Close ups may prototypically be Micros, but this varies based on how much information they window. Similarly, Amorphics have no equivalent category in film shots, since they show a non-active element of the narrative, which can be framed in any number of ways.

With the growing influx of Japanese manga (“comics”) into the United States over the past several decades (Goldberg, 2010; Wong, 2010), much comparison has been made between the techniques of Japanese and American authors (McCloud, 1993, 1996; Rommens, 2000; Cohn, 2010, 2011). Japanese manga come from a different cultural context than that of American comics. While comics in the USA have historically appealed to a particular subculture, manga in Japan are treated much the same as movies, television, or textual books. Manga are widely read by all ages, have many genres, and, in fact, are so popular that they constitute nearly one-third of all printed material (Schodt, 1983, 1996; Gravett, 2004). Though Japanese manga were influenced by American authors early in their historical development (Gravett, 2004), they developed largely in isolation over the past 60 years. With increased importation of manga into America starting in the 1980s, the differences between narrative techniques that emerged from these separate traditions have become quite salient to readers, authors, and scholars of comics in America.

In one of the first comparisons of American and Japanese comics, McCloud (1993) coded the semantic relationships between juxtaposed panels. He found that American authors primarily used transitions showing actions with clear temporal change, followed by shifts between characters (one character to another) and scenes (as in a shift from one whole spatial location to another). Manga similarly showed shifts in actions, characters, and scenes. However, unlike American books, manga also transitioned to different aspects within a scene, such as using panels to solely depict parts of the surrounding environment (i.e., Amorphic panels). The Amorphic panels give the sense of a “wandering eye” across the scene, and were introduced into manga from the influence of Japanese cinema in the 1950s (Shamoon, 2011). McCloud attributed the differences between cultures’ panels to an “artistic culture” of Japan that focused on “being there over getting there.” However, these findings are similar to the attention research: American comics focus on actions and figures while Japanese manga also include information about the surrounding environment.

McCloud (1993, 1996) has also proposed that manga allow a reader to take more of a “subjective” viewpoint on a story than American comics – meaning that manga use techniques that immerse the reader in the narrative as if it were perceived through their own viewpoint, instead of an omniscient “objective” perspective. Such a distinction is especially important if panels are thought to be units of attention, since that framing would then take on the “subjective” point of view of a reader that might differ between cultures. McCloud based his cross-cultural comparison on several factors including the greater focus on environmental aspects in storytelling, which reflect a “wandering eye” across the scene. Second, manga use more “subjective” types of motion lines where a reader appears to move at the same pace as a moving object, thereby seeing the object as solid while the background is blurred (Figure B), as opposed to seeing it move in front of them, where the path itself becomes a blur (as a motion line), as is found in American comics (Figure A). Finally, manga were said to use more subjective viewpoints in panels, which show the viewpoint of a character in the narrative (Figure C). In order to test this broad claim directly, Cohn (2011) coded a corpus of comics and manga for this last type of subjectivity, where panels depict the viewpoint of a character in the narrative. More subjective panels were used in Japanese manga than American comics. This provided evidence that manga do indeed use more subjective viewpoints, at least across one measurable dimension.

Cohn’s (2011) study also examined the attentional types of panels described above. Nearly 60% of American panels were Macros, with only 35% Monos and 5% Micros (Amorphics were not yet theorized as a category, and were likely mixed in Monos and Micros). However, Japanese manga used almost as many Macros (47%) as Monos (43%), and more Micros (10%) than American comics. Because manga featured less than the whole scene in over half of all panels, it implies that the Japanese are as interested in the component parts of a scene as much as the whole scene. These results also suggest that the narrative structure of manga demands the inferential construction of whole scenes more than American comics (Cohn, 2010). These findings of more Micros in Japanese manga are also consistent with claims by Toku (2001, 2002) that manga influences Japanese children’s drawings. She found that Japanese children draw far more variable viewpoints than American children, particularly “exaggerated” close ups.

These studies suggest a difference overall between panels in American comics and Japanese manga that could be construed as reflecting the differences in cross-cultural windowing of attention. Like in attention, readers track only the most important aspects of a sequence to establish the continuity of the narrative. Non-relevant information may then go unattended by the “spotlight of attention” across panels, as happens in change blindness paradigms (Levin and Simons, 2000). There are thus two strategies comic authors can use when creating a comic. One option is to show a full scene (Macro) and rely on the reader’s attentional intuitions to discern the most important parts. In this “objective” method, a reader’s personal spotlight of attention selects the relevant information in a scene (Figure A). Alternatively, authors can use panels to highlight salient parts directly, omitting what is unimportant altogether. This use of panels would heighten panels’ ability to depict a “subjective viewpoint,” since the panels would become the spotlight of attention to focus only on important parts of a scene (Figure B).

The previous research suggests that American comics more consistently use the first option: authors provide an objective viewpoint on a scene, letting the reader direct their own attention across panels to find the most relevant aspects of continuity, while less important elements simply go unattended. This is suggested by the larger amounts of Macros found in American comics. In contrast, Japanese manga do more to simulate the perception of a reader’s attention, evident in greater use of Monos and Micros. This “subjective” strategy of Japanese manga is consistent with McCloud’s (1993, 1996) claim that manga allow a reader to take more of a subjective viewpoint on a story. It also is supported by previous corpus analysis showing that “subjective viewpoints” are more plentiful in Japanese manga than American comics (Cohn, 2011). Thus, rather than Japanese panels showing large scenes (Macros) that include aspects of the background (as in the study of art and photographs by Masuda et al., 2008), panels in manga directly depict these elements of a scene (Monos, Micros) because of the way that panels simulate the window of attention.

Thus, this previous research could support that comic panels serve to simulate attention on a fictitious scene in ways that are consistent with cross-cultural differences in attention. However, an alternative possibility is that these differences merely arise because of separate narrative conventions between American and Japanese comic authors. Indeed, while these studies have shown that comic panels vary between cultures, panels may also differ within cultures. Obvious variability can be found in the diversity of American graphic styles compared to the far more uniform drawing style in manga. Graphic styles are particularly pronounced between genres, such as between the more “serious” Independent graphic novels, which range from more straightforward and realistic styles to cartoony styles, and mainstream comics, which have the bombastic style of muscular heroic figures (Duncan and Smith, 2009). Styles in genres of Japanese comics also vary (Schodt, 1983, 1996; Gravett, 2004), but primarily conform to the stereotypical style of big eyes, pointy chins and noses, and big hair. The diverse styles used in American comics have been likened to types of “dialects,” compared with “accents” in manga genres, which feature variations on a common schema (Cohn, 2010).

Some research suggests that variation between genres extends to the level of panels, and can thereby inform about the framing of attention. In an early study, Neff (1977) found that panels from various genres of American comics use film shots differently. Wide shots (Long and Medium) far outnumbered Close shots (Close and Close ups) in panels for all genres. However, there were far fewer Close shots in Adventure and Romance comic panels than in Mystery and Alien Beings comics. These findings imply that various genres of American books do highlight diverse aspects of a visual scene. However, the sample size in this study was somewhat limited in scope – only two pamphlet-sized comics were analyzed per genre – making the results hard to generalize.

If differences between panels in American comics and Japanese manga are a reflection of cross-cultural differences in attention, these trends should transcend differences between genres within those cultures. Indeed, Cohn’s (2011) study mixed together various genres within the overall samples of American comics and Japanese manga. Thus, the present study sought to examine comic panels both within and between cultures by comparing the panels of “mainstream” Japanese manga with the two major genres of American comics: Mainstream and Independent (“Indy”) books. Mainstream books from both the United States and Japan were chosen because they are the most popular and most stereotypical instances of their respective comic cultures. American Indy books were chosen because they presented a different artistic movement in the USA that contrasts the Mainstream genre (discussed below). Thus, if variation occurs between the structures of comics from within the United States, we may expect it between Mainstream and Indy comics.

If the differences between American comics and Japanese manga are merely an artifact of different narrative conventions, we would expect that Indy books would also have their own unique conventions of storytelling that define them from Mainstream American comics. Under this view, we predicted that attentional categories from all three groups would differ. Because no previous studies have yet examined Indy comics in this way, it is difficult to predict how trends of attentional categories may differ from Mainstream comics.

Nevertheless, if American genres show no variation, yet are both different from Japanese manga, it would imply cultural differences beyond the contexts of genre. To this end, we would predict the results to replicate the study by Cohn (2011) that showed a greater focus on whole scenes (Macros) by panels in American comics, and a greater focus on parts of scenes (Monos, Micros) by Japanese manga. Furthermore, consistent with McCloud’s (1993) findings that manga focus more on surrounding aspects of a scene, we would expect panels from Japanese manga to use more Amorphic panels than American comics. Such results would provide additional support that the attentional windowing in panels from comics reflect broader trends in cross-cultural cognition in attention.