home | O'Reilly's CD bookshelfs | FreeBSD | Linux | Cisco | Cisco Exam  


Book HomeInformation Architecture for the World Wide WebSearch this book

Chapter 3. Organizing Information

The beginning of all understanding is classification.

--Hayden White

Our understanding of the world is largely determined by our ability to organize information. Where do you live? What do you do? Who are you? Our answers reveal the systems of classification that form the very foundations of our understanding. We live in towns within states within countries. We work in departments in companies in industries. We are parents, children, and siblings, each an integral part of a family tree.

We organize to understand, to explain, and to control. Our classification systems inherently reflect social and political perspectives and objectives. We live in the first world. They live in the third world. She is a freedom fighter. He is a terrorist. The way we organize, label, and relate information influences the way people comprehend that information.

As information architects, we organize information so that people can find the right answers to their questions. We strive to support casual browsing and directed searching. Our aim is to apply organization and labeling systems that make sense to users.

The Web provides us with a wonderfully flexible environment in which to organize. We can apply multiple organization systems to the same content and escape the physical limitations of the print world. So why are many large web sites so difficult to navigate? Why can't the people who design these sites make it easy to find information? These common questions focus attention on the very real challenge of organizing information.

3.1. Organizational Challenges

In recent years, increasing attention has been focused on the challenge of organizing information. Yet, this challenge is not new. People have struggled with the difficulties of information organization for centuries. The field of librarianship has been largely devoted to the task of organizing and providing access to information. So why all the fuss now?

Believe it or not, we're all becoming librarians. This quiet yet powerful revolution is driven by the decentralizing force of the global Internet. Not long ago, the responsibility for labeling, organizing, and providing access to information fell squarely in the laps of librarians. These librarians spoke in strange languages about Dewey Decimal Classification and the Anglo-American Cataloging Rules. They classified, cataloged, and helped us find the information we needed.

The Internet is forcing the responsibility for organizing information on more of us each day. How many corporate web sites exist today? How many personal home pages? What about tomorrow? As the Internet provides us all with the freedom to publish information, it quietly burdens us with the responsibility to organize that information.

As we struggle to meet that challenge, we unknowingly adopt the language of librarians. How should we label that content? Is there an existing classification system we can borrow? Who's going to catalog all of that information?

We're moving towards a world where tremendous numbers of people publish and organize their own information. As we do so, the challenges inherent in organizing that information become more recognized and more important. Let's explore some of the reasons why organizing information in useful ways is so difficult.

3.1.1. Ambiguity

Classification systems are built upon the foundation of language, and language is often ambiguous. That is, words are capable of being understood in two or more possible ways. Think about the word pitch. When you say pitch, what do I hear? There are actually more than 15 definitions, including:

  • A throw, fling, or toss.

  • A black, sticky substance used for waterproofing.

  • The rising and falling of the bow and stern of a ship in a rough sea.

  • A salesman's persuasive line of talk.

  • An element of sound determined by the frequency of vibration.

This ambiguity results in a shaky foundation for our classification systems. When we use words as labels for our categories, we run the risk that users will miss our meaning. This is a serious problem. See Chapter 5, "Labeling Systems", for more on this issue.

It gets worse. Not only do we need to agree on the labels and their definitions, we also need to agree on which documents to place in which categories. Consider the common tomato. According to Webster's dictionary, a tomato is a red or yellowish fruit with a juicy pulp, used as a vegetable: botanically it is a berry. Now I'm confused. Is it a fruit or a vegetable or a berry?[3]

[3]"The tomato is technically a berry and thus a fruit, despite an 1893 U.S. Supreme Court decision that declared it a vegetable. ( John Nix, an importer of West Indies tomatoes, had brought suit to lift a 10 percent tariff, mandated by Congress, on imported vegetables. Nix argued that the tomato is a fruit. The Court held that since a tomato was consumed as a vegetable rather than as a dessert like fruit, it was a vegetable.)" "Best Bite of Summer" by Denise Grady, Self, July 1997, Vol. 19 (7), pp. 124-125.

If we have such problems classifying the common tomato, consider the challenges involved in classifying web site content. Classification is particularly difficult when you're organizing abstract concepts such as subjects, topics, or functions. For example, what is meant by alternative healing and should it be cataloged under philosophy or religion or health and medicine or all of the above? The organization of words and phrases, taking into account their inherent ambiguity, presents a very real and substantial challenge.

3.1.2. Heterogeneity

Heterogeneity refers to an object or collection of objects composed of unrelated or unlike parts. You might refer to grandma's homemade broth with its assortment of vegetables, meats, and other mysterious leftovers as heterogeneous. At the other end of the scale, homogeneous refers to something composed of similar or identical elements. For example, Oreo cookies are homogeneous. Every cookie looks and tastes the same.

An old-fashioned library card catalog is relatively homogeneous. It organizes and provides access to books. It does not provide access to chapters in books or collections of books. It may not provide access to magazines or videos. This homogeneity allows for a structured classification system. Each book has a record in the catalog. Each record contains the same fields: author, title, and subject. It is a high-level, single-medium system, and works fairly well.

Most web sites, on the other hand, are highly heterogeneous in two respects. First, web sites often provide access to documents and their components at varying levels of granularity . A web site might present articles and journals and journal databases side by side. Links might lead to pages, sections of pages, or to other web sites. Second, web sites typically provide access to documents in multiple formats. You might find financial news, product descriptions, employee home pages, image archives, and software files. Dynamic news content shares space with static human resources information. Textual information shares space with video, infoarch, and interactive applications. The web site is a great multimedia melting pot, where you are challenged to reconcile the cataloging of the broad and the detailed across many mediums.

The heterogeneous nature of web sites makes it difficult to impose highly structured organization systems on the content. It doesn't make sense to classify documents at varying levels of granularity side by side. An article and a magazine should be treated differently. Similarly, it may not make sense to handle varying formats the same way. Each format will have uniquely important characteristics. For example, we need to know certain things about images such as file format (GIF, TIFF, etc.) and resolution (640x480, 1024x768, etc.). It is difficult and often misguided to attempt a one-size-fits-all approach to the organization of heterogeneous web site content.

3.1.3. Differences in Perspectives

Have you ever tried to find a file on a coworker's desktop computer? Perhaps you had permission. Perhaps you were engaged in low-grade corporate espionage. In any case, you needed that file. In some cases, you may have found the file immediately. In others, you may have searched for hours. The ways people organize and name files and directories on their computers can be maddeningly illogical. When questioned, they will often claim that their organization system makes perfect sense. "But it's obvious! I put current proposals in the folder labeled /office/clients/red and old proposals in /office/clients/blue. I don't understand why you couldn't find them!"

The fact is that labeling and organization systems are intensely affected by their creators' perspectives. We see this at the corporate level with web sites organized according to internal divisions or org charts. In these web sites, we see groupings such as marketing, sales, customer support, human resources, and information systems. How does a customer visiting this web site know where to go for technical information about a product they just purchased? To design usable organization systems, we need to escape from our own mental models of content labeling and organization.

You must put yourself into the shoes of the intended user. How do they see the information? What types of labels would they use? This challenge is further complicated by the fact that web sites are designed for multiple users, and all users will have different perspectives or ways of understanding the information. Their levels of familiarity with your company and your web site will vary. For these reasons, it is impossible to create a perfect organization system. One site does not fit all! However, by recognizing the importance of perspective and striving to understand the intended audiences, you can do a better job of organizing information for public consumption than your coworker on his or her desktop computer.



Library Navigation Links

Copyright © 2002 O'Reilly & Associates. All rights reserved.