By Ian Lamont
February 28, 2007
The following is an expanded essay based on the author's original presentation at the Media Grid Boston Siggraph Summit - July 2006
Over the past 15 years, the world has experienced a rapid and profound period of mass media evolution. Digital technologies ranging from computer animation to the World Wide Web have transformed the way billions across the globe are entertained and informed. While many people and organizations are still adjusting to this new media environment, the transformation has really only just begun.
This essay will argue that a second wave of media evolution is about to wash over the globe. This second wave will unfold over the next 15 years, and will be driven by four factors: A generation of young consumers weaned on the Internet and sophisticated video games; content-creation software based on three-dimensional gaming technologies; a new many-to-many mass media paradigm that emphasizes user-generated contributions and personalization options; and sophisticated consumer devices, Internet applications, and Internet infrastructure capable of collecting and distributing rich media content to billions of network nodes.
The Media DivideSeveral observers have described the generational shift that is impacting media creation and consumption trends. In 2005, Rupert Murdoch, chairman and CEO of News Corp., described the challenges faced by "digital immigrants" — people who have grown up in a traditional media environment, and are now adapting to a networked media world.
Many, perhaps most, readers of this essay will identify with this group. Not only are we digital immigrants, we are also media dinosaurs. We enjoy thumbing through glossy magazines, and maybe still subscribe to a daily newspaper. We schedule at least one evening per week around a favorite TV program, created by one of the major television or cable networks. We can name at least one local or national news anchor. And scattered around our homes and offices are veritable graveyards of physical media — old tapes, vinyl records, floppy disks, and magazines — that we insist on keeping, even though we'll probably never use them again.
For another segment of the population, however, physical media play an increasingly irrelevant day-to-day role. This group includes the children, minors, and young adults whose early years were spent using computers, video games, and, in recent years, the Internet.
The video game generation includes hundreds of millions of people worldwide who have grown up with video games from childhood. This generation is familiar with gaming conventions relating to movement, exploration, cooperation, competition, and communication. Additionally, interaction with video games from an early age has created a foundation of familiarity and interest in computing technologies. This, in turn, has helped usher in successive waves of computer-related revolutions.
The older members of the video game generation — men and women now in their late 30s — cut their computing teeth on simple arcade games like Space Invaders, Pong, and Pac Man, and inexpensive game consoles for home use , such as the Atari 2600. While their parents were often wary of "home computers," the first wave of the video game generation embraced the TRS-80, Apple II, Commodore Vic-20 and early PCs with gusto, helping kick-start the personal computing revolution.
A slightly younger segment of the video game generation, consisting of adults now in their early 30s, grew up with more advanced arcade, console, and PC games in the 1980s. As college students in the early 1990s, they helped propel the World Wide Web into mainstream use.
The younger members of the video game generation — those born in the '90s and later — deserve special attention. This group has grown up with the Internet, not to mention video games that incorporate 3D imagery and multiplayer functionality over local networks and the Internet. These youths have proved to be sophisticated users of computing technology, and also have also developed a different set of habits, expectations and preferences when it comes to consuming media. For this group, physical media are increasingly irrelevant. Browsing for CDs at a store, or subscribing to a monthly magazine seems strange to them — why not download or share music using the Internet, or browse the Internet for the information and interactions they need?
While television is still a major part of the lives of the video game generation, loyalty to television networks is not. According to a survey published by Bolt Media in 2006, one in three people under the age of 34 can't name any of the big four terrestrial TV networks. And no wonder — this is a generation that has grown up with cable television; for the younger members of the video game generation, having hundreds of specialty channels to choose from is the norm. In this type of television landscape, it is easy for the networks — and network programming — to get lost in the mix.
The video game generation has different preferences when it comes to television programming, too. For instance, televised news is attracting fewer young viewers. A 2006 study by the Pew Internet and American Life Project found that just over half of people under the age of 36 watched local television news on a typical day, compared to approximately 65% for both the 36-49 and 50+ age groups. National television news programming had fewer younger viewers: Less than 40% for the under 36 group, compared to about half of the 36-49 age group, and over 60% of those 50 and up.
3D Modding, 3D MediaAs members of the video game generation grow older, they may become more interested in televised news. But what type of news will they want to see, and how will they want to have it presented, 15 years from now? Probably not in the way it is delivered today by most television stations and networks — anchors and reporters delivering news that they and their editors decide is important. Text-based news, whether printed on paper, or delivered via the Internet, may also be a hard sell with the video game generation. However, we may see this group embrace news — and, for that matter, other types of programming — delivered via cutting-edge technologies now coming out of the gaming and Internet Petri dishes.
Let's put aside news for a moment, and examine a new form of entertainment programming based on "modding." This is a video game technology that exploded in the 1990s as a way for players of 3D game titles such as Doom and Quake to modify characters and environments to suit personal preferences, or make gameplay more exciting. Players could create custom weapons, monsters, or game levels, and share them with other players via the 'Net. Game studios and amateur programmers developed the modding tools, and made them available for download on official game websites and fan forums.
In recent years, modding has expanded beyond the gaming community to various entertainment media. For instance, MTV's Video Mods and Yahoo's Artist Mods have been adapted 3D modding to a programming genre that is already popular with the video game generation: Music videos. The producer of both programs, Big Bear Entertainment, takes songs by popular artists, and using professional 3D modding tools, creates fantastic videos based on popular gaming characters and environments. A heavy metal band might be represented by demons and skeletons playing on a spooky mountaintop. A rapper can have monsters bopping along to the beat, in a foreboding cityscape that would be far too expensive to replicate in real life.
Machinima, short for "machine cinema" or "machine animation", is a similar concept to the Video Mods formula. Instead of music, creators use 3D modding tools to make, among other things, dramas and simulations of historical and current events. There is another significant difference between machinima and Video Mods: Whereas a professional design studio and major entertainment companies are behind the Video Mods/Artist Mods initiatives, machinima tends to be a grassroots effort — the artists are mostly amateurs, many of them working with off-the-shelf modding software and freeware.
For instance, The French Democracy, available from Machinima.com, has a plot that gives an unflinching look at the causes of the French riots of 2005. A 20-something French designer named Alex Chan produced the machinima in late 2005. Machinima are often made using standard video game engines such as Quake and Half-Life 2, but Chan used a $70 Activision/Lionshead game title called "The Movies." The animation for The French Democracy may appear crude when compared with digital animation created by Hollywood studios, and the audience may only number in the tens of thousands worldwide, but it is only a matter of time before machinima gain a wider following, as the tools and storytelling devices improve, and more members of the video game generation see machinima as a legitimate entertainment genre.
Machinima has an additional selling point with the video game generation, thanks to the flexibility that modding and other digital technologies provide: Personalization. Imagine dramas in which you choose the actors' "skins," the locations, and even the languages the actors are speaking. Why should Brad Pitt be the star, when you prefer Keifer Sutherland, or a relatively unknown talent? Why not feature a virtual "actor" that you design on your own? What about fine-tuning plot elements, such as the levels of violence or romance? Or having the drama take place in Tokyo, or Toledo?
Corporate media emphasizes centralized control of the medium and the message. Modding and machinima goes against this ideal — what N.Y. or L.A. media titan would want a Days Of Our Lives clone set in Japan or the Midwest, featuring characters that the user chooses? On the other hand, there are also opportunities for media companies to participate in content creation, leverage existing brands, and make money in new ways. Consider how a Hollywood talent agency or movie studio might syndicate actors' "skins", voices, or characters. There are also numerous marketing opportunities, using targeted advertising and product placement. These types of activities are already taking place in Second Life, an online world where participants can customize everything from their own 3D "avatars" to houses, shops, and personal articles. Second Life citizens can interact with each other in a variety of interesting, and sometimes profitable ways — Second Life features a virtual economy and currency that lets them buy and sell custom vehicles, clothing, real estate, branding opportunities, and even virtual sex.
Modding technologies can also be applied to televised news. Forget Katie Couric — why not have Lara Croft reading the news? Or Abraham Lincoln? Or a photorealistic, 3D simulation of your dad, your mom or even yourself? There have already been several experiments involving 3D news mods. Recently, graduate students at Northwestern University created an application called "News At Seven," which incorporates modding and XML technologies to create semi-automated newscasts featuring a 3D anchor. The anchor, a female character from the Half Life 2 game engine, reads scripts gathered from blog posts, news RSS feeds, and footage scraped from the 'Net to present a simple newscast. Admittedly, the format feels strange — the video game character has limited expressions, and speaks with a mechanical, computer-generated voice. Eventually the quality may improve to a point where it is a viable alternative to live anchors. This will require major improvements in 3D and voice-synthesis technologies, which will in turn depend on advances in processing power and network connectivity. When personalization options are thrown into the mix — changing the appearance of the anchors, and the types of news that is presented — live news anchors may find themselves out of a job. The idea of paying Barbie or Ken a lot of money to sit in front of a camera at 7 am, 6 pm and 11 pm to simply read scripts from teleprompters will seem very old-fashioned to users, and very expensive to media companies.
The New GatekeepersDemographic trends and sophisticated computer technologies will help usher in the second wave of media evolution. But the previous examples point to anoth er important factor: User-created content. Unlike most broadcast and cable programming, which is planned and created by corporations and broadcast professionals, modding and machinima have been grassroots phenomena, driven by enthusiastic users, inexpensive and free software tools, and the Internet.
User-generated content is transforming the media industry in many ways, besides 3D modding. Witness the popularity of discussion forums, blogs, YouTube, and MySpace, which let users post and share text, photos, and video. The popularity of such sites are undermining traditional, top-down media models, in which a variety of gatekeepers — journalists, producers, editors, advertising firms, entertainment executives, etc. — determine what people should know, see, hear, and enjoy, and even when and how they receive media content. Traditional models have governed media programming for decades, and have led to the rise of billion-dollar media conglomerates.
Developments that have taken place during the first wave of new media evolution have given many media companies pause for thought. Besides a slew of new upstart competitors to contend with, the once-docile audience has begun to exhibit behaviors that challenge traditional media models. People, especially younger people, are turning away from traditional media content. They spend more and more time surfing the Internet or playing videogames. Some prefer watching home-made videos and old advertisements on YouTube to "real" television shows. Even when users do watch television shows, devices such as TiVo and the iPod let them enjoy programming when they want, and how they want. The audience has also challenged the media establishment by being active creators of content that increasingly competes with traditional mass media products. Machinima is currently a blip on their radar screens, but other user-created content such as blogs, Wikipe dia, and YouTube videos have become overnight success stories in the past few years. These sites are addictive, and impossible for traditional media companies to ignore.
Old media companies will be further challenged in the next 15 years, as a new wave of user-generated content washes over the Internet, thanks to the increasing availability and affordability of portable, digital-based electronic devices. The cameraphones which seemed like such novelties just a few years ago will be in everyone's purse and pocket a few years from now. Get used to the idea of significant portion of the population walking around with high-speed Internet connections on their person, with sophisticated video cameras built in. They will be shooting all kinds of events all the time. Crime. Crashes. Speeches. Sports. And the footage won't be the short, sanitized and safe versions we usually see on television, courtesy of the old media gatekeepers. The user-generated pictures and video will be raw and real. It will be disturbing, yet illuminating. And it will be shared over the 'Net almost as it happens, and available for everyone to see. The cameraphone video of Saddam Hussein at the gallows is just the beginning.
How will this wave of user-generated photographs and video impact the news landscape? More importantly, how will this wave of content impact the public's understanding of the world around them? Let's consider a real-world event that was defined by news imagery: The Tiananmen Square demonstrations and its most enduring image, a man standing in front of a column of tanks. The still image and video clip of this scene were both taken by professional photojournalists. Imagine if the consumer and 'Net technology of today were available in 1989. Suppose just five percent of the tens of thousands of people in Tiananmen Square at that time had portable phones, digital cameras, and video cameras, and the content from 10% of those devices had been uploaded and spread via the 'Net? There wouldn't be just one iconic i mage of the events — a courageous, solitary figure defying the might of the People's Liberation Army. There would be dozens, hundreds, even thousands of images for the world to consider. And the government wouldn't just have to put out fires in Beijing and a few other big cities — there would be anger in practically every city and town in China where there are people with 'Net connections.
Not every mobile phone has video capabilities or 'Net connections, but we are already starting to see the impact of these devices. The December 2004 Indian Ocean Tsunami was a watershed event in this respect. For the first time, global understanding of a major news event was shaped by thousands of photographs and video clips taken by ordinary people — often footage taken by people on holiday pointing their lenses out of hotel windows. In the coming years, we will see more and more of this uncensored and immediate user-generated imagery popping into the public eye after major and minor news events, and shaping public opinion.
Nodes on the 'NetThe coming flood of digital video and pictures will supplant the text-heavy formula that currently defines most information-oriented websites. The ancient practice of reading dark letters on a white background is still standard behavior for visitors to news portals, review sites, forums, and blogs. Text is also how users find news and information, through the use of links and menus, and by entering terms into search engines. As more video, photographs, and 3D content are uploaded to the 'Net, a new set of technologies will be needed to help manage and sort the incoming content, and help users find what they want in the heaving sea of information and rich media content. The underlying Internet infrastructure will require a massive upgrade, in order to handle exponential increases in the amount of rich content being uploaded and distributed via the World Wide Web, and support a rage of media applications that users will employ to keep themselves entertained and informed.
One of the building blocks of ongoing media transformation has been Extensible Markup Language (XML). This language structures data in a way that helps applications process and display information. An application of XML known as RSS ("Real Simple Syndication" or "Rich Site Summary") is already used by millions of consumers to access news articles, blog entries, shopping deals, and even audio and video programs. RSS feeds can be accessed from cascading menus on a web browser, or via software programs that let users to organize and display content in whatever way they choose. They can also be used to personalize and automatically present data. Google's aggregator for news articles, Google News, provides RSS feeds for search results that can be plugged into your browser for easy access to new information on the Web. Let's say you are interested in keeping tabs on the price of real estate in your city. You can search for "real estate" and the name of your city in Goog le News, and then bookmark the RSS feed; accessing the bookmark every few days will display new headlines from a variety of sources relating to these search terms.
In the future, personalization technologies based on XML will be instrumental in helping individual users find and pick the information they need. But powerful devices, databases, servers, and network connections will also be needed to categorize, store, and distribute the constant gush of new video, pictures, and 3D content. XML will also be able to help on the back-end, to let systems more efficiently communicate with each other and deliver content. Another key factor will be data identifiers assigned to each piece of uploaded content that describes and categorizes it so users can more easily find it.
Some metadata are already automatically created for new pieces of digital content. For instance, a Web page will usually have metadata that identifies the software application that generated it (such as Dreamweaver, or Microsoft Word). A digital photograph will contain metadata identifying the camera model, size in pixels and bytes, and timestamp (e.g., Fujifilm S7000, 1200x800 px, 2.1 MB, 11:03 am, February 11, 2007). But people have to manually create other metadata or text "tags" that describe the appearance or subject matter more fully for other computer users and search engines. This human-created information might be the metadata keywords listing the main topics of a news webpage ("fire, apartment, 24 Pleasant Street, Boston"), or the text tags that identify the themes of an uploaded photo on Flickr ("Mt. Washington, June, clouds, nature").
In the second wave of new media evolution, content creators and other 'Net users will not be able to manually tag the billions of new images and video clips uploaded to the 'Net. New hardware and software technologies will need to automatically apply descriptive metadata and tags at the point of creation, or after the content is uploaded to the 'Net. For instance, GPS-enabled cameras th at embed spatial metadata in digital images and video will help users find address- and time-specific content, once the content is made available on the 'Net. A user may instruct his news-fetching application to display all public photographs on the 'Net taken between 12 am and 12:01 am on January 1, 2017, in a one-block radius of Times Square, to get an idea of what the 2017 New Year's celebrations were like in that area. Manufacturers have already designed and brought to market cameras with GPS capabilities, but few people own them, and there are no news applications on the 'Net that can process and leverage location metadata — yet.
Other types of descriptive tags may be applied after the content is uploaded to the 'Net, depending on the objects or scenes that appear in user-submitted video, photographs, or 3D simulations. Two Penn State researchers, Jia Li and James Wang, have developed software that performs limited auto-tagging of digital photographs through the Automatic Linguistic Indexing of Pictures project. In the years to come, autotagging technology will be developed to the point where powerful back-end processing resources will categorize massive amounts of user-generated content as it is uploaded to the 'Net. Programming logic might tag a video clip as "violence", "car," "Matt Damon," or all three. Using the New Years example above, a reader may instruct his news-fetching application to narrow down the collection of Times Square photographs and video to display only those autotagged items that include people wearing party hats.
Another future technology that will enable better news and information-gathering applications is computerized, real-time interpretation of foreign-language content. Reliable, high-quality software translation has been the holy grail of computer scientists for decades, but products have generally been hard to use, unreliable, or ill-suited to specialist fields. In ten years many of the kinks will have been worked out, and text-based translation software that utilizes massive computing grids can be applied to spoken languages. The output may be text, subtitles or overdubs, and it will enable users to easily access foreign news and information sources. This will have a profound impact on the public's understanding of world events and societies, as traditional gatekeepers — journalists, diplomats, "experts", and other people who are able to understand or translate international events — will not be the only sources available. For instance, if war engulfs the Middle East 15 years from now, you will not only depend on English-language reports or accounts. You will be able to understand Arabic, Farsi, and Hebrew sources, or Japanese views of the situation, if you so choose.
At this point, you may be asking whether the Internet will be able to support the accompanying flood of rich media content and associated applications, not to mention billions of users with high-speed connections. Anyone who has attempted to access the 'Net after a major news event knows that too many users can bring a powerful website to its knees. In fact, the high processing, storage, and network demands associated with streaming video force websites that feature user-submitted video to limit the size of files or degrade video quality, in order to serve more viewers.
In the second wave of new media evolution, how will the Internet cope, with billions of users simultaneously accessing video, 3D worlds, and sophisticated media applications at peak times? Where will the processing power for these types of applications come from? And, with petabytes of new video and rich media content being uploaded every second, where will the data be stored? Moore's Law dictates that processing technologies will advance significantly over the next fifteen years. Cutting-edge storage formats with massive capacity and high availability are also emerging from R&D labs. But will it be enough to scale to meet the demand created by the media applications and usage trends described above? Distributed computing technologies — Internet-based processing grids and peer-to-peer software — could be the answer, but a great deal of research, testing, and cooperation on a global scale would be required to make it work. The Media Grid and Open Grid Forum are two initiatives that are working on these issues, and such efforts will be key to realizing some of the advanced applications and usage patterns outlined in this essay.
The degree of change in media distribution and consumption patterns in the past 15 years — the first wave of new media evolution — has been amazing. It has been a period of experimentation and development for both media consumers and content creators. In the next 15 years, the pace of change will be astounding, as the second wave of media evolution unfolds. The thought of watching personalized 3D dramas or newscasts based completely on user-generated content may seem alien now, but the demographic changes and technological advances now underway are paving the way for these and other types of new media experiences.
copyright 2007 © Ian Lamont
| Harvard Extended | I, Lamont