CQR Big Data and Privacy - SAGE edge

24
Big Data and Privacy Should the use of personal information be restricted? B ig data — the collection and analysis of enormous amounts of information by supercomputers — is leading to huge advances in such fields as astro- physics, medicine, social science, business and crime fighting. And big data is growing exponentially: According to IBM, 90 percent of the world’s data has been generated within just the past two years. But the use of big data — including Tweets, Face- book images and email addresses — is controversial because of its potential to erode individual privacy, especially by governments conducting surveillance operations and companies marketing prod- ucts. Some civil liberties advocates want to control the use of big data, and others think companies should pay to use people’s on- line information. But some proponents of big data say the benefits outweigh the risks and that privacy is an outdated concept. A refrigerator that can communicate with a homeowner’s smartphone is shown at the 2013 International Consumer Electronics Show in Las Vegas. Big data experts say the amount of information processed and analyzed will increase exponentially in the future. Huge amounts of data will be generated by what many are calling the “Internet of Things” — the online linking of sensors installed on more and more inanimate objects, such as home appliances. CQ Researcher • Oct. 25, 2013 • www.cqresearcher.com Volume 23, Number 38 • Pages 909-932 RECIPIENT OF SOCIETY OF PROFESSIONAL JOURNALISTS A WARD FOR EXCELLENCE AMERICAN BAR ASSOCIATION SILVER GAVEL A WARD I N S I D E THE I SSUES ....................911 BACKGROUND ................918 CHRONOLOGY ................919 CURRENT SITUATION ........924 AT I SSUE ........................925 OUTLOOK ......................926 BIBLIOGRAPHY ................930 THE NEXT STEP ..............931 T HIS R EPORT Published by CQ Press, an Imprint of SAGE Publications, Inc. www.cqresearcher.com

Transcript of CQR Big Data and Privacy - SAGE edge

Big Data and PrivacyShould the use of personal information be restricted?

Big data — the collection and analysis of enormous

amounts of information by supercomputers — is

leading to huge advances in such fields as astro-

physics, medicine, social science, business and crime

fighting. And big data is growing exponentially: According to IBm,

90 percent of the world’s data has been generated within just the

past two years. But the use of big data — including Tweets, face-

book images and email addresses — is controversial because of

its potential to erode individual privacy, especially by governments

conducting surveillance operations and companies marketing prod-

ucts. Some civil liberties advocates want to control the use of big

data, and others think companies should pay to use people’s on-

line information. But some proponents of big data say the benefits

outweigh the risks and that privacy is an outdated concept.

A refrigerator that can communicate with ahomeowner’s smartphone is shown at the 2013International Consumer Electronics Show in Las

Vegas. Big data experts say the amount of informationprocessed and analyzed will increase exponentially inthe future. Huge amounts of data will be generated bywhat many are calling the “Internet of Things” — theonline linking of sensors installed on more and more

inanimate objects, such as home appliances.

CQ Researcher • Oct. 25, 2013 • www.cqresearcher.comVolume 23, Number 38 • Pages 909-932

RECIPIENT Of SOCIETY Of PROfESSIONAL JOURNALISTS AwARD fOR

EXCELLENCE � AmERICAN BAR ASSOCIATION SILvER GAvEL AwARD

I

N

S

I

D

E

THE ISSUES ....................911

BACKGROUND ................918

CHRONOLOGY ................919

CURRENT SITUATION ........924

AT ISSUE........................925

OUTLOOK ......................926

BIBLIOGRAPHY ................930

THE NEXT STEP ..............931

THISREPORT

Published by CQ Press, an Imprint of SAGE Publications, Inc. www.cqresearcher.com

910 CQ Researcher

THE ISSUES

911 • Do big data’s benefits outweigh the risks?• Is the use of big dataspeeding the erosion of privacy?• Should the federal govern-ment strengthen its regulationof big data?

BACKGROUND

918 Data OverloadTechnologies emerged to address the “information explosion.”

921 Birth of Big DataSociologist Charles Tilly coinedthe term “big data” in 1980.

922 Data and PoliticsSen. Barrack Obama’s 2008presidential campaign succeed-ed in part because it usedbig data.

CURRENT SITUATION

924 Hunting CriminalsLaw enforcement uses bigdata to track lawbreakersand terrorists.

924 ‘Reclaim Your Name’Some companies oppose allowing Internet users toblock use of their data.

OUTLOOK

926 ‘Internet of Things’Billions of Internet-linkedsensors in inanimate objectswill generate huge amountsof data.

SIDEBARS AND GRAPHICS

912 Big Data Reveals Sourcesof Racist TweetsRacist “hot spots” were oftenin rural areas.

913 Most Americans View BigData Negativelyfewer than 40 percent see itas “mostly positive.”

916 Health Care Providers,Employers Most Trustedto Protect DataPolitical parties and the mediaare considered least trustworthy.

919 ChronologyKey events since 1967.

920 Using Big Data to TestHistorical ‘Facts’‘Big data allows us to askquestions that were not possible before.”

922 From Yottabytes toGoogolplexes: Big DataExplainedfor some, it’s “writing zeroesuntil you get tired.”

925 At Issue:Are new laws needed to prevent organizations fromcollecting online personal data?

FOR FURTHER RESEARCH

929 For More InformationOrganizations to contact.

930 BibliographySelected sources used.

931 The Next StepAdditional articles.

931 Citing CQ ResearcherSample bibliography formats.

BIG DATA AND PRIvACY

Cover: Getty Images/David Becker

MANAGING EDITOR: Thomas J. [email protected]

ASSISTANT MANAGING EDITORS: Lyn Garrity,[email protected], Kathy Koch,

[email protected]

SENIOR CONTRIBUTING EDITOR:Thomas J. Colin

[email protected]

CONTRIBUTING WRITERS: Sarah Glazer, Peter Katel, Reed Karaim, Barbara mantel,

Tom Price, Jennifer weeks

SENIOR PROJECT EDITOR: Olu B. Davis

FACT CHECKER: michelle Harris

An Imprint of SAGE Publications, Inc.

VICE PRESIDENT AND EDITORIAL DIRECTOR,HIGHER EDUCATION GROUP:

michele Sordi

EXECUTIVE DIRECTOR, ONLINE LIBRARY AND REFERENCE PUBLISHING:

Todd Baldwin

Copyright © 2013 CQ Press, an Imprint of SAGE Pub-

lications, Inc. SAGE reserves all copyright and other

rights herein, unless pre vi ous ly spec i fied in writing.

No part of this publication may be reproduced

electronically or otherwise, without prior written

permission. Un au tho rized re pro duc tion or trans mis -

sion of SAGE copy right ed material is a violation of

federal law car ry ing civil fines of up to $100,000.

CQ Press is a registered trademark of Congressional

Quarterly Inc.

CQ Researcher (ISSN 1056-2036) is printed on acid-

free paper. Pub lished weekly, except: (march wk. 5)

(may wk. 4) (July wk. 1) (Aug. wks. 3, 4) (Nov. wk.

4) and (Dec. wks. 3, 4). Published by SAGE Publica-

tions, Inc., 2455 Teller Rd., Thousand Oaks, CA 91320.

Annual full-service subscriptions start at $1,054. for

pricing, call 1-800-818-7243. To purchase a CQ Re-

searcher report in print or electronic format (PDf),

visit www.cqpress.com or call 866-427-7737. Single

reports start at $15. Bulk purchase discounts and

electronic-rights licensing are also available. Periodicals

postage paid at Thousand Oaks, California, and at

additional mailing offices. POST mAS TER: Send ad dress

chang es to CQ Re search er, 2300 N St., N.w., Suite 800,

wash ing ton, DC 20037.

Oct. 25, 2013Volume 23, Number 38

Oct. 25, 2013 911www.cqresearcher.com

Big Data and Privacy

THE ISSUESwhen Peter Higgs and

françois Englertwon the Nobel

Prize for physics this month,they were honored for a the-ory they published nearly ahalf-century ago but was notconfirmed until march.

Higgs, of Scotland’s Univer-sity of Edinburgh, and Englert,of Belgium’s Université Librede Bruxelles (free Universityof Brussels), had independentlytheorized that matter obtainsmass from an unknown en-ergy field that permeates theuniverse. Higgs suggested itis composed of an undiscov-ered subatomic particle, whichbecame known as the Higgsboson. To confirm the theory,scientists needed to find thatparticle.

They finally succeededlargely with the help of so-called big data — the col-lection and analysis of enor-mous amounts of informationby supercomputers, often inreal time. Physicists analyzedtrillions of subatomic explo-sions produced at the Euro-pean Organization for Nuclear Research’sLarge Hadron Collider, a 17-mile cir-cular underground tunnel that crossesthe border between Switzerland andfrance. There, protons are fired at eachother at nearly the speed of light, shat-tering them into other subatomic par-ticles, including the Higgs. 1

To analyze the results of those col-lisions, scientists needed computing ca-pability that was “of larger scale andfaster than ever before,” says Joe In-candela, a University of California, SantaBarbara, physics professor who led oneof the two collider teams searching forthe Higgs. The collider can generate

up to 600 million collisions per second,and the teams’ servers can handle 10gigabytes* of data a second. 2

The search for the Higgs boson isjust one of a vast array of discover-ies, innovations and uses made pos-sible by the compilation and manip-ulation of big data. The explosivelyemerging field could radically advancescience, medicine, social science, crime-fighting and corporate business prac-

tices. But big data is contro-versial because of its poten-tial to erode individual pri-vacy, especially in the wakeof recent revelations that theNational Security Agency (NSA)is collecting massive amountsof personal information aboutAmericans and others aroundthe world. Critics want priva-cy controls on big data’s use,while proponents say its ben-efits outweigh its risks.

Big data has led to cutting-edge medical discoveries andscientific breakthroughs thatwould have been impossible inthe past: links between genet-ic traits and medical conditions;correlations among illnesses,their causes and potentialcures; and the mapping of thehuman genome.

Big data also is a boon tobusinesses, which use it toconduct consumer marketing,figure out when machines willbreak down and reduce ener-gy consumption, among otherpurposes. Governments minebig data to improve public ser-vices, fight crime and trackdown terrorists. And pollstersand political scientists use it toanalyze billions of social media

posts for insights into public opinion.(See graphic, p. 913.)

Even humanities scholars are em-bracing big data. Historians tap bigdata to gain new perspectives on his-torical figures. (See sidebar, p. 920.)Other scholars use it to study litera-ture. In the past, researchers investi-gating literary trends might read 10books and conclude that “these booksfrom this era show us how literatureis different” from another era, says BrettBobley, director of the National En-dowment for the Humanities Office ofDigital Humanities. “Today, a researchercould study thousands of books and

BY TOM PRICE

Getty Im

ages/F

razer Harrison

Actor Kunal Nayyar wears Google glasses to the 65th Annual Primetime Emmy Awards in Los Angeleson Sept. 22, 2013. Part of the big data revolution, theglasses contain a computer that can take pictures,respond to voice commands, search the Internet andperform other functions. Along with the nonstop

collection of data come concerns about loss of privacy.

* A gigabyte is 1 billion bytes. A byte is eightbits. A bit is one action of a computer switch.A byte commonly is equivalent to a single al-phanumeric character.

912 CQ Researcher

look at how language changes overtime, how the use of gender changes,how spelling changes.”

Uses of big data involve businesses,governments, political organizations andother groups vacuuming up massiveamounts of personal information fromcell phones, GPS devices, bank accounts,credit-card transactions, retail purchasesand other digital activities. The data oftenare gleaned from search engines, socialnetworks, email services and other on-line sources. The mountains of infor-mation compiled in this manner — andthe way they are analyzed and put touse — generate much of the criticismof big data, particularly after former NSAcomputer specialist Edward Snowdenrevealed in June that the agency hasbeen collecting Americans’ telephoneand email records, and The New YorkTimes revealed that the Drug Enforce-ment Administration had ordered AT&Tto hand over vast amounts of recordsabout its customers’ telephone and com-puter usage. 3

“I don’t really want to live in a totalsurveillance state where big brotherknows everything I do and has all thatinformation at its fingertips,” says JohnSimpson, privacy project director for Con-sumer watchdog, a consumer advoca-cy organization in Santa monica, Calif.Sen. Jay Rockefeller, D-w.va., has in-troduced a measure to allow Internetusers to prohibit websites from collect-ing any of their personal informationexcept when needed to provide a re-quested service to that person. Eventhen, the identity of the user would haveto be kept secret or the informationwould have to be deleted after the ser-vice was performed. The bill awaits ac-tion in the Senate Committee on Com-merce, Science and Transportation. 4

Big data differs from old-fashioneddata partly in the massive volume ofthe information being collected. Butother distinctions exist as well. Bigdata computers often crunch vastamounts of information in real time.And they can reach beyond structured

databases — such as a company’s cus-tomer list — to make sense of un-structured data, including Twitter feeds,facebook posts, Google searches,surveillance-camera images and customerbrowsing sessions on Amazon.com.

As a result, Internet platforms, suchas facebook or Google, act as big data“sensors,” gathering information aboutpeople just as a thermometer gatherstemperature information. In addition,Internet-connected optical, electronicand mechanical sensors are being in-stalled in ever-increasing numbers andlocations.

viktor mayer-Schonberger and Ken-neth Cukier offer striking illustrationsof the size of the data being collected— and the speed with which the vol-ume of that information is growing —in their 2013 book Big Data: A Revo-lution that Will Transform How WeLive, Work, and Think. If all the world’sdata were distributed among everyoneon Earth, they write, each personwould have 320 times more informa-tion than existed in third-century Egypt’sAlexandria Library. If today’s informa-tion were stored on stacks of compactdisks, the stacks would stretch fromEarth to the moon five times.

In 2000, only one-fourth of theworld’s information was stored digi-tally. Today 98 percent is, and theamount of digital data is doubling everytwo to three years, according to mayer-Schonberger, a professor at Oxford Uni-versity’s Internet Institute, and Cukier,data editor of The Economist. 5 Ac-cording to IBm, 90 percent of theworld’s data has been created in thelast two years, and that information isgrowing by 2.5 quintillion bytes eachday. 6 (A quintillion is 1 followed by18 zeroes. See sidebar, p. 922.)

for example, rapidly improving tele-scopes linked to powerful computersand the Internet are doubling the world’scompilation of astronomical data eachyear and making it available to astronomersaround the globe. Until it suffered a majormalfunction in may, the Kepler Space

BIG DATA AND PRIvACY

Source: Monica Stephens, “Geography of Hate,” Humboldt State University,http://users.humboldt.edu/mstephens/hate/hate_map.html#

Big Data Reveals Sources of Racist Tweets

Geographers at Humboldt State University in Arcata, Calif., used big data to create an interactive “Geography of Hate” map revealing where racist Twitter traffic originated. Hot spots predominated in the Midwest, South and Northeast, often in rural areas.

Where Most Racist Tweets Originated, June 2012-April 2013

Most hate

Some hate

To protect privacy, actual tweet locations have been aggregated to the county level and normalized by number of tweets.

Oct. 25, 2013 913www.cqresearcher.com

Telescope, for instance, measured thelight of 170,000 stars every 30 minutesin search of changes indicating the pres-ence of planets, according to AlbertoConti, a scientist for the Space Tele-scope Science Institute at Johns Hop-kins University. 7

As advances in technology relent-lessly expand big data’s capabilities, hereare some questions that scientists, busi-ness executives, privacy advocates andgovernment officials are debating:

Do big data’s benefits outweighthe risks?

mario Costeja was shocked whena Google search on his name returned

an 11-year-old Spanish newspaper no-tice about a financial transgression.Costeja, a Spaniard, had become delin-quent on social security contributions,creating a debt that landed him inlegal trouble in 1998 but that he hadlong since paid.

In a case still pending, Costeja askeda court to order the newspaper to re-move the notice from its online archivesand Google to block the notice fromsearch results. The court ruled in thenewspaper’s favor, but the Google rul-ing remains undecided. 8

Costeja’s plight reveals a key draw-back to big data: Once something en-ters an online database, it can be im-

possible to erase. Before digital data-bases and powerful search engines,the newspaper notice that Costeja en-countered in 2009 would have exist-ed only as a yellowed clipping in thenewspaper’s “morgue,” or perhaps beenburied in microfilm or old newspapersstored in a few libraries.

But with today’s search enginescapable of instantly exploring everydatabase connected to the Internet,old transgressions can be found byanyone with a computer connection.And polls indicate that people areconcerned about losing control ofinformation once it gets collectedand stored.

Most Americans View Data Collection NegativelyMore than half of Americans polled in May and June said the massive collection and use of personal information by government and business have a “mostly negative” impact on privacy, liberty and personal and financial security.* The polling was done just before former National Security Agency computer specialist Edward Snowden in June revealed widespread NSA domestic spying. Fewer than 40 percent of the respondents see the use of big data as “mostly positive.” Fewer than half trust how the government uses their personal data, and nearly 90 percent support the so-called “right to be forgotten.”

Note: Totals do not add to 100 because don’t know/refused to answer responses are not included.

* Pollsters contacted 1,000 people by telephone between May 29-June 2, 2013.

Source: “Allstate/National Journal/Heartland Monitor Poll XVII,” June 2013, www.theheartlandvoice.com/wp-content/uploads/2013/06/HeartlandTopline-Results.pdf

What is your viewpoint on . . .

Mostly negative,

because big data puts

privacy, safety, financial

security and liberty at risk.

Mostly positive, because big

data can improve the

economy, grow businesses,

increase public safety and

provide better service.

Would you support a federal law that would require online

companies to permanently delete personal data or activity if

requested by an individual (the so-called “right to be forgotten”)?

Being able to connect with

people all over the world and access information about

any subject is worth the potential privacy tradeoffs.

The ease of communicating

and locating information online

has made it too easy for personal information to be shared and is not worth the risks.

55%

38%

89%

8%

47% 47%

Yes Yes Support, strongly or somewhat

Oppose, strongly or somewhat

Yes Yes

Collection and Use of Big Data? A New Federal Privacy Law? Social Media?

914 CQ Researcher

Of 1,000 Americans surveyed thisyear, 55 percent perceive a “mostlynegative” impact from the collectionand use of personal information. (Seegraph, p. 913.) Two-thirds complainthey have little or no control overinformation collected about them. Three-fifths believe they can’t correct erro-neous data. Conducted for the Allstateinsurance company and National Jour-nal, and just before the NSA domes-tic spying became public, the poll

found that only 48 percent of Ameri-cans trust how governments, cell-phone companies and Internet providersuse that information. 9

Simpson of Consumer watchdogsays big data can harm individuals, es-pecially if the information is erroneousor used to draw incorrect conclusions.Organizations can collect informationfrom a variety of sources, then use itto “put together a profile about youthat potentially could be used against

you,” he says. Analysis might conclude,for example, that “you’re probably ahealth risk because you go to all thesesites about losing weight, and we cansee you’re buying too much booze.

“Similarly, you can slice and dicethe data you get from the web andother sources and produce somethingthat might be used to determinewhether you should get a job, whetheryou should get a loan, or what rateyou have to pay if you get the loan.”

with sophisticated enough analysis,organizations might “red-line people,”just as neighborhoods, suggests JulesPolonetsky, executive director and co-chair of the future of Privacy forum, awashington-based think tank. Such “web-lining,” as some are calling it, can causepeople to pay more for health insur-ance if they have traits associated withcertain illnesses, he explains, or a per-son’s credit rating could be affected byhis facebook friends’ financial histories.

Those are “probabilistic predictionsthat punish us not for what we havedone, but what we are predicted todo,” said mayer-Schonberger, the Ox-ford professor. 10

But Jeff Jarvis, a City University ofNew York journalism professor who writesfrequently about big data, says criticsfocus too much on the negatives whileignoring the positives. for example, whilemany deplore facial recognition tech-nology as an invasion of privacy thatcan empower stalkers and other preda-tors, Jarvis explains, “facial recognitiontechnology also can be used to find lostchildren or people on Alzheimer’s alertor criminals or terrorists.”

And despite the negative attitudestoward big data uncovered by the All-state/National Journal survey, somepoll respondents saw value in thetechnology. Nearly half said the pri-vacy tradeoffs are a fair price for beingable to connect with people aroundthe world and access information aboutalmost any subject within seconds.more than two-thirds said the collec-tion and analysis of information aboutthem would likely lead them to re-ceive more information about inter-esting products and services and bet-ter warnings about health risks. Nearlytwo in five said big data could enablegovernment and business officials tomake better decisions about expand-ing businesses, improving the econo-my, providing better services and in-creasing public safety. 11

Scientists in Boston, for example,are using probabilistic predictions totry to prevent suicide. funded by thefederal government’s Defense AdvancedResearch Projects Agency, the scien-tists are monitoring social media andtext postings by volunteers who areactive-duty military personnel or vet-erans. The scientists’ computers watchfor keywords associated with suicidein an effort to create an automatedsystem that will alert caregivers, rela-tives or friends when suicide-linkedexpressions are observed. 12

BIG DATA AND PRIvACY

Police at New York City’s counterterrorism center monitor more than 4,000security cameras and license plate readers in the Financial District and

surrounding parts of Lower Manhattan. Vast amounts of data are collected andstored by police departments and other government agencies. For example,

law enforcement agencies store photos taken by cameras mounted on patrol carsand on stationary objects. Governments store photos made for driver’s licensesand other identification documents. Many images are stored indefinitely and cannow be searched with facial-recognition technology across multiple databases.

Getty Im

ages/Joh

n Moo

re

Oct. 25, 2013 915www.cqresearcher.com

The financial Industry RegulatoryAuthority — a self-regulating body forthe securities industry — has turnedto big data to catch up with the rapidtechnological changes occurring in fi-nancial markets. In August 2012, theagency employed software that scanstrading activity for patterns that indi-cate suspicious practices. Results ledthe agency to launch 280 investiga-tions by July 2013. 13

Is big data speeding the erosionof privacy?

Security technologist Bruce Schneier,who has declared that “the Internet isa surveillance state,” is doubtful aboutprotecting privacy in the era of big data.

“If the director of the CIA can’tmaintain his privacy on the Internet,”Schneier said, “we’ve got no hope.” 14

Schneier, a fellow at Harvard LawSchool’s Berkman Center for Internetand Society, was referring to DavidPetraeus, who stepped down as di-rector of the spy agency last Novem-ber as his extramarital affair with for-mer military intelligence officer PaulaBroadwell was becoming public. Thecouple — presumably experts in covertactions — had taken steps to concealtheir relationship. To avoid blazingemail trails, they left messages to eachother in the draft folder of a sharedGmail account. And when Broadwellsent threatening emails to a womanshe thought was a rival for Petraeus’attention, she used a fake identity andaccessed the Internet from hotel net-works rather than her home. But bigdata tripped them up. 15

Analyzing email metadata — ad-dresses and information about the com-puters and networks used by thesenders (but not message content) —the fBI traced Broadwell’s threateningmessages to several hotels. from hotelrecords, they found one guest whohad stayed at those hotels on the datesthe emails were sent: Broadwell. 16

Combining Internet data with offlinedata occurs constantly across the globe,

as companies build dossiers on their cus-tomers to offer personalized productsand services and to target advertising orsell the information to others.

“This is ubiquitous surveillance,”Schneier said. “All of us [are] beingwatched, all the time, and that data[are] being stored forever. . . . It’s ef-ficient beyond the wildest dreams ofGeorge Orwell,” whose novel 1984 in-troduced the world to the all-seeing“Big Brother.” 17

Internet users voluntarily post pho-tos of themselves and their friends onsocial media. Surveillance cameras takephotos and videos of people in pub-lic and private places. Police depart-ments store photos taken by camerasmounted on patrol cars and on sta-tionary objects. Governments store pho-tos made for driver’s licenses and otheridentification documents. Some policecruisers are equipped with devices thatenable remote searches of photo data-bases. many of those images are storedindefinitely and can now be searchedwith facial-recognition technology acrossmultiple databases.

Phone companies store call records,and mobile phone companies storerecords of where phones were used.Analysis of those records can reveal wherea phone was at any time, down to aspecific floor in a specific building. 18

Big-data companies say they protectthe identities of individuals behind thedata. But scientists say even “anonymized”phone records — those with users’ namesremoved — can reveal identities.

many phone numbers are listed inpublic directories. Researchers at themassachusetts Institute of Technology(mIT) and Belgium’s UniversitéCatholique de Louvain discovered thatthey could identify anonymized cellphone users 95 percent of the time ifthey knew just four instances of whenand where a phone was used. 19 Com-panies’ privacy policies also can bemeaningless if they accidentally re-lease information or if their businesspartners don’t honor the policies.

for example, due to a computer bug,facebook allowed unauthorized peopleto access the phone numbers and emailaddresses of 6 million users and nonusersbetween mid-2012 and mid-2013. Thesocial network admitted that it had beensecretly compiling data about usersfrom sources beyond facebook andhad been gathering information aboutnonusers as well. 20

In 2010, the network had admittedthat many third-party applications avail-able through facebook were collect-ing data on users and that the apps’makers were passing the informationon to advertisers and tracking com-panies, contrary to facebook’s statedpolicy. Some also were collecting anddistributing information about the app-users’ friends. 21

Such revelations have left manyAmericans worried about their priva-cy. In the Allstate/National Journal poll,90 percent of Americans said they haveless privacy than previous generations,and 93 percent expect succeeding gen-erations to have even less. 22 An 11-nation survey by Ovum, a London-basedbusiness and technology consulting firm,found that two-thirds of respondentswould like to prevent others from track-ing their online activities. 23

“Everything we do now might beleaving a digital trail, so privacy isclearly on the way out,” says DavidPritchard, a physics professor at mITwho studies teaching effectiveness bymining data gathered during massiveopen online courses (mOOCs), whichmIT and other universities offer forfree over the Internet. 24

By recording and analyzing students’keystrokes, Pritchard can see what pageof an online textbook they viewed, forhow long and what they did beforeand after that. He can also see whoparticipated in the course’s onlineforum. Comparing such study habitswith performance on tests will enablehim to “figure out what tools studentsuse to get problems right or wrong,”he says. But he is quick to note that

916 CQ Researcher

he obtains students’ permission to mon-itor their keystrokes for the study.

while many people value their per-sonal privacy, he says, “we’re going tohave to work a lot harder to maintainsome aspects of it.”

Others say concerns about big dataspeeding the loss of privacy are overblown.

Patrick Hopkins, a philosopher whofocuses on technology, advocates apragmatic approach. “we need to getrealistic about privacy in the informa-tion age. This notion of privacy beingan inalienable right is recent,” saysHopkins, who chairs the millsaps Col-lege Philosophy Department in Jack-son, miss., and is an affiliate scholar

with the Institute for Ethics and Emerg-ing Technologies, an international or-ganization based in Hartford, Conn.,that studies technology’s impact on so-ciety. “It’s not in the Constitution. It’snot in the french Declaration of Rights,”he says, referring to the “Declarationof the Rights of man and of the Citi-zen,” adopted by the french NationalAssembly in 1789. 25 “I’m only wor-ried about privacy if there’s somethingthat’s going to harm me.”

“These days, I know if I drive downa street I will be monitored,” says Hop-kins. “That does not bother me. There’sno harm to me, and I might benefit”if surveillance catches a criminal.

Similarly, he’s not bothered by the gov-ernment mining big data in search of ter-rorists. “Asking if you’re willing to giveup your privacy for security is a falsedilemma,” Hopkins says. “It’s not like youhave 100 percent of one and zero per-cent of the other. I would be willing togive up a bit of privacy for security.”

Citing surveys showing that youngpeople are less worried about privacythan their elders, mike Zaneis, seniorvice president for public policy at theInteractive Advertising Bureau, a NewYork-based trade association for the on-line advertising industry, says, “It’s notbecause they don’t care about privacy.It’s that they understand the value ofthe exchange, and they are willing togive up more information as long asthey are receiving some benefit.”

Andreas weigend — former chief sci-entist at Amazon.com who now directsStanford University’s Social Data Lab —says many people’s notion of privacy is“romantic” and “belongs in the Roman-tic Age. we need to understand a 21st-century notion of privacy, and it can’t bewhat some people wish was the case.”

Should the federal governmentstrengthen its regulation of bigdata?

Privacy advocates and big databusinesses both view Europe as in-structive in debates about governmentregulation — but in different ways.many privacy advocates want the Unit-ed States to adopt Europe’s strongerregulatory approach, and business ad-vocates want Europe to become morelike the less-regulated U.S.

The United States has laws that pro-tect certain kinds of data, such as fi-nancial and medical records, while allEuropean countries have broad right-to-privacy laws. for example, manyEuropean countries require organiza-tions to obtain permission before col-lecting information about an individ-ual and allow those individuals toreview and correct the data. 26 Lastyear, the United Kingdom forbade

BIG DATA AND PRIvACY

Health Care Providers, Employers Most Trusted to Protect Data

Consumers said they trusted doctors, hospitals and employers the most to handle individuals’ personal data responsibly, according to a poll taken in late May and early June. Political parties, the media and social media websites were trusted the least. The poll was conducted just before Edward Snowden’s revelations that the National Security Agency has been spying on private citizens.

Note: Totals do not add to 100 because don’t know/refused to answer responses are not included.

Source: “Allstate/National Journal/Heartland Monitor Poll XVII,” June 2013, www.theheartlandvoice.com/wp-content/uploads/2013/06/HeartlandTopline-Results.pdf

How much do you trust these groups or people to use information about you responsibly?

A great Not much/ deal/some not at all

Health care providers 80% 20%

Employers 79 19

Law enforcement 71 28

Insurance companies 63 35

Government 48 51

Cell phone/Internet providers 48 50

Political parties 37 61

Media 29 69

Social media sites 25 70

Oct. 25, 2013 917www.cqresearcher.com

online organizations from tracking in-dividuals’ web activities without con-sent. And the European Union (EU) isconsidering regulations that wouldapply to all 28 member states. 27

Zaneis, of the Interactive AdvertisingBureau, says U.S. residents’ privacy isguarded adequately by the laws regu-lating data gathered for specific pur-poses, including protecting children’sonline privacy.

The federal Trade Commission ischarged with protecting Americans’ pri-vacy, Zaneis says, and the marketingindustry has an effective self-regulatoryregime. for instance, he says, morethan 95 percent of online advertisingcompanies participate in a programthat enables consumers to preventtheir information from being sent tothird parties for marketing purposes.

That’s not good enough for manyprivacy advocates, however. Individu-als should not have to opt out of data-collection programs, they argue. Instead,they contend, companies should haveto ask individuals to opt in. Critics alsomaintain that self-regulation is inadequate.

“This isn’t something the free mar-ket can fix,” argued Schneier, of Har-vard’s Berkman Center for Internet andSociety. “There are simply too manyways to be tracked” by electronic de-vices and services. “And it’s fanciful toexpect people to simply refuse to use[online services] just because theydon’t like the spying.” 28

Supporting Schneier’s contention,forrester Research reported that just18 percent of Internet users set theirbrowsers to ask organizations not totrack their online activities — a re-quest the organizations can, however,legally ignore. 29 Zaneis says most peo-ple don’t change their default settings— whether to opt in or opt out.

Privacy advocates want to domuch more than block informationto third parties. In the United States,Consumer watchdog supports Sen.Rockefeller’s proposal to enable in-dividuals to ban Internet sites from

collecting information about them on-line except when the collection is nec-essary to provide a requested serviceto the individual. 30

Americans “should be empoweredto make their own decisions aboutwhether their information can betracked and used online,” Rockefellersaid when he introduced the legisla-tion in march. 31 Consumer watch-dog’s Simpson would extend that rightto all data, not just online data. “Youas a consumer should have the rightto control what data companies col-lect from you and how to use it, andif you don’t want them to collect ityou ought to be able to say ‘no,’ ” heargues. “Right now, we have this in-sidious ecosystem that’s based on spy-ing on people without them reallyknowing who’s looking and whatthey’re doing and that they’re puttingtogether giant dossiers on you.”

Americans also want the right to re-move online data that they don’t like.fully 89 percent of respondents in theAllstate/National Journal survey supportedthis so-called “right to be forgotten.” 32

Computer scientist Jaron Lanier —an early developer of virtual reality tech-nology who currently works at microsoftResearch — suggests that people shouldbe paid for their data. The value of thatpersonal information probably runs to$1,000 or more a year, he estimates.Companies already compensate con-sumers for the information they give upwhen using club cards at grocery storesand other retailers, by giving them “mem-ber” discounts, he notes.

But it’s not just the value of onlineadvertising that counts, Lanier says.“what if someone’s medical recordsare used to make medicine?”

Zaneis argues that Internet users al-ready are compensated via the ser-vices they receive free on the web.“Ninety-nine percent of the contentand services that we consume onlineare offered for free to the consumer,”he says. “They’re paid for by the con-sumer giving something. Sometimesit’s data. Sometimes it’s their eyeballson advertising.” If many people blockedaccess to their data, he says, it couldkill the Internet as we know it.

Jeff Jarvis, a City University of New York journalism professor, says critics focustoo much on the negatives of big data while ignoring the positives. While

many deplore facial recognition as an invasion of privacy that can empower stalkers, the technology also can be used to find lost children,

as well as criminals or terrorists, he says.

Getty Im

ages/B

urda M

edia/S

ean Gallup

918 CQ Researcher

BIG DATA AND PRIvACY

Stanford’s weigend calls that “theunderlying concept [of the Internet]:Give to get.”

Jim Harper, director of informationpolicy studies at the Cato Institute, alibertarian think tank in washington,says government should stay out of thedebate. “I put the burden on peopleto maintain their own privacy,” he says.“It’s the responsibility of web users torefuse interaction with sites they don’twant to share information with, to de-cline to share cookies,” which are smallfiles that a website places in a com-puter’s browser to identify the brows-er on future visits. “Don’t post onlinethings you don’t want posted online.

“when you walk outside your house,people can see you and gather whatthey learn and use it any way theywant. I think the online world has towork the same.”

Jarvis, the journalism professor, ar-gues that Internet users have an oblig-ation to share their information with thewebsites they use. “You have an impacton the sustainability of the propertieswhose content you’re getting,” he says.By blocking your information, “you arechoosing to make yourself less valuableto that property. You’ve made this moralchoice not to support these sites.” Newsmedia, which are struggling to surviveas it is, would be especially damagedif readers withheld information neededto make ads more valuable, he adds.

Jarvis does not argue against all pri-vacy laws, such as those protecting med-ical records, for instance. And he favorsextending first-class mail’s privacy guar-antee to all private communication. But“we should not be legislating accordingto technology, because [technology] canbe used for good as well as bad,” hecontinues. “we shouldn’t regulate thegathering of information” online becausethat would be restricting what peoplecan learn and know. “we should regu-late behavior we find wrong.”

Jarvis also calls for respecting “pub-licness” as well as privacy. “The rightto be forgotten has clear implications

for the right to speak,” he says. “If Itake a picture of us together in pub-lic, and you, say, force me to take thatpicture down [from the Internet], you’reaffecting my speech.”

Deciding how to regulate big datais “quite complex,” Hopkins, of millsapsCollege, says. “we need to recognizewhat the costs would be.”

BACKGROUNDData Overload

I n mid-1945, as world war II sloggedtoward its end, vannevar Bush, di-

rector of the U.S. Office of ScientificResearch and Development and latera key advocate for creation of the Na-tional Science foundation, conjuredup the future of big data and the so-called Information Age.

In a lengthy essay in The Atlanticentitled “As we may Think,” Bush wrote:

“The summation of human experi-ence is being expanded at a prodigiousrate, and the means we use for thread-ing through the consequent maze . . .is the same as was used in the days ofsquare-rigged ships. . . . The investigatoris staggered by the findings and conclu-sions of thousands of other workers —conclusions which he cannot find timeto grasp, much less to remember.” 33

Technology probably will solve theproblem, Bush suggested, hinting at fu-ture inventions such as desktop com-puters, search engines and even Google’snew Glass headgear, eyeglasses thatcontain a computer that can take pic-tures, respond to voice commands, searchthe Internet and perform other func-tions as wearers move from place toplace. Bush envisioned a miniature au-tomatic camera that a researcher wouldwear on his forehead to make micro-film photographs as he worked. He alsopredicted that one day a microfilmed

Encyclopaedia Britannica could bekept in a matchbox and a million booksin part of a desk.

Perhaps most striking, he suggestedthe personal computer and the searchengine in a desk-sized contraption hedubbed the “memex.” “On the top areslanting translucent screens, on whichmaterial can be projected for conve-nient reading,” he wrote. “There is akeyboard, and sets of buttons and levers.”Inside are all those compressed docu-ments, which the memex user will callto a screen the way the brain remem-bers — through “association of thoughts”rather than old-fashioned indexing, assearch engines do today.

Bush knew a great deal about data,having invented the differential analyz-er — a mechanical computer for work-ing complex equations — in 1931. Buthe was far from the first to bewail dataoverload and to seek ways of ad-dressing it. The Old Testament book ofEcclesiastes laments: “Of making manybooks there is no end; and much studyis a weariness of the flesh.” 34 The first-century AD philosopher Lucius An-naeus Seneca grumbled that “the abun-dance of books is distraction.” 35 Bythe 15th century, Europeans felt crushedunder information overload after Ger-man printer Johannes Gutenberg in-vented mechanical printing with move-able type — the printing press.

“Suddenly, there were far morebooks than any single person couldmaster, and no end in sight,” HarvardUniversity historian Ann Blair ob-served of Europe after Gutenberg’s in-vention around 1453.

Printers “fill the world with pamphletsand books that are foolish, ignorant,malignant, libelous, mad, impious andsubversive,” the Renaissance humanistErasmus wrote in the early 16th cen-tury, “and such is the flood that eventhings that might have done some goodlose all their goodness. . . . Is thereanywhere on earth exempt from theseswarms of new books?” 36

Continued on p. 920

Oct. 25, 2013 919www.cqresearcher.com

Chronology1960s-1995Development of Internet opensnew ways to create and gatherdata.

1967Defense Department’s AdvancedResearch Projects Agency (ARPA)funds research that leads to cre-ation of the Internet.

1969UCLA and Stanford Research Instituteestablish the first ARPANET link.

1980Computer scientist I. A. Tjomslanddeclares that data expand to fillthe available storage space. . . .Phrase “big data” is first used inprint.

1989Internet opens to general public.

1991Commercial activities are allowed onInternet for first time. . . . Point-and-click navigation invented. . . . worldwide web name coined.

1992Bill Clinton-Al Gore presidential/vice presidential campaign makesInternet an effective political tool.

1994first white House website launched.

1995Library of Congress puts legislativeinformation online.

1997-PresentBig data becomes major forcethrough use of search engines,social media and powerfulcomputers.

1998Google search engine goes online.. . . American information scientistmichael Lesk predicts increase instorage capacity will mean no datawill have to be discarded.

2000A quarter of the world’s storeddata has been digitized.

2002LinkedIn business-networking sitelaunched.

2003more data created this year thanin all previous human history.

2004“Thefacebook” is launched forHarvard undergraduates. . . . flickrphoto-sharing platform goes online.. . . walmart is storing 460 terabytes(460 trillion bytes) of data aboutits customers. Other large retailersalso are storing large amounts ofdata.

2005YouTube video-sharing site created.

2006Twitter goes online. . . . facebookopens to anyone age 13 or over.

2007AT&T begins giving police accessto U.S. phone records dating backto 1987 for use in criminal investi-gations.

2008Sen. Barack Obama uses big datato win presidential election.

2010Super-secret National SecurityAgency (NSA) processes 1.7 billionintercepted communications daily.. . . Google discloses that carstaking street-level photos for

Google maps also captured infor-mation from unprotected wi-finetworks. . . . Conservative activiststurn Tea Party into political forceusing social media.

2012Google fined $22.5 million for by-passing browser privacy controls.. . . U.S. government launches$200 million Big Data Researchand Development Initiative to ad-vance technologies needed to ana-lyze and share huge quantities ofdata. . . . financial Industry Regula-tory Authority taps big data to un-cover suspicious securities practices.. . . NSA begins building $2 billionfacility for processing big data. . . .federal Trade Commission (fTC)begins inquiry into data brokers’activities. . . . Social media activismhelps prolong Republican presi-dential nominating process. . . .Obama intensifies use of big datato win re-election.

2013After strong negative customer re-action, facebook temporarily backsaway from plan to utilize users’pictures and postings in ads. . . .Google fights lawsuit over readingGmail content. . . . Survey reveals55 percent of Americans see“mostly negative” impact fromcollection of big data about peo-ple. . . . fTC official says peopleshould be able to restrict use oftheir data in commercial databases.. . . Acxiom data broker lets peo-ple correct and delete informationin its database. . . . Tappingmedicare’s big data, U.S. govern-ment reports what 3,000 U.S. hos-pitals charge for 100 differenttreatments. . . . Ninety percent ofworld’s data have been created inthe last two years, and data aregrowing by 2.5 quintillion bytesdaily. . . . Ninety-eight percent ofworld’s data are stored digitally.

920 CQ Researcher

The products rolling off the press-es did not overwhelm scholars for long,however. New technologies were de-veloped to address the challenges the

presses had created: public libraries,large bibliographies, bigger encyclope-dias, compendiums of quotations, out-lines, indexes — predecessors of Read-er’s Digest, Bartlett’s Familiar Quotations

and Internet aggregators, Blair noted.That always happens, the late sci-

ence historian Derek Price said, be-cause of the “law of exponential in-crease,” which he propounded while

BIG DATA AND PRIvACY

Continued from p. 918

Big data can be used for more than plumbing databas-es for consumer preferences about dish soap or spy-ing on citizens’ phone calls. Sometimes supported by

grants from the National Endowment for the Humanities, schol-ars in history and literature, for example, are digging into newdatabases of old documents to reexamine common assump-tions about authors and important historical figures.

Researchers at Stanford University, for instance, have dis-covered that England was much less important to frenchphilosopher voltaire and the french Enlightenment than wascommonly thought.

Tapping into Oxford University’s digitized collection of 64,000letters written by 8,000 historical figures from the early 17th to mid-19th centuries, the Stanford team analyzed the letters’ metadata —the names, addresses and dates, but not their content — and dis-covered little communication between voltaire and the English.

Using big data “allows us to ask questions that were sim-ply not possible to be asked in a systematic manner before,and to analyze our data in ways that would have been im-possible or incredibly difficult to do,” explains Dan Edelstein,a Stanford associate professor of french and history. The analysisof the Oxford papers “allowed me to see voltaire’s correspondencenetwork in a new light and to formulate hypotheses that I prob-ably would not have hit upon.”

Similarly, Caroline winterer, a Stanford history professor anddirector of the Stanford Humanities Center, and her colleaguesprobed the papers of Benjamin franklin, collected by Yale Uni-versity scholars beginning in 1954 and digitized with supportfrom the Packard Humanities Institute, in Los Altos, Calif. Theirfindings challenged the perception of franklin as a cosmopolitanfigure, at least before 1763, when he returned to America aftera six-year stay in England. Almost all of the letters franklin re-ceived while in England were from England or America. 1

Living in London made franklin “a typical Anglo-Atlantic fig-ure” rather than “an international man of mystery,” Edelsteinexplains, because “the American colonies and England reallyhad shared a culture at this time.”

Analyzing big data, however, does not replace traditionalscholarship, Edelstein says. “To find out if you’re on the righttrack, you need to contextualize your results and to interpretthem, and that means being very familiar with your data andwith the scholarship of your period. when you see Englanddoesn’t seem to be that important in voltaire’s map [of corre-spondence], that only means something if you already are fa-

miliar with certain scholarship about voltaire that establishesEngland as an important place for him.

“what does it mean that you don’t see a lot of letters there?”he asks. “It could mean he didn’t care about England. It couldmean he had other sources of information about England. Couldit be we just lost the letters to England? You have to spend timein rare-books collections figuring out these questions.” They con-

cluded that “there’s not really a good reason why more of theEnglish letters would have been lost than his [other] letters.”

Brett Bobley, who runs the national endowment’s Office ofDigital Humanities, says big data techniques greatly expandwhat humanities scholars can accomplish.

“Humanities scholars study books, music, art — and thosevery objects that they study are increasingly in digital format,”he explains. “If you study the Civil war and you read old news-papers, now you can digitally access thousands and thousandsof newspaper pages from the Civil war era, and you can usedigital tools to analyze the data in those newspapers far morethan you could possibly read in your lifetime.”

— Tom Price

1 Claire Rydell and Caroline winterer, “Benjamin franklin’s CorrespondenceNetwork, 1757-1763,” mapping the Republic of Letters Project, Stanford Uni-versity, October 2012.

Crunching Data Sheds New Light on History‘Big data allows us to ask questions that were not possible before.”

Computer analysis of letters received by BenjaminFranklin during a six-year stay in England challenged perceptions of the famed

founding father as a cosmopolitan figure.

AFP/G

etty Im

ages/P

aul J. R

ichards

Oct. 25, 2013 921www.cqresearcher.com

teaching at Yale University in 1961.Scientific knowledge grows exponen-tially because “each advance generatesa new series of advances,” he said. 37

Before the Internet and powerfulcomputers made today’s big data pos-sible, some scholars saw it coming. In1980, computer scientist I. A. Tjoms-land proclaimed a corollary to Parkin-son’s first Law (work expands so asto fill the time available for its com-pletion.) “Data expands to fill the spaceavailable,” he said, because “users haveno way of identifying obsolete data,”and “the penalties for storing obsoletedata are less apparent than are thepenalties for discarding potentially use-ful data.” 38

In 1997, while working on an elec-tronic library project called CORE (Chem-ical Online Retrieval Experiment), in-formation scientist michael Lesk predictedthat advances in storage capacity wouldcreate unlimited space for data.

“In only a few years, we will beable [to] save everything — no infor-mation will have to be thrown out, andthe typical piece of information will neverbe looked at by a human being,” saidLesk, who now is a professor of libraryand information science at RutgersUniversity. 39

Birth of Big Data

T echnological advances often comeabout through government initia-

tives. for instance, the 1890 U.S. Cen-sus used machine-read punch cardsfor the first time to tabulate the re-sults, enabling the complete census tobe reported in one year instead ofeight. That, according to Big Data co-authors Cukier and mayer-Schonberger,was the beginning of automated dataprocessing. 40

In vannevar Bush’s time, govern-ment demand drove the developmentof computing machines in efforts to crackenemy codes during world war II. Afterthe war, U.S. spy agencies sought ever-

more-powerful technology to collect,store and process the fruits of theirespionage.

By 2010, the NSA was processing1.7 billion intercepted communications

every day. In 2012, it began constructionof a $2 billion facility in Utah designedto deal with the huge amounts of datait collects. 41

U.S. national security concerns alsodrove the creation and initial devel-opment of the Internet. The DefenseAdvanced Research Projects Agencyfunded the efforts in order to facili-tate information exchange among mili-tary research facilities. The first Inter-

net link occurred when a Universityof California, Los Angeles, computerlogged onto a Stanford Research In-stitute computer 360 miles away onOct. 29, 1969.

Although the concept of an “infor-mation explosion” had been aroundsince 1941, the phrase “big data” wasused first by the late University ofmichigan sociologist Charles Tilly in a1980 paper for the university’s Centerfor Research on Social Organization,according to the Oxford English Dic-tionary. “None of the big [social histo-ry] questions has actually yielded to thebludgeoning of the big-data people,”

In the days before electronic computers, keypunch operators entered data onmachine-readable cards. The U.S. government used punch cards in 1890 totabulate census results, enabling the complete results to be reported in

one year instead of eight, which some scholars mark as the beginning of automated data processing.

Getty Im

ages/M

arvin Lazarus

922 CQ Researcher

Tilly wrote of historians using com-puters and statistics. 42

The Internet didn’t open to the gen-eral public until the late 1980s, whenmCI mail and CompuServe offeredemail services. The public’s first full-service Internet access came througha provider called “The world” in 1989.

The early ’90s saw the beginningsof commercial activity on the Internetand the coining of the term worldwide web. The additions of point-and-click browsing and graphics increasedthe Internet’s popularity. 43

Data and Politics

D uring the 1992 presidential cam-paign, former Arkansas Gov. Bill

Clinton and his running mate, Sen. AlGore of Tennessee, turned the web into

a campaign tool, then launched the firstwhite House website in 1994. In 1995,the Library of Congress started an on-line legislative information website —Thomas.gov — named after former Pres-ident Thomas Jefferson, who sold hispersonal book collection to the libraryafter the British burned Congress’ orig-inal holdings during the war of 1812.

In 2003, vermont Gov. Howard Deanscaled new heights in Internet cam-paigning by organizing supporters, pub-licizing campaign events and raisingfunds online. The efforts made himthe most successful fundraiser in thehistory of the Democratic Party andenabled him to take an early lead inthe campaign for the 2004 Democraticpresidential nomination.

Dean’s quest for the nomination failed.But four years later, Illinois Sen. BarackObama took a quantum leap over

Dean, winning the white House afterraising a record $500 million online andmaking the Internet and big data anintegral part of his campaign. Obamacompiled more than 13 million emailaddresses, attracted a million-memberaudience for text messages, and creat-ed a 90-worker digital staff to contributeto communication, fundraising and grass-roots organizing. He campaigned onfacebook, Twitter and YouTube, andeven bought ads in video games. 44

mark mcKinnon, an adviser to Georgew. Bush’s 2000 and 2004 campaigns, de-scribed the Obama run as a “seminal,transformative race” that “leveraged theInternet in ways never imagined.” 45

Two years later, conservative activistsemployed online social media to turnthe Tea Party into a political force thathelped Republicans take control of theU.S. House. In 2012, social media ac-

BIG DATA AND PRIvACY

T he term big data showed up this year in an old-schooldatabase — the Oxford English Dictionary, which de-fined it as “data of a very large size, typically to the ex-

tent that its manipulation and management present significantlogistical challenges; [also] the branch of computing involvingsuch data.” 1

Big data’s information databases are so massive they requirepowerful supercomputers to analyze them, which increasinglyis occurring in real time. And big data is growing exponen-tially: The amount of all data worldwide is expanding by 2.5quintillion bytes per day.

In addition to analyzing large, traditional databases, big-datapractitioners collect, store and analyze data from unstructureddatabases that couldn’t be used in the past — such as emailmessages, Internet searches, social media postings, photographsand videos. The process often produces insights that couldn’tbe acquired in any other way.

Analyses of big data can uncover correlations — relation-ships between things that may or may not establish cause andeffect. Some analysts say the lack of clear cause and effect isa weakness, but others say mere correlations often are sufficientto inform decisions.

“Correlations are fine for many, many things,” says JeffHawkins, cofounder of Grok, a company that makes softwarefor real-time data analysis. “You don’t really need to know the

why . . . as long as you can act on it.” for instance, he con-tinues, “You don’t really need to understand exactly why peopleconsume more energy at 2 in the afternoon” in order to decidewhether to buy or sell it at that time of the day. “If I’m just try-ing to make a better pricing strategy,” the correlation is enough.

Databases are measured in “bytes,” which describe the amountof computer action required to process them. A byte is eightbits. A bit is one action of a computer switch. A byte commonlyis equivalent to a single alphanumeric character.

Discussing big data requires huge numbers with unfamiliarnames, such as:

• Quadrillion: 1 followed by 15 zeroes.• Quintillion: 1 followed by 18 zeroes.• Sextillion: 1 followed by 21 zeroes.• Septillion: 1 followed by 24 zeroes.• Terabyte: 1 trillion bytes.• Yottabyte: 1 septillion bytes.• Googol: 1 followed by 100 zeroes (the inspiration for the

Google search engine’s name).• Googolplex: 1 followed by a googol zeroes — that is,

1 followed by 100 zeroes 100 times.Googol and googolplex exist primarily for mathematical

amusement. The 9-year-old nephew of Columbia Universitymathematician Edward Kasner, milton Sirotta, coined the terms.The boy defined googolplex as a 1 followed by “writing zeroes

from Yottabytes to Googolplexes: Big Data ExplainedFor some, it’s “writing zeroes until you get tired.”

Oct. 25, 2013 923www.cqresearcher.com

tivism helped dark-horse candidates stayalive to prolong the 2012 Republican pres-idential nominating contest. But Obama— adding a robust social media pres-ence to his big-data mix of information-collection, data-processing and Internetcommunication — won a second term.

Social media — now a major sourceof big data — emerged in the early2000s, with the LinkedIn business net-working site launching in 2003 andfacebook (first called Thefacebook)beginning to serve Harvard under-graduates in 2004. The flickr photo-sharing platform went online in 2004and the YouTube video-sharing site in2005. Twitter launched in 2006, andfacebook opened to anyone 13 orolder that same year. 46

A decade earlier, Stanford Universitygraduate students Larry Page and SergeyBrin began to build a search engine.

They went online in 1997 with the nameGoogle, a play on the number googol,which is 1 followed by 100 zeroes. Pageand Brin chose the brand to representthe massive challenge of trying to searchthe entire Internet. 47 Google becamethe most profitable Internet companyon the Fortune 500 list by continuallyinnovating and adding new productsand services, such as the Google+ so-cial media platform, Google maps andthe street-level and satellite photos thataccompany the maps. 48

It also stumbled into some signifi-cant controversies, such as when it wasdiscovered that Google cars taking street-level photos in 2010 also were collect-ing information from unprotected wi-fi networks, including email addressesand passwords. The federal TradeCommission (fTC) fined the company$7 million for improper data collection

that Google said was unintentional. 49

Last year, the fTC fined Google another$22.5 million for bypassing privacy con-trols on Apple’s Safari browser. 50

while online businesses such asGoogle and facebook are famous —or infamous — for their use of bigdata today, brick-and-mortar retailershave been mining customer data sincebefore some online companies wereborn. By 2004, for example, walmarthad loaded 460 terabytes of customerdata onto its computers — more thantwice as much information as was thenavailable on the Internet and enoughto begin predicting consumer behavior“instead of waiting for it to happen,”according to Linda Dillman, the com-pany’s chief information officer. 51

Target, too, has collected vast amountsof information about its customers foryears and uses it to focus its marketing.

until you get tired,” Kasner related in the 1940 book Mathematicsand the Imagination, which he wrote with fellow mathematicianJames Newman. 2

Other big data jargon includes:• Metadata: Information about information. with email, for

instance, metadata include the addresses, time sent, networks usedand types of computers used, but not the message content.

many who collect metadata — including companies andgovernment investigative agencies — say they don’t invadeprivacy because they don’t collect the content or identify spe-cific individuals. But critics say metadata can reveal a greatdeal about an individual. Analyzing records of phone calls be-tween top business executives could suggest a pending merg-er, calls to medical offices could hint at a serious illness andrecords of reporters’ communications might reveal the identi-ty of confidential sources, according to mathematician SusanLandau, a former Sun microsystems engineer and author ofSurveillance or Security? The Risks Posed by New WiretappingTechnologies. 3

• Cookie: A small file, inserted into a computer’s browserby a website, which identifies the browser. Different kinds ofcookies perform different functions. They can relieve users ofthe need to provide the same identification information eachtime they visit a website. Cookies also can track the browser’sactivities across the Internet. A visited website can allow anoth-

er, unvisited website to place a cookie — which is called athird-party cookie – into a computer without the web surfer’sknowledge.

• Do not track: A request a browser can send to a visitedwebsite asking that the site not set a tracking cookie. The sitecan ignore the request.

• The cloud: Computing and computer storage capabilitiesthat exist beyond the user’s computer, often accessed throughthe Internet.

• Opt in and opt out: Telling a website that you do ordon’t want to do something. Online companies tend to preferthe opt-out system, because the web surfer specifically has torequest that a site not set a cookie or track a browser. Privacyadvocates tend to prefer the opt-in system.

• Analytics: The process of finding insights in data.

— Tom Price

1 Oxford English Dictionary (2013), www.oed.com/view/Entry/18833#eid301162177.2 Edward  Kasner and James  Newman, Mathematics and the Imagination(2001), p. 23; Eric weisstein, “Edward Kasner (1878-1955),” Eric Weisstein’sWorld of Science, wolfram Research, http://scienceworld.wolfram.com/biography/Kasner.html.3 Jane mayer, “what’s the matter with metadata?” The New Yorker, June 6,2013, www.newyorker.com/online/blogs/newsdesk/2013/06/verizon-nsa-metadata-surveillance-problem.html.

924 CQ Researcher

BIG DATA AND PRIvACY

for example, having determined whatproducts pregnant women frequentlybuy at which stage of pregnancy, theretailer can predict which of its cus-tomers are pregnant and send promo-tions to them precisely when they’d like-ly be interested in certain products.

The process can have unintendedconsequences, however, such as whenan angry father demanded to know whyTarget was sending coupons for babyclothes to his teenage daughter, only tolearn later that she was expecting. 52

CURRENTSITUATION

Hunting Criminals

A fter a summer-long public focus onthe National Security Agency’s sur-

veillance practices, September broughtadditional revelations about governmentspying — this time by domestic law en-forcement agencies employing big datastrategies, primarily to hunt for criminals.The New York Times reported that

since 2007 AT&T had given federal andlocal police access to U.S. phone records— including phone numbers, locationsof callers and date, time and durationof the calls — dating back to 1987. 53

Under the federally funded program,called the Hemisphere Project, fourAT&T employees are embedded in jointfederal-local law enforcement offices inAtlanta, Houston and Los Angeles, thenewspaper said. They tap into AT&T’sdatabase whenever they receive “ad-ministrative subpoenas” from the DrugEnforcement Administration. The dataare used to investigate a variety of crimes,however, not just drug cases.

The AT&T database contains recordsof every call that passes through thecompany’s switches, including thosemade by customers of other phone

companies. The AT&T’s database maybe larger than the NSA’s — which storesrecords of phone numbers, date, timeand duration of almost all U.S. calls butdeletes them after five years — and itgrows by about 4 billion calls a day.Justice Department spokesman Brian fal-lon said the project “simply streamlinesthe process of serving the subpoena tothe phone company so law enforce-ment can quickly keep up with drugdealers when they switch phone num-bers to try to avoid detection.”

Officials said the phone records helpthem find suspects who use hard-to-trace throwaway cell phones. Daniel Rich-man, a former federal prosecutor whoteaches law at Columbia University, de-scribed the project as “a desperate ef-fort by the government to catch up” withadvancing cell-phone technology.

September also brought disclosuresthat the NSA surveillance looked at moreAmericans’ records than had previous-ly been reported. Documents releasedin a lawsuit revealed that, from may2006 to January 2009, the NSA improperlyrevealed phone numbers, date, time andcall duration information about Ameri-cans who were not under investigation.The NSA gave the results of databasequeries to more than 200 analysts fromother federal agencies without adequatelyshielding the identities of the Americanswhose records were revealed.

The agency also tracked phone num-bers without establishing “reasonableand articulable suspicion” that the num-bers were tied to terrorists, as was re-quired by a federal court order thatallowed the NSA data collection. 54

‘Reclaim Your Name’

J ust as the Justice Department is try-ing to catch up with criminals using

cell-phone technology, regulatory agen-cies are trying to keep up with thetechnology employed by industries theyoversee. The federal Trade Commis-sion, for instance, is investigating how

data brokers operate. Such companiescollect information about people, thensell it to others.

The commission asked nine compa-nies about the kind of information theygather, the sources of the information,what they do with it and whether theyallow individuals to view and correctthe information or to opt out of havingtheir information collected and sold. 55

In August, fTC Commissioner JulieBrill called on the data broker indus-try to adopt a “Reclaim Your Name”program, which would enable peopleto see their information stored in com-panies’ databases, learn how the in-formation is gathered and used, pre-vent companies from selling it formarketing purposes and correct errors.

The proposal would not allow in-dividuals to demand that the compa-nies erase the information. Neverthe-less, Brill said, it would address “thefundamental challenge to consumerprivacy in the online marketplace: ourloss of control over our most privateand sensitive information.” 56

One data broker is opening the cur-tain on some of its practices — perhapsto its detriment. Scott Howe, chief ex-ecutive of Arkansas-based Acxiom Corp.,announced in late August that peoplecan visit a new website, aboutthedata.com, to see and even change some ofthe information Acxiom has collectedabout them. Site visitors can discover thesources of the data, correct errors, deletesome information and even tell the com-pany to stop collecting and storing dataabout them, though to do so they firstmust provide Acxiom with personal in-formation, including their address, birthdate and last four digits of their SocialSecurity number.

Acxiom collects a wide range of in-formation about some 700 million in-dividuals worldwide, including contactand demographic information, typesof products purchased, type and valueof residence and motor vehicles andrecreational interests. 57 It gathers the

Continued on p. 926

no

Oct. 25, 2013 925www.cqresearcher.com

At Issue:Are new laws needed to prevent organizations from collectingonline personal data?yes

yesJOHN M. SIMPSONDIRECTOR, PRIVACY PROJECT, CONSUMERWATCHDOG

WRITTEN FOR CQ RESEARCHER, OCTOBER 2013

p rivacy laws simply have not kept pace with the digitalage and must be updated to protect us as we surf theweb or use our mobile devices. People must be able to control what, or even whether, organizations collect

data about them online.Suppose you went to the library and someone followed

you around, noting each book you browsed. when you wentto a store, they recorded every item you examined and whatyou purchased. In the brick-and-mortar world, this would bestalking, and an obvious invasion of privacy.

On the Internet such snooping is business as usual, but that isno justification. Nearly everything we do online is tracked, oftenwithout our knowledge or consent, and by companies withwhom we have no relationship. Digital dossiers may help targetadvertising, but they can also be used  to make assumptions aboutpeople  in  connection with employment, housing, insuranceand financial  services — and for government surveillance.

Americans increasingly understand the problem. The Pew Re-search Center released a poll in September that found 86 percentof Internet users have taken steps online to remove or masktheir digital footprints. Some 68 percent of respondents said cur-rent laws are not good enough to protect our online privacy.

So, what can be done? In february 2012 President Obamaissued a report, Consumer Privacy in a Networked World,that called for a Consumer Privacy Bill of Rights that includesthe right of consumers “to exercise control over what personaldata companies collect from them and how they use it.” Hecalled for legislation to enact those rights. It is long past timeto introduce and pass it.

All four major browsers can now send a “do not track”message to sites their users visit. However, companies are notrequired to comply, and most do not.

Sen. Jay Rockefeller, D-w.va., has introduced a “do nottrack” bill that would require compliance. But with Congressmired in partisan gridlock, state-level “do not track” legislation– perhaps enacted through the ballot initiative process in aprogressive state such as California — is another option.

Stopping Internet companies from tracking users online willnot end online advertising or break the Internet, but it willforce advertisers to honor our personal boundaries. A “do nottrack” mechanism would give consumers better control andhelp restore everyone’s confidence in the Internet. That’s awin-win for consumers and businesses alike.no

JIM HARPERDIRECTOR, INFORMATION POLICY STUDIES,THE CATO INSTITUTE

WRITTEN FOR CQ RESEARCHER, OCTOBER 2013

t here is no question that Internet users should be able tostop organizations from collecting data about them. Thequestion is how.

most people think that new sets of legal rights or rules arethe way to put Internet users in the driver’s seat. Ideas rangefrom mandatory privacy notices, to government regulation ofdata collection, to a radical new legal regime in which peopleown all data about them. But these ideas founder when itcomes to administration, and their proponents misunderstandwhat privacy is and how people protect it.

we can learn about online privacy from offline privacy. Inthe real world, people protect privacy by controlling informa-tion about themselves. we have ornate customs and habitsaround how we dress; when and where we speak; what wesay, write or type; the design of houses and public buildings;and much more. All of these things mesh our privacy desireswith the availability of information about us. Laws, such ascontract and property laws, back up the decisions we maketo protect privacy.

Protecting privacy is harder in the online world. many peopledo not understand how information moves online. And wedon’t know what consequences information sharing will havein the future. what’s more, privacy interests are changing.(make no mistake: People — even the young — still careabout privacy.)

The way to protect privacy online is by using establishedlegal principles. when Internet service providers, websites oremail services promise privacy, the law should recognize thatas a contract. The law should recognize that a person’s data— contact information in a phone or driving data in a car —belong to that person.

Crucially, the government should respect the contracts andproperty concepts that are emerging in the online environment.This is not an entirely new idea. In a 1929 fourth Amendmentcase, U.S. Supreme Court Justice Pierce Butler weighed in againstwiretapping, saying, “The contracts between telephone compa-nies and users contemplate the private use of the facilitiesemployed in the service. The communications belong to theparties between whom they pass.”

when Internet users learn what matters to them and howto protect it, the laws and the government should protect andrespect their decisions. This is the best way to let Internetusers stop organizations from collecting data about them.

926 CQ Researcher

information from public records, direc-tories, Internet activity, questionnairesand from other data-collection compa-nies. Acxiom sells the information tocompanies, nonprofits and political or-ganizations that use it in marketing,fundraising, customer service, constituentservices and outreach efforts.

Howe acknowledged Acxiom is tak-ing a risk by letting people review theirinformation. They could falsify informa-tion about themselves, for instance. And,if a significant number asked to be delet-ed from the database, “it would be dev-astating for our business,” he said. “ButI feel it’s the right thing to do.”

Aboutthedata.com also could lead tonew business opportunities, Howeadded. In the future, the site might in-vite visitors to volunteer more infor-mation about themselves. And Acxiomcould make its database more valuableby allowing visitors to specify what kindsof advertising they’d like to see. 58

Others in the marketing industrydisagree. Adopting Brill’s proposals“would lead to more fraud and limitthe efficacy of [marketing] companies,”according to Linda woolley, CEO andpresident of the Direct marketing As-sociation, a trade organization for thedirect marketing industry. 59

Big data companies also are clashingwith web browser producers who makeit easier for Internet surfers to keep theirinformation secret and prevent trackingof their online activities. microsoft nowships Internet Explorer with its do-not-track feature turned on. mozilla plans toship new versions of its firefox browserwith a default setting to block third-partycookies, but it’s still working out the de-tails. Internet Explorer, firefox and otherbrowsers allow users to set other restric-tions on cookies. Randall Rothenberg,president and CEO of the Interactive Ad-vertising Bureau (IAB), called microsoft’sdo-not-track decision “a step backwardsin consumer choice.” Because it is set bydefault, he said, it doesn’t represent aconsumer’s decision. 60

The digital advertising industry’s codeof conduct currently does not requirecompanies to honor do-not-track sig-nals, but the Digital Advertising Al-liance established a committee in Octo-ber to formulate a policy on do-not-tracksettings by early next year. The allianceis a consortium of six advertising in-dustry trade groups. 61

The IAB also charged that mozilla’splan to block third-party cookies woulddisrupt targeted advertising. “There arebillions and billions of dollars and tensof thousands of jobs at stake in thissupply chain,” Rothenberg said. 62 Block-ing the cookies would be especiallydamaging to small online publishersthat depend on the cookies to sell adsto niche audiences, he added. “Thesesmall businesses can’t afford to hirelarge advertising sales teams [and] can’tafford the time to make individual buysacross thousands of websites.” 63

Brendan Eich, mozilla’s chief tech-nology officer, defended the plannedcookie-blocking. “we believe inputting users in control of their on-line experience, and we want a healthy,thriving web ecosystem, [and] we donot see a contradiction,” he said. 64

Privacy advocates applauded. “Thisis a really good move for online con-sumers,” said Jeff Chester, executivedirector of the Center for Digital Democ-racy, a consumer-privacy advocacygroup in washington. 65

while spying and advertising by bigdata companies stir up controversies, manybig-data accomplishments go unnoticedby the public. Because of super-speedycomputers, enormous databases andpowerful sensors, scientists, technicians,business managers and entrepreneursdaily push out the frontiers of science,business and daily life. Some are evenpushing past databases — employingcomputers, sensors and software to col-lect and analyze data simultaneously.

maximizing big data’s benefits re-quires real-time analysis of informationstreamed from sensors, says Jeff Hawkins,cofounder of Grok, a company that has

developed software to do just that. Thedata explosion is happening in part be-cause of an increase in data sources,he says, noting that “pretty much every-thing in the world is becoming a sen-sor.” Sensors are collecting data frombuildings, motor vehicles, streets and allkinds of machinery, Hawkins says, andmuch of it needs to be analyzed im-mediately to be worthwhile.

“Data coming off a windmill mightget (analyzed) every minute,” Hawkinsexplains. “Server data might be every10 seconds. Trying to predict pricingin some marketplace or demand forenergy use in a building might behourly or every 12 to 16 minutes.”

Grok never runs into privacy issuesbecause “we don’t save data,” Hawkinssays. “we look at data, we act uponit and we throw away the data.”

OUTLOOK‘Internet of Things’

P eople familiar with big data agreethat the amount of information

processed and the speed and sophisti-cation with which it can be analyzedwill increase exponentially. Huge amountsof data will be generated by what manyare calling the “Internet of Things” —the online linking of sensors installedon more and more inanimate objects.

“There are going to be billions ofsensors,” Grok’s Hawkins says, “maybehundreds of billions. we are going tosee the world become more efficient,more reliable. In the future, your re-frigerator should be able to say: I shouldprecool myself by about 3 degrees so,when the price of electricity goes upin the afternoon, I don’t have to run.”

Google Chairman Eric Schmidt pre-dicts that everyone on Earth will beonline by the end of the decade, buthe concedes that won’t be all good.

BIG DATA AND PRIvACY

Continued from p. 924

Oct. 25, 2013 927www.cqresearcher.com

In a book written with Jared Cohen,director of Google Ideas, the compa-ny’s think tank, Schmidt acknowledgesthat threats to privacy and reputationwill become stronger. 66

Noting the old advice to not writedown anything you don’t want to readon a newspaper’s front page, Schmidtand Cohen broaden it to “the websitesyou visit, who you include in your on-line network, what you ‘like,’ and whatothers who are connected to you sayand share.” They foresee privacy class-es joining sex-education classes inschools and parents sitting children downfor “the privacy-and-security talk evenbefore the sex talk.” 67

The Cato Institute’s Harper expectsthe collection of personal data to in-crease indefinitely, but “there still re-main huge amounts of information thatpeople keep to themselves, and that’snot going to change. from every thoughtthat crossed your mind at breakfast toyour reasons for watching the televi-sion programs you did tonight — allthat personal information that’s part ofwhat gives your life meaning — no-body knows that unless you tell them.”

Hopkins from millsaps College says,“it will become easier and cheaper tofind information out about people inways that our laws and ancestors neverconceived of,” noting the threat to pri-vacy that would occur if nefarious indi-viduals [such as peeping Toms or thieves“casing a joint”] could own drones. 68

“Courts and public policymakers aregoing to have to decide if we’re goingto try to limit technology to things youcould have done in 1950 or 1750, orare we going to have to give up thenotion that we could have the samekinds of privacy expectations that wehad a century ago,” Hopkins says. Peo-ple probably will accept less privacy,he says, “because of the incredible easeof use and personal benefits” that comefrom sharing the data. “I suspect we’lljust get used to it.”

Zaneis, of the Interactive AdvertisingBureau, also doesn’t expect tighter data

controls, because govern-ment can’t keep up withthe speed of technologi-cal advances. “There’sgoing to be evolution andchange at such a rapidpace that legislators andregulators can’t keep pace,”he says, contending thatindustry self-regulationwillfill the gap.

Consumer watchdog’sSimpson concedes thatprivacy advocates won’tget everything they want.“But I think we’re goingto see some serious pro-tections put in place,” hesays, because the NSA rev-elations have generated“tremendous pushback”against privacy invasion.

But Polonetsky, of thefuture of Privacy forum,says, “Technology prob-ably is going to solve thisbefore policymakers do,”citing how browser man-ufacturers are developingprivacy protections fortheir users.

Similarly, mark Little,the principal consumeranalyst for the Ovum con-sulting firm, warns bigdata companies that theycould run into “hurricane-force disruptions” of theirdata collection. “marketersshould not be surprisedif more and more con-sumers look to alterna-tive privacy ecosystemsto control, secure andeven benefit from theirown data,” he said. 69

Notes1 for background, see Tom Price, “GlobalizingScience,” CQ Global Researcher, feb. 1, 2011,pp. 53-78.

2 “Computing,” European Organization forNuclear Research, Http://home.web.cern.ch/about/computing.3 for background, see Chuck mcCutcheon,“Government Surveillance,” CQ Researcher,Aug. 30, 2013, pp. 717-740.4 “Bill Summary & Status, 113th Congress(2013-2014), S.418, All Congressional Actions,”

928 CQ Researcher

Library of Congress, http://thomas.loc.gov/cgi-bin/bdquery/D?d113:1:./temp/~bd9qwL:@@@X|/home/LegislativeData.php.5 viktor mayer-Schonberger and Kenneth Cuki-er, Big Data: A Revolution that Will Trans-form How We Live, Work, and Think (2013);James Risen and Eric Lichtblau, “How the U.S.Uses Technology to mine more Data moreQuickly,” The New York Times, June 9, 2013, p. 1,www.nytimes.com/2013/06/09/us/revelations-give-look-at-spy-agencys-wider-reach.html?_r=0.6 Risen and Lichtblau, ibid.7 Ross Andersen, “How Big Data Is ChangingAstronomy (Again),” The Atlantic, April 19, 2012,www.theatlantic.com/technology/archive/2012/04/how-big-data-is-changing-astronomy-again/255917.8 Juliette Garside, “Google does not have todelete sensitive information, says European court,”The Guardian, June 25, 2013, www.guardian.co.uk/technology/2013/jun/25/google-not-delete-sensitive-information-court; Katherine Jacob-sen, “Should Google be accountable for whatits search engine unearths?” The Christian Sci-ence Monitor, June 25, 2013, www.csmonitor.com/Innovation/2013/0625/Should-Google-be-accountable-for-what-its-search-engine-unearths.9 “New Poll Shows Americans Anxious AboutPrivacy,” Allstate/National Journal/The Atlantic/Heartland Monitor Poll, June 13, 2013, www.magnetmail.net/actions/email_web_version.cfm?recipient_id=615749555&message_id=2731423&user_id=NJG_NJmED&group_id=526964&jobid=14264751.10 farah Stockman, “Big Data’s big deal,” TheBoston Globe, June 18, 2013, p.11, www.bostonglobe.com/opinion/2013/06/18/after-snowden-defense-big-data/7zH3HKrXm4o3L1HIm7cfSK/story.html.11 “New Poll Shows Americans AnxiousAbout Privacy,” op. cit.12 vignesh Ramachandran, “Social media Pro-ject monitors Keywords to Prevent Suicide,”

Mashable, Aug. 20, 2013, http://mashable.com/2013/08/20/durkheim-project-social-media-suicide.13 Dina ElBoghdady, “wall Street regulatorsturn to better technology to monitor markets,”The Washington Post, July 12, 2013, www.washingtonpost.com/business/economy/wall-street-regulators-turn-to-better-technology-to-monitor-markets/2013/07/12/ab188abc-eb25-11e2-aa9f-c03a72e2d342_story.html.14 Bruce Schneier, “The Internet Is a Surveil-lance State,” CNN, march 16, 2013, www.schneier.com/essay-418.html.15 max fisher, “Here’s the e-mail trick Petraeusand Broadwell used to communicate,” The Wash-ington Post, Nov. 12, 2012, www.washingtonpost.com/blogs/worldviews/wp/2012/11/12/heres-the-e-mail-trick-petraeus-and-broadwell-used-to-communicate.16 Schneier, op. cit.17 Ibid.18 Risen and Lichtblau, op. cit.19 Niraj Chokshi and matt Berman, “The NSADoesn’t Need much Phone Data to KnowYou’re You,” National Journal, June 6, 2013,www.nationaljournal.com/nationalsecurity/the-nsa-doesn-t-need-much-phone-data-to-know-you-re-you-20130605.20 Gerry Shih, “facebook admits year-longdata breach exposed 6 million users,” Reuters,June 21, 2013, www.reuters.com/article/2013/06/21/net-us-facebook-security-idUSBRE95K18Y20130621; violet Blue, “firm: facebook’sshadow profiles are ‘frightening’ dossiers oneveryone,” ZDNet, June 24, 2013, www.zdnet.com/firm-facebooks-shadow-fprofiles-are-frightening-dossiers-on-everyone-7000017199.21 Emily Steel and Geoffrey A. fowler, “face-book in Privacy Breach,” The Wall Street Jour-nal, Oct. 17, 2010, http://online.wsj.com/article/SB10001424052702304772804575558484075236968.html.22 “New Poll Shows Americans AnxiousAbout Privacy,” op. cit.

23 mark Little, “ ‘Little data’: Big Data’s newbattleground,” Ovum, Jan. 29, 2013, http://ovum.com/2013/01/29/little-data-big-datas-new-battleground.24 for background, see Robert Kiener, “futureof the Public Universities,” CQ Researcher, Jan.18, 2013, pp. 53-80.25 for background, see “Declaration of theRights of man and of the Citizens,” “Lecturesin modern European Intellectual History,” TheHistory Guide, www.historyguide.org/intellect/declaration.html.26 Bob Sullivan, “ ‘La difference’ is stark in EU,U.S. privacy laws,” mSNBC, Oct. 19, 2006, www.nbcnews.com/id/15221111/ns/technology_and_science-privacy_lost/t/la-difference-stark-eu-us-privacy-laws/#.UlxnkhDhDpc.27 Thor Olavsrud, “EU data protection regula-tion and cookie law — Are you ready?” Com-puterworldUK, may 24, 2012, www.computerworlduk.com/in-depth/security/3359574/eu-data-protection-regulation-and-cookie-law--are-you-ready.28 Schneier, op. cit.29 Natasha Singer, “A Data Broker Offers aPeek Behind the Curtain,” The New York Times,Aug. 31, 2013, www.nytimes.com/2013/09/01/business/a-data-broker-offers-a-peek-behind-the-curtain.html?adxnnl=1&ref=natashasinger&adxnnlx=1378241984-ftuN93sEdSPgKXyXIEdZZw.30 “Section-by-Section Summary of the Do-Not-Track Online Act of 2013,” U.S. Senate Com-mittee on Commerce, Science and Transporta-tion, www.commerce.senate.gov/public/?a=files.Serve&file_id=95e2bf27-aa9b-4023-a99c-766ad6e4cf12.31 “Rockefeller Introduces Do-Not-Track Billto Protect Consumers Online,” Office of Sen.Jay Rockefeller, march 1, 2013, www.rockefeller.senate.gov/public/index.cfm/press-releases?ID=deb2f396-104e-46a0-b206-8c3d47e2eed3.32 “New Poll Shows Americans AnxiousAbout Privacy,” op. cit.33 Except when noted otherwise, informationfor this historical section was drawn from Urifriedman, “Big Data: A Short History,” ForeignPolicy, November 2012, www.foreignpolicy.com/articles/2012/10/08/big_data; Gil Press,“A very Short History of Big Data,” Forbes,may 9, 2013, www.forbes.com/sites/gilpress/2013/05/09/a-very-short-history-of-big-data;Ann Blair, “Information overload, the earlyyears,” The Boston Globe, Nov. 28, 2010, www.boston.com/bostonglobe/ideas/articles/2010/11/28/information_overload_the_early_years/?page=1.

BIG DATA AND PRIvACY

About the AuthorTom Price, a Washington-based freelance journalist and acontributing writer for CQ Researcher, focuses on the pub-lic affairs aspects of science, technology, education and busi-ness. Previously, Price was a correspondent in the Cox News-papers Washington Bureau and chief politics writer for theDayton Daily News and The (Dayton) Journal Herald. Heis author or coauthor of five books, including Changing TheFace of Hunger and, most recently, Washington, DC, Free& Dirt Cheap with his wife Susan Crites Price.

Oct. 25, 2013 929www.cqresearcher.com

34 Ecclesiastes 12:12, Holy Bible: King Jamesversion, University of michigan Library DigitalCollections, http://quod.lib.umich.edu/cgi/k/kjv/kjv-idx?type=DIv1&byte=2546945.35 Blair, op. cit.36 Ibid.37 Press, op. cit.38 Ibid.39 michael Lesk, “How much Information IsThere In the world?” 1997, www.lesk.com/mlesk/ksg97/ksg.html.40 mayer-Schonberger and Cukier, op. cit., p. 22.41 Jessica van Sack, “Nothing to fear now, butsoon . . .” The Boston Herald, June 11, 2013,p. 2.42Oxford English Dictionary (2013), www.oed.com/view/Entry/18833#eid301162177. GilPress, “Big Data News: A Revolution Indeed,”Forbes, June 18, 2013, www.forbes.com/sites/gilpress/2013/06/18/big-data-news-a-revolution-indeed.43 for background, see marcia Clemmitt, “In-ternet Regulation,” CQ Researcher, April 13, 2012,pp. 325-348.44 Tom Price, “Social media and Politics,” CQResearcher, Oct. 12, 2012, p. 922.45 for background, see Tom Price, “Social mediaand Politics,” CQ Researcher, Oct. 12, 2012, pp.865-888.46 for background, see marcia Clemmitt, “Socialmedia Explosion,” CQ Researcher, Jan. 25, 2013,pp. 81-104.47 “Our history in depth,” Google, www.google.com/about/company/history. for background,see David Hatch, “Google’s Dominance,” CQResearcher, Nov. 11, 2011, pp. 953-976.48 “fortune 500, 2013,” CNN money, http://money.cnn.com/magazines/fortune/fortune500/2013/full_list/index.html?iid=f500_sp_full.49 Brian fung, “Google Street view’s covertcollection of consumers’ wi-fi data in 2010,”National Journal, June 21, 2013, www.nationaljournal.com/tech/you-know-who-else-inadvertently-gathered-your-electronic-data-20130621.50 Phil Rosenthal, “Deeper into data mine we go,privacy plundered,” Chicago Tribune, Aug. 12,2012, p. 1, http://articles.chicagotribune.com/2012-08-12/business/ct-biz-0812-phil-privacy--20120812_1_apple-s-safari-google-privacy.51 Constance L. Hays, “what wal-mart KnowsAbout Customers’ Habits,” The New York Times,Nov. 14, 2004, www.nytimes.com/2004/11/14/business/yourmoney/14wal.html?_r=1&pagewanted=all&position=.52 mayer-Schonberger and Cukier, op. cit., p. 57.53 Scott Shane and Colin moynihan, “DrugAgents Use vast Phone Trove, Eclipsing N.S.A.’s,”

The New York Times, Sept. 1, 2013, www.nytimes.com/2013/09/02/us/drug-agents-use-vast-phone-trove-eclipsing-nsas.html?_r=1&.54 Ellen Nakashima, Julie Tate and CarolLeonnig, “NSA broke privacy rules for 3 years,documents say,” The Washington Post, Sept. 11,2013, p. 1.55 “fTC to Study Data Broker Industry’s Col-lection and Use of Consumer Data,” federalTrade Commission, Dec. 18, 2012, www.ftc.gov/opa/2012/12/databrokers.shtm.56 Julie Brill, “Demanding transparency fromdata brokers,” The Washington Post, Aug. 15,2013, www.washingtonpost.com/opinions/de-manding-transparency-from-data-bro-kers/2013/08/15/00609680-0382-11e3-9259-e2aafe5a5f84_print.html.57 Adam Tanner, “finally You’ll Get To SeeThe Secret Consumer Dossier They Have OnYou,” Forbes, June 25, 2013, www.forbes.com/sites/adamtanner/2013/06/25/finally-youll-get-to-see-the-secret-consumer-dossier-they-have-on-you/.58 Singer, op. cit.59 Linda woolley, letter to fTC CommissionerJulie Brill, Aug. 19, 2013, http://blog.thedma.org/2013/08/19/dma-responds-to-op-ed-attacking-commercial-data-use.60 “ ‘Do Not Track’ Set to ‘On’ By Default inInternet Explorer 10 — IAB Response,” Inter-active Advertising Bureau, may 31, 2012, www.

iab.net/InternetExplorer.61 Al Urbanski, “New Do Not Track Group makesProgress in San francisco,” Direct MarketingNews,Oct. 11, 2013, www.dmnews.com/new-do-not-track-group-makes-progress-in-san-francisco/article/315995.62 Craig Timberg, “firefox moves forwardwith tracking blocker,” The Washington Post,June 20, 2013, p. 14.63 “IAB Accuses mozilla of Undermining Amer-ican Small Business & Consumers’ Control ofTheir Privacy with Proposed Changes to firefox,”Interactive Advertising Bureau, march 12, 2013,www.iab.net/about_the_iab/recent_press_releases/press_release_archive/press_release/pr-031213?gko=2dce8.64 Brendan Eich, “C is for Cookie,” mozilla,may 16, 2013, https://brendaneich.com/2013/05/c-is-for-cookie.65 Timberg, op. cit.66 Eric  Schmidt and Jared  Cohen, The NewDigital Age: Reshaping the Future of People,Nations and Business (2013).67 Doug Gross, “Google chairman: 6 predic-tions for our digital future,” CNN, April 23, 2013,www.cnn.com/2013/04/23/tech/web/eric-schmidt-google-book.68 for background, see Daniel mcGlynn, “Do-mestic Drones,” CQ Researcher, Oct. 18, 2013,pp. 885-908.69 Little, op. cit.

FOR MORE INFORMATIONBerkman Center for Internet and Society at Harvard University, 23 Everett St.,2nd floor, Cambridge, mA 02138; 617-495-7547; www.cyber.law.harvard.edu. Researchcenter that studies aspects of the Internet, including commerce, governance, edu-cation, law, privacy, intellectual property and antitrust issues.

Consumer Watchdog, 2701 Ocean Park Blvd., Suite 112, Santa monica, CA 90405;310-392-0522; www.consumerwatchdog.org. National consumer advocacy organizationthat promotes privacy of personal data.

Future of Privacy Forum, 919 18th St., N.w., Suite 901, washington, DC 20006;202-642-9142; www.futureofprivacy.org. Think tank that convenes discussions,publishes papers and advocates policies to protect privacy of data on and offline.

Institute for Ethics and Emerging Technologies, williams 119, Trinity College,300 Summit St., Hartford, CT 06106; 860-297-2376; 860-297-2376; www.ieet.org.Think tank that studies and debates ethical implications of new technologies.

Interactive Advertising Bureau, 116 East 27th St., 7th floor, New York, NY 10016;212-380-4700; www.iab.net. Trade association for companies that sell online adver-tising; promulgates an industry code of conduct.

Pew Internet & American Life Project, 1615 L St., N.w., Suite 700, washington,DC 20036; 202-419-4500; www.PewInternet.org. Research center that conducts fre-quent public opinion surveys as it studies how the Internet affects Americans.

FOR MORE INFORMATION

930 CQ Researcher

Selected Sources

BibliographyBooks

Jarvis, Jeff, Public Parts, Simon & Schuster, 2011.A journalism professor who writes frequently about big data

and the Internet makes the case for sharing, which he calls“publicness,” and putting aside certain notions of privacy.

Lanier, Jaron, Who Owns the Future? Simon & Shuster,2013.A microsoft Research computer scientist argues that people

should be paid for the information they share online and off.

Mayer-Schonberger, Viktor, and Kenneth Cukier, BigData: A Revolution That Will Transform How We Live,Work, and Think, Houghton Mifflin Harcourt, 2013.An Oxford professor (mayer-Schonberger) and The Econo-mist’s data editor examine big data’s impact on the world,highlighting both its opportunities and threats.

Morozov, Evgeny, To Save Everything, Click Here: TheFolly of Technological Solutionism, Public Affairs, 2013.A self-described “digital heretic” challenges the utopian visions

of “digital evangelists” and warns that many fruits of cutting-edge technology come with high costs.

Smolan, Rick, and Jennifer Erwitt, eds., The HumanFace of Big Data, Against All Odds Productions, 2012.Combining photographs, illustrations and essays, this remark-

able coffee-table book shows big data at work in the real world,from obtaining new insights into health care costs to compil-ing detailed statistics from every major League Baseball pitch.

Articles

Blair, Ann, “Information overload, the early years,” TheBoston Globe, Nov. 28, 2010, www.boston.com/boston-globe/ideas/articles/2010/11/28/information_overload_the_early_years/?page=1.A Harvard historian explains how since biblical times humankind

has felt oppressed by too much data, then learned to cope.

Egan, Erin, “Proposed Updates to our Governing Docu-ments,” Facebook, Aug. 29, 2013, www.facebook.com/notes/facebook-site-governance/proposed-updates-to-our-governing-documents/10153167395945301.facebook officials announce plans for a revised privacy

policy that says users automatically permit their posts andphotos to be used in ads. Thousands of users express outrage.

Scola, Nancy, “Obama, the ‘big data’ president,” TheWashington Post, June 14, 2013, www.washingtonpost.com/opinions/obama-the-big-data-president/2013/06/14/1d71fe2e-d391-11e2-b05f-3ea3f0e7bb5a_story.html.

A journalist who covers technology and politics documentsPresident Barack Obama’s embrace of big data in both hiscampaign and his administration.

Stahl, Lesley, “A Face in the Crowd: Say goodbye toanonymity,” CBS News, May 19, 2013, www.cbsnews.com/2102-18560_162-57599773.html.In a report on facial-recognition surveillance, the veteran

CBS correspondent walks into a store, is recognized by asurveillance camera and, moments later, her cell phone re-ceives a text with a special deal offered by the store, basedon her shopping history and facebook “likes.”

Reports and Studies

“Big Data, Big Impact: New Possibilities for InternationalDevelopment,” World Economic Forum, 2012, www3.weforum.org/docs/WEF_TC_MFS_BigDataBigImpact_Briefing_2012.pdf.An organization best known for annually convening inter-

national leaders in Davos, Switzerland, explores how ana-lyzing big data could predict crises, identify needs and helpdevise ways to aid the poor.

“Kenneth Cukier and Michael Flowers on ‘Big Data,’ ”Foreign Affairs Media Conference, Council on ForeignRelations, May 9, 2013, www.cfr.org/health-science-and-technology/foreign-affairs-media-conference-call-kenneth-cukier-michael-flowers-big-data/p30695.The Economist’s data editor (Cukier) and New York City’s

chief analytics officer discuss how the city uses big data toimprove services, as well as some downsides to the massivecollection and analysis of information.

“Self-Regulatory Principles for Online Behavioral Adver-tising,”American Association of Advertising Agencies, Amer-ican Advertising Federation, Association of National Ad-vertisers, Direct Marketing Association, InteractiveAdvertisingBureau, July 2009, www.aboutads.info/resource/download/seven-principles-07-01-09.pdf.five advertising industry trade associations join together to

issue “consumer-friendly standards” for collecting informationonline and using it to target advertising to individuals. Theorganizations said they would report uncorrected violationsto government agencies and the public.

Rainie, Lee, Sara Kiesler, Ruogu Kang and Mary Madden,“Anonymity, Privacy, and Security Online,” Pew ResearchCenter, Sept. 5, 2013, www.pewinternet.org/Reports/2013/Anonymity-online.aspx.A survey by the nonpartisan think tank finds that most

Americans want the ability to be anonymous online, butmany don’t think it’s possible.

Oct. 25, 2013 931www.cqresearcher.com

Business

Arthur, Lisa, “What is Big Data?”Forbes.com, Aug. 15, 2013,www.forbes.com/sites/lisaarthur/2013/08/15/what-is-big-data/.Businesses often lack consensus in defining big data, but

they need to understand it to realize its potential.

Gurski, Laura, “Big Data Not Living Up To Its Promise?Change the Way You Work,” Bloomberg Business Week,Aug. 21, 2013, www.businessweek.com/articles/2013-08-21/big-data-not-living-up-to-its-promise-change-the-way-you-work.Companies could benefit from work flows that help them in-

tegrate insights from data analysis into their decision-making.

Hicken, Melanie, “Find out what Big Data knows aboutyou (it may be very wrong),” CNN Money, Sept. 5, 2013,http://money.cnn.com/2013/09/05/pf/acxiom-consumer-data/index.html.One of the country’s largest data brokers unveiled a web-

site that allows consumers to see how much information isgathered about them — and how much of it is inaccurate.

Ravindranath, Mohana, “Brooks Brothers, national re-tailers analyze ‘big data’ from sales to adjust marketing,”The Washington Post, Sept. 22, 2013, http://articles.washingtonpost.com/2013-09-22/business/42299416_1_brooks-brothers-analytics-sales-data.Retailers are turning to data analysis, performed either by

outside firms or internally, to fine-tune their marketing.

Economic Impact

Glanz, James, “Is Big Data an Economic Big Dud?” TheNew York Times, Aug. 17, 2013, www.nytimes.com/2013/08/18/sunday-review/is-big-data-an-economic-big-dud.html?pagewanted=all.Some economists question whether big data will have the

economic impact that the data industry predicted.

Yerak, Becky, “In growing field of big data, jobs go un-filled,” The Chicago Tribune, Aug. 25, 2013, http://articles.chicagotribune.com/2013-08-25/business/ct-biz-0825-big-data-20130825_1_big-data-data-sources-data-analysts.The demand for professionals with quantitative skills to

support big data is outpacing the supply of workers.

Metadata

Cukier, Kenneth Neil, and Viktor Mayer-Schoenberger,“The Rise of Big Data: How It’s Changing the Way WeThink About the World,” Foreign Affairs, May/June 2013,www.foreignaffairs.com/articles/139104/kenneth-neil-

cukier-and-viktor-mayer-schoenberger/the-rise-of-big-data.The authors detail the ways in which big data is a trans-

formative technological trend.

White, Martha C., “Why It’s Nearly Impossible To StayOff Big Data’s Radar,” Time, Aug. 7, 2013, http://business.time.com/2013/08/07/why-its-nearly-impossible-to-stay-off-big-datas-radar/.Private data collectors and aggregators constantly are track-

ing consumers to create digital dossiers that will help themsell products and services.

Education

Fletcher, Seth, “How Big Data Is Taking Teachers Out ofthe Lecturing Business,”Scientific American, July 29, 2013,www.scientificamerican.com/article.cfm?id=how-big-data-taking-teachers-out-lecturing-business.The education system increasingly is incorporating data-

driven, personalized computer learning into curricula.

Guthrie, Doug, “The Coming Big Data Education Revo-lution,”U.S. News and World Report, Aug. 15, 2013, www.usnews.com/opinion/articles/2013/08/15/why-big-data-not-moocs-will-revolutionize-education.Big data, more than mOOCs (massive open online courses),

promises to transform online learning and higher education,says a columnist.

Schleicher, Andreas, “Big Data and PISA,” The HuffingtonPost, July 22, 2013, www.huffingtonpost.com/andreas-schleicher/big-data-and-pisa_b_3633558.html.Big data gleaned from a global survey is helping schools

see how they compare to other schools around the world.

The Next Step:Additional Articles from Current Periodicals

CITING CQ RESEARCHERSample formats for citing these reports in a bibliography

include the ones listed below. Preferred styles and formats

vary, so please check with your instructor or professor.

mLA STYLEJost, Kenneth. “Remembering 9/11.” CQ Researcher 2 Sept.

2011: 701-732.

APA STYLE

Jost, K. (2011, September 2). Remembering 9/11. CQ Re-

searcher, 9, 701-732.

CHICAGO STYLE

Jost, Kenneth. “Remembering 9/11.” CQ Researcher, Sep-

tember 2, 2011, 701-732.

ACCESSCQ Researcher is available in print and online. for access, visit yourlibrary or www.cqresearcher.com.

STAY CURRENTfor notice of upcoming CQ Researcher reports or to learn more aboutCQ Researcher products, subscribe to the free email newsletters, CQ Re-searcher Alert! and CQ Researcher News: http://cqpress.com/newsletters.

PURCHASETo purchase a CQ Researcher report in print or electronic format(PDf), visit www.cqpress.com or call 866-427-7737. Single reports startat $15. Bulk purchase discounts and electronic-rights licensing arealso available.

SUBSCRIBEAnnual full-service CQ Researcher subscriptions—including 44 reportsa year, monthly index updates, and a bound volume—start at $1,054.Add $25 for domestic postage.

CQ Researcher Online offers a backfile from 1991 and a number oftools to simplify research. for pricing information, call 800-818-7243 or805-499-9774 or email [email protected].

Upcoming Reports

In-depth Reports on Issues in the News

?Are you writing a paper?

Need backup for a debate?

Want to become an expert on an issue?

for 90 years, students have turned to CQ Researcher for in-depth reporting on issues inthe news. Reports on a full range of political and social issues are now available. followingis a selection of recent reports:

Religious Repression, 11/1/13 Lyme Disease, 11/8/13 Domestic violence, 11/15/13

Civil LibertiesSolitary Confinement, 9/12Re-examining the Constitution, 9/12voter Rights, 5/12Remembering 9/11, 9/11

Crime/LawBorder Security, 9/13Gun Control, 3/13Improving Cybersecurity, 2/13Supreme Court Controversies, 9/12Debt Collectors, 7/12Criminal Records, 4/12

EducationLaw Schools, 4/13Homeless Students, 4/13Plagiarism and Cheating, 1/13

Environment/Societyfuture of the Arctic, 9/13women and work, 7/13Telecommuting, 7/13Climate Change, 6/13future of the Catholic Church, 6/13media Bias, 4/13Combat Journalism, 4/13

Health/SafetyDomestic Drones, 10/13Regulating Pharmaceuticals, 10/13worker Safety, 10/13Alternative medicine, 9/13Assisted Suicide, 5/13mental Health Policy, 5/13Preventing Hazing, 2/13

Politics/EconomyGovernment Spending, 7/13Unrest in the Arab world, 2/13Social media and Politics, 10/12