The Digital Deluge - Systems and Computer Engineering

Download Report

Transcript The Digital Deluge - Systems and Computer Engineering

The Digital Deluge
Lecture 5
Learning in Retirement
David Coll
Professor Emeritus
Department of Systems and Computer
Engineering
Winter 2009
The Youth Ball Welcomes Obama with a Sea of Digital Cameras
January 22, 2009
The World is Aware
• International Herald Tribune, May 30, 2005
– Facing a digital world
– Famous makers from yesteryear run to catch up
• The New York Times, July 28, 1990
– CONSUMER'S WORLD: Coping With Digital Audio
Tape
• Forbes.com: Tales From The Marketing Wars
Coping In A Changing World, March 16, 2007
– A recent article caught my attention.
– It was entitled "Kodak takes Hit in Film and Digital."
– It discussed, in some detail, how analysts had begun
to question just what future the firm that has been
synonymous with film and pictures might have in the
consumer photography market.
• Justin Thorpe’s Web 2.0 BlogTheater: “This Digital Life:
Basic Instructions for Coping with the 21st Century”, July 21,
2007
• “I wasn’t really sure what to expect when my
friend Joseph Price invited me to a play he wrote,
“This Digital Life: Basic Instructions for Coping
with the 21st Century.” I had a hard time imagining
a play about technology.
• The description is as follows:
• Sometimes, late at night, do you Google yourself?
Have you ever sent yourself an email from the
future? Three short plays explore life, death, and
infamy in the age of Second Life and Wikipedia.”
• Praning’s Shoutrout, September 29, 2008
• I am still learning how to do things in digital way; like
digital scrapping and slide shows.
• Good thing, I found this site, Roxio Extreme Digital
Makeover, which is primarily set up to help people,
like me, ease the burden in making digital
presentations, which can be time consuming for
beginners.
• I watched a video, the Wedding Day Crunch, to see
a sample of their work and it’s really amazing how
they made simple photographs of a couple, who’s
planning to get married soon, turn into an interesting
slide show that the couple will present on their
wedding day.
Coping when everything is digital? Digital Documents
and Issues in Document Retention, 2004 - 68 page “white
paper”
• Will you cope when everything is digital?
• Recent legal and business developments mean
renewed attention is being directed into
corporate computer systems in Australia, the
US and around the world. Questions such as
the following are becoming common:
– How can digital documents be used, and when
can they be destroyed?
– What happens if you ignore them after they are
no longer ’useful’?
– Will you be able to rely on them when you need
them?
– In a court case, could you prove they mean what
they say?
AIIM Association for Information and Image Management.
• AIIM is the ECM community that provides
education, research, and best practices to
help organizations find, control, and
optimize their information.
• Few in the information industries would
contest the forehead-slappingly obvious
observation that the operational tempo of
work has picked up.
• Nor would anyone contest the statement
that we are now expected to observe,
orient, decide, and act upon a global, 24-7,
non-stop flow of information.
• It is not “news” that Society demands more
information and more information
processing.
• Like it or not, we live in a world where realtime analysis and response is expected.
This is the hyper-connected “infocosm.”
• What do we do about all this “always-onness”?
• Evolutionary psychologists, cognitive
scientists, and no less a behavioral
authority than our mothers tell us that we
humans were not designed to run roundthe-clock.
• How you choose to cope with the design
disconnect between the seasonal and
diurnal savannah which shaped us and the
ubiquitously connected, Internet-powered,
wi-fi-tethered infocosm which currently
surrounds us will impact your career and
your health.
The Digital Delusionals
• These folks actually think they can use hand-held
devices to keep up.
• IDC defines a “hyperconnected” user as someone
who
– uses at least seven communication devices (landline
phone, cell phone, PC, etc.) and nine communication
applications (IM, Web conferencing, social networks,
etc.).
– Some 16 percent of workers surveyed fall into IDC’s
definition of hyper-connected. Within five years, IDC
predicts that 40 percent of workers will be hyperconnected.
• This group should not be confused with the “thumb
tribe” of young Japanese whose social life revolves
around and identity is defined by their use of
technology.
The WebEmersonians.
• When asked to sum up his work, Ralph Waldo
Emerson said his central doctrine was “the
infinitude of the private man.”
• Admirable in their intentions, this self-reliant but
tragically wrong-headed group of hard workers
believes they can keep up assisted by Web
resources.
• These are the folks who populate the landing
pages of the very sizable “how to” category of the
Web.
• They believe they can; via brute force, sleep
deprivation, and a mastery of search; learn what
they need to know and do what they need to do.
The Delegationals
• They know that there is way too much work and
information for any one human to manage.
• It takes a village to survive the requirements of the
infocosm. Such “villages” typically comprise the
headman, a direct report or two, several
contractors, and a relatively new arrival to the
workspace - the virtual assistant.
• These resources - accessible via the Web - bid to
do your bidding.
• You might connect to
– DoMyStuff.com, TasksEveryday.com,
VirtualAssistants. com
– International Virtual Assistants Association,
Virtual Market Support, and/ or Executive
Secretarial Services.
The Cyber- Sailors
• These folks are defined by acceptance;
believing that in today’s world, as
individuals we are largely powerless.
• These folks are convinced that the outside
world will continue to turn without their
constant attention.
• According to one blogger, the mantra of the
cyber-sailors is: “We accept that we can’t
change the wind but can adjust the sails.”
The Boundarials.
• This group aggressively manages “work-life
balance” refusing to neglect other important
area of their lives such as family, friends,
and hobbies in favor of work-related chores
and goals.
• They pre-announce their “rules-ofworkplace-engagement” to colleagues and
bosses.
The Neo-Utopians.
• William Gibson, author of Neuromancer,
once commented
• “One of the things our grandchildren will
find quaintest about us is that we
distinguish the digital from the real, the
virtual from the real.
• In the future, that will become literally
impossible. The distinction between
cyberspace and that which isn’t cyberspace
is going to be unimaginable.”
• “Neo-Utopians … believe that the current
disconnect between human design specs
and the world of work we live in today
– which was to a very large extent precipitated by
the rapid evolution of technology [as opposed to
the slower evolution of humanity],
• will right itself through several mechanisms
– massive automation of time-consuming low-end
tasks
– intensified focus on strategic tasks [Things That
Matter]
– extended life spans [we will have more time to
do what we need to do]
– and personal robots.
[email protected]; [email protected];
http://billstarnaud.blogspot.com/ or http://greenbroadband.blogspot.com/
• How Web 2.0 tools are transforming
science.
• The 2 projects mentioned have been
funded by CANARIE in the latest NEP
program amongst a total of 11 similar
projects .
• For more examples of how web 2.0 is
revolutionizing science please see my
Citizen Science Blog.
Oceans 2.0
• Described as an extension of the internet
under the ocean, the Venus Coastal
Observatory off Canada's west coast
provides oceanographers with a continuous
stream of undersea data once accessible
only through costly marine expeditions.
When its sister facility Neptune Canada
launches next summer, the observatories'
eight nodes will provide ocean scientists
with an unprecedented wealth of
information.
• Sifting through all that data, however, can
be quite a task. So the observatories, with
the help of CANARIE Inc., operator of
Canada's advanced research network, are
developing a set of tools they call Oceans
2.0 to simplify access to the data and help
researchers work with it in new ways. Some
of their ideas look a lot like such popular
consumer websites as Facebook, Flickr,
Wikipedia and Digg.
• And they're not alone. This set of online
interaction technologies called Web 2.0 is
finding its way into the scientific community.
• Michael Nielsen, a Waterloo, Ont., physicist
who is working on a book on the future of
science, says online tools could change
science to an extent that hasn't happened
since the late 17th century, when scientists
started publishing their research in scientific
journals.
• One way to manage the data boom will
involve tagging data, much as users of
websites like Flickr tag images or readers of
blogs and web pages can "Digg" articles
they approve.
• On Oceans 2.0, researchers might attach
tags to images or video streams from
undersea cameras, identifying sightings of
little-known organisms or examples of rare
phenomena.
• The Canadian Space Science Data Portal
(CSSDP), based at the University of
Alberta, is also working on online
collaboration tools. Robert Rankin, a
University of Alberta physics professor and
CSSDP principal investigator, foresees
scientists attaching tags to specific data
items containing occurrences of a particular
process or phenomenon in which
researchers are interested.
• "You've essentially got a database that has
been developed using this tagging
process," he says.
• If data tagging is analogous to Flickr or
Digg, other initiatives look a bit like
Facebook.
• Pirenne envisions Oceans 2.0 including a
Facebook-like social networking site where
researchers could create profiles showing
what sort of work they do and what
expertise they have.
• When a scientist is working on a project and
needs specific expertise — experience in
data mining and statistical analysis of
oceanographic data, for example — he or
she could turn to this facility to find likely
collaborators.
Coping
• There are a number of ways of coping
with the Digital Deluge
• Increasing your information processing
capacity
–
–
–
–
Acquiring
Processing
Storage
Communicating
Coping
• Decreasing the actual amount of
information that must be stored.
• More powerful compression algorithms
• Avoiding unnecessary replication
Acquiring
• Improved Technology - Machines and
Procedures
– Faster
– More Comprehensive
– More Observant, Intuitive
•
•
•
•
•
Search Engines
Subscription Services
User Profilers
Subject Observers
Observation/Seeking Strategies
Processing
• Representation
– Efficient Sampling
– Compression
– Uniform representation: an analytical concept,
referring to a process which allows information from
several realms or disciplines to be displayed and
worked with as if it came from the same realm or
discipline. The term is also applied when taking
information from a number of sources, which may
have used different methodologies and metrics in
their data collection, and building a single large
collection of information, where some records may be
more complete than others across all fields of data.
• Curation
• “Digital curation is the curation, preservation, maintenance,
and collection and archiving of digital assets.
• Digital curation is the process of establishing and developing
long term repositories of digital assets for current and future
reference[1by researchers, scientists, and historians, and
scholars generally.” http://en.wikipedia.org/wiki/Digital_curation
Analysis
• Data Mining: “the process of extracting hidden
patterns from data”.
• As more data is gathered, with the amount of data
doubling every three years, data mining is
becoming an increasingly important tool to
transform this data into knowledge.
• It is commonly used in a wide range of
applications, such as marketing, fraud detection
and scientific discovery.
• Data mining can be applied to data sets of any
size, and while it can be used to uncover hidden
patterns, it cannot uncover patterns which are not
already present in the data set.”
– http://en.wikipedia.org/wiki/Data_mining
• Data mining is the process of extracting
hidden patterns from data.
• As more data is gathered, with the amount
of data doubling every three years
• Data mining is becoming an increasingly
important tool to transform this data into
information.
• It is commonly used in a wide range of
applications, such as marketing, fraud
detection and scientific discovery.
• Data mining is the process of using
computing power to apply methodologies,
including new techniques for knowledge
discovery, to data.[
• Data mining identifies trends within data
that go beyond simple data analysis.
Through the use of sophisticated
algorithms, non-statistician users have the
opportunity to identify key attributes of
processes and target opportunities.
• Data mining identifies trends within data
that go beyond simple data analysis.
• Through the use of sophisticated
algorithms, non-statistician users have the
opportunity to identify key attributes of
processes and target opportunities.
• For many years, businesses and
governments have used increasingly
powerful computers to sift through volumes
of data such as airline passenger trip
records, census data and supermarket
scanner data to produce market research
reports.
• Continuous innovations in computer
processing power, disk storage, data
capture technology, algorithms,
methodologies and analysis software have
dramatically increased the accuracy and
usefulness of the extracted information.
• The term data mining is often used to apply
to the two separate processes of knowledge
discovery and prediction.
• Knowledge discovery provides explicit
information about the characteristics of the
collected data, using a number of
techniques (e.g., association rule mining)
• Forecasting and predictive modeling
provide predictions of future events, and the
processes may range from the transparent
(e.g., rule-based approaches) through to
the opaque (e.g., neural networks).
• Since the availability of affordable computer
processing power in the last quarter of the
20th century, organizations have been
accumulating vast and ever growing
amounts of data, including, for example:
– operational and transactional data
• such as sales, cost, inventory, payroll and
accounting data
– nonoperational data
• such as forecasts and macro economic data
– meta data
• data about the data itself, such as logical
database design, data dictionary definitions,
and executive summaries and scientific
abstracts.
Tasks
• Classification- Arranges the data into
predefined groups.
– For example an email program might attempt
to classify an email as legitimate or spam.
• Clustering - Is like classification but the
groups are not predefined, so the algorithm
will try to group similar items together.
• Regression - Attempts to find a function
which models the data with the least error.
• Association rule learning - Searches for
relationships between variables.
– For example a supermarket might gather
data of what each customer buys. Using
association rule learning, the supermarket
can work out what products are frequently
bought together, which is useful for
marketing purposes. This is sometimes
referred to as "market basket analysis".
Applications
• Combating terrorism
• It has been suggested that both the Central
Intelligence Agency and the Canadian
Security Intelligence Service have
employed [data mining]
• Previous data mining to stop terrorist
programs under the U.S. government
include
• the Total Information Awareness (TIA)
program,
• Computer-Assisted Passenger
Prescreening System (CAPPS II),
• Analysis, Dissemination, Visualization,
Insight, Semantic Enhancement (ADVISE)
• Multistate Anti-Terrorism Information
Exchange (MATRIX),
• and the Secure Flight program.
• These programs have been discontinued
due to controversy over whether they
violate the US Constitution's 4th
amendment, although many programs that
were formed under them continue to be
funded by different organizations, or under
different names, to this day.
• Two plausible data mining techniques in the
context of combatting terrorism include
"pattern mining" and "subject-based data
mining".
• An example of a probable application to
national security monitoring would be the
ability for government analysts to define a
pattern of interest as "all individuals
traveling from the United States to the
Middle East in the next six months" and
have the ADVISE tool provide an alert
whenever this pattern emerges in the data.[7