The need for structure in data storage

Download Report

Transcript The need for structure in data storage

ICC Module 3 Lesson 3 – Storage
© 2015 Ph. Janson
Information, Computing & Communication
Storage – Clip 1 – Principles
School of Computer Science & Communications
Ph. Janson
1/6
ICC Module 3 Lesson 3 – Storage
© 2015 Ph. Janson
Outline
►Clip 1 – Technology reminder & basic principle
►Clip 2 – Storage structures
►Clip 3 – Addressing and naming
►Clip 4 – File systems
►Clip 5 – Databases
Intro clip
Previous clip
Next clip
2/6
ICC Module 3 Lesson 3 – Storage
© 2015 Ph. Janson
Technology reminder
Latency
Throughput Cost ($/GB)
Size
Retention
Access
RAM
1 - 100 ns
GB/s
10
Mo - Go
NO
Random
Flash
µs
GB/s
0.5
Go - To
Yes
Random
Hard disks
ms
100s MB/s
0.05
> To
Yes
Random
with delay
Magnetic
tapes
Yet
slower!
100s MB/s
Yet
cheaper!
Yet
bigger!
Yes
Sequential
Flash
Hard disks
Magnetic tapes
Random access like RAM
Latency from rotation +
Strictly sequential access
but by pages like hard disks
arm positioning
=> un-/rewinding latency
Head
Tête
10K RPM
Pistes
Tracks
Sector
Secteur
Mouvement
Arm movement du bras
Bras
Arm
3/6
ICC Module 3 Lesson 3 – Storage
© 2015 Ph. Janson
The need for structure in data storage
Unstructured data
Structured data
= dumped without ordered or classification
= stored into lists, piles, hierarchies, tables, etc.
►Easy to dump, store, transport
►Easy to retrieve, explore, exploit
►But hard to retrieve, explore, exploit
►Hard to maintain, store, transport
►In the ocean of unstructured web data
retrieving information on a statistic professor
called Michael Jordan is an real challenge
►Retrieving information on statistics professor
Michael Jordan in databases of organizations
that he is involved in is trivial
4/6
ICC Module 3 Lesson 3 – Storage
© 2015 Ph. Janson
The need for structure in data storage
►Imagine a hard disk without any structure
►How
can one know whether it is full or empty? How can one find information on it ?
•
One might as well look for
 A needle in a haystack
 A song in a music library without a catalog
 The works of a composer in a library classified by interpreters
 The works of an unidentified interpreter about which one only know the title of one song
 A song of which one only knows a few words and notes but neither the title nor the author
►Even
•
Google needs structure to produce answers to our queries !!
5/6
ICC Module 3 Lesson 3 – Storage
© 2015 Ph. Janson
Basic principle for storing structured data
Catalogs / directories for storing structural relations (= “metadata”)
Area for storing “data“ themselves without further logical structure
6/6