Powerpoint slides - Dynamic Connectome Lab

Download Report

Transcript Powerpoint slides - Dynamic Connectome Lab

Computing Environments
CSC8304
Marcus Kaiser
http://www.biological-networks.org
About this module

Module Leader: Dr Marcus Kaiser

Today’s lecture – Introduction to Computing Environments

Lectures for the next 4 weeks cover the need for database systems,
statistical packages, security and poll-based/event-driven software

Reading Week then follows – no lecture

Remaining lectures cover programming languages, scripting and Perl
 Lectures are on Tuesdays, 9:30-11am, CLT.602
 Lecture notes can be found on http://www.biological-networks.org/
(Training > Computing Environments for Bioinformatics )

Practical classes start in DAYSH.821 and are on Wednesdays 11-12am
Assessment
Coursework
Databases/SQL 5 Nov deadline
15% of final mark
Scripting/Perl
10 Dec deadline
15% of final mark
Exam
January
70% of final mark
Exam
A computing environment
Enabling presentation of
services to users and
interaction with such
services by users
Requirements
Monitors
Tactile
feedback
devices
Robots
Services
Mobile devices
(e.g., mobile
phones)
Video game
consoles
Printers
A world of
supporting
services
A heterogeneous environment




A modern day computing environment is made up
from many different types of enabling technologies.
An enabling technology refers to the mechanism that
is used to implement a service.
Many technologies share the same ultimate goal
(e.g., sending a message from one computer to
another).
However, such technologies may attempt to achieve
the same goal in different ways (e.g., Microsoft and
Linux operating systems).
Standards



There are instances when vendors must adhere to some
standard to ensure integration (the Internet protocols exemplify
this).
Standards play a crucial role in computer system development.
There are two types of standard:
 Provided by an organisation that is recognised by the
community as assuming the role of identifying standards
(members of such an organisation are usually drawn from
different vendors).
 Provided by a vendor (or group of vendors) and deployed
without international recognition (however, such recognition
may occur at a later date).
Computer technology evolution
1945: ENIAC
2004:Pentium
http://en.wikipedia.org/wiki/Eniac
Complexity
18,000 Valves
X 103
42 M transistors
Size
200 m3
X 10 -8
6 cm3
Speed
150 ops/s
X 106
1.6 x 109 ops/s
Consumption
10 kW
X 10-3
68W
Cost
$10 000 000
X 10-4
<£1000
Reliability
Hours
X 1000
Years
What if cars improved in a
similar fashion?!!
Speed
70 mph
X 106
11000km/s
Fuel
50 mpg
X 10-3
50,000 mpg
Cost
£10,000
X 10-4
£1
Reliability
1 Year
X 1000
1000 Years
Weight
1 ton
X 10-8
10 mg
Conceptual Levels of Computers







A digital computer is capable of computation, communication and
storage.
Computation is performed by carrying out instructions, a
sequence of instructions to solve a problem is called a program
Instructions are recognised and executed by the electronic
circuits in the computer
Instructions can only permit simple operations, typically, they can:
 Add two numbers
 Check if a number is zero
 Move data around in Memory
This set of instructions is called the machine language
Different types of computers usually have different machine
languages
The machine language is said to be the interface between the
software and the hardware.
Conceptual Levels of Computers
contd.

Most users do not use machine
language
 Use high level language
 e.g. Java
 The high level language is
translated to machine
language by a compiler
 Computers can be thought of as
having different levels, each with
its own language.
 Each level carries out tasks on
behalf of the level above it.
 Helps to cope with understanding
the complexity of computing
systems
Application Software
(anybody)
High Level Language
(Java programmer)
Operating System Level
(programmer)
Assembly Language Level
(Assembly programmer)
Conventional machine level
(hardware designer)
Integrated circuit level
(VLSI designer)
Transistor level
(Physical designer)
Silicon + electronics level
Chemical engineer
Software
Software
and/or
Hardware
Hardware
Data Representation (1)









Humans count in base 10, using 10 digits
Difficult to represent electronically
Machines count in base 2
Two-state devices are easy to make (transistors)
Only two digits used (0 and 1) called binary digits or bits
Electrically represented by 0 volts and 5 volts
Each bit occupies one memory cell or wire
The basic working unit consists usually of 8 bits, called a byte
The basic memory unit is a multiple number of bytes e.g.
 2 bytes = 16 bits
 4 bytes = 32 bits
 8 bytes = 64 bits
 The basic memory unit is called the word length
Data Representation (2)




All bytes in the memory are numbered, or addressable
The first byte is numbered 0, the second 1, and so on
Memory size is usually expressed in terms of (Mega)bytes
It is common practice to:





Write the least significant bit (LSB) on the right
Write the most significant bit (MSB) on the left
Start counting from zero
All data held within the computer is represented by a number of
bits or bytes
All high level objects, such as a Java or C++ class must be
translated into bits
Data Representation (3)

Data comes in many forms
 Booleans
 Characters (i.e. text)
 Integers, both positive and negative; e.g. -230, -1, 0, 45319
 Real numbers, also called floating point numbers, e.g. 3.0, log
(13), sin(π/4), 22/7

Structured data types defined by programming languages e.g.




Arrays
Strings
Classes
Each type is represented by one or more bits
Bits, Bytes, and Buzzwords
Terms used to describe file size or memory size:





Byte
Kilobyte (KB)
Megabytes (MB)
Gigabytes (GB)
Terabytes (TB)
= 8 bits
= 1024 (210) Bytes
= 220 Bytes
= 230, or about a
billion, Bytes
= 240, or about a
trillion, Bytes
Integer data






Integer numbers don’t allow fractions
Humans use the decimal number system. There are 10 digits, 0 –
9.
Each place within a decimal number represents a power of 10.
For example
236 =
2 * 102 +
3 * 101 +
6 * 100 +
10 is not a ‘natural’ base (it is an anatomical incident!)
Computers work more naturally with base 2 because transistors
have two states
In base 2, only digits 0 and 1 are used. Greatly simplifies the
arithmetic and makes it much faster
Binary Numbers




Each place within a binary number represents a power of 2.
e.g binary 101 =
 1 x 22 +
 0 x 21 +
 1 x 20
(equals five in decimal)
Electrical representation: three wires
ON
OFF
ON
(5V)
(0V) (5V)
Binary Arithmetic

Humans perform decimal addition by:








Memorising all single-digit additions
Writing the numbers to be added down right-aligned, one above the
other
Starting at the right and working towards the left
Adding the digits, writing down the result and propagating any carry
to the next column
Subtraction works much the same way except that you must borrow
from the next column
Multiplication with a single-digit number works much the same way
too
Multiplication with a multi-digit number is treated as a series of
separate single digit multiplications, the results of which are added
together
Binary addition, subtraction and multiplication can treated exactly the
same except that only the digits 0 and 1 are used.
Basic Binary Arithmetic examples
1 0 0 1 0 1 0
0 0 1 1 1 0 1
- - - - - - 1 1 0 0 1 1 1
+
0 0 1 101 0
0 0 0 100 1
- - - - - - 0 0 1 000 1
-
Hexadecimal Numbers








Problem with binary arithmetic – Long strings are fine for
machines but awkward for humans
e.g. What is the binary number 0100101011100011 ??
Guess then work it out!
We (humans) therefore often use hexadecimal numbers (or hex
for short).
This uses base 16.
There are 16 “digits” (0 1 2 3 4 5 6 7 8 9 A B C D E F)
Each place represents a power of 16:
e.g. 29F =
decimal)
2 * 162 + 9 * 161 + F * 160
(=671 in
Integer Representation


For the sake of economy, different hardware representations are
used to implement different integer ranges
The following are commonly found:
Name
Bits
Range signed
Range unsigned
Byte
8
-128 … 127
0 .. 255
Word
16
-32768 .. 32767
0 .. 65535
Long
32
-231 .. 231-1
0 .. 232-1
Quad
64
-263 .. 263-1
0 .. 264-1
Integer overflow

It is possible that the result of an integer calculation is bigger than the allowed maximum (both
positive and negative)

Look at the following 8-bit addition

11001000
200
-56
10010110
150
-106
----------------------(1)01011110
(256+) 94
(-256+)
94
The final carry “disappears” because there is no hardware provision for it. The problem is
called overflow (or underflow)

Is this serious? Would you like this to happen to your bank account?

Overflow is a serious problem. It indicates the presence of a bug in your program. The
hardware can detect overflow and will cause your program to crash

Overflow occurred in the European Space Agency’s Ariane 5 rocket when the on-board
software attempted to fit a 64 bit number into 16 bits. This did indeed cause the program to
crash...
Floating Point data






The range of possible values using 32 bits to represent a number,
positive or negative, is large
However, bigger number representations are needed.
 e.g. numbers to allow fractions and powers as required by
many scientific applications
To represent fractions using integers, you would need two of
them
 One for the numerator and one for the denominator
 Would be a major nuisance – not computationally amenable
The way to do this is to use floating point numbers.
Floating point data types allow a much greater range of possible
values
They are represented in floating point notation
Floating Point Notation

Details of how floating point values are represented
vary from one machine to another.

The IEEE standard is one of the standard floating
point representations

More info at
http://www.cs.uaf.edu/~cs301/notes/Chapter4/node13.
html
Character Data

Used for textual data, but can represent small integers
 Usually held in a byte although commonly only 7 bits are needed
 There are two major character sets:
 EBSIDIC (on IBM mainframe machines)
 ASCII (on all other machines)

We concentrate on ASCII (American Standard Code for Information
Interchange)
 It has been standardised by ISO (International Standardisation
Organisation)
 ASCII was actually designed for use with teletypes and so the
descriptions are somewhat obscure
 Often ‘text’ documents are referred to as in ‘ASCII’ format – easier for
document interchange
ASCII




The characters are classed as
 Graphic characters (printable or displayable symbols)
 Control characters (intended to be used for various control
functions, such as vertical motion and data communications.
The basic ASCII set uses 7 bits for each character, giving it a
total of 128 unique symbols.
The extended ASCII character set uses 8 bits, which gives it an
additional 128 characters.
The extra characters represent characters from foreign languages
and special symbols for drawing pictures.
More info @ http://www.jimprice.com/jim-asc.htm
Unicode

Unicode is a new system to standardise character
representation
 Unicode provides a unique number for every
character, independent of platform, program, or
language
 Adopted by such industry leaders as Apple, HP, IBM,
JustSystem, Microsoft, Oracle, SAP, Sun, Sybase,
Unisys.
 Required by modern standards such as XML, Java,
ECMAScript (JavaScript), LDAP, CORBA 3.0, WML
 An implementation of the ISO/IEC 10646 standard
 Enables Internationalization
Unicode




How Unicode Works
It defines a large (and steadily growing) number of characters (>
110,000).
 Each character gets a name and a number, e.g. LATIN
CAPITAL LETTER A is 65 and TIBETAN SYLLABLE OM is
3840.
 Includes a table of useful character properties such as "this is
lower case" or "this is a number" or "this is a punctuation
mark".
The Unicode standard also includes a large volume of helpful
rules and explanations about how to display these characters
properly, do line-breaking and hyphenation and sorting
Unicode is important – do some extra reading! – try a Google
search
Summary

A modern day computing environment is made up from many
different types of enabling technologies

Standards are used to permit interoperability

Computers can be thought of as a number of different levels,
ranging from the application software that we all use, right
through to the electronic circuits within the computer

Computers count in binary

Various ways of representing numeric and character data