COMP9321 Web Application Engineering Semester 2, 2015

Download Report

Transcript COMP9321 Web Application Engineering Semester 2, 2015

COMP9321 Web Application Engineering
Semester 2, 2015
Dr. Amin Beheshti
Service Oriented Computing Group, CSE, UNSW Australia
Week 6
http://webapps.cse.unsw.edu.au/webcms2/course/index.php?cid=2411
COMP9321, 15s2, Week 6
1
We are Generating Vast Amounts of Data !!
Remote patient
monitoring
Healthcare
Social media
Product sensors
…
Real time location
data
Retail
Manufacturing
books, music,
videos, etc.
Digitalization of Artefacts
Location-Based Services
COMP9321, 15s2, Week 6
2
We are Generating Vast Amounts of Data !!
• Air Bus A380:
o generate 10 TB every 30 min
• Twitter:
o Generate approximately 12 TB of data per day.
• Facebook:
o Facebook data grows by over 500 TB daily.
• New York Stock:
o Exchange 1TB of data everyday.
COMP9321, 15s2, Week 6
3
Challenge
• How do we store and access this data over the web ?
COMP9321, 15s2, Week 6
4
Challenge
• How do we store and access this data over the web ?
E-Commerce website
• Data operations are mainly
transactions (Reads and Writes)
• Operations are mostly on-line
• Response time should be quick
but important to maintain
security and reliability of
transactions.
• ACID properties are important
COMP9321, 15s2, Week 6
5
Challenge
• How do we store and access this data over the web ?
E-Commerce website
• Data operations are mainly
transactions (Reads and Writes)
• Operations are mostly on-line
• Response time should be quick
but important to maintain
security and reliability of
transactions.
http://www.techtweet.org/
• ACID properties are important
COMP9321, 15s2, Week 6
6
Challenge
• How do we store and access this data over the web ?
Image serving website
• Data operations are mainly fetching large files (Reads)
• ACID requirements can be relaxed
• Operations are mainly on-line
• High bandwidth requirement
COMP9321, 15s2, Week 6
7
Challenge
• How do we store and access this data over the web ?
Search Website
• Data operations are mainly reading index files for answering
queries (Reads)
• ACID requirements can be relaxed
• Index compilation is performed off-line due to the large size
of source data (the entire Web)
• Response times must be as fast as possible.
COMP9321, 15s2, Week 6
8
Persistence
(Hibernate, pp.5-29)
COMP9321, 15s2, Week 6
9
Persistence
• Persistence is:
“the continuance of an effect after its cause is removed”
• In the context of storing data in a computer system, this means that:
“the data survives after the process with which
it was created has ended”
• In other words, for a data store to be considered persistent:
“it must write to non-volatile storage”
(Hibernate, pp.5-29)
COMP9321, 15s2, Week 6
10
Persistence
• Persistence is a fundamental concept in application
development.
• In an object-oriented applications, persistence allows an
object to outlive the process that created it.
• The state of the object may be stored to disk and an object
with the same state re-created at some point in the future.
• Sometimes entire graphs of interconnected objects may be
made persistent and later re-created in a new process.
(Hibernate, pp.5-29)
COMP9321, 15s2, Week 6
11
Persistence
• Not all objects are persistent:
o some (transient objects) will have a limited lifetime that
is bounded by the life of the process that instantiated it.
• Almost all Java applications contain a mix of persistent and
transient Objects.
• This means we need a subsystem that manages our persistent
objects.
(Hibernate, pp.5-29)
COMP9321, 15s2, Week 6
12
Data Persistence
(Hibernate, pp.5-29)
COMP9321, 15s2, Week 6
13
Data Persistence
• When we talk about persistence in Java, we normally mean
storing data in a relational database using SQL.
• Relational technology is a common denominator for many
disparate systems and technology platforms.
• Relational technology provides a way of sharing data
across different applications or technologies that form part of
the same application.
• The relational data model is often the common enterprise
wide presentation of business entities.
(Hibernate, pp.5-29)
COMP9321, 15s2, Week 6
14
Data Persistence
• When you work with a relational database in a Java
application, the Java code issues SQL statements to the
database via the JDBC API.
• The Java Database Connectivity (JDBC) API provides
universal data access from the Java programming language.
• Using the JDBC API, you can access virtually any data
source, from relational databases to spreadsheets and flat
files.
• The JDBC API is comprised of two packages:
• java.sql
• javax.sql
COMP9321, 15s2, Week 6
(Hibernate, pp.5-29)
15
Data Persistence
• When you work with a relational database in a Java
application, the Java code issues SQL statements to the
database via the JDBC API.
• The Java Database Connectivity (JDBC) API provides
universal data access from the Java programming language.
• Using the JDBC API, you can access virtually any data
source, from relational databases to spreadsheets and flat
files.
• The JDBC API is comprised of two packages:
• java.sql
• javax.sql
COMP9321, 15s2, Week 6
(Hibernate, pp.5-29)
16
Relational Databases
(Hibernate, pp.5-29)
COMP9321, 15s2, Week 6
17
Relational Databases
• Data is stored as a collection of tuples that groups attributes
e.g. (student-id, name, birthdate, courses).
• Data is visualized as tables, where the tuples are the rows
and the attributes form the columns.
• Tables can be related to each other through specific columns.
• Each row in a table has at least one unique attribute.
(Hibernate, pp.5-29)
COMP9321, 15s2, Week 6
18
Structured Query Language (SQL)
COMP9321, 15s2, Week 6
19
Structured Query Language (SQL)
COMP9321, 15s2, Week 6
20
Database Concepts
COMP9321, 15s2, Week 6
21
Database Concepts
COMP9321, 15s2, Week 6
22
Accessing DB from an Application
(JDBC)
COMP9321, 15s2, Week 6
23
Accessing DB from an Application
COMP9321, 15s2, Week 6
24
Java DataBase Connectivity
COMP9321, 15s2, Week 6
25
JDBC Concepts
• When developers use JDBC, they construct SQL statements that can be executed. A
template like query string:
SELECT name FROM employee WHERE age = ?
• can be combined with local data structures so that regular Java objects can be
mapped to the bindings in the string. e.g., a java.lang.Integer object with the value of
42 can be mapped:
SELECT name FROM employee WHERE age = 42
• The results of execution, if any, are combined in a set returned to the caller. For
example, the query may return:
We can browse this result set as necessary.
(Barish, p.310)
COMP9321, 15s2, Week 6
26
JDBC Interfaces
COMP9321, 15s2, Week 6
27
Typical JDBC Scenario
COMP9321, 15s2, Week 6
28
PreparedStatement object
• A more realistic case is that the same kind of SQL statement is processed over and
over (rather than a static SQL statement).
• In PreparedStatement, a place holder (?) will be bound to an incoming value before
execution (no recompilation).
COMP9321, 15s2, Week 6
29
Transaction Management
• By default, JDBC commits each update when you call executeUpdate().
• Committing after each update can be suboptimal in terms of performance.
• It is also not suitable if you want to manage a series of operations as a logical single
operation (i.e., transaction).
COMP9321, 15s2, Week 6
30
Data Access Objects (DAO)
COMP9321, 15s2, Week 6
31
Data Access Objects (DAO)
COMP9321, 15s2, Week 6
32
Data Access Objects (DAO)
COMP9321, 15s2, Week 6
33
Data Access Objects (DAO)
http://onewebsql.com/
COMP9321, 15s2, Week 6
34
Data Access Objects (DAO)
Example: Cars Database
COMP9321, 15s2, Week 6
35
Data Access Objects (DAO)
Example: Cars Database
DTO (Data Transfer Object)
COMP9321, 15s2, Week 6
36
Data Access Objects (DAO)
Example: Cars Database
DTO (Data Transfer Object) carries the actual data ...
COMP9321, 15s2, Week 6
37
Data Access Objects (DAO)
Example: Cars Database
COMP9321, 15s2, Week 6
38
Data Access Objects (DAO)
Example: Cars Database
COMP9321, 15s2, Week 6
39
Data Access Objects (DAO)
Example: Cars Database
COMP9321, 15s2, Week 6
40
Data Access Objects (DAO)
Example: Cars Database
COMP9321, 15s2, Week 6
41
Object-Relational
Impedance Mismatch Problems
COMP9321, 15s2, Week 6
42
Object-Relational Impedance Mismatch Problems
COMP9321, 15s2, Week 6
43
Object-Relational Impedance Mismatch Problems
https://docs.oracle.com/cd/E16162_01/user.1112/e17455/img/mismatch.gif
COMP9321, 15s2, Week 6
44
Object-Relational Impedance Mismatch Problems
COMP9321, 15s2, Week 6
45
Impedance (or Paradigm) Mismatch Problem
COMP9321, 15s2, Week 6
46
Impedance (or Paradigm) Mismatch Problem
Granularity
(Hibernate, pp.5-29) The problem of granularity
COMP9321, 15s2, Week 6
47
Impedance (or Paradigm) Mismatch Problem
Granularity
Observation:
• Classes in your OO-based model come in a range of different levels of granularity
(coarse-grained entity classes like User, finer-grained classes like Address, simple String
class like Postcode)
• Just two levels of granularity in RDB: Tables and Columns with scalar types (i.e., not as
flexible as Java type system)
• Sometimes one ends up forcing the less flexible representation upon the object model
(e.g., User class with properties like postcode, state).
(Hibernate, pp.5-29) The problem of granularity
COMP9321, 15s2, Week 6
48
Impedance (or Paradigm) Mismatch Problem
Subtypes
(Hibernate, pp.5-29) The problem of subtypes
COMP9321, 15s2, Week 6
49
Impedance (or Paradigm) Mismatch Problem
Identity
(Hibernate, pp.5-29) The problem of identity
COMP9321, 15s2, Week 6
50
Impedance (or Paradigm) Mismatch Problem
Identity
• While on the subject of identity … Modern object persistence solutions recommend
using surrogate key.
• A surrogate key in a database is a unique identifier for either an entity in the
modelled world or an object in the database.
• The surrogate key is not derived from application data, unlike a natural (or
business) key which is derived from application data.
(Hibernate, pp.5-29) The problem of identity
COMP9321, 15s2, Week 6
51
Impedance (or Paradigm) Mismatch Problem
Association
(Hibernate, pp.5-29) The problem of association
COMP9321, 15s2, Week 6
52
Impedance (or Paradigm) Mismatch Problem
Association
(Hibernate, pp.5-29) The problem of association
COMP9321, 15s2, Week 6
53
Impedance (or Paradigm) Mismatch Problem
Object Graph Navigation
(Hibernate, pp.5-29) The problem of object graph navigation
COMP9321, 15s2, Week 6
54
Impedance (or Paradigm) Mismatch Problem
Object Graph Navigation
Considering the following example:
(Hibernate, pp.5-29) The problem of object graph navigation
COMP9321, 15s2, Week 6
55
Impedance (or Paradigm) Mismatch Problem
1+N selects problem:
The N+1 query problem is a common performance issue. It looks like this:
Assuming load_cats() has an implementation that boils down to:
..and load_hats_for_cat($cat) has an implementation something like this:
..you will issue "N+1" queries when the code executes, where N is the number of cats:
https://secure.phabricator.com/book/phabcontrib/article/n_plus_one/
COMP9321, 15s2, Week 6
56
Impedance (or Paradigm) Mismatch Problem
The cost of mismatch problems:
The DAO pattern helps isolate the mismatch problems by separating the
interfaces from implementation, but someone (usually application developers)
still has to provide the implementation classes !!
(Hibernate, pp.5-29) The cost of mismatch problems
COMP9321, 15s2, Week 6
57
Object-Relational Mapping (ORM)
COMP9321, 15s2, Week 6
58
Object-Relational Mapping (ORM)
COMP9321, 15s2, Week 6
59
Hibernate
COMP9321, 15s2, Week 6
60
Hibernate
COMP9321, 15s2, Week 6
61
Hibernate
COMP9321, 15s2, Week 6
62
Continuing with the Cars example ...
COMP9321, 15s2, Week 6
63
Continuing with the Cars example ...
COMP9321, 15s2, Week 6
64
Continuing with the Cars example ...
COMP9321, 15s2, Week 6
65
Continuing with the Cars example ...
COMP9321, 15s2, Week 6
66
Continuing with the Cars example ...
COMP9321, 15s2, Week 6
67
Continuing with the Cars example ...
COMP9321, 15s2, Week 6
68
To use Hibernate, you need:
• Hibernate packages (hibernate*.jar)
• A set of mapping (between a table and an object) les
• A Hibernate configuration file (e.g., database connection details)
COMP9321, 15s2, Week 6
69
Hibernate Example
• See course material, week 6
COMP9321, 15s2, Week 6
70
NoSQL
COMP9321, 15s2, Week 6
71
What is NoSQL?
• Stands for No-SQL or Not Only SQL??
• Class of non-relational data storage systems
• E.g. BigTable, Dynamo, PNUTS/Sherpa, ..
• Usually do not require a fixed table schema nor do
they use the concept of joins
• Distributed data storage systems
• All NoSQL offerings relax one or more of the ACID
properties (will talk about the CAP theorem)
Chapter 19: Distributed Databases
COMP9321, 15s2, Week 6
72
NoSQL Data Storage
Classification:
 Uninterpreted key/value or ‘the big hash table’.
• Amazon S3 (Dynamo)
 Flexible schema
• BigTable, Cassandra, HBase (ordered keys, semistructured data),
• Sherpa/PNuts (unordered keys, JSON)
• MongoDB (based on JSON)
• CouchDB (name/value in text)
COMP9321, 15s2, Week 6
73
CAP Theorem
Three properties of a system
• Consistency (all copies have same value)
• Availability (system can run even if parts have
failed) Via replication.
• Partitions (network can break into two or more
parts, each with active systems that can’t talk to
other parts)
•
Brewer’s CAP “Theorem”: You can have at most two of these three
properties for any system.
•
Very large systems will partition at some point.
COMP9321, 15s2, Week 6
74
Why NoSQL?
•
NoSQL Data storage systems makes sense for applications
that need to deal with very large semi-structured data :
• e.g. Social Networking Feeds
COMP9321, 15s2, Week 6
75
Why NoSQL?
share, comment, review,
crowdsource, etc.
COMP9321, 15s2, Week 6
76
Examples
NoSQL databases:
 Employs less constrained consistency models.
 Simple retrieval and appending operations.
 Significant performance benefits.
Examples:
•
•
•
•
Key–value Store
Document Store
Graph Database
…
COMP9321, 15s2, Week 6
77
Graph Database
User
Social Network
Collaborative Filtering
Netflix
Movie
Probabilistic Analysis
Docs
Text Analysis
Wiki
Words
COMP9321, 15s2, Week 6
78
Graph Database
User
Social Network
Collaborative Filtering
Netflix
Movie
Probabilistic Analysis
Docs
Text Analysis
Wiki
Words
COMP9321, 15s2, Week 6
79
Graph Stores
• Use a graph structure
– Labeled, directed, attributed multi-graph
•
•
•
•
Label for each edge
Directed edges
Multiple attributes per node
Multiple edges between nodes
– Relational DBs can model graphs, but an edge
requires a join which is expensive
• Example Neo4j
– neo4j.com/
COMP9321, 15s2, Week 6
Advantages of NoSQL
•
•
•
•
•
•
•
•
Cheap, easy to implement
Data are replicated and can be partitioned
Easy to distribute
Don't require a schema
Can scale up and down
Quickly process large amounts of data
Relax the data consistency requirement (CAP)
Can handle web-scale data, whereas Relational DBs cannot
COMP9321, 15s2, Week 6
Disadvantages of NoSQL
•
•
•
•
•
•
•
•
•
New and sometimes buggy
Data is generally duplicated, potential for inconsistency
No standardized schema
No standard format for queries
No standard language
Difficult to impose complicated structures
Depend on the application layer to enforce data integrity
No guarantee of support
Too many options, which one, or ones to pick
COMP9321, 15s2, Week 6
References
•
•
•
(Hibernate) Hibernate In Action, Christian Bauer and Gavin King, Manning Publications
(HibernateDOC) http://www.hibernate.org/hib docs/reference/en/html/
Some examples are originated from Dr. David Edmond from School of Information
Systems, QUT, Brisbane and S. Sudarshan from IIT Bombay.
COMP9321, 15s2, Week 6
83
COMP9321, 15s2, Week 6
84