Introducing Enterprise NoSQL

Download Report

Transcript Introducing Enterprise NoSQL

MarkLogic 8
Overview of New Features
© COPYRIGHT 2015 MARKLOGIC CORPORATION. ALL RIGHTS RESERVED.
MarkLogic / Enterprise NoSQL Database Platform
POWERFUL
AGILE
TRUSTED
Native
JSON
Store
Native
XML
Store
Scalable
and Elastic
Cloud
Ready
(AWS)
Performance
at scale
LDAP and
Kerberos
Security
Native RDF
Triple Store
Geospatial
Support
Hadoop
and HDFS
REST
API
Security
Certifications
Monitoring and
Management
Full-text
Search
Flexible
Indexes
SQL
Support
Multi-OS
Support
Configuration
Management
24/7
Engineering
Support
Bitemporal
Real-time
Alerting
Schema
Agnostic
Samplestack
ACID
Transactions
Flexible
Replication
Semantic
Inference
Tiered
Storage
MarkLogic
Content Pump
XA
Transactions
Customizable
Backup
Server-Side
JavaScript
Fully
Transactional
Ad-hoc
Queries
Index Across
Data Types
Point-in-time
Recovery
Customizable
Failover
Atomic
Forests
MarkLogic 8 / More Powerful, Easier to Use
Developer Experience
Semantics
MarkLogic 8 is more powerful than ever,
but remarkably easy to use
Enterprise triple store,
document store, database
combined
Bitemporal
JSON
Unified indexing
and query for
today’s web and
SOA data
Node.js
Client API
Java Client
API
Server-Side
JavaScript
Enterprise NoSQL
database for Node.js
applications
NoSQL agility in a
pure Java
interface
JavaScript
runtime inside
MarkLogic using
Google’s V8
Track information along two
dimensions of time
MarkLogic 8 / More Powerful, Easier to Use
Additional New Features
Management
API
Incremental
Backup
Flexible
Replication
Enhanced
HTTP Server
REST-based
Management API
to manage all
MarkLogic
capabilities,
providing more
programmatic
control than ever
before
Faster backups that
use less storage,
only backing up
changes since the
previous incremental
or full backup
Customizable
information
sharing between
systems, allowing
for the easy and
secure distribution
of data
Simple and fast
client-server
interactions out-ofthe-box with a
single interface
Learn MarkLogic
with an end-toend three-tiered
sample app
+
Database
+
Middle Tier
Front End
JSON
Unified indexing and query for today’s web and SOA data
 Speed up development with powerful
built-in search, transformation, and
alerting capabilities designed for JSON
 Reduce lost fidelity and functionality from
data model translations and brittle ETL
 Simplify architecture with data, metadata,
and relationships managed consistently
and securely together
 Ease modern, end-to-end JavaScript
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
{
"_id": 1,
"name": { "MarkLogic" },
"supports" : [
{
"datatype": "XML",
"year": 2003
},
{
"datatype": "JSON",
"year": 2015
}
]
}
development
SLIDE: 5
© COPYRIGHT 2015 MARKLOGIC CORPORATION. ALL RIGHTS RESERVED.
Node.js Client API
Enterprise NoSQL database for Node.js applications

Focus on application features rather than
plumbing with out-of-the-box search,
transactions, aggregates, alerting, geospatial,
and more

Move faster to production with proven reliability
at scale

Maximize performance and flexibility—bringing
code to the data

Enable modern end-to-end JavaScript
development

SLIDE: 6
Always open source on GitHub
Participate.
Contribute.
Fork it.
© COPYRIGHT 2015 MARKLOGIC CORPORATION. ALL RIGHTS RESERVED.
Java Client API
NoSQL agility in a pure Java interface

Faster development and less custom code with
out-of-the-box data management, search, and
alerting

Pure Java query builder and conveniences for
POJOs, JSON, XML, and binary I/O

Built-in extensibility for moving performancecritical code to the database

Always open source and developed on GitHub
Participate.
Contribute.
Fork it.
SLIDE: 7
© COPYRIGHT 2015 MARKLOGIC CORPORATION. ALL RIGHTS RESERVED.
Server-Side JavaScript
JavaScript runtime inside MarkLogic using Google’s V8
 Run code near the data for unparalleled
Front End
power, efficiency
 Build applications faster from a growing pool
of skills, tools
Middle Tier
 Reduce risk with proven performance and
reliability
 Decrease brittle ETL and lost fidelity and
+
Database Layer
functionality from JSON data conversions
 Pair with Node.js to ease full-stack JavaScript
development
SLIDE: 8
© COPYRIGHT 2015 MARKLOGIC CORPORATION. ALL RIGHTS RESERVED.
Samplestack
An end-to-end three-tiered application in Java and Node.js

Encapsulates best practices and
introduces key MarkLogic concepts

Use sample code as a model for
building applications more quickly

Front End
Middle Tier
Modern technology stack shows where
MarkLogic fits in your environment
Database Layer
Participate.
Contribute.
Fork it.
SLIDE: 9
© COPYRIGHT 2015 MARKLOGIC CORPORATION. ALL RIGHTS RESERVED.
Semantics
Enterprise triple store, document store, database combined

Store and query billions of facts and relationships;
infer new facts

Facts and relationships provide context for better
search

Flexible data modeling—integrate and link data from
different sources

Standards-based for ease of use and integration
– RDF, SPARQL, and standard REST interfaces

SLIDE: 10
New in MarkLogic 8: SPARQL 1.1, graph
traversal, automatic inference using rule sets, and
SPARQL from Server-Side JavaScript and Node.js
© COPYRIGHT 2015 MARKLOGIC CORPORATION. ALL RIGHTS RESERVED.
Bitemporal
Timing is everything


SLIDE: 11
Rewind the information “as it actually was” in
combination with “as it was recorded” at some
point in time
Provides increased insight into your business
and mission
Capture evolving schema as the shape of the
data changes with changing time, a capability
that has prevented relational bitemporal
offerings from being widely adopted

Critical for anyone in regulated industries

Even better because of Tiered Storage and
Semantics
Valid Time

EVENT 3
EVENT 2
EVENT 2
EVENT 1
System Time
Valid Time – Real-world
time, information “as it
actually was”
System Time – Time it
was recorded to the
database
© COPYRIGHT 2015 MARKLOGIC CORPORATION. ALL RIGHTS RESERVED.
Management API
REST-based API to manage all MarkLogic capabilities
SLIDE: 12

Increase efficiency and agility by automating timeconsuming repetitive tasks across production,
testing and development

Reduce setup time and admin error by
orchestrating multi-step configurations and
deployments

Fit more seamlessly into IT environments by using
REST interfaces unlike CLI or proprietary APIs

Perform automated testing and monitor
performance using market tools that support
REST

Even better with Client REST API, Elasticity
© COPYRIGHT 2015 MARKLOGIC CORPORATION. ALL RIGHTS RESERVED.
Incremental Backup
Faster backups while using less storage

Store only changes since the previous full or incremental backup

Consume less storage for backup copies

Reduce backup window

Improve availability with multiple daily backups

Work with Log Archiving to enable fine-grained point-in-time recovery
INCREMENTAL BACKUP (differential)
FULL
SUNDAY
SLIDE: 13
FULL
MONDAY
TUESDAY
WEDNESDAY
THURSDAY
FRIDAY
SATURDAY
SUNDAY
© COPYRIGHT 2015 MARKLOGIC CORPORATION. ALL RIGHTS RESERVED.
Flexible Replication
Customizable information sharing between systems
SLIDE: 14

Enable content collaboration across
numerous systems

Support directly connected or mobile
users

Provide data that users need using
simple configurable parameters or
queries

Ensure data consistency and
security with simple workflows

Even better with Bitemporal and
Management API
© COPYRIGHT 2015 MARKLOGIC CORPORATION. ALL RIGHTS RESERVED.
Enhanced HTTP Server
Simple and fast client-server interactions out-of-the-box
SLIDE: 15

Use a single interface when employing the REST
API, custom HTTP, XCC/XDBC to connect to any
database

Delivers ease-of-use by not having to create extra
ports

Simplifies the out-of-the-box interaction and can
improve the performance of client and server

Provides an improved and more efficient
developer experience with MarkLogic
© COPYRIGHT 2015 MARKLOGIC CORPORATION. ALL RIGHTS RESERVED.
APPENDIX
Continuous Innovation
Cerisent XQE Server 1
ACID Transactions
Text Based Search
Backup and Restore
Linux Support
Web-based Protocols
HTTP and XDBC
XQuery
2003
2004
MarkLogic Server 4
MarkLogic Server 3
Alerting
Entity Enrichment
Geospatial
Analytics (co-occur.,
value lexicons,
bucketing)
Modular documents
Security auditing
HA: forest-level failover
Advanced Search
Features
Content processing
(including PDF, Word,
Excel, PPT)
HTTP calls
Failover
Support for Linux,
Windows Server, .NET
2005
MarkLogic Server 2
Clustering
Role-based security
w/BASIC authentication
Document Collections
Enhanced Search
(stemming, thesaurus,
wildcard)
WebDAV support
Document locking
Enhanced XDBC support
2006
MarkLogic Server 5
2008
2010
MarkLogic 7
Complete Enterprise
Roadmap
Database Replication
Multi-statement and
distributed transactions
Point-in-time recovery
Start Hadoop Roadmap
Hadoop Connector
2011
MarkLogic Server 3.1
MarkLogic Server 4.1-2
Advanced Search
Features
Wildcard queries
Directories
Forward Compatibility
Support for Sun Solaris
XML Contentbase
Connector (XCC)
Replication
Failover
Database Rollback
Compartment Security
Search Optimizations, API
Information Studio
Application Builder
REST capabilities
SSL support
Schema Validation
Japanese added
2012
MarkLogic 6
Accessibility
SQL/BI
Java/REST/JSON
UDFs/Analytics
mlcp
Hadoop Distributions
HDFS Tech Preview
Semantics Foundation
Next-gen Infrastructure
Support
Elasticity
Tiered Storage
Continue Hadoop
Roadmap
Run on HDFS
2013
2015
MarkLogic 8
JSON Storage
Server-side JavaScript
Semantics
Bitemporal
Samplestack
Java Client API
Node.js Client API
Management API
Incremental Backup
Flexible Replication
Enhanced HTTP Server
MarkLogic / Enterprise NoSQL Database Platform
POWERFUL
AGILE
TRUSTED
Better answers
from today’s data
Adaptive to
every environment
Hardened, proven
platform
MarkLogic is built to find answers in
documents, relationships, and metadata
MarkLogic runs well everywhere, while
preserving the option to change
hardware, data, and scale later
MarkLogic has a proven track record of
performance under all enterprise
conditions
Simpler data
integration
Uncompromised
data resiliency
MarkLogic accelerates and simplifies data
sharing across silos, cutting down on ETL
and making agile development possible
MarkLogic will keep your data safe and
whole—no matter what happens in your
application or at your data center
The intelligent
data layer
An intelligent data layer powers intelligent
applications—and makes them faster and
more flexible than any alternative
// POWERFUL / Deliver more value, build better apps
Native
JSON
Store
Store and manage data natively as
JSON documents, speeding up
development and reducing data
transformation with a simplified
architecture for end-to-end
JavaScript development.
Native
XML
Store
Store and manage data natively as
XML documents, a hierarchical selfdescribing data type that is ideal for
a wide variety of applications.
Native RDF
Triple Store
Geospatial
Support
Store RDF triples and query them
using SPARQL—providing context
to your data and better search with
a database that can handle a
combination of documents, data,
and triples.
Store geospatial data such as GML,
KML, and GeoRSS and do complex
queries on the data or in
combination with other data types.
Also integrate with ESRI ArcGIS and
Google Maps for visualization.
Full-text
Search
Flexible
Indexes
Bitemporal
Real-time
Alerting
Built-in, lightning fast search and
query capabilities across hundreds of
billions of documents. And, fullfeatured UX with type-ahead
suggestions, facets, snippeting,
relevance ranking, and language
support.
Rely on over 30 sophisticated,
composable indexes including a
universal index, range index,
geospatial index, and triple index—all
designed so that developers can ask
harder questions and get faster
responses.
Handle historical data along two
different timelines, making it possible
to rewind the information “as it
actually was” in combination with “as
it was recorded” at some point in
time.
Create an unlimited number of realtime alerts by email or text using the
alerting API and reverse indexes.
Whenever a document is loaded that
matches a specific query, you’ll know.
Semantic
Inference
Tiered
Storage
Server-Side
JavaScript
Work with new data that didn’t exist
before. For example, if John lives in
London, and London is in England,
then MarkLogic can infer that “John
lives in England” and then add that
new fact to your semantic search.
Store and manage data in different
tiers based on cost and performance
trade-offs, and easily migrate between
tiers without any ETL, additional
software, or expensive infrastructure
changes.
Live in JavaScript. Run JavaScript
near the data for unparalleled power
and efficiency with a high performance
JavaScript runtime inside MarkLogic
using Google’s V8.
Run complex distributed transactions
across multiple documents and
Fully
collections with no performance dropTransactional offs at scale. Production applications
run tens of thousands of transactions
per second for tens of thousands of
users.
// AGILE / Prepare for and respond to change
Handle petabytes of data without
Scalable
over-provisioning, over-spending, or
and Elastic experiencing downtime,
SQL
Support
inconsistency, or risk of data loss.
Cloud
Ready
(AWS)
Use MarkLogic’s cloud templates to
get up and running quickly on AWS
or other cloud environments,
starting with a three node cluster or
a large cluster with over a hundred
nodes.
Hadoop
and HDFS
Make Hadoop better by connecting
it to MarkLogic and using it as part
of an infrastructure to handle both
operational and analytic workloads.
REST API
Configure and administer MarkLogic
with a single REST-based API. This
provides more programmatic control
than ever before—giving DBAs the
power and flexibility necessary to
run a modern data center.
Multi-OS
Support
Schema
Agnostic
Samplestack
Use a relational SQL data model
within MarkLogic, connecting to SQLbased tools using the ODBC driver,
or execute SQL commands against
relational databases using the
MLSAM open-source XQuery library.
Run MarkLogic on Windows, Linux,
Solaris, OS X. MarkLogic runs easily
and is easy to setup in your
environment, whether in the cloud,
virtualized, or on premises.
Only use schema when you need it.
Ingest all your data as-is, whether
structured or unstructured, using the
NoSQL document model rather than
being forced to use a predefined
schema.
Get going fast on MarkLogic with
Samplestack, an end-to-end three
tiered sample application designed to
show developers how to implement a
reference architecture using key
MarkLogic concepts and sample
code.
MarkLogic
Content Pump
XA
Transactions
Ad-hoc
Queries
Index Across
Data Types
MLCP makes it easy to quickly import
or export documents and metadata
from MarkLogic, or to copy from one
database to another using a
command-line tool.
Run distribute transactions across a
cluster using the XA (eXtended
Architecture) standard, which ensures
ACID properties for global transaction
processing.
Don’t plan your queries in advance of
ingesting your data. MarkLogic is
designed for search and discovery so
that you can run any query at any-time
and get real-time results.
Use multiple indexes in concert
across multiple data types—giving you
the power to search and query all of
your data.
// TRUSTED / Enterprise-ready for mission-critical uses
Performance
at scale
LDAP and
Kerberos
Security
Security
Certifications
Monitoring and
Management
Scales easily to handle hundreds
of terabytes using shared-nothing
architecture in which data
partitions are completely
independent of each other and
can act independently.
Use third party authentication from
LDAP or Kerberos, making the
most secure NoSQL database
easier to manage.
Secure your data with
government-grade security.
MarkLogic has certified, granular
security for modern data
governance and to handle the
increased complexity of today’s
cyber threats.
Use the Management API for
cluster management, process
automation, access controls,
database cloning, audit trails, and
connections to third-party
interfaces.
Configuration
Management
24/7
Engineering
Support
ACID
Transactions
Flexible
Replication
View and manage the configuration
settings for MarkLogic databases,
forests, application servers, groups
or hosts—and easily propagate
changes across the entire cluster.
Rely on support from the 24/7, allengineer support staff to ensure
you get answers fast, or just want
some friendly tips on saving a few
milliseconds on performance.
Don’t settle for a BASE-ic database.
Use ACID transactions to ensure you
don’t run the risk of encountering
data corruption, stale reads, and
inconsistent data—all of which are
unacceptable.
Enable customizable information
sharing between systems, allowing
for the easy and secure distribution
of portions of data even across
disconnected, intermittent, and
latent networks.
Customizable
Backup
Customizable
Failover
Point-in-time
Recovery
Atomic
Forests
Restore the database quickly with
minimal downtime, relying on full
and consistent backups, hot
configuration changes, and
automatic index optimization without
shutting down the system.
Have confidence that your data is
always available, reducing risk and
avoiding interruptions with
automated local- or shared-disk
failover made possible with sharednothing architecture.
Rollback to a specified point in time
by replaying journal archives, an
additional feature to ensure disaster
recovery and easy of management.
Manage data in collections of
documents similar to partitions,
called forests, that exist
independently and enable scalability
and elasticity, rebalancing, efficient
operations, and easier data
governance.