Hands-on Node
Download
Report
Transcript Hands-on Node
Welcome!
Install MarkLogic
Install Node.js v4, NPM
Geophoto
– git clone
Install git client
Hands-on Node
– git
clone https://github.com/dmcassel/handson-node
– cd hands-on-node
– npm install
– cd ..
–
–
–
–
–
–
–
–
SLIDE: 1
https://github.com/marklogic/geophoto
cd geophoto
npm install -g bower
npm install && bower install
cd ml-setup
node setup.js bootstrap
cd ..
cd import
node import.js ../data/photos
© COPYRIGHT 2013 MARKLOGIC CORPORATION. ALL RIGHTS RESERVED.
MarkLogic: Hands-on with Node.js
David Cassel @dmcassel
© COPYRIGHT 2015 MARKLOGIC CORPORATION. ALL RIGHTS RESERVED.
Agenda
MarkLogic
Demo: Geophoto
Feature walk-through
Hands-on with Node
SLIDE: 3
© COPYRIGHT 2013 MARKLOGIC CORPORATION. ALL RIGHTS RESERVED.
MarkLogic / Enterprise NoSQL Database Platform
POWERFUL
AGILE
TRUSTED
Native
JSON
Store
Native
XML
Store
Scalable
and Elastic
Cloud
Ready
(AWS)
Performance
at scale
LDAP and
Kerberos
Security
Native RDF
Triple Store
Geospatial
Support
Hadoop
and HDFS
REST
API
Security
Certifications
Monitoring and
Management
Full-text
Search
Flexible
Indexes
SQL
Support
Multi-OS
Support
Configuration
Management
24/7
Engineering
Support
Bitemporal
Real-time
Alerting
Schema
Agnostic
Samplestack
ACID
Transactions
Flexible
Replication
Semantic
Inference
Tiered
Storage
MarkLogic
Content Pump
XA
Transactions
Customizable
Backup
Server-side
JavaScript
Fully
Transactional
Ad-hoc
Queries
Index Across
Data Types
Point-in-time
Recovery
Customizable
Failover
Atomic
Forests
Enterprise NoSQL Database Platform
Flexible Data
Model
Search and
Query
Semantics
Scalability
ACID
and Elasticity Transactions
Certified
Security
Hadoop
Integration
Store and manage
JSON, XML, RDF,
and Geospatial data
with a documentcentric, schemaagnostic database
Lightning fast,
sophisticated,
sub-second
search and
query across all
of your data
Store and query
linked data as
RDF and
SPARQL
Scale to
petabytes of data
without overprovisioning or
over-spending
Governmentgrade, granular,
role-based
security
Make your
Hadoop better
by connecting
it to MarkLogic
Avoid data loss,
data corruption,
and stale
reads—even at
speed and scale
Flexible Data Model
Store and manage JSON, XML, RDF, and Geospatial data with a
document-centric, schema-agnostic database
JSON, XML, RDF, Geospatial data, and also
large binaries—all stored and managed on a
single unified platform
Document-centric and schema-agnostic for
agility, reducing lost fidelity and functionality
from data conversion and brittle ETL
Use the data format that makes the most sense,
keeping the data in its most readable form
SLIDE: 6
© COPYRIGHT 2013 MARKLOGIC CORPORATION. ALL RIGHTS RESERVED.
Flexible Data Model
Schema-agnostic, structure-aware
<report>
vehicle
near airport </title>
<title>Suspicious
Suspicious
vehicle…
<date> 2012-11-12Z </date>
<type> observation/surveillance</type>
<threat>
<type>suspicious activity</type>
<category> suspicious vehicle </category>
</threat>
<location>
<lat>37.497075 </lat>
<long> -122.363319 </long>
</location>
van…
<description> A blue van
with license plate ABC 123 was observed parked behind the airport sign…
<triple><subject>IRIID </subject> <predicate>isa </predicate><object>license-plate</object></triple>
<triple><subject>IRIID </subject> <predicate>value </predicate><object>ABC 123 </object></triple>
</description>
SLIDE: 7
© COPYRIGHT 2013 MARKLOGIC CORPORATION. ALL RIGHTS RESERVED.
Search and Query
Built-in search to find answers in documents, relationships, and metadata
In MarkLogic, a search is a query, and a query is search
JavaScript
XQuery
SPARQL
Ingest your data as-is and rely on over 30 sophisticated
indexes to get better answers from today’s data
Lightning fast, sub-second search across hundreds of
terabytes of data and billions of documents
Full-text
Search
Rich Query
Capability
Geospatial
Search
In-database
MapReduce
Semantic
Search
Powerful, agile development providing complex query
capability across heterogeneous data
Full-featured UX with full-text search, type-ahead
suggestions, facets, snippeting, highlighted search
terms, proximity boosting, relevance ranking, and
language support
SLIDE: 8
© COPYRIGHT 2013 MARKLOGIC CORPORATION. ALL RIGHTS RESERVED.
Universal Index
Term
Which vetted reports contain the
phrase blue van?
Term List
“blue”
123, 125, 129, 130, 152, 344, …
“van”
123, 125, 126, 129, 130, 152, …
“observed”
125, 152, 516, 522, 765, 890, …
“blue van”
123, 125, 129, 130, 152, 486, …
STEM “observe”
125, 152, 516, 522, 765, 890, …
<report>
…
<report>/<location>
Document
References
125, 516, 890, …
MarkLogic indexes…
Words
…
Phrases
<threat>/<category>
…
Stemmed words and phrases
<type>suspicious activity</type>
…
<date>2012-11-12Z</date>
…
Structure
Collection:Vetted
…
Words and phrases in the context of structure
Role:Analyst + Action:Read
…
…
…
Values
…
…
Collections
…
…
Security Permissions
SLIDE: 9
© COPYRIGHT 2013 MARKLOGIC CORPORATION. ALL RIGHTS RESERVED.
Which vetted reports containing the phrase blue van
were submitted before 2015?
Range Index
Term
Term List
“blue”
123, 125, 129, 130, 152, 344, …
“van”
123, 125, 126, 129, 130, 152, …
“observed”
125, 152, 516, 522, 765, 890, …
“blue van”
123, 125, 129, 130, 152, 486, …
STEM “observe”
125, 152, 516, 522, 765, 890, …
<report>
…
<report>/<location>
…
<threat>/<category>
…
<type>suspicious activity</type>
…
<date>2012-11-12Z</date>
…
Collection:Vetted
…
Role:Analyst + Action:Read
…
…
…
…
…
…
…
SLIDE: 10
Document
References
125, 516, 890, …
Range indexes map
document IDs to values,
and vice-versa in a compact
in-memory representation.
© COPYRIGHT 2013 MARKLOGIC CORPORATION. ALL RIGHTS RESERVED.
Geospatial
Triple IndexIndex
Term
Term List
“blue”
123, 125, 129, 130, 152, 344, …
“van”
123, 125, 126, 129, 130, 152, …
“observed”
125, 152, 516, 522, 765, 890, …
“blue van”
123, 125, 129, 130, 152, 486, …
STEM “observe”
125, 152, 516, 522, 765, 890, …
<report>
…
<report>/<location>
…
<threat>/<category>
…
<type>suspicious activity</type>
…
<date>2012-11-12Z</date>
…
Collection:Vetted
…
Role:Analyst + Action:Read
…
…
…
…
…
…
…
SLIDE: 11
Which vetted reports about a blue van from before 2015
2015
to ahave
location
near
the airport?
in
thisrefer
location
a New
England
license plate?
Document
References
125, 516, 890, …
The Geospatial index is like a
2D range index, with built-in
query support for point, box,
circle, and complex polygons.
© COPYRIGHT 2013 MARKLOGIC CORPORATION. ALL RIGHTS RESERVED.
Which vetted reports about a blue van from before 2015
in this location have a New England license plate?
Triple Index
Term List
123, 125, 129, 130, 152, 344, …
123, 125, 126, 129, 130, 152, …
Document
References
125, 516, 890, …
125, 152, 516, 522, 765, 890, …
123, 125, 129, 130, 152, 486, …
125, 152, 516, 522, 765, 890, …
…
…
…
The Triple index is an index of
“facts” expressed as Semantic
triples. It can efficiently query
and join billions of “linked data”
triples.
…
…
…
…
…
…
…
SLIDE: 12
© COPYRIGHT 2013 MARKLOGIC CORPORATION. ALL RIGHTS RESERVED.
ACID Transactions
Don’t settle for a BASE-ic database
Reads and writes are durably logged to disk, and
strongly isolated from other transactions
Prevents data corruption, stale reads, and inconsistent
data—common problems with databases that settle for
eventual consistency—and all of which are
unacceptable
No performance drop-offs at scale. Production
applications run tens of thousands of very complex
transactions per second for tens of thousands of users
Accomplished using MVCC (multi-version concurrency
control)
SLIDE: 13
© COPYRIGHT 2013 MARKLOGIC CORPORATION. ALL RIGHTS RESERVED.
ACID Transactions
Implemented Using MVCC
/articles/doc1.xml
/articles/doc1.xml
Document
Document
Title
Title
Author
Last
Section
Metadata
Last
Section Section Section
423
∞
Creation Timestamp
SLIDE: 14
Section
Metadata
First
First
Section
Author
Section
628
∞
Section
628
∞
Year
Section Section Section
Section
∞
MVCC Benefits:
ACID transactions
Zero-latency search indexing
High throughput
Lock-free reads
Serial writes
Point-in-time query
Fast database rollback
Deletion Timestamp
© COPYRIGHT 2013 MARKLOGIC CORPORATION. ALL RIGHTS RESERVED.
Government-Grade Security
Certified, granular security for modern data governance
Certified security – Higher security
Data Governance With MarkLogic
certifications than any other NoSQL
database, carrying a Common Criteria
Security Certification and certified to run
in classified government systems
Granular Security – Role Based
Security
Retention
Privacy
Access Control (RBAC) at the
document level, and can also employ
other models for cell-level security
Continuity
Provenance
Compliance
SLIDE: 15
© COPYRIGHT 2013 MARKLOGIC CORPORATION. ALL RIGHTS RESERVED.
MarkLogic 8 / More Powerful, Easier to Use
Developer Experience
Semantics
MarkLogic 8 is more powerful than ever,
but remarkably easy to use
Enterprise triple store,
document store, database
combined
Bitemporal
JSON
Unified indexing
and query for
today’s web and
SOA data
Node.js
Client API
Java Client
API
Server-side
JavaScript
Enterprise NoSQL
database for Node.js
applications
NoSQL agility in a
pure Java
interface
JavaScript
runtime inside
MarkLogic using
Google’s V8
Track information along two
dimensions of time
JSON
Unified indexing and query for today’s web and SOA data
Speed up development with powerful
built-in search, transformation, and
alerting capabilities designed for JSON
Reduce lost fidelity and functionality from
data model translations and brittle ETL
Simplify architecture with data, metadata,
and relationships managed consistently
and securely together
Ease modern, end-to-end JavaScript
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
{
"_id": 1,
"name": { "MarkLogic" },
"supports" : [
{
"datatype": "XML",
"year": 2003
},
{
"datatype": "JSON",
"year": 2015
}
]
}
development
SLIDE: 17
© COPYRIGHT 2013 MARKLOGIC CORPORATION. ALL RIGHTS RESERVED.
Node.js Client API
Enterprise NoSQL database for Node.js applications
Focus on application features rather than
plumbing with out-of-the-box search,
transactions, aggregates, alerting, geospatial,
and more
Move faster to production with proven reliability
at scale
Maximize performance and flexibility—bringing
code to the data
Enable modern end-to-end JavaScript
development
SLIDE: 18
Always open source on GitHub
Participate.
Contribute.
Fork it.
© COPYRIGHT 2013 MARKLOGIC CORPORATION. ALL RIGHTS RESERVED.
Server-side JavaScript
JavaScript runtime inside MarkLogic using Google’s V8
Run code near the data for unparalleled
Front End
power, efficiency
Build applications faster from a growing pool
of skills, tools
Middle Tier
Reduce risk with proven performance and
reliability
Decrease brittle ETL and lost fidelity and
+
Database Layer
functionality from JSON data conversions
Pair with Node.js to ease full-stack JavaScript
development
SLIDE: 19
© COPYRIGHT 2013 MARKLOGIC CORPORATION. ALL RIGHTS RESERVED.
SLIDE: 20
Execution
JSON
JavaScript
JavaScript object graph
JavaScript
JSON
JavaScript
Full-Stack JavaScript
Angular
Data Model
© COPYRIGHT 2013 MARKLOGIC CORPORATION. ALL RIGHTS RESERVED.
Application Logic
Application
{Node, Java} Client API
Extensions
HTTP Data Services
REST Client API
MarkLogic
{JavaScript, XQuery} Built-ins
User code
SLIDE: 21
Extensions
Framework code
© COPYRIGHT 2013 MARKLOGIC CORPORATION. ALL RIGHTS RESERVED.
After today…
MarkLogic University: instructor-led and on-demand videos
developer.marklogic.com: tutorials, blogs
docs.marklogic.com: function docs, Guides
Quick Start
SLIDE: 22
© COPYRIGHT 2013 MARKLOGIC CORPORATION. ALL RIGHTS RESERVED.
MarkLogic World 2016
San Francisco | May 9-12 | Park Central Hotel
Tokyo | June 2 | Tokyo Station Hotel
London | June 13-15 | The IET
world.marklogic.com
MarkLogic World 2016
May 9-12, Park Central Hotel, San Francisco
3 Keynotes | 3 Hands-On Technical Workshops
3 MarkLogic University Courses | 4 Days of Content
5 Tracks | 15+ Customer Speakers | 30+ Interactive Sessions
world.marklogic.com
MarkLogic World 2016
May 9-12, Park Central Hotel, San Francisco
MONDAY
MAY 9
MarkLogic University:
Intro to MarkLogic
2-hour Course*
Hands-on Technical
Workshops*
TUESDAY
MAY 10
WEDNESDAY
MAY 11
THURSDAY
MAY 12
Opening Keynote
Breakouts
Sponsor Showcase
Expert 1:1’s
Keynote
Breakouts
Sponsor Showcase
Expert 1:1’s
Keynote
Breakouts
Partner Summit
Welcome Reception
*Advanced sign-up required during registration
world.marklogic.com
MarkLogic World 2016
June 13-15, The IET, London
MONDAY
JUNE 13
TUESDAY
JUNE 14
WEDNESDAY
JUNE 15
MarkLogic University:
Intro to MarkLogic
2-hour Course*
Opening Keynote
Breakouts
Partner Summit
Sponsor Showcase
Expert 1:1’s
Sponsor Reception
Keynotes
Breakouts
Sponsor Showcase
Expert 1:1’s
Hands-on Technical
Workshops*
*Advanced sign-up required during registration
world.marklogic.com
MarkLogic World 2016
June 2, Tokyo Station Hotel
Come learn the whys and hows of MarkLogic Enterprise NoSQL from data
architecture, implementation, advanced development, and best practices.
Hear from world-class customers and take away new ideas on how you can
use MarkLogic to power your organization’s mission-critical applications.
world.marklogic.com
APPENDIX
MarkLogic / Enterprise NoSQL Database Platform
POWERFUL
AGILE
TRUSTED
Better answers
from today’s data
Adaptive to
every environment
Hardened, proven
platform
MarkLogic is built to find answers in
documents, relationships, and metadata
MarkLogic runs well everywhere, while
preserving the option to change
hardware, data, and scale later
MarkLogic has a proven track record of
performance under all enterprise
conditions
Simpler data
integration
Uncompromised
data resiliency
MarkLogic accelerates and simplifies data
sharing across silos, cutting down on ETL
and making agile development possible
MarkLogic will keep your data safe and
whole—no matter what happens in your
application or at your data center
The intelligent
data layer
An intelligent data layer powers intelligent
applications—and makes them faster and
more flexible than any alternative
// POWERFUL / Deliver more value, build better apps
Native
JSON
Store
Store and manage data natively as
JSON documents, speeding up
development and reducing data
transformation with a simplified
architecture for end-to-end
JavaScript development.
Native
XML
Store
Store and manage data natively as
XML documents, a hierarchical selfdescribing data type that is ideal for
a wide variety of applications.
Native RDF
Triple Store
Geospatial
Support
Store RDF triples and query them
using SPARQL—providing context
to your data and better search with
a database that can handle a
combination of documents, data,
and triples.
Store geospatial data such as GML,
KML, and GeoRSS and do complex
queries on the data or in
combination with other data types.
Also integrate with ESRI ArcGIS and
Google Maps for visualization.
Full-text
Search
Flexible
Indexes
Bitemporal
Real-time
Alerting
Built-in, lightning fast search and
query capabilities across hundreds of
billions of documents. And, fullfeatured UX with type-ahead
suggestions, facets, snippeting,
relevance ranking, and language
support.
Rely on over 30 sophisticated,
composable indexes including a
universal index, range index,
geospatial index, and triple index—all
designed so that developers can ask
harder questions and get faster
responses.
Handle historical data along two
different timelines, making it possible
to rewind the information “as it
actually was” in combination with “as
it was recorded” at some point in
time.
Create an unlimited number of realtime alerts by email or text using the
alerting API and reverse indexes.
Whenever a document is loaded that
matches a specific query, you’ll know.
Semantic
Inference
Tiered
Storage
Server-side
JavaScript
Work with new data that didn’t exist
before. For example, if John lives in
London, and London is in England,
then MarkLogic can infer that “John
lives in England” and then add that
new fact to your semantic search.
Store and manage data in different
tiers based on cost and performance
trade-offs, and easily migrate between
tiers without any ETL, additional
software, or expensive infrastructure
changes.
Live in JavaScript. Run JavaScript
near the data for unparalleled power
and efficiency with a high performance
JavaScript runtime inside MarkLogic
using Google’s V8.
Run complex distributed transactions
across multiple documents and
Fully
collections with no performance dropTransactional offs at scale. Production applications
run tens of thousands of transactions
per second for tens of thousands of
users.
// AGILE / Prepare for and respond to change
Handle petabytes of data without
Scalable
over-provisioning, over-spending, or
and Elastic experiencing downtime,
SQL
Support
inconsistency, or risk of data loss.
Cloud
Ready
(AWS)
Use MarkLogic’s cloud templates to
get up and running quickly on AWS
or other cloud environments,
starting with a three node cluster or
a large cluster with over a hundred
nodes.
Hadoop
and HDFS
Make Hadoop better by connecting
it to MarkLogic and using it as part
of an infrastructure to handle both
operational and analytic workloads.
REST API
Configure and administer MarkLogic
with a single REST-based API. This
provides more programmatic control
than ever before—giving DBAs the
power and flexibility necessary to
run a modern data center.
Multi-OS
Support
Schema
Agnostic
Samplestack
Use a relational SQL data model
within MarkLogic, connecting to SQLbased tools using the ODBC driver,
or execute SQL commands against
relational databases using the
MLSAM open-source XQuery library.
Run MarkLogic on Windows, Linux,
Solaris, OS X. MarkLogic runs easily
and is easy to setup in your
environment, whether in the cloud,
virtualized, or on premises.
Only use schema when you need it.
Ingest all your data as-is, whether
structured or unstructured, using the
NoSQL document model rather than
being forced to use a predefined
schema.
Get going fast on MarkLogic with
Samplestack, an end-to-end three
tiered sample application designed to
show developers how to implement a
reference architecture using key
MarkLogic concepts and sample
code.
MarkLogic
Content Pump
XA
Transactions
Ad-hoc
Queries
Index Across
Data Types
MLCP makes it easy to quickly import
or export documents and metadata
from MarkLogic, or to copy from one
database to another using a
command-line tool.
Run distribute transactions across a
cluster using the XA (eXtended
Architecture) standard, which ensures
ACID properties for global transaction
processing.
Don’t plan your queries in advance of
ingesting your data. MarkLogic is
designed for search and discovery so
that you can run any query at any-time
and get real-time results.
Use multiple indexes in concert
across multiple data types—giving you
the power to search and query all of
your data.
// TRUSTED / Enterprise-ready for mission-critical uses
Performance
at scale
LDAP and
Kerberos
Security
Security
Certifications
Monitoring and
Management
Scales easily to handle hundreds
of terabytes using shared-nothing
architecture in which data
partitions are completely
independent of each other and
can act independently.
Use third party authentication from
LDAP or Kerberos, making the
most secure NoSQL database
easier to manage.
Secure your data with
government-grade security.
MarkLogic has certified, granular
security for modern data
governance and to handle the
increased complexity of today’s
cyber threats.
Use the Management API for
cluster management, process
automation, access controls,
database cloning, audit trails, and
connections to third-party
interfaces.
Configuration
Management
24/7
Engineering
Support
ACID
Transactions
Flexible
Replication
View and manage the configuration
settings for MarkLogic databases,
forests, application servers, groups
or hosts—and easily propagate
changes across the entire cluster.
Rely on support from the 24/7, allengineer support staff to ensure
you get answers fast, or just want
some friendly tips on saving a few
milliseconds on performance.
Don’t settle for a BASE-ic database.
Use ACID transactions to ensure you
don’t run the risk of encountering
data corruption, stale reads, and
inconsistent data—all of which are
unacceptable.
Enable customizable information
sharing between systems, allowing
for the easy and secure distribution
of portions of data even across
disconnected, intermittent, and
latent networks.
Customizable
Backup
Customizable
Failover
Point-in-time
Recovery
Atomic
Forests
Restore the database quickly with
minimal downtime, relying on full
and consistent backups, hot
configuration changes, and
automatic index optimization without
shutting down the system.
Have confidence that your data is
always available, reducing risk and
avoiding interruptions with
automated local- or shared-disk
failover made possible with sharednothing architecture.
Rollback to a specified point in time
by replaying journal archives, an
additional feature to ensure disaster
recovery and easy of management.
Manage data in collections of
documents similar to partitions,
called forests, that exist
independently and enable scalability
and elasticity, rebalancing, efficient
operations, and easier data
governance.