Developing in Handcuffs - University of Minnesota

Download Report

Transcript Developing in Handcuffs - University of Minnesota

9th Annual CodeFreeze Symposium
Jeff Lemmerman
Matt Chimento
Medtronic Energy and Component Center
Medtronic
1
Medtronic Energy and Component Center
• MECC est. 1976
• MECC Components
Batteries
Defibrillation Capacitors
Feedthroughs
Glass/ Metal Feedthroughs
Precision Molding and
Extrusion
• Census – 1200 Employees
• Plant Size – 190,000 Square Feet
40,000 Manufacturing
15,000 R&D Labs
38,000 Office
97,000 Common, Support,
Warehouse
2
About MongoDB
 Background
 Founded in 2007 as 10Gen
 First release of MongoDB in 2009
 $223M+ in funding
 MongoDB
 Core server
 Native drivers
 Version 2.4.9 released 1/10/14
 Subscriptions, Consulting, Training
 Monitoring (MMS)
RDBMS Strengths
 Data stored is very compact
 Rigid schemas have led to powerful query
capabilities
 Data is optimized for joins and storage
 Robust ecosystem of tools, libraries, integrations
 40+ years old!
Enter “Big Data”
 Gartner defines it with 3Vs
 Volume
 Vast amounts of data being collected
 Variety
 Evolving data
 Uncontrolled formats, no single schema
 Unknown at design time
 Velocity
 Inbound data speed
 Fast read/write operations
 Low latency
Is this a BIG data problem?
6
Where stored?
7
Mapping Big Data to RDBMS
 Difficult to store uncontrolled data formats
 Scaling via big iron or custom data
marts/partitioning schemes
 Schema must be known at design time
 Impedance mismatch with agile development and
deployment techniques
 Doesn’t map well to native language constructs
Key Features
 Data stored as documents (JSON-like BSON)
 Flexible-schema
 In schema design, think about optimizing for read vs. storage
 Full CRUD support (Create, Read, Update, Delete)
 Atomic in-place updates
 Ad-hoc queries: Equality, RegEx, Ranges, Geospatial




Secondary indexes
Replication – redundancy, failover
Sharding – partitioning for read/write scalability
Terminology





Collection = Table
Index = Index
Document = Row
Column = Field
Joining = Embedding & Linking
Our experience with MongoDB
 Consulting/Training has been excellent
 Support agreement has been under-utilized
 Emails for security updates etc. are prompt
 Release cycle is frequent
 Mongo Monitoring Service
 Potential concerns storing db stats externally
 MongoDB Certification now available
 New course coming soon in Udacity
Building First C# Application
 CRUD operations for domain class “Component”
 Create new Visual Studio 2010/2012 project
 Install C# driver – currently 1.8.3
 Domain class annotations
 Authentication
 Replication
 Sharding
11
Medtronic Confidential
12
13
How is data retrieved?
14
Loading Data Into Central
Repository
15
Download/Install MongoDB
mongodb.org/downloads
16
Install MongoDB as Windows Service
17
Create Default Data Directory
C:\data\db
Start Mongod
C:\MongoDB\bin\mongod.exe
18
MongoDB Shell
C:\MongoDB\bin\mongo.exe
19
Creating Components
 .insert() will always try to create new document
 .save() if _id already exists will update
 If document doesn’t have _id field it is added
20
Reading Components
21
Reading Components
Returns Null
22
Updating Components
 $set keyword used for partial updates
 Without $set keyword entire document is replaced
 {multi : true} to update multiple documents
23
Deleting Components
Works like .find()
Drops collection
Drops database
24
Medtronic Confidential
25
Creating Components - CompRepo
mongodb://localhost/database
26
Creating Components – Add()
27
Reading Components
28
Updating Components – Save()
Save sends entire document back to server
29
Updating Components – Update()
Update only sends changes
30
Deleting Components
Needed to add reference to Repo class
31
Automapping
32
33
Authentication
 Clients on localhost connect as admin by default
 Start mongod with config option to disable
 Create read-only user and a write user
 Start mongod with these config options
34
Replica Sets
Scaling Reads
Sharding
Key Points
•
•
•
•
•
CHOOSE WISELY: SHARD KEY CANNOT BE
CHANGED!
All documents in sharded collection must
include the shard key
Shard key must be an indexed field
Queries that sort by the shard key are much more
efficient
Mongos handles routing to the correct shard
Sharding
What makes a good shard key?
_ID Field
Date
Hashed
Keys
Month +
Username
Timestamp
Ascending
Index
Sharding
Geographic
Location
UUID
Key Learnings
 Working Set < Memory
 ISODate("2012-09-25T03:00:23Z") Use UTC
 Queries must match data type “string” vs. integer
 Download and use other MongoDB tools (MongoVUE)
 Do not convert query results to List<>
Gaps
 Enterprise acceptance of “new” approach
 Integration with off-the-shelf reporting and analytics
 User interface for managing the database cluster
 Developer familiarity with JSON and MongoDB
 21 CFR Part 11 Compliance
Questions?
docs.mongodb.org/manual
Medtronic
42
Collect and store raw data
Medtronic Confidential
43
RDBMS Optimized For Storage
Databases Are
Not ARDS
Medtronic Confidential
44
Waveform Data
Medtronic Confidential
45
ObjectId
 Special 12-byte BSON type that guarantees
uniqueness within the collection. The ObjectID is
generated based on timestamp, machine ID, process
ID, and a process-local incremental counter.
MongoDB uses ObjectId values as the default values
for _id fields.
46
Indexing
Without indexes queries must perform a table scan (every document)
All collections index on the _id field
47
Backup/Restore
One option is to use mongodump.exe / mongorestore.exe
48
Aggregation Framework
49
Write Concern
50