Orientation slides - Ohio State Computer Science and Engineering

Download Report

Transcript Orientation slides - Ohio State Computer Science and Engineering

High-End Computing Systems Group
An Overview
Gagan Agrawal
Christopher Stewart
Spyros Blanas
Huan Sun
Arnab Nandi
Radu Teodorescu
IEEE Fellow
D. K. Panda Univ. Distinguished Scholar Yang Wang
Srini Parthasarathy
Xiaodong Zhang ACM Fellow
Feng Qin
Yinqian Zhang
P. (Saday) Sadayappan IEEE Fellow
IEEE Fellow
Overview
What are we doing?
Personal app
Applications
Scientific computation
Enterprise management
Parallel computing
Database
……
……
Cloud
Operating system (File system, Virtual Memory, …)
Virtual machine
Hardware
CPU
……
Memory
GPU
Ethernet
Hard disk
SSD
Infiniband
Many professors’ works cross boundaries.
Greatest Computing Challenges
•
•
•
•
•
•
•
Efficiency: they should be fast
Cost: they should be cheap
Reliability: they should not fail
Security: they should not be hacked
Scalability: they should support big data
Energy-efficiency, usability, flexibility, ……
Users expect all of the above from a system
software
The world is changing
• Hardware is changing:
– Multi-core, GPU, SSD/NVRAM, RDMA, ……
• Application is changing:
– Mobile app, medical app, cloud app, larger-scale
computation, ……
• They both call for new system software.
Efficiency
• Single core CPU => Multi-core, GPU, ……
– Today efficiency comes from parallelism.
– Multi-threaded programming is hard.
• Hard drive => SSD, NVRAM, ……
– Call for new storage abstraction and design
• 1Gb Ethernet => 10Gb, 40Gb, infiniband, ……
– New mechanisms (e.g. RDMA) requires redesign
of existing systems
Reliability
• Software has bugs. Hardware can fail.
– Today they can have disastrous consequences.
• It is a classic topic
– Solution 1: test as much as possible. Model checking,
….
– Solution 2: formal verification
– Solution 3: fault tolerance. Assume failures are
unavoidable.
– ……
Security
• Mobile and cloud are popular today
• Is cloud a safe place to store your or your
company’s private data?
• Is it OK to give my personal information to
Google or Apple or some random mobile app?
Scalability
• We are in a Big Data era.
• We will need more machines to store and
process more data.
• Will adding more resources solve all
problems?
Research Areas
Gagan Agrawal
• Interests: High Performance, Data-Intensive,
and Cloud Computing, Middleware/Compilers,
Data/Web Mining
• Recent work: “Exploiting recent SIMD
architectural advances for irregular
applications”, CGO 2016
Spyros Blanas
• Interests: Multi-core Database Systems, Main
Memory Data Processing, Scientific Data
Management
• Recent work: “Forecasting the cost of
processing multi-join queries via hashing for
main-memory databases”. SoCC 2015
Arnab Nandi
• Interests: Data Interaction, Database Usability,
Large-scale Analytics, Cloud data management
• Recent work: “FluxQuery: An Execution
Framework for Highly Interactive Query
Workloads”, SIGMOD 2016
Dhabaleswar K. (DK) Panda
• Interests: High-performance Networking,
Programming Models, Accelerators, Cloud
Computing, File systems and Storage, PowerAware Designs, Big Data, and Deep Learning
– MVAPICH2 is used by more than 2600 organizations.
• Recent work: “Designing MPI Library with On-Demand
Paging (ODP) of InfiniBand: Challenges and Benefits”,
SC 2016
• Looking for 4 students.
Srinivasan Parthasarathy
• Interests: Data Mining, Parallel & Distributed
Computing Systems, Network Science at Scale
(Social and Biological Networks)
• Recent work: “What Links Alice and Bob?:
Matching and Ranking Semantic Patterns in
Heterogeneous Networks”, WWW 2016 (Best
of Selection)
• Looking for 1 student.
Feng Qin
• Interests: Operating Systems, System Security
and Dependability, Cloud Computing, and
Mobile Computing, Dependable storage
• Recent work: “Crash Consistency Validation
Made Easy”, FSE 2016
P. (Saday) Sadayappan
• Interests: Performance Optimization,
Compilers/Runtime Systems
• Recent work: “PolyCheck: Dynamic
Verification of Iteration Space Transformations
on Affine Programs”, POPL 2016
• Looking for new students.
Christopher Stewart
• Interests: Power management, Data-intensive
services, Empirical studies
• Recent work: “Blending On-Demand and Spot
Instances to Lower Costs for In-Memory
Storage”, Infocomm 2016
• Looking for 2 students.
Huan Sun
• Interests: Discover, Represent, Search
knowledge
• Recent work: "On Generating Characteristicrich Question Sets for QA Evaluation“, EMNLP
2016
Discover, Represent, Search Knowledge
2015
Knowledge
discovery
Discover
KB
construction
Represent
Question
answering
Huan Sun
The Ohio State University
Search
Knowledge
20
What Knowledge to Discover/Represent/Search?
Factoid
Questions
To answer, we need
What is the capital of California?
Who first landed on the moon?
Answer detection
Where was Google founded?
Easy
…
Informational
Queries
Flu treatments
Bluescreen solution
How my customers think?
Intelligence
Queries
Treatments to this patient?
Code answers for current project Deep analysis
Human + machine
Similar patent/lawsuit cases?
Answer detection &
summarization
…
…
Huan Sun
The Ohio State University
Hard
21
Radu Teodorescu
• Interests: computer architecture, nanoscale
technology scaling, reliability, variability and
power management.
• Recent work: “Core Tunneling: VariationAware Voltage Noise Mitigation in GPUs”,
HPCA 2016
Yang Wang
• Interests: distributed systems, fault tolerance,
concurrency, scalability, performance
measurement and debugging
• Recent work: “Cheap and Available State
Machine Replication”, USENIX ATC 2016
Xiaodong Zhang
• Interests: Distributed Systems, Operating
Systems, Networking, Cloud data
management, Databases, GPUs
• Recent work: “BCC: reducing false aborts in
optimistic concurrency control with low cost
for in-memory databases”, VLDB 2016
Yinqian Zhang
• Interests: system security
• Recent work: “One Bit Flips, One Cloud Flops:
Cross-VM Row Hammer Attacks and Privilege
Escalation”, USENIX Security 2016
Talk to professors
• Here I can only give a very brief overview.
• Professors’ interests change.
– Their websites may be outdated.
• Talk to them if you are interested.
– They are happy to meet you.
What do professors expect from you?
• Strong programming skills
– Ability to understand, design, implement, and
debug real systems
– Willingness to get hands dirty
• Strong reasoning, proving, and modeling skills
– Make them work on real systems
Achievements
National Leadership Indicators
• Significant funding from competitive sources
– NSF, DOE, DARPA, NIH
– Strong players in National initiatives (DOE FT, DOE Pmodels..,
NSF-SII)
• Conference & editorial leadership
– PC/General Chair: ICDCS, IPDPS, SC, Hot IC, ANCS, HiPC, WWW,
LCPC, ICPP, SIAM DM (at CMH 2010)
– VC/Area Chair: HPDC, IPDPS, Cluster, HiPC, CCGrid, SIGKDD,
ICDM
– IEEE TPDS (Associate EIC), Trans Computers, Micro; JPDC, IEEE
Intelligent Systems, IEEE TKDE, Data Mining and Knowledge
Discovery, Distributed and Parallel Databases
National Leadership Indicators
• Constant presence in top-tier conferences
– You can interact and network with renowned
researchers in your area
• Strong connections with industry
– You will work on problems that matter, and your
solutions could be widely adopted and used
• Conference & editorial leadership
– Your advisor is well-connected and can point you
to the next big problems in his/her area
Deployed Systems and Software (1/2)
• High Performance MPI over InfiniBand and High-Speed Ethernet:
MVAPICH (being used by 2600+ organizations in 81 countries, more
than 0.38 million downloads)
• NFS over RDMA (adopted by Sun)
• High Performance Spark, Hadoop and Memcached over InfiniBand
and RoCE: RDMA-Spark, RDMA-Hadoop and RDMA-Memcached
(being used by 185 organizations in 26 countries, more than 17,000
downloads)
• Tensor Contraction Engine (NWCHEM Comp. Chemistry suite)
• PLuTo Automatic Parallelizer (incorporated in IBM xlc compiler)
Deployed Systems and Software (2/2)
• Eclat Algorithm used in Statistical Analysis tools
• MotifMiner
• Tree and Graph Mining Suites
• CUBE BY feature in Apache Pig
• Clock-Pro page replacement algorithm in Linux
• RCFile storage format for Hadoop data
Your work could be on this slide in a few years!
Student Accomplishments
• Over 60 graduate students doing research in
the Systems area
– Over 55 students funded as RAs
• Many publications at premiere conferences
–
–
–
–
High performance computing conferences (e.g., SC, IPDPS, ICS)
Data mining conferences (e.g., KDD, ICDM, SIAM DM)
Database conferences (e.g., SIGMOD, VLDB, ICDE, PODS, SIGIR)
ACM and USENIX systems conferences (e.g., SIGMETRICS,
PPOPP, PLDI, PACT, ASPLOS, Eurosys, SOSP/OSDI, ISCA, HPCA,
MICRO)
• 38 Best Paper awards/nominations at major
conferences.
Student Accomplishments, contd.
• Departmental Research awards
• Prestigious Fellowship awards
– 5 IBM Fellowships
– 1 Microsoft Research Fellowship
– 2 CRA Computing Innovation Fellowships
– 1 Presidential Fellowship
– 1 NSF Fellowship
– 1 USENIX Fellowship
– 1 University Fellowship
Student Accomplishments, contd.
• Ph.D. and Masters graduates in great demand
• Academia: Arizona State U., William & Mary,
Michigan State U., U. of Illinois at Chicago, UT
Arlington, New Mexico State U., New Jersey
Institute of Technology, Louisiana State U., U. of
Cincinnati / Cincinnati Children’s Hospital, San
Francisco State U., Stanford U., U. Iowa, Kent U.,
Nanchang U., Drexel U., SUNY Binghamton, U.
Houston, U. Alabama, …
Student Accomplishments, contd.
• Ph.D. and Masters graduates in great demand
• Industry & Government Labs: MSR, IBM TJ
Watson, IBM Almaden, HP Research, Argonne, Oak
Ridge, PNNL, …
• Industry: Google, Facebook, Twitter, Microsoft,
LinkedIn, Amazon, Oracle, Intel, NVidia, SGI,
Teradata, Compaq/Tandem, Lucent, Citrix, Dell,
Yahoo, Cray, Hortonworks, …
Research Opportunities
• Ph.D. Major area
– Extremely strong market demand
– Expect continuing demand due to impact on
foundations as well as technology advances
• Ph.D. Minor area
– Systems cross-cuts many research directions in AI,
Graphics, Networking, Software Engineering
• Masters Thesis
• Masters Project
Relevant Courses
CSE 5441
CSE 6421
CSE 5433
CSE 5241
Parallel Computing
CSE 6441
CSE 6422
Operating Systems
CSE 6431
CSE 5242
CSE 5243
CSE 5245
CSE 5343
CSE 6341
• 5441 only in Fall
• 6422, 6441 only in Spring
Architecture
Databases/
Data Mining
Languages/Compilers
Specialty Courses
• Special topics (letter graded) / Seminars (S/U graded)
• 2016 Fall or 2017 Spring:
– Huan Sun 5539 Question Answering
– Gagan Agrawal 5449 Topics in High Performance
Computing
– DK Panda 5194 Network-Based Computing for HPC,
Cloud, and Big Data
– Feng Qin 5194 SW Dependability and Security
– Xiaodong Zhang 5449 Big Data Analytics and
Management
Summary
• Addressing cutting-edge research challenges
with focus on multi-disciplinary applications
• Synergistic group with significant growth in
the last few years
• Many research opportunities for bright and
motivated students – as major area or minor
area for Ph.D., M.S. thesis, or M.S. project
https://cse.osu.edu/research/systems