Tuesday, April 19, 2016

Hadoop Summit Dublin, 2016, Sessions

Since the Hadoop Summit(2016) Dublin Agenda does not link slides or videos to the published agenda, here is my part to make it easier to folks to find stuff.

Day 1: April 13th, 2016

Time Session Slides Video
11:30-12:10 Apache Hadoop YARN: Past, Present and Future,
Varun Vasudev
? https://www.youtube.com/watch?v=oyoMY--kWFU

Running Spark in Production
Vinay Shukla, Hortonworks
Saisai Shao, Hortonworks
Running-spark-in-production https://youtu.be/OkyRdKahMpk

Is Your Enterprise Data Lake Metadata-Driven AND Secure?
Madhan Neethiraj, Hortonworks
? https://www.youtube.com/watch?v=FdO1anQS8dE

Batch is Back: Critical for Agile Application Adoption
Robby Dick, BMC Software
? https://www.youtube.com/watch?v=FoEBZAwnwKc

How To: A Beginners Guide to Becoming an Apache Contributor
Venkatesh Sellappa, Teradata UK Limited
? https://www.youtube.com/watch?v=FqYOTSClJeU

It's Not the Size of Your Cluster, It's How You Use It
David Darden, Big Fish Games
Don Smith, Big Fish Games
? https://www.youtube.com/watch?v=FKt4Grj5rsU

Taming the Elephant: Efficient and Effective Apache Hadoop Management
Paul Codding, Hortonworks
? https://www.youtube.com/watch?v=sCB6HmfdTZ4
12:20-13:00 Scale out Resource Management at Microsoft Using Apache YARN
Raghu Ramakrishnan, Microsoft Corporation
? https://www.youtube.com/watch?v=c_F_sNOqo1M

Using a Data Lake at the Core of a Life Assurance Business: Solutions for Data in one Place and Putting the Customer First
Raj Mukherjee, Zurich Insurance
Chris Murphy, Zurich Life Insurance , UK
? https://www.youtube.com/watch?v=MULvBPP-h2E

Unified Stream & Batch Processing with Apache Flink
Ufuk Celebi, data Artisans GmbH
? https://www.youtube.com/watch?v=8Uh3ycG3Wew

Tame That Beast: How to Bring Operations, Governance and Reliability to Hadoop
Stefan Radtke, EMC
? https://www.youtube.com/watch?v=9gWCES_Ntu8

Apache NiFi in the Hadoop Ecosystem
Bryan Bende, Hortonworks
? https://www.youtube.com/watch?v=V77M-8ABrdE

Querying the Internet of Things: Streaming SQL on Kafka/Samza and Storm/Trident
Julian Hyde, Hortonworks
? https://www.youtube.com/watch?v=EnzZX8voLcg
14:10-14:50 HDFS: Optimization, Stabilization and Supportability
Chris Nauroth, Hortonworks
https://t.co/7oAQ4Qkhnp https://www.youtube.com/watch?v=R-BjP1iQ5lU

Empower Data-Driven Organizations with HPE and Hadoop
Gilles Noisette, Hewlett Packard Enterprise
? https://www.youtube.com/watch?v=P5GXf9xU2kk

Ingest and Stream Processing - What Will You Choose?
Pat Patterson, StreamSets
Ted Malaska, Cloudera Inc
? https://www.youtube.com/watch?v=wdX_uvfVP0g

Advanced Execution Visualization of Spark Jobs
Marton Balassi, Hungarian Academy of Sciences
Zoltán Zvara, Hungarian Academy of Sciences
? https://www.youtube.com/watch?v=M3yNafTnZl4

Powering a Virtual Power Station with Big Data
Michael Bironneau, Open Energi
? https://www.youtube.com/watch?v=sMQ3NaDRXbg

Implementing the Business Catalog in the Modern Enterprise: Bridging Traditional EDW and Hadoop
Andrew Ahn, Hortonworks
? https://www.youtube.com/watch?v=BoduvxVapy8
15:00-15:40 Rocking the World of Big Data at Centrica
Reuben Banga, Centrica
? https://www.youtube.com/watch?v=mjqAXHWcsNE

On-Demand HDP Clusters using Cloudbreak and Ambari
Karthik Karuppaiya, Symantec
Narendra Bidari, Symantec
? https://www.youtube.com/watch?v=QIPghkoi7j0

The Future of Apache Storm
P. Taylor Goetz, Hortonworks
? https://www.youtube.com/watch?v=QkBTrvob1hk

HBase and Spark: Leveraging your Non-Relational Datastore in Batch and Streaming Applications
Ted Malaska, Cloudera Inc
Jonathan Hsieh, Cloudera


Exploring Titan and Spark GraphX for Analyzing Time-Varying Electrical Networks
Guillaume Germaine, EDF R&D
Thomas Vial, Octo Technology


No Time to Waste: From Data Warehousing to Modern Data Architecture in 4 Easy Sprints
Hessel Miedema, Capgemini
Andrea Capodicasa, Capgemini


The Heterogeneous Data Lake: Analytics in the World of Specialized Datastores
Tomer Shiran, Dremio

16:10-16:50 The Past, Present, and Future of Hadoop at LinkedIn
Carl Steinbach, LinkedIn


Fast Distributed Online Classification and Clustering
Prasad Chalasani, Media Math


Real-World NoSQL Schema Design
Tugdual Grall, MapR Technologies


Large-Scale Stream Processing in the Hadoop Ecosystem
Gyula Fora, King
Marton Balassi, Hungarian Academy of Sciences


Machine Learning in Big Data – Look Forward or Be Left Behind
Bill Porto, RedPoint Global Inc.


Apache Atlas: Tracking Dataset Lineage Across Hadoop Components
Andrew Ahn, Hortonworks


Hadoop and Friends as Key Enabler of the IoE – Continental’s Dynamic eHorizon
Dr. Thomas Beer, Continental Automotive

17:00-17:40 TensorFlow: Large-Scale Deep Learning For Intelligent Computer Systems
Ram Ramanathan, Google


Hive on ACID
Alan Gates, Hortonworks


Outlier Analysis and Anomaly Detection for Sensors with Spark and Storm
Casey Stella, Hortonworks


Hadoop in the Cloud: Real World Lessons from Enterprise Customers
Rashim Gupta, Microsoft Corp.


Accelerating Apache Hadoop through High-Performance Networking and I/O Technologies
Dhabaleswar K (DK) Panda, The Ohio State University


Connecting Everything
Patrick de Vries, KPN


Cooperative Data Exploration with IPython Notebook
Piotr Lusakowski, deepsense.io


Migrating Hundreds of Pipelines in Docker Containers
Noa Resare, Spotify


The Hadoop Deployment Strategy at Renault Group
Kamélia Benchekroun, Renault

Overview of Apache Flink: the 4G of Big Data Analytics Frameworks
Slim Baltagi, Capital One Financial Corporation

Securing Spark on Production Hadoop Clusters
Marcelo Vanzin, Cloudera


The Key to Unlocking the Value in the Internet of Things? Managing the Data!
Ron Bodkin, ThinkBig, a Teradata company


HBase on Steroids with In-Memory Compaction
Eshcar Hillel, Yahoo!


Day 2, April 14, 2016

Scaling out to 10 Clusters, 1000 Users, and 10,000 Flows: The Dali Experience at LinkedIn
Carl Steinbach, LinkedIn

MLLeap: Or How to Productionize Data Science Workflows using Spark
Mikhail Semeniuk, Shift Technologies
Hollin Wilkins, Truecar

Bringing HBase Data Efficiently into Spark with DataFrame Support
Zhan Zhang, Hortonworks

Protecting Enterprise Data In Hadoop
Owen O'Malley, Hortonworks

Hadoop Helps Deliver High Quality, Low Cost Healthcare Services
Ranadip Chatterjee, Healtrix Ltd

Evolving HDFS to a Generalized Distributed Storage Subsystem
Sanjay Radia, Hortonworks

Securing Hadoop in an Enterprise Context
Hellmar Becker, ING

Tailored for Spark
John Scheibmeir, eBay
Petr Igrevski, eBay

Apache Hadoop YARN and the Docker Container Runtime
Sidharta Seethana, Hortonworks

Hadoop Platform at Yahoo: A Year in Review
Sumeet Singh, Yahoo!, Inc.

Deep Recurrent Neural Networks for Sequence Learning in Spark
Yves Mabiala, Thales

Telstra's Tale of Hadoop in the Enterprise
Chris Ottinger, Telstra

Production Grade Data Science for Hadoop
Villu Ruusmann, Openscoring OÜ

Why Big Data Management Requires Hierarchical Taxonomies
Andrew Ahn, Hortonworks

Apache Hive 2.0 SQL Speed Scale
Alan Gates, Hortonworks

Telematics with Hadoop and Nifi
Adam Morton, Admiral Insurance
Simon Elliston Ball, Hortonworks

Apache Zeppelin, Helium and Beyond
Moon soo Lee, NFLabs

Smart Data for a Predictive Bank
Bart Buter, ING
Alex Buijsman, ING Bank

Planning with Polyalgebra: Bringing Together Relational, Complex and Machine Learning Algebra
Julian Hyde, Hortonworks

Organising The Data Lake – Information Governance In A Big Data World

Big Data Application Architectures - IoT
Nishant Thacker, Microsoft

Hadoop and Other Animals
Matthew Aslett, 451 Research

Apache Eagle - Monitor Hadoop in Real-time
Yong Zhang, eBay
Arun Manoharan, eBay

Zeppelin + Livy: Bringing Multi Tenancy to Interactive Data Analysis
Prabhjyot Singh, Hortonworks
Jianfeng Zhang, Hortonworks

Practical Advice to Build a Data Driven Company
Simon Maby, Octo Technology

Using Natural Language Processing on Non-Textual Data with MLLib
Casey Stella, Hortonworks

Hadoop Everywhere: Geo-Distributed Storage for Big Data
Nikhil Joshi, EMC
Vishrut Shah, EMC

Log, I am Your Father. The Role of Machine Data in the IoT
James Hodge, Splunk

The Evolution of Apache Kylin: Realtime and Plugin Architecture in Kylin2
Yang Li, Kyligence Inc.

Apache Phoenix and HBase: Past, Present and Future of SQL over HBase
Enis Soztutar, Hortonworks

Surviving The Hadoop Revolution
Scott Gray, IBM
Adriana Zubiri, IBM

How Do You Decide Where Your Customer Was?
Burak Isikli, Turkcell

LLAP: Sub-Second Analytical Queries in Hive
Sergey Shelukhin, Hortonworks

Benefits of Hadoop as Platform as a Service
Aaron Call, Barcelona Supercomputing Center

Working with the Type Safe Scalding API
Justin Coffey, Criteo
Sofian Djamaa, Criteo

Detecting Persistent Threats Using Sequence Statistics
Ted Dunning, MapR Technologies

Wednesday, April 13, 2016

Running Spark in Production session at Hadoop Summit Dublin, 2016

It was a great experience presenting to 200+ people. Here are the slides and here is the YouTube video.

#hadoopsummit #HS16Dublin #apachespark #sparksecurity #ThankYou