HA Cluster Summit 2015

From Planning Wiki
Jump to: navigation, search

HA Cluster Summit 2015

Brno, Czech Republic

February 4th and 5th

This summit will piggy-back on the Devconf conference in Brno, Czech Republic.

Success!

The HA Summit was a great success! Thanks to all who came!

HA Summit 2015 group picture. Thanks to Keisuke Mori for the photo!

Beside the collective photos, there were also more important outcomes:

  • #clusterlabs channel on freenode created to cover converged cluster stack and surrounding components

Purpose

It's been six years since the very useful meeting of HA developers in 2008.

This summit aims to play a similar role by bringing HA developers and users together to discuss the next five years (or so) of development.

Topics Covered and Notes

Below is the collection of topics discussed. It is an evolving discussion, and nothing is yet "locked down".

Please feel free to continue the discussion on the new Cluster Labs Users mailing list!

NOTE: At this time, we are collecting our notes from the summit. There may well be errors in the discussion notes below. This notice will be removed only after we've reached consensus on these notes.

Attendee's; If you remember any details differently, please either edit this wiki directly or else post to the mailing list and we will update this wiki as appropriate.

Community; If you were not an attendee, but you want to contribute your ideas and comments, please do!

Topic Requester Topic
Andrew Beekhof Scaling - with pacemaker-remoted and/or a new messaging/membership layer
  <details to be filled in>
Andrew Beekhof SBD (storaged based death) fencing.
  <details to be filled in>
Andrew Beekhof Containerization of services (cgroups, docker, virt)
  <details to be filled in>
Andrew Beekhof degraded mode
  <details to be filled in>
David Vossel Resource agents (upstream releases, handling of pull requests, testing)
  <details to be filled in>
Madison Kelly Merging community spaces (IRC channels, mailing lists, etc).
  Attendees agreed in principle to create a new "Cluster Labs" community. Interested companies will contribute to funding this organization.

To avoid xkcd'ing the community, Andrew Beekhof kindly offered to transfer the domain 'clusterlabs.org' to this effort once the new organization takes shape.

The community will have two primary purposes;

  • Coordinate future development efforts and direction.
  • Promote the adoption of open-source clustering.

It was agreed that this new community, though fundamentally focused on HA, would not be restricted to HA. There is enough overlap in HA/HPC/LB clusters that it would be needlessly restrictive to exclude other cluster types.

Likewise, this new community will be functionally focused on Linux, but not exclusively so. It will be of interest to the community to grow adoption of clustering on any platform where a sufficiently interested community exists.

From a commercial perspective; Sponsors will link to this new community, and will be able to use their membership in the community in their marketing efforts. As much as possible, however, the community will be vendor-neutral.

Madison Kelly Making fencing mandatory (or require a special setting waving rights to complain if disabled).
  <details to be filled in>
Madison Kelly Introduce new "scan core" alert system and "scan agents". Discuss possible use outside AN!.
  Scan Core is an open source effort to provide a mechanism for collecting and analysing software and sensor data and using it to make intelligent decisions about cluster configurations.

The primary comment here was to explore the possibility of having Scan Core support existing nagios scan agents.

Michael Schwartzkopff Monitoring of clusters.
  The principle argument here was that the Pacemaker stack and associated tools should fully support SNMP to simplify management of large cluster environments.
Philipp Reisner DRBD 9 introduction and overview
  <details to be filled in>
Kristoffer Grönlund UI / ease of use, cooperation and sharing ideas on improving the interfaces
  OCF standard updates: We now have a process for proposing changes to the OCF standard via pull requests on github. Marek Grac proposed adding features currently used by the fence-agents to the OCF standard, and I (krig) proposed some form of parameter validation to improve the feedback given by UIs for incorrect data in parameters.

Some examples of useful validation types to detect would be IP address, file, directory, device (disk), integer ranges, email address, executable, boolean...

Frank Danapfel Requirements for development of resource agents for 3rd party applications from ISVs (for example SAP)
  <details to be filled in>
Michael Adam Integration of CTDB/clustered Samba with pacemaker.
  <details to be filled in>
Dejan Muhamedagic Geo Clustering and booth.
  <details to be filled in>
<Round Table> Recent features (ie. pacemaker-remoted, crm_resource --restart) and common deployment scenarios (eg. NFS) that people get wrong

Roadmap: What to expect next.

Unification: Locking and fencing the RH style (cman) and the rest of the world.

Additional / future features in pacemaker

Cluster File Systems: Which one is usable for what application.

Please bring your personal ideas/stories/experiences to this discussion!

  <details to be filled in>

Schedule

The goal of the schedule is to provide guidance and to ensure anyone who has a particular idea or concern to address can do so.

Please note:

  • This is an informal summit! The 2008 summit was a success, in part, because of loose structure and simple discussion. To maintain that, atmosphere, the schedule will not be rigidly adhered to. If a good discussion is underway, it will not be stopped to meet an arbitrary deadline. If this pushes back another discussion, then so be it.
  • No slides! If you want to bring some hand-outs, that is fine. Beyond that, a white-board (or paper flip-board) will be available for flushing out ideas.

Below are two calendars. If you want to present an idea or lead a discussion, please pick a one hour span. 45 minutes will be for presenting and then 15 minutes should be for direct Q&A. Any unused time will be unstructured "round-table" discussions.

While possible, try to leave time between your talk and existing scheduled talks to space things out nicely.


This is not meant to be definitive, just a suggestion and a way of boiling down the topics already into the wiki into some set of subheadings. Names are those people that suggested the topic, not necessarily session leaders.

Feb 4th

Developer conference, day 1.

9:00 Informal "coffee and code" time
9:30 Introductions
10:00 Pacemaker. scaling, containers, degraded mode (A. Beekhof)
11:30 DRBD9 (P. Reisner)
12:30 Lunch
13:30 Fencing (help, ease of configuring (M. Kelly))
14:00 Heartbeat
15:00 Round Table discussions

Recent features Unification (locking, fencing)

17:00 Gather for beer/food

An official (as it gets, anyway) dinner and drinks will be had after the summit ends for the day.

Feb 5th

Developer conference, day 2.

9:00 Informal "coffee and code" time
10:00 Monitoring (M. Schwartzkopff)

Scan Core, Scan Agents (M. Kelly)

11:30 Madison Kelly:

Managing/merging resources (IRC and Mailing lists, specifically)

David Vossel:

Resource agents, upstream, testing, submission

12:30 Lunch
13:30 Round Table discussions

Future features Filesystems Roadmap

15:00
16:30 Key-signing party (resources)
17:00 Gather for beer/food

For those staying in Brno for the night;

Dinner and sight-seeing will start around 18:00. Location to be determined, suggestions welcomed.

Location

Red Hat Czech s.r.o. Purkynova 99/71 Brno 61245 Czech Republic

Closest tram stop is 'Cervinkova'.

Due to recent building renovation (not yet visible in google street view), the reception has been moved and here is a video, starring a Panda and Eliska Malikova <emalikov@redhat.com>, that shows the way from Cervinkova tram stop to the reception.


Accomodations

The Devconf accomodations information applies. Please visit their website for hotel recommendations.

Participants

Please confirm your participation here, with full name and company (required to receive a guest badge to access the Red Hat Office).

A little blurb about yourself would be helpful for others to get to know you.

If you do not wish to create an account here, or simply don't like wiki formatting, feel free to email digimer and she will add you to the list.

Full Name Company Note
Madison Kelly Alteeve's Niche! Madison, aka 'digimer', build's the Anvil! HA platform for highly available KVM virtual servers. While not being a programmer, she tries to contribute via tutorials, documentation, testing and community support. Her strength is in Red Hat's High Availability AddOn (cman + rgmanager on RHEL 6).
Andrew Beekhof Red Hat
Christine Caulfield Red Hat Corosync developer, maintainer of cman & associated cluster components, lover of strace and tcpdump logs, writer of white papers with silly jokes in.
Chris Feist Red Hat
Jan Friesse Red Hat
Ken Gaillot Red Hat recent hire on pacemaker team
Marek Grác Red Hat
Ryan McCabe Red Hat
Jan Pokorný Red Hat comaintainer of CMAN stack config layer, recently working on CMAN -> Pacemaker conversion
David Vossel Red Hat
Ondrej Mulár Red Hat
Tomáš Jelínek Red Hat
Matouš Ejem Red Hat
Fabio M. Di Nitto Red Hat also known as ´fabbione´ or the ´The God Father´, Fabio is the HA team overlord at Red Hat.
Andreas Grünbacher Red Hat Andreas has recently joined the GFS2 development team.
Kristoffer Grönlund SUSE Mainly works on crmsh, hawk and resource-agents, @krig on Github.
Michael Schwartzkopff sys4 AG Wrote a book about Linux Cluster.
Dejan Muhamedagic SUSE Works on booth, crmsh, resource-agents, and cluster-glue.
Yan Gao SUSE Mainly works on pacemaker, @gao-yan on Github.
Kengo Fujioka NTT manager in NTT
Keisuke Mori NTT Promoting Pacemaker to the Linux-HA Japan community. Works on pacemaker-1.0/1.1, resource-agents, etc. @kskmori on Github.
Yusuke Iida NTT Japanese Pacemaker developer (and user). Belong to the Linux-HA japan. @yuusuke on Github.
Frank Danapfel Red Hat Software engineer for Red Hat at the SAP LinuxLab focusing on HA solutions for SAP Products running on RHEL.
Philipp Reisner LINBIT DRBD developer
Lars Ellenberg LINBIT DRBD developer, maintainer of heartbeat
Alexander Hass SAP LinuxLab
Kai Dupke SUSE
Michael Adam Red Hat Samba/CTDB developer at Red Hat Storage
Günther Deschner Red Hat RedHat - Samba developer / Storage Team
Kaleb Keithley Red Hat RedHat - Gluster developer / Red Hat Storage Team