2012-05-30

NoSQL Matters 2012

A new conference about NoSQL took place in Cologne this year in May - it's called NoSQL Matters, consisted of three parallel tracks and lasted for two days. I attended the following talks.
On Tuesday:
  • Scalable NoSQL - Past, Present and Future by Doug Judd: This keynote started with showing the changes in hardware, software development, IT applications, research and corporate leaderhip over the last decades comparing the pre-internet era with the current internet era. Moving on to NoSQL databases three categories for the different database products were introduced: auto-sharding, distributed hashing and data consistency. The talk ended with an outlook on future and present evolutions focussing on disk drive technology, networking and application trends.
  • NoSQL - A Technology for Real Time Enterprise Applications? by Dirk Bartels: It's been about big data processing in general - not going into depth so much, but showing lots of market analysts' comparisons and mentioning basic theories like the CAP theorem. Also, some differences in requirements concerning NoSQL databases for old-style enterpríses and web enterprises were shown. He finished with application examples from their customers.
  • Designing for Concurrency with Riak by Mathias Meyer: This talk held by the author of the Riak Handbook dealt with data consistency and concurrent writes. Having started with per-document changelogs and vector clocks, some more Riak-specific features like secondary indexes were introduced - besides some distributed data structures in general like g-counters.
  • Hypertable - The Storage Infrastructure behind one of the World's Largest Email Services by Doug Judd: Hypertable was built following Google's BigTable architecture and focuses on horizontal scalability. It uses sparse tables and column families.
  • From Tables to Graph. Recommendation Systems, a Graph Database Use Case Analysis by Pere Urbón-Bayes: This talk gave a short introduction into recommendation systems (detecting similarities, measuring the "distance" between items based on their properties etc.), what's the relation to graph processing and why graph databases may help. The following graph databases, graph processing frameworks and APIs were shortly introduced: Neo4j, OrientDB, Apache Giraph, Signal/Collect, Blueprints API.
  • Welcome to Redis 2.6 by Salvatore Sanfilippo: The main author of Redis gave an overview of the new features introduced in release 2.6: scripting (uses Lua, scripts are atomic and are run within the server), more bit operations, millisecond key expiration, increments by floating-point numbers, serialization of values (dump and restore), AppendOnlyFile improvements, improvements for small sets, hashes etc.
On Wednesday:
  • NoSQL Adoption - What's the Next Step? by Luca Garulli: It's been an entertaining keynote about database history (starting with stone tablets and papyrus ...) and the three rules of NoSQL - one is: "If you only have a hammer, everything looks like a nail." ;-). He also mentioned a criteria catalogue for choosing the right database product and showed some future developments and risks for NoSQL databases.
  • NoSQL - Not Only a Fairy Tale by Timo Derstappen and Sebastian Cohnen: This talk showed the evolution of an ad server's persistence layer (from Amazon S3 to CouchDB back to Amazon S3 with Redis as caching layer) and the lessons learned. In their scenario CouchDB didn't scale as needed due to replication and compaction overhead - CouchDB's strengths like multi-master replication, MVCC and append-only storage weren't needed. Redis' performance was impressive to them.
  • The No-Marketing Bullshit Introduction to Couchbase Server 2.0 by Jan Lehnardt: This talk was about Couchbase, which offers auto-sharding by introducing so-called vBuckets (which reminds me of consistent hashing), automatic failover, a Memcached-compatible API and SDKs for lots of programming languages. Release 2.0 also offers incremental MapReduce, replication across datacenters etc. He also presented a live demo.
  • Apache Cassandra: Real-World Scalability, Today by Jonathan Ellis: This talk started with an overview of Cassandra's high availability features (no single point of failure, multi-master and multi-datacenter awareness etc.). Then he introduced more details about partitioning and replication based on consistent hashing (taking just the primary key into account) and performance features like the log-structured storage engine, row-level isolation and builtin compression. After that he presented lots of examples of Cassandra use cases.
  • NoNoSQL@Google by Olaf Bachmann: This talk introduced, how Google's Ads Traffic Quality Team makes heavy use of big data analysis. Basically, data is stored as Protocol Buffers. There is heavy use of MapReduce, but it's hard to write, maintain and debug. You can countervail this by using Sawzall, but that makes MapReduce inflexible. So, the next try was Dremel, but that has limitations concerning intermediate and output data size. After that SqlMR (SQL on top of MapReduce) was given a try, but it lacks of interactivity and it's hard to debug. So, at Google SQL is still the data analysis language of choice, although there are different dialects of it.
  • Theoretical Aspects of Distributed Systems, Playfully Illustrated by Pavlo Baron: This entertaining talk showed several problems and possible solutions in distributed systems with audience interaction: time synchronization, vector clocks, re-hashig, consistent hashing, gossip architecture, hinted handoff, quorum, master election, failure detection, partition tolerance etc.
The baristas have a break ...

2012-05-27

Usedom

Ich war mit einer sehr guten Freundin auf Usedom und habe das tolle Wetter - sonnig, aber nicht zu warm - und die schöne Seeluft genossen. Zumindest mit den Füßen war ich mal in der Ostsee. Allerdings hätte ich nicht gedacht, dass es an der Ostküste so hügelig ist - einige Steigungen auf der Fahrradtour von Heringsdorf über Bansin und Ückeritz nach Loddin und wieder zurück haben mich doch ganz schön fertig gemacht ...
Hier ein paar Impressionen:
Am Heringsdorfer Strand: Plan B, falls der IT-Job keinen Spaß mehr macht
Klassische Bäderarchitektur ... allerdings müsste man hier vielleicht noch mal ein wenig Hand anlegen
Ein Erlenbruch zwischen Bansin und Ückeritz
Das Anklamer Tor in Usedom (Stadt)

2012-05-13

Balkanesque ...

Gestern gastierten Balkan Beat Box im Lido. Ich habe sie vor einiger Zeit schon einmal im Festsaal Kreuzberg gesehen und war von ihnen begeistert. Dieses Mal war es nicht minder gut: Treibende Musik, volles Haus, gute Stimmung. Hier ein offizielles Video:
Komisch war nur, dass es nach dem Konzert keine Getränke mehr gab, da im Anschluss an das Konzert eine Balkanbeats-Party eines anderen Veranstalters stattfand und man dadurch die Leute erstmal zum Gehen und abermaligen Zahlen des Einbtritts (für die Party) bewegen wollte - wir sind einfach gegangen und haben in einer der umliegenden Gaststätten noch ein Getränk zu uns genommen ...

2012-05-01

Mozart kommt aus Marzahn!

Mal rüber über die Spree nach Kreuzberg zum MyFest. Schön isses wieder gewesen. Ihn wollte ich schon immer mal live erleben: Yok, ein echter Punkpoet - intelligente Texte mit Ukulele und Akkordeon instrumentiert. Dann trat Fil auf und hat über die Möchtegern-Punks "Die Toten Hosen" (nicht dass ich sie nicht mögen würde, aber Punk ist doch irgendwie anders) und Prenzlauer Berg (gehört ja mittlerweile zum guten Ton) hergezogen und uns darüber aufgeklärt, dass Mozart eigentlich aus Marzahn kommt.
Fil - heute ohne Sharkey
Energiegeladen waren Death Before Dishonor, eine Hardcore-Band aus Boston, auf der Bühne vorm Trinkteufel - sehr nett.
War einiges los vor der Bühne vorm Trinkteufel - da gab's "Männermucke"
Zum Schluss noch ein wenig das leckere türkisches Essen genossen, das überall an kleinen Ständen von den Anwohnern angeboten wird, und der Folklore auf und vor der Bühne am Feuerwehrbrunnen gelauscht.
Folklore auf der Bühne am Feuerwehrbrunnen