What the Dumb Tiger Wants to Share: Cloud

Showing posts with label Cloud. Show all posts

2019-04-16

Google Cloud Next '19

Last week a colleague of mine and me had the chance to visit this year's Google Cloud Next in San Francisco for three days. But we were just two of around 37000 attendees!
The organisation was perfect, and all talks that I attended were very well prepared. There were lots of tracks in parallel - so, I might have missed some interesting ones. Here are the ones I attended:

API Management for Serverless and Multi-Cloud by Prithpal Bhogill and Kelsey Hightower. It started with a short introduction into Apigee, an API management platform that provides versioning, authorization, service discovery, metrics, tracing etc. and that is extensible for different backends. The following demos were shown:
1. How to easily deploy and run your Go server using GCP-native tools: Go code -> Dockerfile -> Cloud Build -> Cloud Run & Apigee (incl. authentication & tracing).
2. Apigee: A mashup of containers run by Cloud Run and Cloud Functions incl. response mixin.
3. Apigee: The above mashup plus a function running on AWS Lambda showing how to start going multi-cloud and the easy composition of building blocks running in multiple environments.
Kubernetes and VS Code: Zero to Deployed by David Gageot. A short demo of Cloud Code for VS Code.
Google Cloud Platform from 1 to 100 Million Users by Harry Lin and Helen Lo. This talk presented a journey of how to improve the availability and scalability of a classical 3-tier web application (frontend, backend, database): GCE instance groups, load balancer, Cloud SQL (incl. read replicas & failover), Memorystore for caching, global load balancer to go multi-region, Cloud DNS, Cloud CDN, Spanner to enable data changes across multiple regions.
Securing Serverless by Breaking In by Hayley Denbraver. Key takeaways for me:
1. Check your dependencies and the dependencies of your dependencies.
2. Fewer functions mean smaller attack surface.
Fast and Lean Data Science With TPUs by Martin Görner and Nik Spirin. The motivation for this talk was visual object recognition with RetinaNet using Focal Loss. Beside an introduction into the TPU architecture (vector processing unit & matrix multiplication unit) and its efficient data types, a model was trained live on ML Engine during the talk and the TPU usage in Keras was shown.
Build Mobile Apps with Flutter and Google Maps by Brett Morgan and Matthew Sullivan. During this talk an app locating ice cream stores in San Francisco was developed. It used Firestore and Google Maps with Flutter in less than 250 lines of code and showed some available packages and the development support (e.g. IDE plugins, hot reload, package manager).
Super-Charge Your GKE Developer Workflow in Visual Studio Code and IntelliJ by Etan Shaul and Ian Talarico. Some live coding using an example bookshelf application and Cloud Code incl. deployment to GKE, remote debugging, the Jib maven plugin and the cluster explorer.
Montage Maker: Building Scaling Serverless Apps by Brent Shaffer. Some lessons learnt when developing a serverless video split and compose application: the maximum runtime of a Cloud Function is nine minutes, there's a maximum of 1000 function invocations per second and a maximum of 1000 outstanding events within Cloud Pub/Sub, the maximum memory used by a Cloud Function is 2 GB etc.
Traffic Director and Envoy-Based L7 ILB for Production-Grade Service Mesh and Istio by Mike Columbus and Prajakta Joshi. Introduction into Traffic Director, a GCP-managed control plane offering global load balancing, autoscaling, traffic control incl. policies etc. for Istio, especially the Envoy proxy. It's still beta, and things like mTLS and improved observability are to come. A failover for an application running on two GKE clusters in different regions was demoed.
Data Warehousing With BigQuery: Best Practices by Ryan McDowell and Alban Perillat-Merceroz. Lots of best practices concerning BigQuery:
- Compressed Avro is the fastest data format for data ingestion (followed by uncompressed Avro, Parquet/ORC, CSV, JSON, compressed CSV, compressed JSON).
- Prefer ELT over ETL; load data into raw and staging tables before propagating into reporting tables; use the Dataflow templates provided by Google.
- In schema design denormalization can speed up queries, e.g. for tightly coupled or immutable relationships.
- Use partitioning to cut down costs on reading data.
- Cluster frequently accessed fields either by natural partitioning or by a fake date column.
- Prefer hashing for deterministic surrogate keys derived from business key.
- Use slot reservations to prioritize workloads and monitor these slots.
- Segment users by roles and isolate their workloads by hierarchical reservations.
Istio in Production: Day 2 Traffic Routing by Megan O'Keefe. After a short introduction into Istio and its components (Envoy, Pilot, Mixer, Citadel etc.), she demoed a bunch of features in a live session: traffic splitting, content-based routing, circuit breaking, fault injection/chaos testing, istio-ingressgateway, how to secure egress traffic etc.
Scalable and Manageable: A Deep-Dive Into GKE Networking Best Practices by Bowei Du and Mahesh Narayanan. A nice rundown about using VPC-native GKE clusters, the use of private clusters with master-authorized networks for best security, container-native load balancing to avoid request imbalance and for best performance, a detailed overview of the Pod lifecycle (liveness, readiness, healthcheck), the importance of signal handling within the application and "terminationGracePeriodSeconds", failure handling ("PodAntiAffinity") etc.
Running Jenkins CI/CD with GKE by Jeff Fry. A short intro and on-screen demo of the Jenkins K8s plugin.
Debugging Istio: How to Fix a Broken Service Mesh by Megan O'Keefe and Sandeep Parikh. This session was awesome! It was a hands-on demo session on how to debug traffic routing, missing metrics and problems with mTLS. Nice tool used during the session: Stern.
Canary Deployments With Istio and Kubernetes Using Spinnaker by Nora Lutz and Andrew Phillips. A nice overview of how to do canary deployments using Istio's "DestinationRule" and "VirtualService" and how Spinnaker adds visibility for these types of deployments by leveraging Kayenta.
How Retailers Prepare for Black Friday on Google Cloud Platform by Kiran Davuluri, Andre Fatala and Jimit Ladha. Some stories about how retailers use capacity planning and change freezes to stabilize the peak season.
Performance Tuning Go Applications on GCP by Valentin Deleplace. This talk covered the following topics:
- Use sync.WaitGroup to synchronize on asynchronously started Go routines.
- Run benchmarks: go test -bench=.
- Generate and view a trace: go test -bench=. -trace a.out && go tool trace a.out
- Generate and view a CPU profile: go test -bench=. -cpuprofile a.prof && go tool pprof -http=:6060 a.prof
- Use the Linux perf tool.
- Detect data access races: go test -race
- Run the escape analysis (stack allocation -> heap allocation): go build -gcflags=-m

2018-03-08

Google Cloud OnBoard

There are several dates for Google's Cloud OnBoard "roadshow". It's been a nice full-day introduction into the possibilities of Google Cloud Platform including a small lab concerning GKE. The event followed quite narrowly along these slides.

A nice rooftop view towards the Alster

Between the topics videos like the following ones were shown:

2017-07-17

First Humble Steps With Terraform

When you are interested in cloud provisioning and infrastructure as code, you nowadays definitely come across Terraform. So did I, and I want to share my first experiences in conjunction with AWS. All code used in this post can be found here.

Installation

Installation is fairly easy: Terraform can be downloaded from its homepage as a single binary, and then you're ready to go. For convenience, you may want to put that binary into a location referenced by your PATH environment variable.

Authentication

Terraform can leverage the authentication scheme used by AWS CLI. I.e. maintain your ~/.aws/credentials like you do for using AWS CLI, and you are already able to start with Terraform. Also, the use of profiles is supported as shown in the following Terraform code snippet, which sets up the provider:

provider "aws" {
region = "${var.region}"
profile = "${var.profile}"
}

Dry run

With terraform plan you can always check what actions would be taken by Terraform. This is also very helpful during development.

Collaboration

By default, Terraform stores the current state in a local file called terraform.tfstate. If you work on provisioning in a team, you should set up a remote state with locking support. Terraform supports several backends, and one of those is S3. Locks are held in a DynamoDB table. The following snippet declares the usage of a remote state:

terraform {

backend "s3" {

bucket = "my-first-humble-steps-with-terraform-staging"

key = "staging/terraform.tfstate"

region = "ca-central-1"

encrypt = "true"

dynamodb_table = "my-first-humble-steps-with-terraform-staging-lock"

}

Before you can initialize the remote state by terraform init, the referenced AWS resources (S3 bucket and DynamoDB table) have to exist. You can create the resources by point-and-click using the console, or guess what, by using Terraform itself:

provider "aws" {

region = "${var.region}"

profile = "${var.profile}"

}

resource "aws_s3_bucket" "terraform_shared_state" {

bucket = "my-first-humble-steps-with-terraform-${lower(var.profile)}"

acl = "private"

versioning {

enabled = true

}

lifecycle {

prevent_destroy = true

}

resource "aws_dynamodb_table" "terraform_shared_state_lock" {

name = "my-first-humble-steps-with-terraform-${lower(var.profile)}-lock"

read_capacity = 5

write_capacity = 5

hash_key = "LockID"

attribute {

name = "LockID"

type = "S"

}

DRY

Modularization

What if you want to provision several environments in a similar way? Terraform provides a module concept. Simply bundle the shared resources into a module (here: resources), put the actual settings in a subdir per environment (here: staging)

.
├── resources
│ ├── outputs.tf
│ ├── resources_test.py
│ ├── resources.tf
│ └── variables.tf
└── staging
├── outputs.tf
├── remote_state.tf
├── staging.tf
├── terraform.tfvars
└── variables.tf

The file resources/resources.tf provides the resources to be provisioned. The file staging/staging.tf just calls the resources module resources with the variables expected by the module:

module "resources" {

source = "../resources"

region = "${var.region}"

profile = "${var.profile}"

vpc_id = "${var.vpc_id}"

count = "${var.count}"

}

If you use a module for the first time, you have to run terraform get within the environment subdir once.
A word about variables: resources/variables.tf defines the input expected by the module, and resources/outputs.tf defines the output handed back to the caller. So, this is kind of an interface definition of that module. The caller provides those input variables either by providing the actual value when calling the module (i.e. in staging/staging.tf) or by defining (see staging/variables.tf), setting (see staging/terraform.tfvars) and passing the variables (see staging/staging.tf) - the latter introduces more places to edit, when you add a new variable. On the other hand, you have a more visible interface definition of the environment itself. It's up to you to decide.

More of the same kind

You don't want to repeat yourself and add redundant code for several resources of the same kind. Therefore, use the resource's count attribute. Using the count.index attribute helps picking a single resource from a collection of resources. In the following snippet n EBS volumes and n instances are created, plus a single volume attachment for each instance/volume pair:

resource "aws_ebs_volume" "humblebee_volume" {

count = "${var.count}"

availability_zone = "..."

size = 8

encrypted = true

tags {

Name = "Humblebee"

}

resource "aws_volume_attachment" "humblebee_attachment" {

count = "${var.count}"

device_name = "/dev/sdz"

volume_id = "${aws_ebs_volume.humblebee_volume.*.id[count.index]}"

instance_id = "${aws_instance.humblebee_instance.*.id[count.index]}"

}

resource "aws_instance" "humblebee_instance" {

count = "${var.count}"

ami = "..."

instance_type = "t2.micro"

subnet_id = "..."

tags {

Name = "HumbleBee"

}

By the way, use volume attachments instead of direct EBS block devices mapped to your instances - if you ever want to replace an instance, e.g. because of an AMI upgrade, only the instances and the attachment get replaced, but the EBS volume itself is left untouched, which avoids undesired data loss.
EDIT: I've updated the code snippet above and switched from element(...) to the indexing operator [...] to avoid unnecessary rebuilds of unaffected resources. See here for details.

Data Sources

Beside resources, Terraform also provides data sources to access data outside of the current Terraform environment. E.g. you can query the ID of a certain AMI using the aws_ami data source:

data "aws_ami" "amazon_linux" {

most_recent = true

filter {

name = "owner-alias"

values = ["amazon"]

}

filter {

name = "name"

values = ["amzn-ami-hvm-*-gp2"]

}

filter {

name = "architecture"

values = ["x86_64"]

}

filter {

name = "virtualization-type"

values = ["hvm"]

}

filter {

name = "state"

values = ["available"]

}

Or you can query the subnets for a given VPC:

data "aws_subnet_ids" "humblebee_subnet_ids" {

vpc_id = "${var.vpc_id}"

}

With that, you can provision an instance with the latest AMI - and if you provision more than one instance, the instances are spread across the available subnets:

resource "aws_instance" "humblebee_instance" {

count = "${var.count}"

ami = "${data.aws_ami.amazon_linux.id}"

instance_type = "t2.micro"

subnet_id = "${element(data.aws_subnet_ids.humblebee_subnet_ids.ids, count.index)}"

tags {

Name = "HumbleBee"

}

Testing

One advantage of infrastructure as code is the ability to run automated tests against that code. When it comes to infrastructure, I think of integration tests as tests against running resources. In the area of Terraform kitchen-terraform is such a integration test framework. But testing against running resources implies in most cases costs.

Therefore, I decided to use unit tests to checks what would be performed in case of a Terraform run. I came across Terraform Validate for this.

Setup

The setup if quite straightforward - I used virtualenv to set up a separate Python 3.5 environment for this:

$ git clone https://github.com/elmundio87/terraform_validate
$ virtualenv -p python3.5 tfenv
$ source tfenv/bin/activate
$ pip install -r terraform_validate/requirements.txt
$ python setup.py install

Test runs

The following test suite checks for name tags being set, direct EBS block device mappings not being used and for EBS volume encryption:

#!/usr/bin/env python # -*- coding: utf-8 -*-

"""Test suite for Terraform resources."""

import os import unittest import terraform_validate class TestResources(unittest.TestCase): """Tests related to resources.""" def setUp(self): self.path = os.path.join(os.path.dirname( os.path.realpath(__file__)), '.') self.validator = terraform_validate.Validator(self.path) def test_tags(self): """Checks resources for required tags.""" tagged_resources = ['aws_ebs_volume', 'aws_instance'] required_tags = ['Name'] self.validator.error_if_property_missing() self.validator.resources(tagged_resources).property('tags'). \ should_have_properties(required_tags) def test_ebs_block_device(self): """Checks instances for NOT having a EBS block device directly mapped.""" self.validator.resources(['aws_instance']). \ should_not_have_properties(['ebs_block_device']) def test_ebs_volume_encryption(self): """Checks EBS volume for enabled encryption.""" self.validator.error_if_property_missing() self.validator.resources(['aws_ebs_volume']).property('encrypted'). \ should_equal(True) if __name__ == '__main__': SUITE = unittest.TestLoader().loadTestsFromTestCase(TestResources) unittest.TextTestRunner(verbosity=0).run(SUITE) # vim:ts=4:sw=4:expandtab

Terraform Validate only parses the Terraform code files, but does not trigger a terraform plan run to find out what actions would be performed. This has some implications:

*.tfvars files are not recognized. So, variable expansion cannot be triggered. Therefore, running tests within the environment subdir is not meaningful, as actual variable expansion does not take place.
Variable expansion only takes default values (if defined) into account.
Checks on properties that are derived from variables or data sources do not work.
Data sources cannot be checked.

So, unit tests do not substitute integration tests, but can complement them.

Cleanup

That's simple: terraform destroy removes all the resources controlled by Terraform's state file. All other resources are left untouched.

The end

It was fun dealing with Terraform. I hope this might be useful for others as well. If you have questions or suggestions, please leave a comment.

2015-11-26

DevOps Conference 2015

DevOps Conference 2015 in Munich - three days consisting of a workshop day and two conference days.
I've chosen The Docker Basics Workshop (in German) by Peter Roßbach. This workshop was very well prepared and led us through some typical tasks of building and running Docker containers with a good mixture of tips and tricks around managing those containers.
On the first conference day I attended the following sessions:

A keynote entitled DevOps State of the Union by John Willis. He talked a lot about culture, CAMS and ICE. Here you can find a collection of his recent talks.
Define Delivery Pipelines in Jenkins easily with Workflow DSL (in German) by Bernhard Cygan. It was an introduction into the Jenkins workflow plugin which provides a nice DSL for workflow definitions within Jenkins - but there seem to be some substantial differences between the commercial and the open source version ... not so nice.
Running Docker on AWS by Jonathan Weiss - a good wrap-up of the application management possibilities within AWS including a live demo of the integration between AWS OpsWorks and Amazon ECS.
Another keynote: Flexibility, how to make data science in banking work by Bart Buter. This was about how to enable data scientists with an agile working environment in the field of Big Data analysis.
Microservices and DevOps Journey at Wix.com by Aviran Mordo. This talk was very entertaining! Aviran talked about their journey from a monolithic to a microservice architecture ... about what to do when, and even more important, about what not to do at the beginning. As the company has to deal with the clash of their company name in German, he also showed their funny German commercial.
Completely Without Any Server? Yes, with AWS Lambda and API Gateway! (in German) by Andreas Mohrhard. AWS Lambda lets you run code without the need to explicitly set up an execution environment - Andreas presented a nice use case for this.
Wake me up before you Go-Go: Lightweight, Fast and Maintainable Web Services in Go (in German) by Philipp Tarasiewicz. A nice intro into the key concepts of Go and how to easily implement a web service.
Panel session: Building happier engineering teams with John Willis, Erkan Yanar, Peter Roßbach and Sebastian Meyen. To summarize it in short words: DevOps people want freedom, respect and challenges; they hate limitations, boring work and a blaming culture.
Three Open Space sessions completed this first conference day.

The second conference day was shorter:

Eight Things that make Continuous Delivery go Nuts by Eduards Sizovs - filled with war stories about how organizations kill CD, DevOps and a good team spirit by their processes and business decisions - very entertaining.
High Throughput Logging with Kafka and Spark by Alexandru Dabija and Viktor Kubinec. They showed a kind of opinionated logging solution based on Kafka and Spark motivated by Elasticsearch not fulfilling their requirements.
Reliable and Flexible Build Resources with Mesos, Docker and CoreOS (in German) by Georg Öttl - a good presentation of how they used Docker containers as Jenkins slaves.
Keynote: Tools, Culture, and Aesthetics: The Art of DevOps by J. Paul Reed - the title says it all and perfectly chimes in with the discussion about the DevOps mindset and culture.
rkt: the container runtime by Iago Lopez Galeiras. A nice presentation of the core concepts and the architecture behind rkt, which takes a different road to application containers compared to Docker.

I skipped the final panel session.
My personal summary:

Docker seems to be everywhere. But we should be cautious and not just follow a hype - ask yourself: Does Docker help in my current use case? Do I understand the implications?
If you want to learn a new programming language, learn Go.
Culture and people are more important than technology - see CAMS.

2013-11-22

Dublin

I've spent some days in Dublin for the company I'm working for.

On my way to work: the formal garden at the Irish Museum of Modern Art

Good selection of decent burgers: Bobo's on Dame Street

Waiting for the bus opposite to the Bank of Ireland at College Green

I also used the opportunity to the join a meetup of the AWS Ireland Usergroup hosted by eircom. About 30 people attended and listened to the following talks:

A nice recap of this year's AWS re:Invent by Grace Mollison.
A live demo showing a message processing system based on SQS, DynamoDB and Elastic Beanstalk also by Grace Mollison.
A presentation about NetApps's idea of private storage in the cloud by Tim Waldron.
Morgan Lynch showed the architecture behind Senddr.

eircom provided free drinks and a buffet for the AWS Ireland Usergroup meetup

2013-02-20

AWS OpsWorks

A new member of the Amazon Web Services family is online now: AWS OpsWorks. And here is the announcement in Werner Vogels' blog.

2011-11-20

Google Developer Day 2011 Berlin

I've been to the Google Developer Day 2011 Berlin ... together with more than 2000 other people.

Right before the keynote ...

It's been a "Google brainwash", but nevertheless very cool ... at least because of the free drinks and food ;-). Maybe you find some video of the different talks around somewhere - here are the talks, I can recommend:

The keynote starring two human beat-boxes (really cool!): A nice wrap-up of what Google has achieved so far and what they are planning.
Google App Engine Overview & Update: As I've never got in touch with that, it's been a nice introduction for me.
Bleeding Edge HTML5: Lots of slides and live demos to understand why HTML5 is one of the current buzzwords.
V8 Performance Tuning Tricks: Some details on JavaScript performance optimization and why it can be very hard to do.
HTML5, Flash and the Battle for Faster YouTube Cat Videos: Two funny guys talking about the advantages and disadvantages of HTML5 video compared to Flash and the future plans for HTML5 video.

2011-05-17

Amazon Web Services

Today the Java User Group Berlin Brandenburg arranged an event about Amazon Web Services sponsored by adesso AG. There were three presentations followed by a Q&A session:

"State of the Cloud" held by Dr. Werner Vogels, CTO Amazon.com gave a good introduction into how Amazon evolved from an e-commerce company to "the cloud company".
In his talk Attila Narin, AWS Solutions Architect EMEA presented the "Amazon Web Services Architecture Lessions" learned - dealing with availability zones, replication, storage decisions and single point of failures.
Finally, Carlos Conde, AWS Solutions Architect EMEA introduced AWS Elastic Beanstalk in his talk "Deploying Java Applications in the AWS Cloud".