Design options and trade-offs to consider when developing a log ingestion strategy with the Elastic stack

Image by Ag Ku from Pixabay

Imagine your team is responsible for business-critical IT services. Say you operate the shared DevOps platform that enables developers in your company to roll out new features and hotfixes. Think of the version control system, CI/CD tools, deployment environments, and so on. These services can run on multiple environments, and their number and type can change over time (e.g., by adding contract testing integration to your CI/CDs). Part of your daily job is to keep a close eye on these services, monitoring their performance…

Image by chenspec from Pixabay

Benchmarking an Elasticsearch cluster is not for the faint of heart. Sure, benchmarking any distributed system is no joke. But when it comes to Elasticsearch, it is so easy to overlook details that could render your experiments uninformative at best and misleading at worst. This is a problem because empirical testing is the best way to validate your cluster architecture and index/query designs, and the only way to do meaningful capacity planning.

Over the years, I have gained some experience, read interesting articles, learned some lessons, and discussed ideas on how to benchmark Elasticsearch. I recently had the perfect opportunity…

It was a sunny August day when I claimed I would publish the conclusion to this blog series of Elasticsearch exercises “in a few weeks”. Time flies, doesn’t it? Sorry to keep you waiting.

Here is a quick recap: we started by operating and configuring a cluster, then we indexed some data into it, and, finally, we played with mappings and text analysis.

Today, we will practice with searches and aggregations, which are the last two Exam Objectives of the Elastic Certified Engineer exam to be covered.

DISCLAIMER: All exercises in this series are based on the Elasticsearch version currently…

In the previous posts (#1 and #2) of this series, I proposed several exercises in preparation for the Elastic Certified Engineer exam. So far, we have practised operating and indexing data into an Elasticsearch cluster. Today we will work on data mapping and text analysis.

DISCLAIMER: All exercises in this series are based on the Elasticsearch version currently used for the exam, that is, v6.5. Please always refer to the Elastic certification FAQs to check the latest exam version.

Mappings and Text Analysis

The mapping of an index describes the schema of its documents and how to search them. In practice, a mapping tells…

This is the second installation of a four-part series of exercises on how to prepare for the Elastic Certified Engineer exam. In the previous blog post, we practised our ability to deploy, configure, and operate an Elasticsearch cluster. Today we’ll start working with documents covering the “Indexing Data” Exam Objective of the certification.

Before we begin, let’s get a few definitions out of the way. In Elasticsearch, the basic unit of information to persist data is a (JSON) document. Documents with shared purpose and characteristics can be collected into an index. For example, one might have an index for application…

As I began to prepare for the Elastic Certified Engineer exam, I felt overwhelmed by the number of resources available on the web for self-study. In addition to the monumental Elastic documentation, I could get lost in all sorts of articles, tutorials, open books, webinars, and course materials. Despite this wealth of information, what I missed most were exercises to practice the six objectives of the certification exam.

I briefly mentioned the shortage of training exercises in a previous article on tips and advice for Elastic Certified Engineer candidates. I also promised to get over the problem in the near…

At kreuzwerker, we are passionate about emerging technologies. If a technique or a platform is mature enough to offer our clients the optimal solution, we don’t hold back just because it’s new. That explains our experience with Elasticsearch since before its version one came out back in 2013. Fast-forward half a decade (and six major version releases) later, and Elasticsearch has become the de facto open-source standard for central logging and full-text search features. It was a winning choice.

In the last few years we have helped many companies in getting the most out of the Elastic ecosystem: migrating clusters…

At, we dramatically improved the caching efficiency of the ‘Offers of Product’ page by exploiting cache busting techniques and Akamai cloud services.

This is an enhanced and lengthened version of a previous post published on the Kreuzwerker blog.

The beating heart of idealo is the ‘Offers of Product’ page, which aggregates the offers from over 50.000 online shops to enable quick comparisons and purchases. The web-application that supports these functionalities is updated almost every day, in the quest for the best user experience. Many of these updates require changes to static assets like images, CSS, and JavaScript files, which are cached in the browser to improve the frontend performance. But, as any web developer will tell you, delivering fresh assets while still…

Guido Lena Cota

Senior IT Consultant / PhD Computer Science

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store