ELK Stack Introduction In this post I will tell you about the basics of ELK stack. For each technology we are going to discuss its basic features and functions, examples, and we will be connecting its features together. All examples will be done on Ubuntu server 18.04 LTS with ELK 7.x for Long story short - ELK Stack is a tool for centralized logs from your infrastructure at one spot, optimized for fast search.
At Cloudinfrastack we are very proud of our international team of local and remote employees. Not only does it create a positive diversity at the workplace, it also encourages effective cooperation with people of different nationalities. We, as a company, greatly benefit from our remote employees who live in different time zones, most notably in terms of the overall effectiveness and flexibility which it offers when tasks need to be completed around the clock.
Databases are often the most critical part of an infrastructure. Failure of a database server often leads to a failure of other components because a database server is required for their functionality. Everyone does backups, and it is the first thing you should do if the data stored in your database server is crucial, but backups are run weekly, daily, hourly… And sometimes even a few minutes of outage can cause the loss of a lot of money, that’s when high availability comes to mind.
It’s a no brainer that being a part of a rapidly growing and constantly changing industry such as IT requires a different approach in most areas, including the productivity and happiness of your employees. At Cloudinfrastack we believe that our business is not only collective success, but we also firmly believe that our office is more than just a place where work takes place, it’s also an environment that needs to be adjusted accordingly to the needs of the employees as they are the ones who most often occupy the office space.
This is the question we asked ourselves with the goal to shorten response time on Skype and to make our work more effective. Everything works great with Mattermost because of its webhooks and integration possibilities. We have deployed monitoring and created through our community a solution to connect Mattermost with other platforms like Slack, Rocket.Chat, Discord, and others, which is called matterbridge. The unfortunate realization that it does not connect with Skype gave birth to an idea; to create a bridge between Skype and Mattermost.
Our Company runs mainly on open source technologies. It’s no secret – we use Openstack, Ceph, Nextcloud and many more open source softwares. But what is so special about an open source software and why do people think you should prefer it over the regular one? Firstly, by using an open source software, you support the community and the growth of the software. Some companies, like us, also like to innovate the software and explore its potential and exiting features.
FFMPEG is a powerful open-source tool you can use to handle various multimedia files. In this tutorial we will outline the basic functionality of FFMPEG, however, it is important to consider that this is only a small portion of the FFMPEG project. FFMPEG can be downloaded from the official FFMPEG website and it can also be downloaded using a package manager such as APT. Multimedia files which FFMPEG is able to process typically consist of a container with audio and video streams.
It has been a long time since http/2 was introduced to the world. Http/2 promised higher speed, lesser packets, multiplexing, and much more. We know there is a profound difference between http/1.1 and http/2 but many things remain unsolved. That is why IETF in collaboration with google started developing the newest version of http, version 3 based on the QUIC protocol. Http/3 is completely different because it is UDP (User Datagram Protocol) based.
What exactly is Rsync? Rsync is an efficient utility most commonly used when synchronizing files and directories between separate hosts. A typical example of rsync usage would be the following: rsync -avz file root@remote-host:/home/ This command will open an SSH connection to the remote host, afterwards it will run rsync on the remote host which will compare which specific parts of the locally stored file are required to be transferred so that the remotely stored file is identical to the locally stored file.
At first let’s remind ourselves what master-master setup means. Master-master method belongs to Multi-master replication method, in database replications you have 2 types on nodes – master and slave. In master-slave, needed changes identified by a group member must be submitted to the designated “master” of the node. This differs from Master-Master replication in which data can be updated by any authorized contributor of the group. Although it is a bit unusual, it´s sometimes better to have your databases in master-master setup, for example:
In previous blog posts, my colleagues have introduced Prometheus and explained in detail how it works. For those who did not read those posts; Prometheus is basically pull-style monitoring. You have a service which sends requests for metrics and target nodes respond with them via http/s protocols. When using Prometheus, you get to a point when you wonder if there is a way to have redundant metric collecting. Since there is a thing called Federation you might think that this is the redundancy that you want.
What is systemd? Systemd is a fairly recent part of linux operating system – RedHat started developing systemd in 2010 and it was added to version 15 of ubuntu linux back in the year 2015. It is responsible for booting the system correctly and maintaining all its services in its proper state. Basically, systemd is a successor of init scripts used on older linux systems. Why should we use systemd services?
Sometimes it is really helpful to use debug mode to see what really happens during runtime of your code. And that is the time for delve.go code debugger. How to install delve (if you have your go path set) go get -u github.com/derekparker/delve/cmd/dlv Now you should be able to run the dlv version dlv version Delve Debugger Version: 1.4.0 Build: $Id: 1990ba12450cab9425a2ae62e6ab988725023d5c $ I have prepared this simple go code
Our task: Let’s suppose we have a set of 100,000 files placed in 100,000 paths. We need to know the size of each and then make a list of the ones larger than n megabytes with full paths while not spending ages on it. Simple methods like bash’s find and grep are too slow, so in this article we will talk about how we can use python multiprocessing library for our files.
Sometimes we are all in need of doing some quick and basic setup to monitor our key services. In these cases, this super simple cheatsheet comes into play. This exact guideline is meant to be used for several node Elasticsearch clusters but can also be used to monitor almost everything – just replace ES exporter with anything that suits your own taste and set up a nice informative dashboard. Setting up Node Exporters.
Cloudinfrastack has been challenged with designing CDN, storing 200M images and serving them to web with high performance caching. In this talk we will present complete solution, consisting of Ceph object storage gateway RadosGW, frontend image serving and on-the-fly resizing, and essential tips for designing, running and optimizing such demanding task with Ceph. Record from Cloud and DevOps Meetup on October 16th.
This technical talk covers aspects of designing a data-center fabric, creating it in the virtual space for testing and verification including an open-source tool based roll-out. Andreas is System Engineer with more than 20 years of experience. Andreas is currently working as Pre-Sales System Engineer at Cumulus Networks. Record from Cloud and DevOps Meetup on October 16th.
In the past years APIs became an essential part of modern web applications. API incorporated most of the stuff historically done on the server side - authentication and authorization, combining data sources, calling 3rd party services etc. This puts a lot of pressure on API stability and scalability, yet we still need a sustainable pace for feature development. Karel is a cloud enthusiast and has a long term experience with architecture on both AWS and Google Cloud Platform.
If we are working with monitoring systems, we usually want to know if we have some unusual behavior in our graphs, for example if disk I/O graph is briefly increased. This behavior is called spikes. But how can we catch the spikes correctly if we use Prometheus in our infrastructure? Prometheus is a TSDB (time series database), it can export data to monitoring systems such as Grafana. Prometheus has 4 types of metrics:
When it comes to cloud storage data, most users and companies use google drive platform. It is a good place to save your files into an online storage and therefore be able to reach your files from anywhere. Of course, even google needs to sustain its services, so some restrictions like a limited cloud space are common. When you require more practical use in your business you will reach the point of having two options.
Elasticsearch is the name of a full-text search engine in computer science, distributed for free under the Apache license. It has a RESTful interface and offers high availability, speed, and scalability. It is developed in Java and can be communicated with via the web interface. Elasticsearch is a schematic database, therefore it is not necessary to define the database structure because it is created based on embedded data. It can be included on the list of NoSQL databases.
On hot summer days when the heat is in the air, my mind starts to think about vacation and the time passing by, but business never stops and it’s nice to have all things nicely prepared before you leave the office. Especially when you can use OpenStack instrument called Heat. So, let’s take look at it a bit. Heat is a very useful orchestration tool for OpenStack users as it provides a way to automate the process of cloud components creations.
What is meant by “infrastructure as code” Infrastructure as code is a way to maintain infrastructure by automated processes and minimize human effort needed to configure anything from physical baremetals to services running on many virtual hosts. There are 2 ways to do this. The first way is to have an automating software running on every host and pulling configuration from servers which have all configuration ‘recipes’. The second way is that configuration servers push configuration to host (insecure – configuration server needs to have access to all servers).
Since our infrastructure is powered by Openstack, Cinder takes care of exposing our block devices to virtual machines. And because we value open source software, we use Ceph as the storage backend (as well as LVM in certain setups). In today’s article, I will show you the overview of Ceph architecture, pinpoint its advantages and disadvantages, and show you a demo of Ceph snapshotting to demonstrate its power and the ease of administration.
Let’s say you have just finished installing Prometheus, full of enthusiasm you want to take another step, create the structure of exporters and sort out from which exact services you want to harvest metrics. If you use it on a small scale, source code control is not your biggest concern, but when you want to collect metrics from your whole infrastructure, you definitely want to know the binaries you are running.
Golang, as a very ops/admin focused language, has a huge community and thus a lot of useful packages that can help us in the everyday development regarding monitoring, graphing, and automatization. I’m going to demonstrate a few that I use in most of my programs, either as a substitution of the default package with a similar functionality or a totally new functionality that I consider a core need of the modern ops/admin tool development.
Why Prometheus? Prometheus is an open-source system monitoring and alerting toolkit originally built at SoundCloud. Since its release in 2012, many companies and organizations have adopted Prometheus, and the project has a very active community. It is developed as an open project, independent of any company or organization.“ It is based on metrics and is designed to measure and visualise the overall health and performance of services, it is similar to tools like Graphana/Graphite, but offering a more robust and comprehensive feature set, including:
Our Infrastructure We are currently managing over 2000 virtual machines, hundreds of bare metals, tens of services, and tens of user accounts. You can imagine how difficult it was to add or change existing users (change permission, access, ssh keys, and so on). The Pains of Locally Managed Users Previously, our users were deployed only with puppet, which is great, however, searching for users in different git repositories, different branches, wasn’t the right way.
In our infrastructure we manage mainly Linux hosts, but there are also a few Windows servers that meet clients’ requirements. The best way to manage cloud infrastructure is by automation, using Puppet or Ansible for example. Unfortunately, it is only effective with a vast amount of hosts with similar features. We decided to manage all Windows hosts manually because in this case automation processes (Puppet, Ansible) would be more time consuming.
Our transition from lower Puppet version to Puppet 5 was (and still is) somewhat tricky. For us - managing more than 2000 hosts with puppet - this task is really time consuming. Every host must be switched manually to make sure no critical changes will apply. Luckily, there are some steps that can be done to simplify this task, some of which I’m going to explain. This will allow you to switch to higher version of Puppet without losing your precious data, and allow you to use other Puppet 5 features in the future.