Zabbix templates linux шаблон

This repository provides a set of templates which offers the alternative set of templates to supplied by Zabbix.

On master branch is last stable version of the templates. Please report eny found issues or bugs.

Development of the next version of the templates is done on devel branch.

If you have some changes for those templates please submit PR against devel.

Table of Contents

The first version tagged in git repo to stamp state of templates and tools and to provide better tracking changes by using git command
The development of the next versions of the templates will continue on devel branch. When all changes are ready devel branch will be merged to master one. It will be way better for those who have interested enough tested template.
Recent changes
- All templates:
  - change all graphs resolution to 1200×300
- ICMP
  - Screens:
    - new NET::ICMP
- MIB
  - changed all description OID trees URLs to http://www.oidview.com/mibs/ based (http://support.ipmonitor.com seems no longer available)
  - IF-MIB
    - Applications:
      - the new prototype for all interfaces LLD prototype items IF-MIB::interfaces::
    - LLDs:
      - new interfaces LLD discovery[<#IFDESCR>,IF-MIB::ifDescr,<#IFOPERSTATUS>,IF-MIB::ifOperStatus] and added filter to remove from the list all interfaces with ifOperStatus=Down state. Switch from <#SNMPVALUE>to <#IFDESCR>as macro indexing all prototype items
    - Screens:
      - new:
        
        IF-MIB::ifHCOctets
        
        IF-MIB::ifOctets
  - SNMPv2-MIB
    - Applications:
      - rename mib-2.system to SNMPv2-MIB::system and mib-2.system.snmp to »’SNMPv2-MIB::snmp»’ to use matching MIB naming conventoion
    - Items:
      - changed units in update interval from number of seconds to number of m/h/d
    - Graphs:
      - new normal graph SNMPv2-MIB::snmpPkts with SNMPv2-MIB::snmpPkts OIDs presenting rate of SNMP requests/replies
- OS Linux
  - Items:
    - added MEM:: items descriptions
    - fixed NET::segments retransmitted item use new sed regexp in this item s/$ *$$.*$ segments retransmitted*/\2/ p/
  - Triggers:
    - fixed typo in name: s/SYS:uname changed/ SYS::uname changed /
    - rename «Lack of free memory» to MEM::free
    - use diff() =1 function (instead change() and str() ) in triggers:
      - HW::devices list has been changed
      - HW::CPU info has changed
- OS Solaris
  - Triggers:
    - fixed typo in trigger name s/SYS:uname changed/ SYS::uname changed /
    - added MEM::free
- OS Windows
  - Triggers:
    - rename trigger to the same name as it is in other OS templates s/Host information was changed/ SYS:uname changed /
    - rename «Lack of free memory» to MEM::free
- Service MySQL
  - Applications:
    - new:
    - SVC::MySQL::cfg for all read configuration parameters
    - SVC::MySQL::Com for all Com_* metrics
    - SVC::MySQL::DB:: <#DB>prototype for all per database metrics
    - SVC::MySQL::innodb for all innodb storage engine metrics
    - SVC::MySQL::threads for all treads related metrics
  - Graphs:
    - new SVC::MySQL::threads with Threads_cached , Threads_connected and Threads_running metrics
  - Items:
    - new max_allowed_packet — the maximum size of one packet or any generated/intermediate string
    - new show_compatibility_56 — show is MySQL engine running in MySQL 5.6 compatibility mode is ON/OFF
    - new Threads_cached — the number of threads in the thread cache
    - mew Threads_running — the number of threads that are not sleeping
    - rewrite most of the items SQL queries to use uppercase SQL keywords and lowercase for table names and row names (this will cause problems with imprt new template b ut I need to standarize thuis before first officially announced release of the templates)
  - Screens:
    - new SVC::MySQL::threads which combines SVC::MySQL::threads graph and Connections simple graph
  - Triggers:
    - new SVC::MySQL::version has been changed (severity: Not classified)
    - new SVC::MySQL::cfg::show_compatibility_56=ON (severity: High, because this template requires show_compatibility_56=OFF)
- Service Zabbix Proxy
  - Applications:
    - new SVC::Zabbix Proxy::proc
  - Graphs:
    - updated:
      - SVC::zabbix_proxy::process busy %
      - SVC::zabbix_proxy::data gathering process busy %
  - Items:
    - new:
      - proc::busy::configuration syncer
      - proc::busy::data sender
      - proc::busy::heartbeat sender
      - proc::busy::ipmi manager
      - proc::busy::ipmi poller
      - proc::busy::java poller
      - proc::busy::snmp trapper
      - wcache::index::pfree
    - delete items which have been by mistake copied from Service Zabbix Server template
      - wcache::text::free
      - wcache::text::total
      - wcache::text::used
    - move Processes:: items to SVC::Zabbix Proxy::proc Application
    - rename all Processes::$4::$2 to proc::$4::$2 and remove quotes on all those items second key parameter (to allow easy migration from standard «Template App Zabbix Proxy» template)
  - Triggers:
    - new:
      - SVC::zabbix_proxy::configuration syncer >=75% busy
      - SVC::zabbix_proxy::data sender >=75% busy
      - SVC::zabbix_proxy::heartbeat sender >=75% busy
      - SVC::zabbix_proxy::ipmi manager >=75% busy
      - SVC::zabbix_proxy::ipmi poller >=75% busy
      - SVC::zabbix_proxy::java poller >=75% busy
      - SVC::zabbix_proxy::snmp trapper >=75% busy
      - SVC::zabbix_proxy::vmware collector >=75% busy
- Service Zabbix Server
  - Applications:
    - new:
      - SVC::Zabbix Server::rcache::buffer
      - SVC::Zabbix Server::vcache::buffer
      - SVC::Zabbix Server::vcache::cache
      - SVC::Zabbix Server::wcache::history
      - SVC::Zabbix Server::wcache::trend
      - SVC::Zabbix Server::wcache::values
    - rename SVC::Zabbix Server::process::busy to SVC::Zabbix Server::proc
  - Graphs:
    - updated SVC::zabbix_server::process busy %
    - new SVC::zabbix_server::preprocessing queue
  - Items:
    - added all items descriptions
    - new:
      - proc::busy::alert manager %
      - proc::busy::escalator %
      - proc::busy::ipmi manager %
      - proc::busy::ipmi poller %
      - proc::busy::java poller %
      - proc::busy::preprocessing manager %
      - proc::busy::preprocessing worker %
      - proc::busy::proxy poller %
      - proc::busy::snmp trapper poller %
      - proc::busy::task manager %
      - proc::busy::timer %
      - proc::busy::vmware collector %
      - triggers
      - queue::preprocessing
    - remove items::queued (it duplicates information provided by queue::* items)
    - removed quotes around processes names to make migration from standard zabbix template easier
    - rename all process::* items to proc::* (keep it in sync with proxy template)
    - rename Uptime to uptime
    - Triggers:
      - new:
        
        SVC::zabbix_server::alert manager processes >=75% busy
        
        SVC::zabbix_server::escalator processes >=75% busy
        
        SVC::zabbix_server::ipmi manager processes >=75% busy
        
        SVC::zabbix_server::ipmi poller processes >=75% busy
        
        SVC::zabbix_server::java poller processes >=75% busy
        
        SVC::zabbix_server::preprocessing manager processes >=75% busy
        
        SVC::zabbix_server::preprocessing worker processes >=75% busy
        
        SVC::zabbix_server::proxy poller processes >=75% busy
        
        SVC::zabbix_server::snmp trapper processes >=75% busy
        
        SVC::zabbix_server::task manager processes >=75% busy
        
        SVC::zabbix_server::timer processes >=75% busy
        
        SVC::zabbix_server::vmware collector processes >=75% busy
- Service Nginx
  - new template

List of templates:

ICMP
MIB
- F5-BIGIP-LOCAL-MIB
- F5-BIGIP-SYSTEM-MIB
- IF-MIB
- IP-MIB
- SNMP-MPD-MIB
- SNMP-USER-BASED-SM-MIB
- SNMPv2-MIB
- SNMP-VIEW-BASED-ACM-MIB
- UDP-MIB
OS Linux
OS Solaris
OS Windows
SNMP Devices
- BIG-IP 5000
- DSL-3782
Service Apache
Service MySQL
Service Nginx
Service Zabbix Agent
Service Zabbix Proxy
Service Zabbix Server

Notes and Guidelines:

Each template has own version tag which is the copy of the whole zabbix-templates package version tag in which last changes has released
Each template in the description field has the last modification date and internal version
If it is something which needs to be done to use those templates it is described in each template within description notes
Naming convention for the items names, applications and triggers must adhere naming convention using 2-4 letter abbreviations:

Reason of use in all templates the same graphs resolution, item types and SNMP protocol version and community name is to provide easy way to change those settings across all templates is someone may need this.

This program is free software, distributed under the terms of the GNU General Public License Version 2.

Источник

Zabbix Code Guidelines

Templates

Zabbix template guidelines

Disclaimer

Current document should not be considered as a strict set of rules everybody must follow. Instead, this document only reflects our current approach to template building and any rule or best practice described here may evolve to something else or may be abandoned in the future.

Current document status: Version 1.0

Introduction

Zabbix approach to monitoring. Resource vs service

Everything can be seen as a service or as a resource.

A service is something that your organization provides to the outside world. Like an online retail store, or public email service. If your store is online and customers can successfully purchase something from it – your service is available. Or maybe service is something that your department provides to the rest of the company. Like computation resources that you provide to other departments on demand. Such internal service can be the dependency for your company online store service. As you can see, service issues clearly affect the real world. Service monitoring availability and performance is what you should always do in the first place. As Google suggests in SRE book, service unavailability should be considered a red-hot situation and the responsible person must be immediately paged.

A resource is a component that helps to provide services. It can be a server, a virtual machine, a container, a database, middleware app, microservice custom app, some hardware controller, network or anything else. You can breakdown resources to even smaller bits, like splitting server into CPU, RAM, IO subsystem or splitting network link to layer 3 and 2 connectivity and physical link present on both sides. The modern distributed system might be a complex set of different resources with dynamic nature where resources are added and removed on demand based on the service load, just like in Kubernetes cluster or AWS. Resources require your monitoring attention too, but differently. Because resource unavailability doesn’t necessarily automatically affect the real world, keep paging people on resource failures at a minimum. Create tickets instead that can be solved during working hours.

Service monitoring is considered a project level monitoring – it is not something you can get out of the box or get a template on share.zabbix.com — it is something you need to create yourself using Zabbix features. That is because all services are different, have different SLOs, have different architecture and so on, so it’s hard to prepare a common blueprint for service monitoring. But consider the following approaches:

And most likely situations such as zero rate (or sudden drop of rate), high errors ratio (HTTP 500 everywhere) are the situations that indicate serious service problems.

But why do you need to monitor resources if service monitoring is set up? There are multiple reasons but the most important one is this: once you know your service is down (symptom) you need to isolate the root cause of the problem.

Resources is something that is common and generic in many projects, different architectures. That’s where templates can thrive. Seriously, do you really need to waste time to create your own monitoring solution to control OS Linux? Or for MySQL database, Cisco router, or for docker host? Maybe you can spend your time more efficiently by preparing service-level monitoring instead.

What does it take to become a good resource template?

Let’s try to define some properties of what a good resource template is, some key principles we follow in Zabbix when building templates:

1. Flexible and reusable

In Zabbix, the template is equal to the monitoring solution for some specific object. It’s a sort of container that should be used to transfer configuration, monitoring solution between Zabbix server instances. A good quality template is something that Zabbix users create, use for their own good, and then share it with the Zabbix community, so the next person can download this template and reuse it, update it with newer ideas and approaches, contributing to the common cause. So, the first thing that comes to mind when you try to answer a question what a good template in Zabbix is how flexible and reusable it is. If other Zabbix users can download it and use it without changing half of it – that’s really a good sign.

Here are a few rules of the thumb on how to achieve it:

2. Knowledge and expertise

We also think that a good template is not just a set of metrics (items), thresholds (triggers), and dashboards bundled together. The most important ingredient to a good template is how much expertise and knowledge about a monitored object is contained within. And by expertise and knowledge we mean, not the number of metrics someone knows how to collect – but knowing what metrics are useful and important, and which are just useless, or what thresholds should be used to be notified only about problems that matter without too much noise.

While very minimalistic template may not provide all the information you need, on the other hand, bloated, oversized templates are bad as well, as users lose focus on the most important metrics, as well as they get overwhelmed with problems noise. So:

3. Modularity and scope

The last thing that is very important is the template scope. We already talked about services and resources, so, generally, the good template is the one that has a scope of the single resource:

If you keep a template scope within a single resource, it will be much easier to share such templates and they will be useful to people who have the same building blocks in their architecture. Also, avoid merging resources of different layers – do not add metrics for Linux OS and PostgreSQL into the single template.

But what about ‘inner’, metrics scope? What metric types, classes should you be collecting? Surely you can do monitoring for various reasons, including collecting business indicators or looking for security breaches, but when creating a generic resource Zabbix template, try to adopt the following approach:

3.1 Always start with fault monitoring or availability monitoring. The most popular and very important answer people want to get from monitoring – is my system up and running? So, try to address that in your template first.

That is, prefer black-box monitoring approach here, simple or not so simple health checks are essential and the first thing you or any user want to know about. Add items and triggers to your template that can help you to be sure – the thing you are monitoring is accessible and is up and running. Use ICMP ping, check that TCP port is open, check that API returns HTTP 200 OK, and so on.

The second problem that is addressed by fault monitoring is an imminent failure. For example,

Add items and triggers that will help you to intervene and prevent such a drastic outcome.

Also, if your monitoring object can detect faults on its own — use it! Many systems can report faults directly using logs or sending SNMP traps and so on. And that’s the kind of expertise we talked earlier provided to you from the developers, vendors, authors of the system you want to control. And nobody knows it better than them. So just make sure you can retranslate faults detected by the system itself in your Zabbix template.

3.2 Once your template can check the health of your system — proceed with performance monitoring. This is where you will need to open the box wide open (white-boxing). There are really nice methods out there to help you choose what metrics to collect first: USE, Four golden signals from Google or RED for request-driven services. Just make sure you extend the template with items and triggers to help solve the following use cases:

3.3 Inventory and state control

While Zabbix is not the inventory system, it still can collect lots of information about the resource and most importantly, detect changes, such as the system being restarted outside of maintenance window, the version was updated, or it is outdated, and so on. So, make it part of your template checklist.

3.4 Security

If you know how to properly detect security issues with the resource, i.e.:

Then consider adding such items and triggers to your template as well.

4. Follow guidelines

Finally, is the style of the template. How to name your items? Templates? Triggers? If we would all follow the same style when creating Zabbix templates – then it wouldn’t really matter who made this template – you, Zabbix, or another community member from the other side of the globe – as template contents and layout will be very predictable and expected.

Conclusion

Following style guidelines and template core principles mean that we can reuse each other templates as building bricks for our monitoring projects, saving time and adding someone else knowledge on the monitored object.

That concludes the introduction to Zabbix template guidelines, a comprehensive set of rules how we build templates in Zabbix.

Источник

Zabbix templates linux шаблон