Server load - the percentage of the load on the characteristics of the hosting resources, namely the CPU, RAM and disk space, consumed during the execution of current tasks. Analysis of the server load will quickly understand the causes of slow work.

    The hardware of any server consists of 4 main components:

    • Processor
    • memory
    • disk
    • Network interface

    Analysis of server load is to collect and process statistics of each of these components.

    Processor..

    First of all you need to check the processor.
    For example you can use the top utility:

    [email protected]:~# top  
    
    top - 13:29:39 up 7 days, 1:10, 1 user, load average: 0.03, 0.03, 0.00  
    Tasks: 104 total, 2 running, 102 sleeping, 0 stopped, 0 zombie  
    %Cpu(s) : 0.3 us, 1.0 sy, 0.0 ni, 97.0 id, 0.0 wa, 0.0 hi, 1.3 si, 0.3 st
    MiB Mem : 969.5 total, 68.8 free, 635.9 used, 264.8 buff/cache  
    MiB Swap : 0.0 total, 0.0 free, 0.0 used.    106.7 avail Mem  
    
        PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND                                          
        823 mysql 20 0 0 1852008 401812 0 S 1.0 40.5 81:24.20 mysqld                                           
         13 root 20 0 0 0 0 R 0.3 0.0 26:11.00 rcu_sched                                        
        695 redis 20 0 0 66776 4216 2100 S 0.3 0.4 19:55.21 redis-server                                     
          1 root 20 0 0 166044 8396 5084 S 0.0 0.8 3:30.21 systemd                                          
          2 root 20 0 0 0 0 0 S 0.0 0 0:00.09 kthreadd                                         
          3 root 0 -20 0 0 0 0 I 0.0 0.0 0:00.00 rcu_gp                                           
          4 root 0 -20 0 0 0 I 0.0 0.0 0:00.00 rcu_par_gp                                       
          5 root 0 -20 0 0 0 I 0.0 0.0 0:00.00 netns        
    

    It is necessary to pay attention to the allocated sections, the CPU load usually should not exceed 10...20%.

    The following indicators are the most important for analysis:

    • us - user processes. A high index means that our application is loading the server.
    • id - unused CPU resources. This index must be high (normal values are from 80 to 100).
    • wa - waiting for I/O operations. A high value means that the processor waits very long for responses from I/O devices. Most often it is connected with a large number of disk operations.

    More detailed statistics can be obtained using mpstat utility from sysstat package:

    apt-get install sysstat  
    mpstat -P ALL  
    

    View details of all processors on the server:

    [email protected]e1139-22869:~# mpstat -P ALL  
    Linux 5.15.0-46-generic (dsde1139-22869.fornex.org) 09/06/2022 _x86_64_ (1 CPU)  
    
    02:37:21 PM CPU %usr %nice %sys %iowait %irq %soft %steal %guest %gnice %idle  
    02:37:21 PM all 0.90 0.00 0.74 0.13 0.00 0.96 0.32 0.00 0.00 96.95  
    02:37:21 PM 0.90 0.00 0.74 0.13 0.00 0.96 0.32 0.00 0.00 96.95  
    

    The htop utility will show the CPU load in a handy way:

    apt-get install htop  
    htop  
    

    file

    CPU load-analysis.

    If the CPU load index (us in the top) exceeds 20%, it is necessary to evaluate the possibility of optimizing the application, if the possible optimization has already been performed, it is necessary to purchase additional servers.

    In the case of a high I/O waiting rate (wa in top'e), it is necessary to further analyze the disk and network subsystem (below).

    Memory.

    You need to determine the amount of occupied and free memory.

    free  
    

    The free tool will show you the memory usage data:

    [email protected]:~# free  
                   total used free shared buff/cache available
    Mem: 992724 655200 73968 86748 263556 104972  
    Swap: 0 0 0  
    

    It is important to pay attention to the free value, which is the amount of free memory.
    A very important parameter is Swap - it is used disk space in case RAM is no longer sufficient.

    For more information about RAM usage, see

    cat /proc/meminfo  
    

    We will see this information:

    [email protected]:~# cat /proc/meminfo  
    MemTotal: 992724 kB  
    MemFree: 73192 kB  
    MemAvailable: 104864 kB  
    Buffers: 10856 kB  
    Cached: 226868 kB  
    SwapCached: 0 kB  
    Active: 95644 kB  
    Inactive: 686204 kB  
    Active(anon):      29728 kB  
    Inactive(anon):   610212 kB  
    Active(file):      65916 kB  
    Inactive(file):    75992 kB  
    Unevictable: 27624 kB  
    Unlocked:           27624 kB  
    

    You need to determine the amount of occupied and free memory.

    free  
    

    The free tool will show you data on memory usage:

    [email protected]:~# free  
                   total used free shared buff/cache available
    Mem: 992724 655200 73968 86748 263556 104972  
    Swap: 0 0 0  
    

    It is important to pay attention to the free value, which is the amount of free memory.
    A very important parameter is Swap - it is used disk space in case RAM is no longer sufficient.

    For more information about RAM usage, see

    cat /proc/meminfo  
    

    We will see this information:

    [email protected]:~# cat /proc/meminfo  
    MemTotal: 992724 kB  
    MemFree: 73192 kB  
    MemAvailable: 104864 kB  
    Buffers: 10856 kB  
    Cached: 226868 kB  
    SwapCached: 0 kB  
    Active: 95644 kB  
    Inactive: 686204 kB  
    Active(anon):      29728 kB  
    Inactive(anon):   610212 kB  
    Active(file):      65916 kB  
    Inactive(file):    75992 kB  
    Unevictable: 27624 kB  
    Unlocked:           27624 kB  
    SwapTotal: 0 kB  
    SwapFree: 0 kB  
    Dirty: 272 kB  
    Writeback: 0 kB  
    AnonPages: 571784 kB  
    Mapped: 99156 kB  
    Shmem: 86748 kB  
    KReclaimable: 26500 kB  
    Slab: 54816 kB  
    SReclaimable: 26500 kB  
    SUnreclaimable: 28316 kB  
    KernelStack: 2668 kB  
    PageTables: 5548 kB  
    NFS_Unstable: 0 kB  
    Bounce: 0 kB  
    WritebackTmp: 0 kB  
    CommitLimit: 496360 kB  
    Committed_AS: 7313844 kB  
    VmallocTotal: 34359738367 kB  
    VmallocUsed:       17592 kB  
    VmallocChunk: 0 kB  
    Percpu: 552 kB  
    HardwareCorrupted: 0 kB  
    AnonHugePages: 2048 kB  
    ShmemHugePages: 0 kB  
    ShmemPmdMapped: 0 kB  
    FileHugePages: 0 kB  
    FilePmdMapped: 0 kB  
    HugePages_Total: 0  
    HugePages_Free: 0  
    HugePages_Rsvd: 0  
    HugePages_Surp: 0  
    HugePagesize: 2048 kB  
    Hugetlb: 0 kB  
    DirectMap4k: 171884 kB  
    DirectMap2M: 876544 kB  
    

    Memory usage analysis.

    A small amount of free RAM is not a problem, but such a situation is an excuse to closely monitor the server.

    In case Swap starts to grow, you need to take urgent action:

    • Add RAM.
    • Acquire new servers and distribute the load between them.

    Disks.

    The disk subsystem can be stressed when an application works with files. In addition, disks can be stressed by working with the database.

    You should start the disk analysis by checking the free space:

    df -h  
    

    This will show results for all partitions:

    [email protected]:~# df -h  
    Filesystem Size Used Avail Use% Mounted on  
    tmpfs 97M 732K 97M 1% /run  
    /dev/vda1 9.8G 8.5G 846M 92% /
    tmpfs 485M 0 485M 0% /dev/shm  
    tmpfs 5.0M 0 5.0M 0% /run/lock  
    tmpfs 97M 0 97M 0% /run/user/0  
    

    The Use column will show the occupied space.

    The iotop tool is able to show expanded disk load.

    apt-get install iotop  
    iotop  
    

    It will also show the distribution by processes that work on the disk:

    [email protected]:~# iotop  
    
    Total DISK READ: 0.00 B/s | Total DISK WRITE: 0.00 B/s  
    Current DISK READ: 0.00 B/s | Current DISK WRITE: 0.00 B/s  
        TID PRIO USER DISK READ DISK WRITE SWAPIN IO> COMMAND                                                                          
          1 be/4 root 0.00 B/s 0.00 B/s ?unavailable? init
          2 be/4 root 0.00 B/s 0.00 B/s ?unavailable?  [kthreadd]
          3 be/4 root 0.00 B/s 0.00 B/s ?unavailable?  [rcu_gp]
          4 be/4 root 0.00 B/s 0.00 B/s ?unavailable?  [rcu_par_gp]
          5 be/4 root 0.00 B/s 0.00 B/s ?unavailable?  [netns]
          7 be/4 root 0.00 B/s 0.00 B/s ?unavailable?  [kworker/0:0H-events_highpri]
          9 be/4 root 0.00 B/s 0.00 B/s ?unavailable?  [mm_percpu_wq]
         10 be/4 root 0.00 B/s 0.00 B/s ?unavailable?  [rcu_tasks_rude_]
         11 be/4 root 0.00 B/s 0.00 B/s ?unavailable?  [rcu_tasks_trace]
         12 be/4 root 0.00 B/s 0.00 B/s ?unavailable?  [ksoftirqd/0]
         13 be/4 root 0.00 B/s 0.00 B/s ?unavailable?  [rcu_sched]
         14 be/4 root 0.00 B/s 0.00 B/s ?unavailable?  [migration/0]
         15 be/4 root 0.00 B/s 0.00 B/s ?unavailable ?  [idle_inject/0]
         17 be/4 root 0.00 B/s 0.00 B/s ?unavailable?  [cpuhp/0]
         18 be/4 root 0.00 B/s 0.00 B/s ?unavailable?  [kdevtmpfs]
         19 be/4 root 0.00 B/s 0.00 B/s ?unavailable?  [inet_frag_wq]
         20 be/4 root 0.00 B/s 0.00 B/s ?unavailable?  [kauditd]
         21 be/4 root 0.00 B/s 0.00 B/s ?unavailable?  [khungtaskd]
    

    Analysis of the disk subsystem.

    If the disk is exposed to a large number of reads, the correct behavior would be:

    • In case most of the reads are from the application, you need to enable APC caching.
    • In the case of a database, make sure that its parameters are properly configured.
    • If reads occur as a result of accessing a Web server, consider using the HTTP cache.

    A large number of writes to disk usually indicates the need to scale.

    • Make sure you have all access and debug logs disabled.
    • Most disk writes are likely to be generated by the database.
    • A large number of writes may also generate downloadable files.

    Network.

    The cbm utility allows you to see network traffic in real time:

    apt-get install cbm  
    cbm  
    

    We will see data about the amount of traffic per second:

     Interface Receive Transmit Total
      lo 0.00 B/s 0.00 B/s 0.00 B/s
      eth0 35.90 kB/s 758.75 B/s 36.65 kB/s
    

    High network traffic by itself is not a problem. But the near-peak values indicate a need to scale in the near future.

    General statistics.

    The dstat utility will show you the overall real-time server statistics:

    apt-get install dstat  
    dstat  
    

    We will see system data at one-second intervals:

    [email protected]:~# dstat  
    You did not select any stats, using -cdngy by default.  
    --total-cpu-usage-- -dsk/total- -net/total- ---paging-- ---system--
    usr sys idl wai stl| read writ| recv send| in out | int csw  
      2 1 97 0 0| 35k 29k| 0 0 | 0 0 | 683 702 
      1 0 99 0 0| 124k 0 | 39k 1162B| 0 0 |1003 822 
      3 4 86 6 1|3580k 8776k| 37k 522B| 0 0 |1161 1018 
      2 2 95 1 0|3888k 0 | 37k 2808B| 0 0 |1054 995 
      3 0 96 0 1| 0 0 | 34k 444B| 0 0 | 919 810 
      4 0 96 0 0| 756k 72k| 31k 702B| 0 0 | 872 790 
      5 2 93 0 0| 0 0 | 25k 624B| 0 0 | 739 724 
      1 1 97 0 1| 0 0 | 22k 436B| 0 0 | 622 638 
      1 1 98 0 0| 0 0 | 17k 770B| 0 0 | 520 599 
      1 0 99 0 0| 0 0 | 13k 436B| 0 0 | 449 572 
      1 0 99 0 0| 0 0 |9005B 504B| 0 0 | 376 533 
      0 1 98 0 1| 0 0 |7293B 648B| 0 0 | 332 495 
      3 1 95 1 0| 288k 244k|6697B 770B| 0 0 0 | 371 562 
      2 0 98 0 0| 0 0 |6435B 350B| 0 0 | 349 520 
      0 1 99 0 0| 0 0 |6971B 640B| 0 0 | 334 513 
      3 1 96 0 0 0| 0 0 | 13k 342B| 0 0 | 498 625 
      1 0 99 0 0| 0 0 | 22k 770B| 0 0 | 692 744 
      2 1 96 0 1| 0 0 | 33k 598B| 0 0 | 900 810 
    

    Attention should be paid to:

    • total-cpu-usage - CPU load
    • dsk/total - disk load
    • net/total - network load

    If you have any difficulties or any additional questions, you can always contact our support service via Ticket system.