laitimes

Nginx shuttles: anti-leeching, dynamic and static separation, high availability, compression, cross-domain, cache, etc

Dry! The summary is very long, it is recommended to collect it first and watch it slowly!

preface

In the early stand-alone service, there were two main problems:

(1) The deployment mode of the monolithic structure cannot carry the increasing traffic of services.

(2) When the backend node goes down, the entire system is paralyzed, resulting in the entire project being unavailable.

Hence the need to introduce load balancing technology:

  • "High availability of the system" means that when there are multiple services, when a service is down, requests can be quickly transferred to other services.
  • "High performance of the system" Multiple servers increase the throughput of the system at higher scales.
  • "System scalability" When the business suddenly increases or decreases, nodes can be added/subtracted to achieve high flexibility and scalability.

There are two main load schemes: hardware level and software level, we all know that the hardware aspect is expensive, and the bosses must try to solve the technical level as much as possible.

First, the concept of Nginx

Nginx is a lightweight high-performance HTTP reverse proxy server, and it is also a general type of proxy server, supporting most protocols, such as TCP, UDP, SMTP, HTTPS, etc.

Nginx shuttles: anti-leeching, dynamic and static separation, high availability, compression, cross-domain, cache, etc

NGINX is built based on the multiplexing model, which has the characteristics of less resource occupation and high concurrency support.

The official explanation theoretically supports 5W concurrent connections at the same time, of course, this peak can only be reached in the actual production environment unless the hardware keeps up.

Nginx shuttles: anti-leeching, dynamic and static separation, high availability, compression, cross-domain, cache, etc

After using Nginx proxy, the client's request is distributed to the server for processing, and the server returns Nginx after processing, and the Nginx result is returned to the client.

Let's build an environment to understand the advanced features of Nginx, such as dynamic and static separation, resource compression, cache configuration, IP blacklist, high availability guarantee, and so on.

Second, Nginx construction

Server-related construction is omitted and goes directly to NG.

❶ The server creates the Nginx directory and goes into:

[root@localhost]# mkdir /soft && mkdir /soft/nginx/  
[root@localhost]# cd /soft/nginx/             

❷ Download the Nginx installation package

You can upload the downloaded compressed package by the server remote tool, and also use the wget command server to download the compressed package online:

[root@localhost]# wget https://nginx.org/download/nginx-1.21.6.tar.gz  

wget命令的可通过yum命令安装:           

If the wget command is not supported, you need to install wget support with the yum command:

[root@localhost]# yum -y install wget             

❸ Command to extract the Nginx package:

[root@localhost]# tar -xvzf nginx-1.21.6.tar.gz             

❹ Download and install the dependencies and packages required by Nginx:

[root@localhost]# yum install --downloadonly --downloaddir=/soft/nginx/ gcc-c++  
[root@localhost]# yum install --downloadonly --downloaddir=/soft/nginx/ pcre pcre-devel4  
[root@localhost]# yum install --downloadonly --downloaddir=/soft/nginx/ zlib zlib-devel  
[root@localhost]# yum install --downloadonly --downloaddir=/soft/nginx/ openssl openssl-devel            

You can also install it with one click with yum command:

[root@localhost]# yum -y install gcc zlib zlib-devel pcre-devel openssl openssl-devel             

After completing the ls view, all the dependencies that need to be are in it:

Nginx shuttles: anti-leeching, dynamic and static separation, high availability, compression, cross-domain, cache, etc

Then use the rpm command to build each dependent package in turn; Or use the following command to install all dependent packages with one click:

[root@localhost]# rpm -ivh --nodeps *.rpm             

❺cd to the nginx directory, execute the Nginx configuration script, configure the environment in advance for later installation, default in the /usr/local/nginx/ directory:

[root@localhost]# cd nginx-1.21.6  
[root@localhost]# ./configure --prefix=/soft/nginx/             

❻ Execute the command to compile and install Nginx:

[root@localhost]# make && make install             

❼ Go back to the /soft/nginx/ directory, and use ls to see the files generated after installing nginx.

❽ Modify the nginx.conf in the conf directory after installation:

[root@localhost]# vi conf/nginx.conf  
修改端口号:listen    80;  
修改IP地址:server_name  你当前机器的本地IP(线上配置域名);             

❾ Develop the Nginx configuration file and launch:

[root@localhost]# sbin/nginx -c conf/nginx.conf  
[root@localhost]# ps aux | grep nginx             

Nginx other operation commands:

sbin/nginx -t -c conf/nginx.conf # 检测配置文件是否正常  
sbin/nginx -s reload -c conf/nginx.conf # 修改配置后平滑重启  
sbin/nginx -s quit # 优雅关闭Nginx,会在执行完当前的任务后再退出  
sbin/nginx -s stop # 强制终止Nginx,不管当前是否有任务在执行             

❿ Release port 80 and refresh the server firewall:

[root@localhost]# firewall-cmd --zone=public --add-port=80/tcp --permanent  
[root@localhost]# firewall-cmd --reload  
[root@localhost]# firewall-cmd --zone=public --list-ports             

⓫ Browser input Nginx IP or domain name access:

If you see the Nginx welcome screen, then congratulations on the successful installation.

Third, Nginx reverse proxy - load balancing

First build a simple WEB project with SpringBoot + Freemarker, and then in this project, the control layer interface is as follows:

@Controller  
public class IndexNginxController {  
    @Value("${server.port}")  
    private String port;  
  
    @RequestMapping("/")  
    public ModelAndView index(){  
        ModelAndView model = new ModelAndView();  
        model.addObject("port", port);  
        model.setViewName("index");  
        return model;  
    }  
}             

Front-end index.html source code:

<html>  
    <head>  
        <title>Nginx演示demo</title>  
        <link href="nginx_style.css" rel="stylesheet" type="text/css"/>  
    </head>  
    <body>  
        <div style="border: 2px solid red;margin: auto;width: 800px;text-align: center">  
            <div  id="nginx_title">  
                <h1>欢迎来,我是竹子${port}号!</h1>  
            </div>  
        </div>  
    </body>  
</html>             

Once you've done the preparations, adjust nginx.conf again

upstream nginx_boot{  
   # 30s内检查心跳发送两次包,未回复就代表该机器宕机,请求分发权重比为1:2  
   server 192.168.0.000:8080 weight=100 max_fails=2 fail_timeout=30s;   
   server 192.168.0.000:8090 weight=200 max_fails=2 fail_timeout=30s;  
   # 这里的IP请配置成你WEB服务所在的机器IP  
}  
  
server {  
    location / {  
        root   html;  
        # 配置一下index的地址,最后加上index.ftl。  
        index  index.html index.htm index.jsp index.ftl;  
        proxy_set_header Host $host;  
        proxy_set_header X-Real-IP $remote_addr;  
        proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;  
        # 请求交给名为nginx_boot的upstream上  
        proxy_pass http://nginx_boot;  
    }  
}             

After the load balancing preparations are completed, restart Nginx and publish two more web services, the port of one service is changed to 8080 when the first web service is started, and the port number of the other service is changed to 8090.

Browser visit to see the effect:

Nginx shuttles: anti-leeching, dynamic and static separation, high availability, compression, cross-domain, cache, etc

Polling and configuration weights are used here, so 8080 once, 8090 twice...

Nginx distributes the request principle

The request flow is as follows:

Nginx shuttles: anti-leeching, dynamic and static separation, high availability, compression, cross-domain, cache, etc

How request distribution works

  • Nginx listens on server port 80, so requests go to Nginx first;
  • Nginx will match the rules according to the location configuration, and then locate the location /{} rule according to the request path /;
  • Then according to the location configuration proxy_pass find an upstream named nginx_boot;
  • ACCORDING TO THE UPSTREAM CONFIGURATION INFORMATION, THE CLIENT REQUEST IS FORWARDED TO THE WEB SERVER FOR PROCESSING (WHEN MULTIPLE MACHINES ARE MULTI-MACHINE, Nginx will distribute the number of requests according to the weight ratio)

Fourth, Nginx dynamic and static separation

Think first, "Why do you do the separation of motion and static? What are its advantages?"

Take Taobao analysis as an example:

Nginx shuttles: anti-leeching, dynamic and static separation, high availability, compression, cross-domain, cache, etc

When the browser visits the Taobao homepage, F12 can see that there are 100+ requests for the homepage load, and static resources are generally in the project resources/static/ directory:

Nginx shuttles: anti-leeching, dynamic and static separation, high availability, compression, cross-domain, cache, etc

Without using NG separation, a client requesting the Taobao homepage has 100+ concurrent requests to the server. In this way, the pressure on the back-end server is very large.

It is not difficult to see that of the 100+ requests on the homepage, at least 60+ belong to *.js, *.css, *.html, *.jpg..... What about requests for such static resources?.

Most static resources don't change for a long time, what if requests for static resources are processed before the request is made to the server? "After the separation of motion and static processing, the back-end server can reduce the amount of concurrency by more than half".

How to achieve dynamic and static separation?

(1) Create a static resource directory static_resources Nginx directory:

mkdir static_resources             

(2) Put all the static resources of the project in this directory, and then delete the static resources in the back-end project and then package.

(3) Then fine-tune the nginx.conf configuration file and add a location matching rule:

location ~ .*\.(html|htm|gif|jpg|jpeg|bmp|png|ico|txt|js|css){  
    root   /soft/nginx/static_resources;  
    expires 7d;  
}             

RESTART NGINX AND WEB SERVICES, AND THEN ACCESS TO FIND THAT THE ORIGINAL STATIC RESOURCE EFFECT IS STILL THERE, AS FOLLOWS:

Nginx shuttles: anti-leeching, dynamic and static separation, high availability, compression, cross-domain, cache, etc

The nginx_style.css files under static have been removed, but the effect is still there:

Nginx shuttles: anti-leeching, dynamic and static separation, high availability, compression, cross-domain, cache, etc

Let's explain the location rule configuration:

location ~ .*\.(html|htm|gif|jpg|jpeg|bmp|png|ico|txt|js|css)           
  • ~ indicates case sensitivity when matching
  • .* means that any character can appear zero or more times (resource names are not limited)
  • \.indicates a matching suffix separator.
  • (html|...| css) means to match all static resources of that type in parentheses

"Friendly reminder: you can also transfer static resources to OOS, and configure a new upstream point to location"

5. Nginx resource compression

With the previous dynamic and static separation, imagine that the smaller the size of static resources, the faster the transmission speed and the more bandwidth will be saved.

Therefore, when deploying static resources, Nginx is also used to compress static resource transfers, which not only saves bandwidth resources, but also speeds up response speed and improves the overall throughput of the system.

NGINX provides three resource compression modules: ngx_http_gzip_module; ngx_http_gzip_static_module; ngx_http_gunzip_module;

Among them ngx_http_gzip_module is built into Nginx, which can directly use the compression instructions of the module, in fact, subsequent resource compression operations are based on it.

Some parameters and instructions for Nginx compression configuration:

Nginx shuttles: anti-leeching, dynamic and static separation, high availability, compression, cross-domain, cache, etc

Simple configuration in Nginx is used:

http{
    # 开启压缩机制
    gzip on;
    # 指定会被压缩的文件类型(也可自己配置其他类型)
    gzip_types text/plain application/javascript text/css application/xml text/javascript image/jpeg image/gif image/png;
    # 设置压缩级别,越高资源消耗越大,但压缩效果越好
    gzip_comp_level 5;
    # 在头部中添加Vary: Accept-Encoding(建议开启)
    gzip_vary on;
    # 处理压缩请求的缓冲区数量和大小
    gzip_buffers 16 8k;
    # 对于不支持压缩功能的客户端请求不开启压缩机制
    gzip_disable "MSIE [1-6]\."; # 低版本的IE浏览器不支持压缩
    # 设置压缩响应所支持的HTTP最低版本
    gzip_http_version 1.1;
    # 设置触发压缩的最小阈值
    gzip_min_length 2k;
    # 关闭对后端服务器的响应结果进行压缩
    gzip_proxied off;
}           

Among them gzip_proxied there are a variety of options, which can be determined according to the actual situation of the system, and the interpretation is as follows:

  • off: Turns off Nginx's response compression to background services.
  • expired: The response header contains the Expires message, and it is compressed.
  • no-cache: If the response header contains Cache-Control:no-cache information, it is compressed.
  • no-store: If the response header contains Cache-Control:no-store information, it is compressed.
  • private: If the response header contains Cache-Control:private information, it is compressed.
  • no_last_modified: If the response header does not contain Last-Modified information, it is compressed.
  • no_etag: If the response header does not contain ETag information, it is compressed.
  • auth: If the response header contains Authorization information, it is compressed.
  • any: Enable the compression mechanism for the response result of the backend unconditionally.

After configuring, introduce a jquery-3.6.0 file in the original index page .js test it:

<script type="text/javascript" src="jquery-3.6.0.js"></script>             

Compare the differences before and after compression:

Nginx shuttles: anti-leeching, dynamic and static separation, high availability, compression, cross-domain, cache, etc

From the comparison results, before the compression mechanism is enabled, the original size of js files is 230K, when you restart Nginx after configuring compression, you will find that the file size is from 230KB → 69KB, which is the gap!

Please note:

(1) If it is image or video data, NG will enable the compression mechanism by default.

(2) If you .js file, specify the compression type as application/javascript.

6. Nginx buffer

First of all, after accessing Nginx: "client →Nginx → server", in this process there are: "client →Nginx, Nginx → server" two connections, these two connection speeds are definitely different, so that the user's experience is extremely poor.

Nginx has a buffer mechanism, mainly to solve the "speed mismatch between two connections" problem, after buffering, Nginx proxy temporarily saves the backend response, and then provides data to the user on demand.

Let's take a look at the configuration items about buffers:

  • proxy_buffering: Whether to enable the buffering mechanism, the default is on off.
  • client_body_buffer_size: Set the memory size to buffer client request data.
  • proxy_buffers: Set the number and size of buffers for each request/connection, default 4 4k/8k.
  • proxy_buffer_size: Set the buffer size used to store response headers.
  • proxy_busy_buffers_size: When the backend data is not fully received, Nginx can return the buffer of the bus state to the client, and this parameter is used to set the specific size of the buffer of the bus state, which is proxy_buffer_size*2 by default.
  • proxy_temp_path: When the memory buffer is full, you can temporarily store the data to disk, this parameter is to set the directory where the buffer data is stored.
    • Syntax: proxy_temp_path path; path is the path to the temporary directory
  • proxy_temp_file_write_size: Set the size limit for each write to a temporary file.
  • proxy_max_temp_file_size: Set the maximum storage capacity allowed in the temporary buffer directory.
  • Non-buffered parameter items:
    • proxy_connect_timeout: Set the timeout period when establishing a connection to the back-end server.
    • proxy_read_timeout: Set the timeout period for reading response data from the backend server.
    • proxy_send_timeout: Set the timeout period for transmitting request data to the backend server.

For specific nginx.conf configuration, refer to the following:

http{  
    proxy_connect_timeout 10;  
    proxy_read_timeout 120;  
    proxy_send_timeout 10;  
    proxy_buffering on;  
    client_body_buffer_size 512k;  
    proxy_buffers 4 64k;  
    proxy_buffer_size 16k;  
    proxy_busy_buffers_size 128k;  
    proxy_temp_file_write_size 128k;  
    proxy_temp_path /soft/nginx/temp_buffer;  
}             

The buffer parameter is allocated on a per-request basis, not the shared space for all requests.

NG buffering can also appropriately reduce the bandwidth consumption caused by instant transmission.

Nginx caching mechanism

Caching is familiar to you, in terms of performance optimization, caching is one of the solutions that can greatly improve performance (caching includes client cache, proxy cache, server cache, etc.). Nginx's cache is a proxy cache.

Benefits of caching:

  • Reduces bandwidth consumption to request resources from the backend or file server again.
  • Reduce the access pressure of downstream servers and improve the overall throughput of the system.
  • Improved response time and faster page loading.

Let's first familiarize yourself with the configuration of caching in Nginx:

proxy_cache_path: The path to the proxy cache.

proxy_cache_path path [levels=levels] [use_temp_path=on|off] keys_zone=name:size [inactive=time] [max_size=size] [manager_files=number] [manager_sleep=time] [manager_threshold=time] [loader_files=number] [loader_sleep=time] [loader_threshold=time] [purger=on|off] [purger_files=number] [purger_sleep=time] [purger_threshold=time];           

Explain the meaning of each parameter:

  • path: The path address of the cache.
  • levels: A hierarchy of cache storage, allowing up to three levels of directories.
  • use_temp_path: Whether to use a temporary directory.
  • keys_zone: Specify a shared memory space to store hotspot keys (1M can store 8000 keys).
  • inactive: Set how long the cache is deleted after it has not been accessed (the default is ten minutes).
  • max_size: The maximum storage space allowed for the cache, after which the cache will be removed based on the LRU algorithm, Nginx will create a Cache manager process to remove data, or through purge.
  • manager_files: The maximum number of cached files removed by the manager process at a time.
  • manager_sleep: The maximum time that the manager process removes cached files at a time.
  • manager_threshold: The interval between each cache removal by the manager process.
  • loader_files: When you restart Nginx to load the cache, the default number of each load is 100.
  • loader_sleep: The maximum allowed time limit is 200ms per load.
  • loader_threshold: After a single load, the pause interval is 50ms by default.
  • purger: Whether to enable the purge method to remove data.
  • purger_files: The number of cache files each time they are removed.
  • purger_sleep: Maximum time allowed to consume each time it is removed.
  • purger_threshold: The interval between pauses after each removal.

proxy_cache: To enable or disable proxy caching, you need to specify a shared memory area when enabling it.

proxy_cache zone | off;           

zone represents the name of the memory region, which is the name of the setting keys_zone above.

proxy_cache_key: How to generate cached keys.

proxy_cache_key string;           

string is the rule of Key, for example: $scheme$proxy_host$request_uri.

proxy_cache_valid: The status code and expiration time of the cache.

proxy_cache_valid [code ...] time;           

Code is the status code, time is the effective time, and different cache times can be set according to the status code.

For example: proxy_cache_valid 200 302 30m;

proxy_cache_min_uses: Set the number of times a resource is requested and then cached.

proxy_cache_min_uses number;           

number is the number of times, and the default is 1.

proxy_cache_use_stale: When an exception occurs in the backend, whether Nginx is allowed to return the cache as a response.

proxy_cache_use_stale error;           

error is the error type, configurable timeout|invalid_header|updating|http_500....

proxy_cache_lock: For the same request, whether to enable the lock mechanism, only one request is allowed to be sent to the backend.

proxy_cache_lock on | off;           

proxy_cache_lock_timeout: Configure the lock timeout mechanism to release the request after the specified time.

proxy_cache_lock_timeout time;           

proxy_cache_methods: Set caching to enable for those HTTP methods.

proxy_cache_methods method;           

method is the type of request method, such as, etc.

"proxy_no_cache": Define the condition that the cache is not stored, and it will not be saved when it is met.

proxy_no_cache string...;           

string is a condition, such as $cookie_nocache $arg_nocache $arg_comment;

proxy_cache_bypass: Define the condition that the cache is not read, and it will not be read from the cache if it is met.

proxy_cache_bypass string...;           

Similar to the configuration of proxy_no_cache.

add_header: Add field information to the response header.

add_header fieldName fieldValue;           

$upstream_cache_status: The information of whether the cache hit is recorded is recorded, and there are several cases:

  • MISS: The request missed the cache.
  • HIT: Request hit cache.
  • EXPIRED: The cache was requested but the cache has expired.
  • STALE: The request hit a stale cache.
  • REVALIDDATED: Nginx verifies that the stale cache is still valid.
  • UPDATING: The cache content of the hit is stale, but the cache is being updated.
  • BYPASS: The response result is obtained from the origin server.

Note: This is an Nginx built-in variable and not a parameter item.

Next, configure the Nginx proxy cache:

http{  
    # 设置缓存的目录,并且内存中缓存区名为hot_cache,大小为128m,  
    # 三天未被访问过的缓存自动清楚,磁盘中缓存的最大容量为2GB。  
    proxy_cache_path /soft/nginx/cache levels=1:2 keys_zone=hot_cache:128m inactive=3d max_size=2g;  
      
    server{  
        location / {  
            # 使用名为nginx_cache的缓存空间  
            proxy_cache hot_cache;  
            # 对于200、206、304、301、302状态码的数据缓存1天  
            proxy_cache_valid 200 206 304 301 302 1d;  
            # 对于其他状态的数据缓存30分钟  
            proxy_cache_valid any 30m;  
            # 定义生成缓存键的规则(请求的url+参数作为key)  
            proxy_cache_key $host$uri$is_args$args;  
            # 资源至少被重复访问三次后再加入缓存  
            proxy_cache_min_uses 3;  
            # 出现重复请求时,只让一个去后端读数据,其他的从缓存中读取  
            proxy_cache_lock on;  
            # 上面的锁超时时间为3s,超过3s未获取数据,其他请求直接去后端  
            proxy_cache_lock_timeout 3s;  
            # 对于请求参数或cookie中声明了不缓存的数据,不再加入缓存  
            proxy_no_cache $cookie_nocache $arg_nocache $arg_comment;  
            # 在响应头中添加一个缓存是否命中的状态(便于调试)  
            add_header Cache-status $upstream_cache_status;  
        }  
    }  
}             

The test looks like this, as follows:

Nginx shuttles: anti-leeching, dynamic and static separation, high availability, compression, cross-domain, cache, etc

There is no data in the cache on the first visit, so there is no hit cache. The second and third times, there is still no cache hit, because the minimum condition for the cache configuration is "the resource must be requested at least three times before joining the cache". The cache is not hit until the fourth time. This has the advantage of reducing some of the invalid cache footprint.

Cache cleanup

Since there is a cache, there must be a clear cache, just like the mobile app needs to clean the cache for a long time.

If the cache is too much, if it is not cleaned in time, then the disk space will be "eaten", so there is a purger option in the previous proxy_cache_path parameters, and the cache can be automatically cleaned after it is turned on, but this is the commercial version to pay.

So we can use the ngx_cache_purge module introduced by the three parties instead, and install the plugin first:

(1) Go to the Nginx installation directory and create a cache_purge directory:

[root@localhost]# mkdir cache_purge && cd cache_purge             

(2) Download the plugin compressed file from github via the wget command and unzip it:

[root@localhost]# wget https://github.com/FRiCKLE/ngx_cache_purge/archive/2.3.tar.gz  
[root@localhost]# tar -xvzf 2.3.tar.gz             

(3) Go to Nginx's decompression directory:

[root@localhost]# cd /soft/nginx/nginx1.21.6             

(4) Rebuild Nginx and add the plugin with the --add-module command:

[root@localhost]# ./configure --prefix=/soft/nginx/ --add-module=/soft/nginx/cache_purge/ngx_cache_purge-2.3/             

(5) Compile the Nginx you just built again, "but don't make install":

[root@localhost]# make             

(6) Delete the startup file of Nginx before:

[root@localhost]# rm -rf /soft/nginx/sbin/nginx             

(7) From the generated objs directory, copy the new Nginx boot file to the original location:

[root@localhost]# cp objs/nginx /soft/nginx/sbin/nginx             

The three-party cache clearing plugin is ngx_cache_purge installed, then fine-tune the nginx.conf configuration and add a location rule:

location ~ /purge(/.*) {  
  # 配置可以执行清除操作的IP(线上可以配置成内网机器)  
  # allow 127.0.0.1; # 代表本机  
  allow all; # 代表允许任意IP清除缓存  
  proxy_cache_purge $host$1$is_args$args;  
}             

Restart Nginx and http://xxx/purge/xx clear the cache.

8. Nginx implements the IP blacklist and whitelist function

Nginx mainly implements the IP blacklist and whitelist through allow and deny configuration items:

allow xxx.xxx.xxx.xxx; # 允许指定的IP访问,可以用于实现白名单。  
deny xxx.xxx.xxx.xxx; # 禁止指定的IP访问,可以用于实现黑名单。             

When there are many IP configurations, the whole heap in the nginx.conf file is unfriendly and too redundant, at this time, two new files can be created: BlocksIP.conf and WhiteIP.conf:

# --------黑名单:BlocksIP.conf---------  
deny 192.177.12.222; # 屏蔽192.177.12.222访问  
deny 192.177.44.201; # 屏蔽192.177.44.201访问  
deny 127.0.0.0/8; # 屏蔽127.0.0.1到127.255.255.254网段中的所有IP访问  
  
# --------白名单:WhiteIP.conf---------  
allow 192.177.12.222; # 允许192.177.12.222访问  
allow 192.177.44.201; # 允许192.177.44.201访问  
allow 127.45.0.0/16; # 允许127.45.0.1到127.45.255.254网段中的所有IP访问  
deny all; # 除开上述IP外,其他IP全部禁止访问             

Add the IP addresses you want to disallow/open to the corresponding files, and then import these two files into nginx.conf:

http{  
    # 屏蔽该文件中的所有IP  
    include /soft/nginx/IP/BlocksIP.conf;   
 server{  
    location xxx {  
        # 某一系列接口只开放给白名单中的IP  
        include /soft/nginx/IP/blockip.conf;   
    }  
 }  
}             

These two file import considerations:

If you want to block/open the whole site, import it in HTTP;

If you only need one domain name to block/open it, import it in Sever;

If you only need to block/open IPs for a certain range of interfaces, then import them in location.

You can also implement IP blacklists and whitelists by ngx_http_geo_module and ngx_http_geo_module third-party libraries (this can be blocked by region and country, and IP libraries are provided).

9. Nginx cross-domain configuration

If it is a front-end and back-end separation, distributed architecture, and the introduction of a three-party SDK, the cross-domain problem must be solved.

How do cross-domain issues arise?

The main reason for this is the "same-origin strategy". In order to ensure the security of user information and prevent malicious websites from stealing data, a same-origin policy is necessary, otherwise cookies can be shared. Then the HTTP stateless protocol will lead to the theft of the user's identity information.

The same-origin strategy includes three points: "protocol + domain name + port", that is, two requests that are the same in all three can be regarded as same-origin, and the same-origin strategy will restrict the resource interaction between different sources.

Nginx addresses cross-domain

Adding configuration to nginx.conf solves cross-domain:

location / {  
    # 允许跨域的请求,可以自定义变量$http_origin,*表示所有  
    add_header 'Access-Control-Allow-Origin' *;  
    # 允许携带cookie请求  
    add_header 'Access-Control-Allow-Credentials' 'true';  
    # 允许跨域请求的方法:GET,POST,OPTIONS,PUT  
    add_header 'Access-Control-Allow-Methods' 'GET,POST,OPTIONS,PUT';  
    # 允许请求时携带的头部信息,*表示所有  
    add_header 'Access-Control-Allow-Headers' *;  
    # 允许发送按段获取资源的请求  
    add_header 'Access-Control-Expose-Headers' 'Content-Length,Content-Range';  
    # 一定要有!!!否则Post请求无法进行跨域!  
    # 在发送Post跨域请求前,会以Options方式发送预检请求,服务器接受时才会正式请求  
    if ($request_method = 'OPTIONS') {  
        add_header 'Access-Control-Max-Age' 1728000;  
        add_header 'Content-Type' 'text/plain; charset=utf-8';  
        add_header 'Content-Length' 0;  
        # 对于Options方式的请求返回204,表示接受跨域请求  
        return 204;  
    }  
}             

After configuring in nginx.conf, refresh the configuration file and the cross-domain problem is solved.

When the back-end uses a distributed architecture, sometimes RPC calls also have to deal with cross-domain, but in the back-end project, inherit the HandlerInterceptorAdapter class, implement the WebMvcConfigurer interface, and add @CrossOrgin annotations to solve the cross-domain problem of rpc calls.

10. Nginx's anti-hotlink design

What is hotlinking?

"Hotlinking refers to the external display of resources introduced by external websites into the current website"

Here's an example to help understand:

For example, a certain picture of station A and station B, the picture materials of station A are bought with money, but station B directly copies all the pictures of station A through the <img src="X station/xxx.jpg" />.

At this time, anti-hotlinks can come in handy.

Nginx's anti-hotlink mechanism is related to a header field, Referer.

This field mainly describes where the current request comes from, you can get the value in Nginx, and then determine whether it is a resource reference request of this site, if not, it is not allowed to access.

There is a configuration item valid_referers in Nginx that can just meet this requirement, and the usage is as follows:

valid_referers none | blocked | server_names | string ...;           
  • none: Indicates that HTTP requests without the Referer field are accepted.
  • blocked: Indicates that access is allowed for requests other than http:// or https//.
  • server_names: The whitelist of resource requests, that is, you can specify the domain names that are allowed to access.
  • string: A customizable string that governs wildcard and regular expression writing.

According to the syntax, it is implemented as follows:

# 在动静分离的location中开启防盗链机制  
location ~ .*\.(html|htm|gif|jpg|jpeg|bmp|png|ico|txt|js|css){  
    # 最后面的值在上线前可配置为允许的域名地址  
    valid_referers blocked 192.168.12.129;  
    if ($invalid_referer) {  
        # 可以配置成返回一张禁止盗取的图片  
        # rewrite   ^/ http://xx.xx.com/NO.jpg;  
        # 也可直接返回403  
        return   403;  
    }  
      
    root   /soft/nginx/static_resources;  
    expires 7d;  
}             

Restart NG, basically anti-hotlink!

Of course, the anti-hotlink mechanism can also be ngx_http_accesskey_module with a third-party plug-in, which achieves a more complete design.

Anti-hotlinking is unable to solve the problem of crawlers forging referers information to scrape data.

11. Nginx large file transfer configuration

In some business scenarios, Nginx can also solve the problem of file exceeding the limit or timeouts of transfer requests during large file transfers.

Configuration items that NG may use for file transfer:

Nginx shuttles: anti-leeching, dynamic and static separation, high availability, compression, cross-domain, cache, etc

When transferring large files, these four parameter values can be configured according to the actual situation of your project.

Here is only the configuration that needs to be configured as the proxy layer, and here it is just to adjust the configuration of Nginx as the gateway layer a little higher to the extent that it can "accommodate large files" transfers.

12. Nginx configures SLL certificates

The website must use an SSL certificate to access HTTPS, so Nginx also needs to listen for requests on port 443, HTTPS to ensure communication security, so the server needs to configure the corresponding digital certificate.

About the SSL certificate configuration process:

(1) Go to the CA or apply for the corresponding SSL certificate from the cloud console, and download the Nginx version of the certificate after approval.

(2) After downloading the digital certificate, there are three complete files: .crt, . .key. pem:

  • .crt: Digital certificate file, .crt is an extension file for .pem, so some people may not have it after downloading.
  • .key: The server's private key file, and the asymmetric encrypted private key, which is used to decrypt the data transmitted by the public key.
  • .pem: Source certificate text file in Base64-encoded format, which can modify the extension name by root requirements.

(3) Create a certificate directory in the Nginx directory and upload the downloaded certificate/private key file to this directory.

(4) Finally, change the nginx.conf file as follows:

# ----------HTTPS配置-----------  
server {  
    # 监听HTTPS默认的443端口  
    listen 443;  
    # 配置自己项目的域名  
    server_name www.xxx.com;  
    # 打开SSL加密传输  
    ssl on;  
    # 输入域名后,首页文件所在的目录  
    root html;  
    # 配置首页的文件名  
    index index.html index.htm index.jsp index.ftl;  
    # 配置自己下载的数字证书  
    ssl_certificate  certificate/xxx.pem;  
    # 配置自己下载的服务器私钥  
    ssl_certificate_key certificate/xxx.key;  
    # 停止通信时,加密会话的有效期,在该时间段内不需要重新交换密钥  
    ssl_session_timeout 5m;  
    # TLS握手时,服务器采用的密码套件  
    ssl_ciphers ECDHE-RSA-AES128-GCM-SHA256:ECDHE:ECDH:AES:HIGH:!NULL:!aNULL:!MD5:!ADH:!RC4;  
    # 服务器支持的TLS版本  
    ssl_protocols TLSv1 TLSv1.1 TLSv1.2;  
    # 开启由服务器决定采用的密码套件  
    ssl_prefer_server_ciphers on;  
  
    location / {  
        ....  
    }  
}  
  
# ---------HTTP请求转HTTPS-------------  
server {  
    # 监听HTTP默认的80端口  
    listen 80;  
    # 如果80端口出现访问该域名的请求  
    server_name www.xxx.com;  
    # 将请求改写为HTTPS(这里写你配置了HTTPS的域名)  
    rewrite ^(.*)$ https://www.xxx.com;  
}             

Once configured, your website can be accessed via https, and when the client uses http, it will be automatically rewritten as an HTTPS request.

13. High availability of Nginx

If a single Nginx node is deployed in production, because Nginx accesses external traffic as the gateway layer of the entire system, once Nginx goes down, it will eventually lead to the collapse of the entire system, which is undoubtedly a disaster for the production environment.

Therefore, it is also necessary to ensure the high availability of Nginx.

Through keepalived's VIP mechanism, Nginx is highly available. VIP refers to Virtual IP, that is, Virtual IP.

keepalive was a frequently used high-availability technology during the development of monolithic architecture, such as MySQL, Redis, MQ, Proxy, Tomcat, etc., which will achieve high availability of single-node applications through the VIP mechanism provided by keepalived.

Keep alive + restart script + hot standby build

(1) Create a directory and download keepalive into Linux and unzip it:

[root@localhost]# mkdir /soft/keepalived && cd /soft/keepalived  
[root@localhost]# wget https://www.keepalived.org/software/keepalived-2.2.4.tar.gz  
[root@localhost]# tar -zxvf keepalived-2.2.4.tar.gz             

(2) Enter the decompressed keepalive directory and build the installation environment, then compile and install:

[root@localhost]# cd keepalived-2.2.4  
[root@localhost]# ./configure --prefix=/soft/keepalived/  
[root@localhost]# make && make install             

(3) Go to /soft/keepalived/etc/keepalived/ in the installation directory and edit the configuration file:

[root@localhost]# cd /soft/keepalived/etc/keepalived/  
[root@localhost]# vi keepalived.conf             

(4) Edit the keepalived.conf core configuration file of the host as follows:

global_defs {  
    # 自带的邮件提醒服务,建议用独立的监控或第三方SMTP,也可选择配置邮件发送。  
    notification_email {  
        root@localhost  
    }  
    notification_email_from root@localhost  
    smtp_server localhost  
    smtp_connect_timeout 30  
    # 高可用集群主机身份标识(集群中主机身份标识名称不能重复,建议配置成本机IP)  
 router_id 192.168.12.129   
}  
  
# 定时运行的脚本文件配置  
vrrp_script check_nginx_pid_restart {  
    # 之前编写的nginx重启脚本的所在位置  
 script "/soft/scripts/keepalived/check_nginx_pid_restart.sh"   
    # 每间隔3秒执行一次  
 interval 3  
    # 如果脚本中的条件成立,重启一次则权重-20  
 weight -20  
}  
  
# 定义虚拟路由,VI_1为虚拟路由的标示符(可自定义名称)  
vrrp_instance VI_1 {  
    # 当前节点的身份标识:用来决定主从(MASTER为主机,BACKUP为从机)  
 state MASTER  
    # 绑定虚拟IP的网络接口,根据自己的机器的网卡配置  
 interface ens33   
    # 虚拟路由的ID号,主从两个节点设置必须一样  
 virtual_router_id 121  
    # 填写本机IP  
 mcast_src_ip 192.168.12.129  
    # 节点权重优先级,主节点要比从节点优先级高  
 priority 100  
    # 优先级高的设置nopreempt,解决异常恢复后再次抢占造成的脑裂问题  
 nopreempt  
    # 组播信息发送间隔,两个节点设置必须一样,默认1s(类似于心跳检测)  
 advert_int 1  
    authentication {  
        auth_type PASS  
        auth_pass 1111  
    }  
    # 将track_script块加入instance配置块  
    track_script {  
        # 执行Nginx监控的脚本  
  check_nginx_pid_restart  
    }  
  
    virtual_ipaddress {  
        # 虚拟IP(VIP),也可扩展,可配置多个。  
  192.168.12.111  
    }  
}             

(5) Clone a previous virtual machine as a standby and edit the keepalived.conf file of the standby as follows:

global_defs {  
    # 自带的邮件提醒服务,建议用独立的监控或第三方SMTP,也可选择配置邮件发送。  
    notification_email {  
        root@localhost  
    }  
    notification_email_from root@localhost  
    smtp_server localhost  
    smtp_connect_timeout 30  
    # 高可用集群主机身份标识(集群中主机身份标识名称不能重复,建议配置成本机IP)  
 router_id 192.168.12.130   
}  
  
# 定时运行的脚本文件配置  
vrrp_script check_nginx_pid_restart {  
    # 之前编写的nginx重启脚本的所在位置  
 script "/soft/scripts/keepalived/check_nginx_pid_restart.sh"   
    # 每间隔3秒执行一次  
 interval 3  
    # 如果脚本中的条件成立,重启一次则权重-20  
 weight -20  
}  
  
# 定义虚拟路由,VI_1为虚拟路由的标示符(可自定义名称)  
vrrp_instance VI_1 {  
    # 当前节点的身份标识:用来决定主从(MASTER为主机,BACKUP为从机)  
 state BACKUP  
    # 绑定虚拟IP的网络接口,根据自己的机器的网卡配置  
 interface ens33   
    # 虚拟路由的ID号,主从两个节点设置必须一样  
 virtual_router_id 121  
    # 填写本机IP  
 mcast_src_ip 192.168.12.130  
    # 节点权重优先级,主节点要比从节点优先级高  
 priority 90  
    # 优先级高的设置nopreempt,解决异常恢复后再次抢占造成的脑裂问题  
 nopreempt  
    # 组播信息发送间隔,两个节点设置必须一样,默认1s(类似于心跳检测)  
 advert_int 1  
    authentication {  
        auth_type PASS  
        auth_pass 1111  
    }  
    # 将track_script块加入instance配置块  
    track_script {  
        # 执行Nginx监控的脚本  
  check_nginx_pid_restart  
    }  
  
    virtual_ipaddress {  
        # 虚拟IP(VIP),也可扩展,可配置多个。  
  192.168.12.111  
    }  
}             

(6) Create a new scripts directory and write the Nginx restart script, check_nginx_pid_restart.sh:

[root@localhost]# mkdir /soft/scripts /soft/scripts/keepalived  
[root@localhost]# touch /soft/scripts/keepalived/check_nginx_pid_restart.sh  
[root@localhost]# vi /soft/scripts/keepalived/check_nginx_pid_restart.sh  
  
#!/bin/sh  
# 通过ps指令查询后台的nginx进程数,并将其保存在变量nginx_number中  
nginx_number=`ps -C nginx --no-header | wc -l`  
# 判断后台是否还有Nginx进程在运行  
if [ $nginx_number -eq 0 ];then  
    # 如果后台查询不到`Nginx`进程存在,则执行重启指令  
    /soft/nginx/sbin/nginx -c /soft/nginx/conf/nginx.conf  
    # 重启后等待1s后,再次查询后台进程数  
    sleep 1  
    # 如果重启后依旧无法查询到nginx进程  
    if [ `ps -C nginx --no-header | wc -l` -eq 0 ];then  
        # 将keepalived主机下线,将虚拟IP漂移给从机,从机上线接管Nginx服务  
        systemctl stop keepalived.service  
    fi  
fi             

(7) Change the encoding format of the script file and give execution permissions:

[root@localhost]# vi /soft/scripts/keepalived/check_nginx_pid_restart.sh  
  
:set fileformat=unix # 在vi命令里面执行,修改编码格式  
:set ff # 查看修改后的编码格式  
  
[root@localhost]# chmod +x /soft/scripts/keepalived/check_nginx_pid_restart.sh             

(8) Because the installation of keepalived is a custom installation location, you need to copy some files to the system directory:

[root@localhost]# mkdir /etc/keepalived/  
[root@localhost]# cp /soft/keepalived/etc/keepalived/keepalived.conf /etc/keepalived/  
[root@localhost]# cp /soft/keepalived/keepalived-2.2.4/keepalived/etc/init.d/keepalived /etc/init.d/  
[root@localhost]# cp /soft/keepalived/etc/sysconfig/keepalived /etc/sysconfig/             

(9) Add keepalive to the system service and set to enable automatic startup, and then test whether the startup is normal:

[root@localhost]# chkconfig keepalived on  
[root@localhost]# systemctl daemon-reload  
[root@localhost]# systemctl enable keepalived.service  
[root@localhost]# systemctl start keepalived.service             

Other commands:

systemctl disable keepalived.service # 禁止开机自动启动  
systemctl restart keepalived.service # 重启keepalived  
systemctl stop keepalived.service # 停止keepalived  
tail -f /var/log/messages # 查看keepalived运行时日志             

(10) Test whether the VIP is effective by checking whether the virtual IP address is successfully mounted on the machine:

[root@localhost]# ip addr             
Nginx shuttles: anti-leeching, dynamic and static separation, high availability, compression, cross-domain, cache, etc

You can see that the virtual IP has been successfully mounted, but the other machine 192.168.12.130 will not mount this virtual IP, only when the host goes offline, 192.168.12.130 as a standby will go online to take over the VIP.

Test whether the external network can communicate normally with VIPs:

Nginx shuttles: anti-leeching, dynamic and static separation, high availability, compression, cross-domain, cache, etc

The external communication through VIP is normal, which means that the virtual IP address configuration is successful.

NGINX high availability testing

keepalived's VIP mechanism mainly does several things:

  • First, the VIP is mounted for the machine where Nginx is deployed.
  • Second, through keepalive built a master-slave dual-machine hot standby.
  • Third, Nginx was restarted with downtime through keepalived.

If you want to match the domain name, change the configuration of nginx.conf:

sever{  
    listen    80;  
    # 这里从机器的本地IP改为虚拟IP  
 server_name 192.168.12.111;  
 # 如果这里配置的是域名,那么则将域名的映射配置改为虚拟IP  
}             

Test the effect:

Nginx shuttles: anti-leeching, dynamic and static separation, high availability, compression, cross-domain, cache, etc

In the above process, first start the keepalive and nginx services respectively, and then simulate the Nginx downtime by manually stopping nginx, and after a while, query the background process again, and we will find that nginx is still alive.

As you can see, keepalive has implemented the function of automatic restart of Nginx after downtime for us.

Next, simulate a server failure scenario:

Nginx shuttles: anti-leeching, dynamic and static separation, high availability, compression, cross-domain, cache, etc

We manually turned off the keepalive service to simulate the machine power failure, hardware damage, etc. (because the machine power off and other conditions = keepalive process in the host disappeared), and then queried the IP information of the machine again, it is obvious that the VIP disappeared!

After switching to standby: 192.168.12.130 to see the situation:

Nginx shuttles: anti-leeching, dynamic and static separation, high availability, compression, cross-domain, cache, etc

After the main machine is down, the VIP is automatically transferred from the master to the slave.

14. Nginx performance optimization

Finally, the performance optimization of Nginx mainly briefly talks about the most effective optimizations.

Optimization 1: Enable persistent connection configuration

When using Nginx as a proxy service, we recommend that you enable HTTP persistent connections to reduce the number of client handshakes and reduce server losses, as follows:

upstream xxx {  
    # 长连接数  
    keepalive 32;  
    # 每个长连接提供的最大请求数  
    keepalived_requests 100;  
    # 每个长连接没有新的请求时,保持的最长时间  
    keepalive_timeout 60s;  
}             

Optimization 2. Enable zero-copy technology

Zero copy is available in most of the better middleware, such as Kafka, Netty, etc., and Nginx can also configure data zero copy technology, as follows:

sendfile on; # 开启零拷贝机制             

Differences between the zero-copy read mechanism and the traditional resource read mechanism:

  • "Traditional way:" hardware--> kernel--> user space--> program space--> program kernel space--> network sockets
  • "Zero-copy mode:" hardware--> kernel--> program kernel space--> network sockets

Optimization 3: Enable the mechanism of non-delay or multi-packet co-delivery

The two parameters tcp_nodelay and tcp_nopush in NGINX are the more critical performance parameters, which are enabled as follows:

tcp_nodelay on;  
tcp_nopush on;             

The TCP/IP protocol adopts the Nagle algorithm by default, that is, in the process of network data transmission, each data packet will not be sent out immediately, but will wait for a period of time, and combine the next several packets together into a data packet to send, but although this algorithm improves network throughput, the real-time performance is reduced.

So if your project is a highly interactive application, you can manually turn on tcp_nodelay configuration so that every packet submitted by the application to the kernel is sent immediately. However, this generates a large number of TCP headers, which adds a lot of network overhead.

On the contrary, if some projects pursue higher throughput and do not have high real-time requirements, you can enable tcp_nopush configuration items. When this option is set, the kernel tries to concatenate small packets into one large packet (one MTU) and send them together.

Of course, if after a certain period of time (usually 200ms), the kernel still does not accumulate to an MTU amount, it must also send existing data, otherwise it will keep blocking.

tcp_nodelay. tcp_nopush two parameters are "mutually exclusive".

If you are looking for a responsive application, it is recommended to enable tcp_nodelay parameters, such as IM, finance, and other types of projects.

If you are pursuing throughput, we recommend that you enable tcp_nopush parameters, such as scheduling system and reporting system.

(1) tcp_nodelay is generally used when the persistent connection mode is enabled.

(2) The tcp_nopush parameter must be enabled before the sendfile parameter can be used.

Optimize the fourth, adjust the worker work process

After Nginx is started, only one worker worker process will be enabled by default to process client requests, and we can open the corresponding number of worker processes according to the number of CPU cores of the machine to improve the overall concurrency support, as follows:

# 自动根据CPU核心数调整Worker进程数量  
worker_processes auto;             

Note: The maximum number of worker processes is 8, and there will be no major performance improvement after 8.

You can also slightly adjust the number of file handles that each worker process can open:

# 每个Worker能打开的文件描述符,最少调整至1W以上,负荷较高建议2-3W  
worker_rlimit_nofile 20000;             

The operating system kernel (kernel) is to use the file descriptor to access the file, whether it is open, new, read, write the file, you need to use the file descriptor to specify the file to be operated, so the larger the value, the more files a process can operate (but can not exceed the kernel limit, the maximum recommended is about 3.8W as the upper limit).

Optimization 5. Enable the CPU affinity mechanism

Partners who are more familiar with concurrent programming know that because the number of processes/threads often far exceeds the number of cores of the system CPU, because the principle of operating system execution is essentially to use a time-slice switching mechanism, that is, a CPU core will constantly switch between multiple processes, resulting in great performance loss.

The CPU affinity mechanism refers to binding each Nginx worker process to a fixed CPU core, thereby reducing the time overhead and resource consumption caused by CPU switching, and the opening method is as follows:

worker_cpu_affinity auto;             

Optimization 6: Enable the epoll model and adjust the number of concurrent connections

As mentioned at the beginning: Nginx and Redis are programs based on the multiplexing model to implement, but the original version of the multiplexing model select/poll can only listen to a maximum of 1024 connections, while epoll is an enhanced version of the select/poll interface, so the use of this model can greatly improve the performance of a single worker, as follows:

events {  
    # 使用epoll网络模型  
    use epoll;  
    # 调整每个Worker能够处理的连接数上限  
    worker_connections  10240;  
}             

It's finally over...