This software is free and open source under [MIT license](./LICENSE.txt).
If anyone demands you to download this software only from their webpage, telegram channel, forces you to delete links, videos, makes copyright claims, you are dealing with scammers.
You need to run them with the necessary parameters and redirect certain traffic with iptables or nftables.
## When it will not work
* If DNS server returns false responses. ISP can return false IP addresses or not return anything
when blocked domains are queried. If this is the case change DNS to public ones, such as 8.8.8.8 or 1.1.1.1.Sometimes ISP hijacks queries to any DNS server. Dnscrypt or dns-over-tls help.
* If blocking is done by IP.
* If a connection passes through a filter capable of reconstructing a TCP connection, and which
follows all standards. For example, we are routed to squid. Connection goes through the full OS tcpip stack, fragmentation disappears immediately as a means of circumvention. Squid is correct, it will find everything as it should, it is useless to deceive him. BUT. Only small providers can afford using squid, since it is very resource intensive. Large companies usually use DPI, which is designed for much greater bandwidth.
## nfqws
This program is a packet modifier and a NFQUEUE queue handler.
--dpi-desync-udplen-increment=<int> ; increase or decrease udp packet length by N bytes (default 2). negative values decrease length.
--dpi-desync-udplen-pattern=<filename>|0xHEX ; udp tail fill pattern
--dpi-desync-start=[n|d|s]N ; apply dpi desync only to packet numbers (n, default), data packet numbers (d), relative sequence (s) greater or equal than N
--dpi-desync-cutoff=[n|d|s]N ; apply dpi desync only to packet numbers (n, default), data packet numbers (d), relative sequence (s) less than N
--hostlist=<filename> ; apply dpi desync only to the listed hosts (one host per line, subdomains auto apply, gzip supported, multiple hostlists allowed)
--hostlist-exclude=<filename> ; do not apply dpi desync to the listed hosts (one host per line, subdomains auto apply, gzip supported, multiple hostlists allowed)
--hostlist-auto=<filename> ; detect DPI blocks and build hostlist automatically
--hostlist-auto-fail-threshold=<int> ; how many failed attempts cause hostname to be added to auto hostlist (default : 3)
--hostlist-auto-fail-time=<int> ; all failed attemps must be within these seconds (default : 60)
--hostlist-auto-retrans-threshold=<int> ; how many request retransmissions cause attempt to fail (default : 3)
--hostlist-auto-debug=<logfile> ; debug auto hostlist positives
--filter-tcp=[~]port1[-port2]|* ; TCP port filter. ~ means negation. setting tcp and not setting udp filter denies udp. comma separated list supported.
--filter-udp=[~]port1[-port2]|* ; UDP port filter. ~ means negation. setting udp and not setting tcp filter denies tcp. comma separated list supported.
Fakes are separate generated by nfqws packets carrying false information for DPI. They must either not reach the server or be rejected by it. Otherwise TCP connection or data stream would be broken. There're multiple ways to solve this task.
* **md5sig** does not work on all servers
* **badsum** doesn't work if your device is behind NAT which does not pass invalid packets.
* **method** - HTTP method start ('GET', 'POST', 'HEAD', ...). Method is usually always at position 0 but can shift because of `--methodeol` fooling. If fooled position can become 1 or 2.
* **host** - hostname start in a known protocol (http, TLS)
* **endhost** - the byte next to the last hostname's byte
* **sld** - second level domain start in the hostname
* **endsld** - the byte next to the last SLD byte
* **midsld** - middle of SLD
* **sniext** - start of the data field in the SNI TLS extension. Any extension has 2-byte type and length fields followed by data field.
Marker list example : `100,midsld,sniext+1,endhost-2,-10`.
When splitting all markers are resolved to absolute offsets. If a relative position is absent in the current protocol its dropped. Then all resolved offsets are normalized to the current packet offset in multi packet group (multi-packet TLS with kyber, for example). Positions outside of the current packet are dropped. Remaining positions are sorted and deduplicated.
`seqovl` adds to one of the original segment `seqovl` bytes to the beginning and decreases sequence number. For `split` - to the first segment, for `disorder` - to the beginning of the penultimate segment sent (second in the original sequence).
In `split` mode this creates partially in-window packet. OS receives only in-window part.
In `disorder` mode OS receives fake and real part of the second segment but does not pass received data to the socket until first segment is received. First segment overwrites fake part of the second segment. Then OS passes original data to the socket.
There are DPIs that analyze responses from the server, particularly the certificate from the ServerHello that contain domain name(s). The ClientHello delivery confirmation is an ACK packet from the server with ACK sequence number corresponding to the length of the ClientHello+1.
If the RST is after a full ACK after a delay of about ping to the server, then probably DPI acts on the server response. The DPI may be satisfied with good ClientHello and stop monitoring the TCP session without checking ServerHello. Then you were lucky. 'fake' option could work.
If it does not stop monitoring and persistently checks the ServerHello, --wssize parameter may help (see [CONNTRACK](#conntrack)).
Otherwise it is hardly possible to overcome this without the help of the server.
The best solution is to enable TLS 1.3 support on the server. TLS 1.3 sends the server certificate in encrypted form.
This is recommendation to all admins of blocked sites. Enable TLS 1.3. You will give more opportunities to overcome DPI.
### SYNDATA mode
Normally SYNs come without data payload. If it's present it's ignored by all major OS if TCP fast open (TFO) is not involved, but may not be ignored by DPI.
Original connections with TFO are not touched because otherwise they would be definitely broken.
* 0 phase modes work during the connection establishement : `synack`, `syndata``--wsize`, `--wssize`. [hostlist](((#multiple-strategies))) filters are not applicable.
* In the 1st phase fakes are sent before original data : `fake`, `rst`, `rstack`.
* In the 2nd phase original data is sent in a modified way (for example `fakedsplit` or `ipfrag2`).
There is special support for all tcp split options for multi segment TLS. Split position is treated as message-oriented, not packet oriented. For example, if your client sends TLS ClientHello with size 2000 and SNI is at 1700, desync mode is `fake,multisplit`, then fake is sent first, then original first segment and the last splitted segment. 3 segments total.
UDP attacks are limited. Its not possible to fragment UDP on transport level, only on network (ip) level.
Only desync modes `fake`,`hopbyhop`,`destopt`,`ipfrag1` and `ipfrag2` are applicable.
`fake`,`hopbyhop`,`destopt` can be used in combo with `ipfrag2`.
`fake` can be used in combo with `udplen` and `tamper`.
`udplen` increases udp payload size by `--dpi-desync-udplen-increment` bytes. Padding is filled with zeroes by default but can be overriden with a pattern.
This option can resist DPIs that track outgoing UDP packet sizes.
Requires that application protocol does not depend on udp payload size.
QUIC initial packets are recognized. Decryption and hostname extraction is supported so `--hostlist` parameter will work.
Wireguard handshake initiation and DHT packets are also recognized.
For other protocols desync use `--dpi-desync-any-protocol`.
Conntrack supports udp. `--dpi-desync-cutoff` will work. UDP conntrack timeout can be set in the 4th parameter of `--ctrack-timeouts`.
Fake attack is useful only for stateful DPI and useless for stateless dealing with each packet independently.
By default fake payload is 64 zeroes. Can be overriden using `--dpi-desync-fake-unknown-udp`.
### IP fragmentation
Modern network can be very hostile to IP fragmentation. Fragmented packets are often not delivered or refragmented/reassembled on the way.
Frag position is set independently for tcp and udp. By default 24 and 8, must be multiple of 8.
Offset starts from the transport header.
tcp fragments are almost always filtered. It's absolutely not suitable for arbitrary websites.
udp fragments have good chances to survive but not everywhere. It's good to assume success rate on QUIC between 50..75%.
Likely more with your VPS. Sometimes filtered by DDoS protection.
There are important nuances when working with fragments in Linux.
ipv4 : Linux allows to send ipv4 fragments but standard firewall rules in OUTPUT chain can cause raw send to fail.
ipv6 : There's no way for an application to reliably send fragments without defragmentation by conntrack.
Sometimes it works, sometimes system defragments packets.
Looks like kernels <4.16havenosimplewaytosolvethisproblem.Unloadingof`nf_conntrack`module
and its dependency `nf_defrag_ipv6` helps but this severely impacts functionality.
Kernels 4.16+ exclude from defragmentation untracked packets.
See `blockcheck.sh` code for example.
Sometimes it's required to load `ip6table_raw` kernel module with parameter `raw_before_defrag=1`.
In openwrt module parameters are specified after module names separated by space in files located in `/etc/modules.d`.
In traditional linux check whether `iptables-legacy` or `iptables-nft` is used. If legacy create the file
`/etc/modprobe.d/ip6table_raw.conf` with the following content :
```
options ip6table_raw raw_before_defrag=1
```
In some linux distros its possible to change current ip6tables using this command: `update-alternatives --config ip6tables`.
If you want to stay with `nftables-nft` you need to patch and recompile your version.
In `nft.c` find :
```
{
.name = "PREROUTING",
.type = "filter",
.prio = -300, /* NF_IP_PRI_RAW */
.hook = NF_INET_PRE_ROUTING,
},
{
.name = "OUTPUT",
.type = "filter",
.prio = -300, /* NF_IP_PRI_RAW */
.hook = NF_INET_LOCAL_OUT,
},
```
and replace -300 to -450.
It must be done manually, `blockcheck.sh` cannot auto fix this for you.
Or just move to `nftables`. You can create hooks with any priority there.
Looks like there's no way to do ipfrag using iptables for forwarded traffic if NAT is present.
`MASQUERADE` is terminating target, after it `NFQUEUE` does not work.
nfqws sees packets with internal network source address. If fragmented NAT does not process them.
This results in attempt to send packets to internet with internal IP address.
You need to use nftables instead with hook priority 101 or higher.
This variant works if DPI is stateful and does not track all packets separately in search for "bad requests". If it's stateless you have to redirect all outgoing plain http packets.
nft add rule inet ztest pre iifname $IFACE_WAN tcp sport "{80,443}" ct reply packets 1-3 queue num 200 bypass
```
To engage `datanoack` or `ipfrag` for passthrough traffic special POSTNAT configuration is required. Generated packets must be marked as **notrack** in the early stage to avoid being invalidated by linux conntrack.
If your device supports flow offloading (hardware acceleration) iptables and nftables may not work. With offloading enabled packets bypass standard netfilter flow. It must be either disabled or selectively controlled.
Newer linux kernels have software flow offloading (SFO). The story is the same with SFO.
In `iptables` flow offloading is controlled by openwrt proprietary extension `FLOWOFFLOAD`. Newer `nftables` implement built-in offloading support.
Flow offloading does not interfere with **tpws** and `OUTPUT` traffic. It only breaks nfqws that fools `FORWARD` traffic.
--socks ; implement socks4/5 proxy instead of transparent proxy
--local-rcvbuf=<bytes> ; SO_RCVBUF for local legs
--local-sndbuf=<bytes> ; SO_SNDBUF for local legs
--remote-rcvbuf=<bytes> ; SO_RCVBUF for remote legs
--remote-sndbuf=<bytes> ; SO_SNDBUF for remote legs
--nosplice ; do not use splice to transfer data between sockets
--skip-nodelay ; do not set TCP_NODELAY for outgoing connections. incompatible with split.
--local-tcp-user-timeout=<seconds> ; set tcp user timeout for local leg (default : 10, 0 = system default)
--remote-tcp-user-timeout=<seconds> ; set tcp user timeout for remote leg (default : 20, 0 = system default)
--fix-seg=<int> ; recover failed TCP segmentation at the cost of slowdown. wait up to N msec.
--no-resolve ; disable socks5 remote dns
--resolver-threads=<int> ; number of resolver worker threads
--maxconn=<max_connections> ; max number of local legs
--maxfiles=<max_open_files> ; max file descriptors (setrlimit). min requirement is (X*connections+16), where X=6 in tcp proxy mode, X=4 in tampering mode.
; its worth to make a reserve with 1.5 multiplier. by default maxfiles is (X*connections)*1.5+16
--max-orphan-time=<sec> ; if local leg sends something and closes and remote leg is still connecting then cancel connection attempt after N seconds
--hostspell ; exact spelling of "Host" header. must be 4 chars. default is "host"
--hostdot ; add "." after Host: name
--hosttab ; add tab after Host: name
--hostnospace ; remove space after Host:
--hostpad=<bytes> ; add dummy padding headers before Host:
--domcase ; mix domain case after Host: like this : TeSt.cOm
--methodspace ; add extra space after method
--methodeol ; add end-of-line before method
--unixeol ; replace 0D0A to 0A
--tlsrec=N|-N|marker+N|marker-N ; make 2 TLS records. split at specified logical part. don't split if SNI is not present.
--tlsrec-pos=<pos> ; make 2 TLS records. split at specified pos
--mss=<int> ; set client MSS. forces server to split messages but significantly decreases speed !
--tamper-start=[n]<pos> ; start tampering only from specified outbound stream position. byte pos or block number ('n'). default is 0.
--tamper-cutoff=[n]<pos> ; do not tamper anymore after specified outbound stream position. byte pos or block number ('n'). default is unlimited.
--daemon ; daemonize
--pidfile=<filename> ; write pid to file
--user=<username> ; drop root privs
--uid=uid[:gid] ; drop root privs
```
### TCP segmentation in tpws
**tpws** like **nfqws** supports multiple splits. Split [markers](#tcp-segmentation) are specified in `--split-pos` parameter.
On the socket level there's no guaranteed way to force OS to send pieces of data in separate packets. OS has a send buffer for each socket. If `TCP_NODELAY` socket option is enabled and send buffer is empty OS will likely send data immediately. If send buffer is not empty OS will coalesce it with new data and send in one packet if possible.
In practice outside of massive transmissions it's usually enough to enable `TCP_NODELAY` and use separate `send()` calls to force custom TCP segmentation. But if there're too many split segments Linux can combined some pieces and break desired behaviour. BSD and Windows are more predictable in this case. That's why it's not recommended to use too many splits. Tests revealed that 8+ can become problematic.
Since linux kernel 4.6 **tpws** can recognize TCP segmentation failures and warn about them. `--fix-seg` can fix segmentation failures at the cost of some slowdown. It waits for several msec until all previous data is sent. This breaks async processing model and slows down every other connection going through **tpws**. Thus it's not recommended on highly loaded systems. But can be compromise for home systems.
If you're attempting to split massive transmission with `--split-any-protocol` option it will definitely cause massive segmentation failures. Do not do that without `--tamper-start` and `--tamper-cutoff` limiters.
**tpws** works on socket level and receives in one shot long requests (TLS with kyber) that should normally require several TCP packets. It tampers entire received block without knowing how much packets it will take. OS will do additional segmenation to meet MTU.
`--disorder` sends every odd packet with TTL=1. Server receives even packets fastly. Then client OS retransmits odd packets with normal TTL and server receives them. In case of 6 segments server and DPI will see them in this order : `2 4 6 1 3 5`. This way of disorder causes some delays. Default retransmission timeout in Linux is 200 ms.
`--oob` sends one out-of-band byte in the end of the first split segment.
`--oob` and `--disorder` can be combined only in Linux. Others OS do not handle this correctly.
### TLSREC
`--tlsrec` allow to split TLS ClientHello into 2 TLS records in one TCP segment. It accepts single pos marker.
`--tlsrec` breaks significant number of sites. Crypto libraries on servers usually accept fine modified ClientHello but middleboxes such as CDNs and ddos guards - not always. Use of `--tlsrec` without filters is discouraged.
### MSS
`--mss` sets TCP_MAXSEG socket option. Client sets this value in MSS TCP option in the SYN packet.
Server replies with it's own MSS in SYN,ACK packet. Usually servers lower their packet sizes but they still don't fit to supplied MSS. The greater MSS client sets the bigger server's packets will be.
If it's enough to split TLS 1.2 ServerHello, it may fool DPI that checks certificate domain name.
This scheme may significantly lower speed. Hostlist filter is possible only in socks mode if client uses remote resolving (firefox `network.proxy.socks_remote_dns`).
`--mss` is not required for TLS1.3. If TLS1.3 is negotiable then MSS make things only worse. Use only if nothing better is available. Works only in Linux, not BSD or MacOS.
### Other tamper options
`--hostpad=<bytes>` adds padding headers before `Host:` with specified number of bytes. If `<bytes>` is too large headers are split by 2K. Padding more that 64K is not supported and not accepted by http servers.
It's useful against stateful DPI's that reassemble only limited amount of data. Increase padding `<bytes>` until website works. If minimum working `<bytes>` is close to MTU then it's likely DPI is not reassembling packets. Then it's better to use regular split instead of `--hostpad`.
### Supplementary options
**tpws** can bind to multiple interfaces and IP addresses (up to 32).
Parameters `--bind-iface*` and `--bind-addr` create new bind.
Other parameters `--bind-*` are related to the last bind.
link local ipv6 (`fe80::/8`) mode selection :
```
--bind-iface6 --bind-linklocal=no : first selects private address fc00::/7, then global address
--bind-iface6 --bind-linklocal=unwanted : first selects private address fc00::/7, then global address, then LL
--bind-iface6 --bind-linklocal=prefer : first selects LL, then private address fc00::/7, then global address
--bind-iface6 --bind-linklocal=force : select only LL
```
To bind to all ipv4 specify `--bind-addr "0.0.0.0"`, all ipv6 - `::`.
`--bind-addr=""` - mean bind to all ipv4 and ipv6.
If no binds are specified default bind to all ipv4 and ipv6 addresses is created.
To bind to a specific link local address do : `--bind-iface6=fe80::aaaa:bbbb:cccc:dddd%iface-name`
The `--bind-wait*` parameters can help in situations where you need to get IP from the interface, but it is not there yet, it is not raised
or not configured.
In different systems, ifup events are caught in different ways and do not guarantee that the interface has already received an IP address of a certain type.
In the general case, there is no single mechanism to hang oneself on an event of the type "link local address appeared on the X interface."
To bind to a specific ip when its interface may not be configured yet do : `--bind-addr=192.168.5.3 --bind-wait-ip=20`
It's possible to bind to any nonexistent address in transparent mode but in socks mode address must exist.
**tpws** like **nfqws** supports multiple strategies. They work mostly like with **nfqws** with minimal differences.
`filter-udp` is absent because **tpws** does not support udp. 0-phase desync methods (`--mss`) can work with hostlist in socks modes with remote hostname resolve.
This is the point where you have to plan profiles carefully. If you use `--mss` and hostlist filters, behaviour can be different depending on remote resolve feature enabled or not.
Use `--mss` both in hostlist profile and profile without hostlist.
Use `curl --socks5` and `curl --socks5-hostname` to issue two kinds of proxy queries.
It's also possible to use `-j REDIRECT --to-port 988` instead of DNAT but in the latter case transparent proxy must listen on all IP addresses or on a LAN interface address. It's not too good to listen on all IP and it's not trivial to get specific IP in a shell script. `route_localnet` has it's own security impact if not protected by additional rules. You open `127.0.0.0/8` subnet to the net.
This is how to open only single `127.0.0.127` address :
```
iptables -A INPUT ! -i lo -d 127.0.0.127 -j ACCEPT
iptables -A INPUT ! -i lo -d 127.0.0.0/8 -j DROP
```
Owner filter is required to avoid redirection loops. **tpws** must be run with `--user tpws` parameter.
ip6tables work almost the same with minor differences. ipv6 addresses should be enclosed in square brackets :
There's no `route_localnet` for ipv6. DNAT to localhost (`::1`) is possible only in **OUTPUT** chain. In **PREROUTING** chain DNAT is possible to any global address or link local address of the interface where packet came from.
donttouch : disable system flow offloading setting if selected mode is incompatible with it, dont touch it otherwise and dont configure selective flow offloading
none : always disable system flow offloading setting and dont configure selective flow offloading
software : always disable system flow offloading setting and configure selective software flow offloading
hardware : always disable system flow offloading setting and configure selective hardware flow offloading
```
`FLOWOFFLOAD=donttouch`
The GETLIST parameter tells the install_easy.sh installer which script to call
to update the list of blocked ip or hosts.
Its called via `get_config.sh` from scheduled tasks (crontab or systemd timer).
Put here the name of the script that you will use to update the lists.
If not, then the parameter should be commented out.
You can individually disable ipv4 or ipv6. If the parameter is commented out or not equal to "1",
use of the protocol is permitted.
```
#DISABLE_IPV4=1
DISABLE_IPV6=1
```
The number of threads for mdig multithreaded DNS resolver (1..100).
The more of them, the faster, but will your DNS server be offended by hammering ?
`MDIG_THREADS=30`
temp directory. Used by ipset/*.sh scripts for large lists processing.
/tmp by default. Can be reassigned if /tmp is tmpfs and RAM is low.
TMPDIR=/opt/zapret/tmp
ipset and nfset options :
```
SET_MAXELEM=262144
IPSET_OPT="hashsize 262144 maxelem 2097152
```
Kernel automatically increases hashsize if ipset is too large for the current hashsize.
This procedure requires internal reallocation and may require additional memory.
On low RAM systems it can cause errors.
Do not use too high hashsize. This way you waste your RAM. And dont use too low hashsize to avoid reallocs.
There's no `ipset` support unless you run custom kernel. In common case task of bringing up `ipset` on android is ranging from "not easy" to "almost impossible", unless you find working kernel image for your device.
**nfqws** should be executed with `--uid 1`. Otherwise on some devices or firmwares kernel may partially hang. Looks like processes with certain uids can be suspended. With buggy chineese cellular interface driver this can lead to device hang.