Changes made to date have significantly reduced our "compute cycles" but we are still way over the limit and the costs are increasing daily. You can see the effect the changes made to date have made in the graph:
https://www.mediafire.com/convkey/87ef/i1r78skw21rv1kb6g.jpg
Our compute cycles have dropped from 2500 a day to less than 600 a day and our bandwidth has dropped from 4000 to less than 800. That is from blocking some of the worst offenders.
The PROBLEM is that we are only allowed 3000 compute cycles for the WHOLE MONTH period, which runs from the 22nd of Month A to 21st of Month B. Additional charges for the period 16 November to 15 December were $360.26 so there will be another large charge for December-January. We normally pay less than that for the whole year!
A few of you know that I was making changes to increase the BLOT BLOCKAGE today, as I made a typo and the server crashed. That was quickly fixed and from that point on I tested out all the files on a blank directory before they are set to load on the site here. That work is all done and the files are going to be loaded shortly. I do check to see that there are not too many people on the system when I make that change but unfortunately I can not account for someone who is logging on at the exact same time the files are uploading.
The new .htaccess file that carries all the new blocking code will be blocking a large number if IP addresses and ranges of IP addresses associated with what appear to be the main offenders. We have tried other less severe approaches and the BOTS are bypassing them easily. Neil found a way to just say "NO BOTS PLEASE" and they continued the attack. It could be it takes a few days but at the rate the costs are increasing we have to take every step possible.
IF YOU FIND that you are blocked, then I hope that you remembered to bookmark the site where the announcements are posted here:
http://cefresearch.blogspot.ca/
as I will post a copy of this message at that location. I will put in the e-mail address for the MATRIX in case you don't have that, then you can send me a message with your I.P. address so I can exempt that block.
Find your IP address: http://whoisip.ovh
I will also provide a list of blocked addresses to date at that location. Doing that here would just let the BOTS know who is being blocked.
I also need to you to tell me if any of you are using these IP addresses as they appear somewhat normal, of origin in the USA or Canada, but are associated with what appear to be large data downloads:
Halifax 142.177.247.89
Winnipeg 142.161.238.42
Ottawa 131.137.88.71
Melbourne Australia 124.191.103.90 (might be the Diggers)
Saskatoon 128.233.6.93
Redmond Kansas 131.253.24.140
Calgary 137.186.55.136
Tatamagouche (N.S. or P.E.I?) 142.134.66.130
Brantford 50.100.61.223
Edmonton 50.64.149.31 or 50.68.152.77
Windsor 70.51.99.113 (that is now blocked as it does not appear to be HUMAN - if you are call home!)
I will post this now and come back at 3:00 pm EST to upload the new blocks. You might want to finish any posts you are working on now prior to that time, JUST IN CASE! It should work cleanly but better safe than sorry.
Stay tuned to the other announcement site over the next week or until further notice, as if this keeps up and the blocks DON'T WORK then we will probably need to block the site, or we will go broke.
I have already sent in a submission to a VPS PROVIDER (http://vpsville.ca/) to determine what is involved (and costs) to switch to a VPS (Virtual Private Server) system. I know nothing about those, other than what I have read, and I do not know if I would have the necessary skill set or time to operate such a system. I have not heard back from this as of this date. Reading tells me there are MANAGED VPS and STAND ALONE VPS and so if we can get a MANAGED system, it might be okay. Best is if we can stay where we are now if we can beat the botters to death!
Fingers crossed, now also crossing toes,
Richard
ADDED TO THIS BLOG POST ONLY
Here is "section" the file that is now on the system to do the blocking. If you see one that is a short version such as "Require not ip 124." that means that it is blocking any IP that starts with that series of numbers. If your IP has any of the front numbers on the list you have to let me know by e-mail to cefmatrix@gmail.com.
# from Register.ca for Semrushbot 46.229. and then RVL added many from log of 6 January 2017 using Excel spreadsheet to sort into large groups
# bingbot is 157.55.39.104 and then added just 157.55.39. as many others used
# mj12bot found in large group from 163.172.68.136
# Domain Re-Animator Bot 167.114.156.198
# BoogleBot is 52.90.230.103 - very large amount on 6 January 2017 from IP in Seattle USA
# profound.net/domainappender trying to read .htaccess file 54.147.153.234
# downloading wiki from London UK 62.210.148.247
# large downloads from this Windson IP so temp block 70.51.99.113
# blocked any large groups from China, Russia, etc. also large group from Germany
Require all granted
Require not ip 46.229.
Require not ip 10.8.163.19
Require not ip 10.8.174.151
Require not ip 10.8.
Require not ip 104.140.
Require not ip 107.
Require not ip 108.
Require not ip 109.
Require not ip 110.
Require not ip 111.
Require not ip 112.
Require not ip 113.
Require not ip 114.
Require not ip 115.
Require not ip 116.
Require not ip 117.
Require not ip 118.
Require not ip 119.
Require not ip 12.106.
Require not ip 120.
Require not ip 121.
Require not ip 122.
Require not ip 123.
Require not ip 124.
Require not ip 125.
Require not ip 126.
Require not ip 127.
Require not ip 128.
Require not ip 129.
Require not ip 144.
Require not ip 146.
Require not ip 151.237.
Require not ip 151.80
Require not ip 157.55.39.104
Require not ip 157.55.39.
Require not ip 163.172.68.136
Require not ip 165.231.
Require not ip 167.114.156.198
Require not ip 17.
Require not ip 18.
Require not ip 5.196.167.230
Require not ip 5.9.94.207
Require not ip 52.90.230.103
Require not ip 54.147.153.234
Require not ip 61.
Require not ip 62.
Require not ip 62.210.148.247
Require not ip 63.
Require not ip 64.
Require not ip 65.
Require not ip 66.
Require not ip 67.
Require not ip 68.
Require not ip 69.
Require not ip 70.51.99.113
Require not ip 77.248.252.113
This may be duplication, but is also added to the .htaccess code as some can ignore the IP address blocks. These are the NAMES of the ones that I have found so far that are the main offenders:
# RVL added December 31, 2016 as Semrushbot is still invading the site
# found here: http://stackoverflow.com/questions/23631872/ban-robots-from-website
# they say "Also you can do this little trick, deny ANY ip address that has "SemrushBot" in user agent string"
Options +FollowSymlinks
RewriteEngine On
RewriteBase /
SetEnvIfNoCase User-Agent "^SemrushBot" bad_user
SetEnvIfNoCase User-Agent "^Slurp" bad_user
SetEnvIfNoCase User-Agent "^dotbot" bad_user
SetEnvIfNoCase User-Agent "^Googlebot" bad_user
SetEnvIfNoCase User-Agent "^bingbot" bad_user
SetEnvIfNoCase User-Agent "^mj12bot" bad_user
SetEnvIfNoCase User-Agent "^ToutiaoSpider" bad_user
SetEnvIfNoCase User-Agent "^YandexBot" bad_user
SetEnvIfNoCase User-Agent "^domainappender" bad_user
SetEnvIfNoCase User-Agent "^bogglebot" bad_user
SetEnvIfNoCase User-Agent "^WhateverElseBadUserAgentHere" bad_user
Deny from env=bad_user
In the event that you understand any of this, then I can tell you that the following code was also added to CACHE the files on the site so they are not downloaded from the server every time someone visits a post:
# ADDED from http://www.fastcomet.com/tutorials/phpbb3/performance-optimization 28-12-2016
## EXPIRES CACHING ##
ExpiresActive On
ExpiresByType image/jpg "access plus 1 year"
ExpiresByType image/jpeg "access plus 1 year"
ExpiresByType image/gif "access plus 1 year"
ExpiresByType image/png "access plus 1 year"
ExpiresByType text/css "access plus 1 month"
ExpiresByType application/pdf "access plus 1 month"
ExpiresByType text/x-javascript "access plus 1 month"
ExpiresByType application/x-shockwave-flash "access plus 1 month"
ExpiresByType image/x-icon "access plus 1 year"
ExpiresDefault "access plus 2 days"
## EXPIRES CACHING ##
If you think I actually know what I am doing, think again! I go read about it on the web, try it out, and then sit back to see if it works. If anyone really knows how to do all this stuff, then you could be a big asset to the team! If you see line of code that starts with a # that means that line is just a note and is not read as code. There you will see I put in the URL of the information that I may have found on the web that tells me what I should do to fix the problem. Very often I have no idea what they are talking about so I have to learn that first.
Added later:
The file was uploaded at 3:02 pm EST and the site is still alive! Whew!
The blocks are now in effect.
No comments:
Post a Comment