Language: 
To browser these website, it's necessary to store cookies on your computer.
The cookies contain no personal information, they are required for program control.
  the storage of cookies while browsing this website, on Login and Register.

GDPR and DSGVO law

Storing Cookies (See : http://ec.europa.eu/ipg/basics/legal/cookies/index_en.htm ) help us to bring you our services at overunity.com . If you use this website and our services you declare yourself okay with using cookies .More Infos here:
https://overunity.com/5553/privacy-policy/
If you do not agree with storing cookies, please LEAVE this website now. From the 25th of May 2018, every existing user has to accept the GDPR agreement at first login. If a user is unwilling to accept the GDPR, he should email us and request to erase his account. Many thanks for your understanding.
Amazon Warehouse Deals ! Now even more Deep Discounts ! Check out these great prices on slightly used or just opened once only items.I always buy my gadgets via these great Warehouse deals ! Highly recommended ! Many thanks for supporting OverUnity.com this way.

User Menu

Donations

Please Donate for the Forum.
Many thanks.
Regards, Stefan.(Admin)

A-Ads

Powerbox

Smartbox

3D Solar

3D Solar Panels

DC2DC converter

Micro JouleThief

FireMatch

FireMatch

CCKnife

CCKnife

CCTool

CCTool

Magpi Magazine

Magpi Magazine Free Rasberry Pi Magazine

Battery Recondition

Battery Recondition

Arduino

Ultracaps

YT Subscribe

Gravity Machines

Tesla-Ebook

Magnet Secrets

Lindemann Video

Navigation

Products

Products

WaterMotor kit

Statistics

  • *Total Members: 83679
  • *Latest: Mbonheur

  • *Total Posts: 515961
  • *Total Topics: 15381
  • *Online Today: 44
  • *Most Online: 103
(December 19, 2006, 11:27:19 PM)
  • *Users: 1
  • *Guests: 12
  • *Total: 13

Author Topic: Google crawler stealing all the bandwidth traffic..  (Read 13299 times)

Offline hartiberlin

  • Administrator
  • Hero Member
  • *****
  • Posts: 7931
    • free energy research OverUnity.com
Google crawler stealing all the bandwidth traffic..
« on: January 09, 2008, 11:10:48 AM »
Hi All,
although I am still on vacation I could analyse the traffic being made on this forum and unfortunately the Google bot crawler makes about 10 to 20 times more traffic than all the power users over here.
I already set in Google Webmastertools the setting to crawl this site slower, but this also did not help.
Also the Google bot does not follow the Crawl-delay parameter in robots.txt.

Is there any other solution to stop the Google bot to crawl so fast ?
Maybe it is because of the Adsense Ads ?
Any help would be greatly appreciated,
maybe setting it somehow to error 503 for temporarely not available , not to be thrown out of the index ?
Many thanks.
Regards, Stefan.

Free Energy | searching for free energy and discussing free energy


Offline helmut

  • Hero Member
  • *****
  • Posts: 720
    • in construction
Re: Google crawler stealing all the bandwidth traffic..
« Reply #1 on: January 09, 2008, 11:22:29 AM »
Hi Stefan
Dont forget to enjoy your Vacation.
The world will keep on turning

helmut

Offline Earl

  • TPU-Elite
  • Sr. Member
  • *******
  • Posts: 435
Google crawler stealing all the bandwidth traffic..
« Reply #2 on: January 12, 2008, 10:30:03 PM »
If your Web server is running under Linux, can you use a cron job to copy and overwrite robots.txt such that only x hours per night robots.txt says

User-agent: *
Allow: /

The rest of the day it is overwritten to show

User-agent: *
Disallow: /

For example make a file called allow.txt and the cron job would say
echo allow.txt > robots.txt

and the file disallow.txt and the cron job would say
echo disallow.txt > robots.txt

allow.txt and disallow.txt are the same except one line and contain
the entire robots.txt
For your info, Slurp (yahoo/AV) and MSFT bots obey crawl delay,
Googlebot not yet but will most likely in 2.1+
less than 35 percent of servers have a robots.txt file
this is crazy, but over 75,000 robots.txt files have pictures in them!

Regards, Earl



Free Energy | searching for free energy and discussing free energy

Google crawler stealing all the bandwidth traffic..
« Reply #2 on: January 12, 2008, 10:30:03 PM »
Sponsored links:




Offline hartiberlin

  • Administrator
  • Hero Member
  • *****
  • Posts: 7931
    • free energy research OverUnity.com
Re: Google crawler stealing all the bandwidth traffic..
« Reply #3 on: January 12, 2008, 11:52:40 PM »
Hi Earl,
nice idea !
This sounds like an easy solution.
Many thanks for this tip.

I just wonder, if Google tries again after a few hours to access
my site, when it was blocked already ?

Offline amigo

  • Hero Member
  • *****
  • Posts: 545
Re: Google crawler stealing all the bandwidth traffic..
« Reply #4 on: January 13, 2008, 12:49:38 AM »
You could use .htaccess in the root of the web and Mod_Rewrite (if this server runs Apache) to effectively block Google or have rules based on time tied to scripts that check last visited time etc.

It really depends what is the ultimate goal but mod_rewrite is pretty powerful, though with steep learning curve to begin with. :)

Free Energy | searching for free energy and discussing free energy

Re: Google crawler stealing all the bandwidth traffic..
« Reply #4 on: January 13, 2008, 12:49:38 AM »
Sponsored links:




Offline DrStiffler

  • TPU-Elite
  • Hero Member
  • *******
  • Posts: 610
    • Stiffler Scientific
Re: Google crawler stealing all the bandwidth traffic..
« Reply #5 on: January 17, 2008, 11:35:31 PM »
Well there are still some problems....

When I post a message it goes to neverland and mat take or it may not. Loading is slow, (to much being downloaded to the local machine). If it takes 45 to 180 seconds to down load or longer to ass a message, then what need to be done?????

Offline hartiberlin

  • Administrator
  • Hero Member
  • *****
  • Posts: 7931
    • free energy research OverUnity.com
Re: Google crawler stealing all the bandwidth traffic..
« Reply #6 on: January 18, 2008, 12:30:30 AM »
Hmm,
I blocked now all the spiders and on my location the site runs very well.
Please post a few traceroute results, so we can see,where the
bottlenecks are.
Many thanks.
P.S:If you post new pictures, be sure that the picture name is
really new and was not used by another user already,
so name the pics:
my_username_pic01.jpg
my_username_pic02.jpg
etc...

Also it it wise to copy the written text into the windows buffer (Control plus c)
before hitting the Post button,
in case the server times out or does not accept it to have still a copy of it...

Free Energy | searching for free energy and discussing free energy

Re: Google crawler stealing all the bandwidth traffic..
« Reply #6 on: January 18, 2008, 12:30:30 AM »
Sponsored links:




Offline Paul-R

  • without_ads
  • Hero Member
  • *****
  • Posts: 1789
Re: Google crawler stealing all the bandwidth traffic..
« Reply #7 on: January 18, 2008, 04:06:31 PM »
Hmm,
I blocked now all the spiders and on my location the site runs very well.
On the subject of bandwidth, every post has a tick box labelled:

"Notify me of replies"

and this tick box defaults to the "Yes" flag. Why not set the
software so that either this service is omitted, or it defaults
to the "No" postion, and people have to change it actively to
get it to work?
Paul.

Offline gri

  • Jr. Member
  • **
  • Posts: 59
Re: Google crawler stealing all the bandwidth traffic..
« Reply #8 on: January 18, 2008, 04:47:11 PM »
On the subject of bandwidth, every post has a tick box labelled:

"Notify me of replies"

and this tick box defaults to the "Yes" flag.

Why not set the software so that either this service is omitted,
or it defaults to the "No" postion,
and people have to change it actively to get it to work?
Paul.

Paul-R,

if a user does not observe vividly any setting results -
he in general does not know of its existance at all.

It is quite easy and natural to set off
those settings which are _visible_ for a user.

For example, if a user is not getting notifications by default
he will think that SMF developers monkey team
has not yet began to develop the human notification system.

But they began already. Just they are doing it too slow.
According to the Evolution tempo.

Free Energy | searching for free energy and discussing free energy

Re: Google crawler stealing all the bandwidth traffic..
« Reply #8 on: January 18, 2008, 04:47:11 PM »
3D Solar Panels

Offline bolt

  • Hero Member
  • *****
  • Posts: 926
Re: Google crawler stealing all the bandwidth traffic..
« Reply #9 on: January 21, 2008, 08:00:05 AM »
This site has got way too slow with all the linked ads. For every page refresh a user makes the main server has to go off and collect all those adds links too as well as building the page. For end users its not a pleasant experience taking time to browse and then be subjected to too many ads in the text.

My solution. Use firefox  and click AddOns. Install Adblock Plus and add the following zap lines to the Adblock filters then the pages here loads clean, fast and no more stupid adverts!

Result squeaky clean page  ;D

http://www.shareasale.com/
http://shopcloud.chitika.net/
http://pagead2.googlesyndication.com/
http://mm.chitika.net/
http://bdv.bidvertiser.com/

Any more spotted just right click on the advert and click Adblock to zap it.

 

OneLink