İçeriğe git

Blog Sayfam

  • madde
    5
  • yorum
    4
  • görünüm
    55

About /robots.txt


belgeport

13 kez okundu

About /robots.txt

In a nutshell

Web site owners use the /robots.txt file to give instructions about their site to web robots; this is called The Robots Exclusion Protocol.

It works likes this: a robot wants to vists a Web site URL, say http://www.example.com/welcome.html. Before it does so, it firsts checks for http://www.example.com/robots.txt, and finds:

User-agent: *
Disallow: /

The "User-agent: *" means this section applies to all robots. The "Disallow: /" tells the robot that it should not visit any pages on the site.

There are two important considerations when using /robots.txt:

  • robots can ignore your /robots.txt. Especially malware robots that scan the web for security vulnerabilities, and email address harvesters used by spammers will pay no attention.
  • the /robots.txt file is a publicly available file. Anyone can see what sections of your server you don't want robots to use.

So don't try to use /robots.txt to hide information.

See also:

0 Yorum


Önerilen Yorumlar

Görüntülenecek yorum yok.

Misafir
Yorum yaz...

×   Zengin metin olarak yapıştırıldı.   Paste as plain text instead

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Önceki içeriğiniz geri yüklendi.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

×
×
  • Yeni Oluştur...