About /robots.txt

A robot is a program that automatically traverses the Web's hypertext structure by retrieving a document, and recursively retrieving all documents that are referenced.

Web site owners use the /robots.txt file to give instructions about their site to web robots.

It works likes this: a robot wants to vists a Web site URL, say http://www.example.com/welcome.html. Before it does so, it firsts checks for http://www.example.com/robots.txt, and finds:

User-agent: *
Disallow: /

The "User-agent: *" means this section applies to all robots. The "Disallow: /" tells the robot that it should not visit any pages on the site.

There are two important considerations when using /robots.txt:

a. can ignore your /robots.txt. Especially malware robots that scan the web for security vulnerabilities, and email address harvesters used by spammers will pay no attention.
b. /robots.txt file is a publicly available file. Anyone can see what sections of your server you don't want robots to use.

So don't try to use /robots.txt to hide information.

1. To exclude all robots from the entire server

User-agent: *
Disallow: /

2. To allow all robots complete access

User-agent: *
Disallow:

3. To exclude all robots from part of the server

User-agent: *
Disallow: /cgi-bin/
Disallow: /tmp/
Disallow: /junk/

4. To exclude a single robot

User-agent: BadBot
Disallow: /

5. To allow a single robot

User-agent: Google
Disallow:

User-agent: *
Disallow: /

6. To prevent all robots from indexing a page on your site, place the following meta tag into the head section of your page:

meta name="robots" content="noindex"






7. Submitted your URL to be indexed via:
http://www.bing.com/docs/submit.aspx


SEO Details
1. Submissions to search Engines and Directories:
Submission of website to Dmoz, Google, Yahoo, Technorati, and MSN search engines. Submission of website in directories manually.
http://addurl.altavista.com/addurl/default
http://www.dmoz.org/add.html
http://search.msn.com/docs/submit.aspx
http://technorati.com/
http://www.ask.com/
http://www.scrubtheweb.com/addurl.html
http://www.alexa.com/site/help/webmasters
http://artmam.net/add.php
http://www.niche-listings.com
http://www.blog-search.com/
http://www.exactseek.com
http://www.jayde.com
http://www.info-listings.com
http://www.pegasusdirectory.com
http://www.google.com/addurl/
http://www.submitexpress.com/submit.html
http://freewebsubmission.com
http://www.webworldindex.com
http://turnpike.net/directory.html
http://www.directoryvault.com
http://www.infotiger.com/addurl.html
http://www.nerdworld.com/nwadd.html
http://www.walhello.com/addlinkgl.html
http://www.fybersearch.com/add-url.php
2. Google Sitemap and Yahoo Sitemap:
Sitemap is a simple text document that lists all valid URLs on your website. This document is designed to help Yahoo and Google crawlers.
https://www.google.com/webmasters/tools/docs/en/about.html
- Submit sitemap file
- Specify your preferred domain
- Robots.txt file validation
- Validate and resolve crawl errors through Diagnostics
- Validate and resolve indexing violations
http://search.yahoo.com/info/submit.html
- Submit Website
- Submit Site Feed
http://sitemaps.org/
- XML Format Protocol
a. Create Google sitemap XML file.
b. Upload the new sitemap to the website.
c. Submit Sitemap to Google
d. Create a Yahoo Sitemap xml file
e. Uploading the new sitemap to the website.
f. Submit your Sitemap to Yahoo.
3. Ping (New Add):

services like Pingomatic (there are numerous others too) will ping a variety of websites for you to notify them that you’ve updated. In doing so you’ll also be letting search engines know that you’ve updated which will trigger their robots to come visit your blog. I’d also suggest pinging Google’s blog search tool.
a. Search Ping websites and submit information.
http://pingomatic.com/
http://technorati.com/ping
http://www.feedburner.com/fb/a/ping
http://www.ping.in/
http://blogsearch.google.com/ping
http://blogrolling.com/ping.phtml
http://kping.com/
http://pinger.blogflux.com/
http://www.icerocket.com/c?p=ping
http://blo.gs/ping.php
http://www.bloglines.com/ping
http://pingates.com/
Search Engine Optimization
Search Engine Optimization (SEO) is the process of improving the volume and quality of traffic to a web site from search engines via "natural" ("organic" or "algorithmic") search results for targeted keywords. Usually, the earlier a site is presented in the search results or the higher it "ranks", the more searchers will visit that site. SEO helps to ensure that a site is accessible to a search engine and improves the chances that the site will be found by the search engine.
Search Engine Optimization (SEO) includes following steps:
1. Google Sitemap and Yahoo Sitemap:
Sitemap is a simple text document that lists all valid URLs on your website. This document is designed to help Yahoo and Google crawlers.
a. Create Google sitemap XML file.
b. Upload the new sitemap to the website.
c. Submit Sitemap to Google
d. Create a Yahoo Sitemap xml file
e. Uploading the new sitemap to the website.
f. Submit your Sitemap to Yahoo.
2. Submissions to search Engines and Directories:
Submission of website to Dmoz, Google, Yahoo, Technorati, and MSN search engines. Submit website information in all other important directories.
3. Reciprocal links: Reciprocal links can improve a website’s search relevance ranking, moving the website up the list of results on a search engine results page.
4. Create a Site Map: Include a detailed, text-based site map, with a link to every page and preferably a short description of what each page offers.
5. Title and Meta Tags: Use logical Title and META tags for each page.
Use logical Title and META tags in near about 20 pages:
a. Title tag
b. Description Meta tag
c. Keyword Meta tag
6. Headlines: Search engine robot looks at the headlines of a web page in order to pick up the essential feature of that page. Put your main phrase in a headline and place it near the top of the page.
7. Internal Linking:
There are two main ways to insure that our site gets well spidered. The first is to place text links on the bottom of homepage to the main internal pages. The second is to create a sitemap to all our internal pages and link to it from the homepage. Do not use links in pictures and we should use important keywords in the links.
8. Home, Site Map and Contact Page Links: To ensure that they are spidered, place links to them near the top of your source code on every page of your web site.
9. Broken Links: Take care of broken links, its important for search engines.
10. Emphasize on Text and Not Graphics: Search engines should be able to find data on your site that they can put in their data bases. Pages that are all flash or images are not search engine friendly. If you are using graphics, images or hyperlinks then you should include alternative (ALT) text.
11. Using Style Guidelines Effectively: If you are using CSS style commands, do not include them within your actual web page source code. You don't want search engine spiders to have to wade through 100 lines of unreadable code before they reach your actual content. Instead, place your style guidelines into a separate CSS file and call them with a single line of code from within your and tags by using the following code.
12. Keyword density:
A spider places importance on what it reads highest on the page and so beginning with a sentence that includes our targeted phrase only makes sense. The term "keyword density" refers to the percentage of our content that is made up of our targeted keywords. Add keywords, Description and Title in every webpage of the website.
13. Keyword Analysis (Overture/WordTracker):
We generally use WordTracker for finding the Search Count of the Keyword. Find INTITLE and INURL for the keyword. (www.digitalpoint.com).
14. Google Analytics: Google Analytics shows you how people found your site, how they explored it, and how you can enhance their visitor experience.
Robot Files
1. To allow your site to search engines crawlers, create robots.txt file and place on root with

following content:
User-agent: *
Disallow:
2. To remove your site from search engines and prevent all robots from crawling it in the future, place the following robots.txt file in your server root:
User-agent: *
Disallow: /
The Disallow line lists the pages you want to block. You can list a specific URL or a pattern. The entry should begin with a forward slash (/).
* To block the entire site, use a forward slash.
Disallow: /
* To block a directory and everything in it, follow the directory name with a forward slash.
Disallow: /private_directory/
* To block a page, list the page.
Disallow: /private_file.html
4. Block or remove pages using meta tags
Rather than use a robots.txt file to block crawler access to pages, you can add a tag to an HTML page to tell robots not to index the page.
To prevent all robots from indexing a page on your site, you'd place the following meta tag into the section of your page:
5. To allow robots to index the page on your site but instruct them not to follow outgoing links, you'd use the following tag:

Source: Google.com

Search Engine Friendly Websites
1. Emphasize on Text and Not Graphics: Search engines should be able to find data on your site that they can put in their data bases. Pages that are all flash or images are not search engine friendly. If you are using graphics, images or hyperlinks then you should include alternative (ALT) text.

2. Create a Site Map: Include a detailed, text-based site map, with a link to every page and preferably a short description of what each page offers.

3. Readable Text: Standard size (Web standard is 12); make sure whatever font color you pick is readable as well.

4. Page Loading Time: Your website should take 8-15 seconds to load.

5. Broken Links: Take care of broken links, its important for search engines.

6. JavaScript: If you really need to use JavaScript you can safely use it by putting the code into a separate JS file and calling it with a single line of code which you place between your and tags within your web page:


7. Using Style Guidelines Effectively: If you are using CSS style commands, do not include them within your actual web page source code. You don't want search engine spiders to have to wade through 100 lines of unreadable code before they reach your actual content. Instead, place your style guidelines into a separate CSS file and call them with a single line of code from within your and tags by using the following code:

8. Spell Check: Check your spelling.

9. Consistent Design and Layout: Means using the same general colour scheme, logo, consistent navigation menu, header and footer in the same location.

10. Navigation System: Avoid using JavaScript drop downs or flash navigation. If possible use text links because they are small in size and are also crawlable by search engines.
If you have a navigation system which uses Java-script or images, then it is best to add an additional text link navigation bar at the bottom of the site.

11. Title and Meta Tags: Use logical Title and META tags for each page.

12. Browser Compatibility: Make your website browser compatible.

13. Headlines: Search engine robot looks at the headlines of a web page in order to pick up the essential feature of that page. Put your main phrase in a headline and place it near the top of the page. Your headline text should be enclosed with special header tags such as

,

,

.

14. Home, Site Map and Contact Page Links: To ensure that they are spidered, place links to them near the top of your source code on every page of your web site.

Steps of Website Optimization

1. Keyword Analysis (Overture/WordTracker):
We generally use WordTracker for finding the Search Count of the Keyword. Find INTITLE and INURL for the keyword. (www.digitalpoint.com)

2. Content of Website:
Before optimize the website we should like to get a good deal of new content down in order to insure that we know exactly.

3. Site Structure:
We have to create a site that is easily spidered by the search engines and also attractive for visitors. We must "think like a spider". A search engine spider reads our web page like we would read a book. It starts at the top left, reads across, and then moves down.
Priority must be given then, to what we place near the top of our page.

4. Optimization:
A spider places importance on what it reads highest on the page and so beginning with a sentence that includes our targeted phrase only makes sense. The term "keyword density" refers to the percentage of our content that is made up of our targeted keywords. Add keywords, Description and Title in every webpage of the website.

5. Internal Linking:
There are two main ways to insure that our site gets well spidered. The first is to place text links on the bottom of homepage to the main internal pages. The second is to create a sitemap to all our internal pages and link to it from the homepage. Do not use links in pictures and we should use important keywords in the links.

6. Google Sitemap and Yahoo Sitemap:
Sitemap is a simple text document that lists all valid URLs on your website. This document is designed to help Yahoo and Google crawlers.

7. Google Local and Yahoo Local:
Google local and Yahoo local is the local search which is area/location specific and by submitting the website in this search the website will be shown in the local search results also. It’s very important to submit the website in the local search for the better listing result from location point of view.

8. Submissions to search Engines and Directories:
Submission of website to Dmoz, Google, Yahoo, and MSN search engines. Submission of website in directories manually.

9. Link Building:
Link Popularity is one of the most important features of any search engine optimization process and it is the "Off-Page" Optimization of our website. We search the internet for relevant high ranking category websites and get linked within them.
Miscellaneous

1. Forum:
http//www.webworkshop.net/seoforum/index.php
2. Blog SetUp:
http//weblogs.about.com/od/startaweblog/
3. Convert .Doc to Pdf:
http//www.gobcl.com/convert_pdf.asp
4. Search Engine Working:
http//www.seomoz.org/
5. SEO Tools:
http//www.seocompany.ca/tool/seo-tools.html
6. SEO Job Site:
http//www.jobsindia.org/