"Google dorking," also known as Google Hacking is a form of open source intelligence (OSINT) gathering that enables someone to find sensitive data or evidence of vulnerabilities by querying a search engine like Google or Bing.
The best defense in this regard is, to borrow a line from Socrates, "know thyself." Search your own domain for sensitive information, using public databases of "google dorks" like http://johnny.ihackstuff.com/ghdb and your own common sense. When you find something of concern, take action to remove it from your site. Then use Google's webmaster tools (http://www.google.com/webmasters/tools/) to remove the information from Google's search results.
Here are a few things that you could start by searching for:
- site:mysite.com viagra
- site:mysite.com filetype:xls
- site:mysite.com -site:www.mysite.com
- site:pastebin.com mysite.com
Some of these indicate that you have already been compromised. Others look for potentially sensitive information, or unusual hostnames/subdomains. Be sure to use more than one search engine, since each one looks at your domain differently. If you have a site with a low volume of new content, consider setting a Google Alert (http://www.google.com/alerts) for "site:mysite.com" to keep tabs on new results for your site.