Loading...


bookmark - How Search Engines Operate answering your questions

How Search Engines Operate - answering your questions

 
 Discussion by GuardianGA with 6 Replies.
 Last Update: June 16, 2008, 11:17 am
 
bookmark - How Search Engines Operate answering your questions  
Quickly Post to How Search Engines Operate answering your questions w/o signup Share Info about How Search Engines Operate answering your questions using Facebook, Twitter etc. email your friend about How Search Engines Operate answering your questions Print
Reply / Comment New Discussion / Topic Share / Bookmark E-Mail a Friend Print

When I took a computer class, they said that search engines use "spiders" to register your site and have it show up when you type a keyword. I'm pretty sure this is true, coming from a professor with a degree in Computer Science but I'm wondering, what exactly are what they call spiders? What are spiders hosted on and all the technical questions. If you know anything technically about specifically how search engines like Google functions, this is the thread for you. So, please - share what you know. Your honest patronage is greatly appreciated and will be listened to.







   Sat Sep 17, 2005    Reply         

Spiders are hosted on the search engine. Search engines go to a website, checks out all the links it contains, and indexes all the valid links. It's like looking through a family tree and gathering information about each of the children. If there is no website linking to you, the spiders will not find you, because they do not know of your existence.

You can put a file called spider.txt on the root of your site to prevent spiders from going to your site or limit their actions.

   Sat Sep 17, 2005    Reply         

Dear szupie
I am very much thrilled knowing about the spiders. It has exited a great deal and I want to know more about the spiders you talk about. How these spiders are built? Which languages are used? I have come to know that it is related to the Meta tag of HTML. It is true? Please answer my queries. I am waiting for your answer eagerly. Bye

   Mon Jul 24, 2006    Reply         


I thought they were called 'crawlers'? The bots that look through your <META> tags to rank you in your pages. Maybe there is something called spiders, if a professor talked about it. Sounds interesting. I'm just new to learning about how to get noticed by search engines, and the 'crawlers' is what I have read about.

   Tue Jul 25, 2006    Reply         

Here is how a search engine works.
There are 5 parts that make up a search engine.

Spider - This program downloads web pages just like a web browser. The spider does not take images (or like media) in to account. It only downloads the html version.

Crawler - This program finds all links on each page. The crawler follows these links and tries to find documents not already known to the search engine.

Indexer - This component parses each page and analyzes the various elements, such as text, headers, structural or stylistic features, special HTML tags, etc.

Database - This is the storage area for the data that the search engine downloads and analyzes. Sometimes it is called the index of the search engine.

Results Engine - The results engine ranks pages. It determines which pages best match a user's query and in what order the pages should be listed.

Web server - The search engine web server usually contains a HTML page with an input field where the user can specify the search query he or she is interested in. The web server is also responsible for displaying search results to the user in the form of an HTML page.

That is how a search engine works. So what matters is the html page rather than what we see in the browser in terms of SEO. A great book is SEO MINDSET. Find it and read it.

   Tue Jun 10, 2008    Reply         

I wonder how large the Google Database is by this point? I'm guessing its deffinately in Terabytes by now, since it stores Caches of pages, along with images, videos , and every other form of Content. And also, what kind of Languages are various search engines programmed in.

   Sun Jun 15, 2008    Reply         


QUOTE (LegallyHigh)

I wonder how large the Google Database is by this point? I'm guessing its deffinately in Terabytes by now, since it stores Caches of pages, along with images, videos , and every other form of Content. And also, what kind of Languages are various search engines programmed in.
Link: view Post: 124578

From Wikipedia:

QUOTE

Servers are commodity-class x86 PCs running customized versions of Linux. Indeed, the goal is to purchase CPU generations that offer the best performance per dollar, not absolute performance. Estimates of the power required for over 450,000 servers range upwards of 20 megawatts, which could cost on the order of US$2 million per month in electricity charges.

Specifications:

* Upwards of 15,000 servers[4]ranging from a 533 MHz Intel Celeron to a dual 1.4 GHz Intel Pentium III (as of 2003); a 2005 by Paul Strassmann has 200,000 servers,[6] while unspecified sources claimed this number to be upwards of 450,000 in 2006.[1]
* One or more 80GB hard disks per server (2003)
* 2–4 GB of memory per machine (2004)

The exact size and whereabouts of the data centers Google uses are unknown, and official figures remain intentionally vague. In a 2000 estimate, Google's server farm consisted of 6000 processors, 12,000 common IDE disks (2 per machine, and one processor per machine), at four sites: two in Silicon Valley, California and two in Virginia.[7] Each site had an OC-48 (2488 Mbit/s) internet connection and an OC-12 (622 Mbit/s) connection to other Google sites. The connections are eventually routed down to 4 x 1 Gbit/s lines connecting up to 64 racks, each rack holding 80 machines and two ethernet switches. The servers run custom server software called Google Web Server.

15,000 * 80GB = 1.2 PB (Petabytes) as a minimum amount of hard drive capacity. Sure, a fair chunk of it probably isn't used, just as yet more is used for the OS and other such necessary software, but damn me if Google would buy too many more servers than they needed to. Still, my estimate is hardly particularly thought out or detailed. I took the liberty of doing a little more research in to some people who have made more accurate estimates.

In short: the Google machine is big, scary, and makes most supercomputers cry themselves to sleep. :)

   Mon Jun 16, 2008    Reply         

Quickly Post to How Search Engines Operate answering your questions w/o signup Share Info about How Search Engines Operate answering your questions using Facebook, Twitter etc. email your friend about How Search Engines Operate answering your questions Print
Reply / Comment New Discussion / Topic Share / Bookmark E-Mail a Friend Print

Similar Topics:

Search Engine Optimization

Well, You have made a site and you got the data. You have put it online and want people to see it. The best and easiest way to get people to your site is Search Engines. But most of us know the vast amount of Competition on Internet. So here are some steps which will Definately let you rank better t ...more

   23-Jun-2005    Reply         

Search Engines

I have been looking around for some good search engines and i found a website that some good ones in it. FetchFidoe Search Engine Interface is the place. It has some really good engines. One of my favorites ...more

   08-Sep-2005    Reply         

Some Html Ways To Increase Your Ran...

Your Website is lookin good and has all the things youve been told it needed; great titles, H1-H3 tags, meta tags, good unique content and easy navigation. So why arent you getting the traffic you deserve? These tips are for beginner to advanced SEOs. I have many other go ...more

   22-Oct-2009    Reply         

Search Engines A new kind of search engine   Search Engines A new kind of search engine (1) (8) The Brainboost Search Engine The smartest search engine ever!  The Brainboost Search Engine The smartest search engine ever!