Surprised to hear that crawling still runs out of the basement. For a general-purpose search engine, what kind of bandwidth does that require? Or is it manageable only because DDG proxies so many searches to other search engines?
Also, doesn't it seem inefficient to have crawlers here and indexes there?
I run a couple of boxes that are crawl only for Nuuton right out of my office. I do have a good internet connection. But its a good way to save on infrastructure.
I don't suppose you have a blog or something for that? My hobby/interest is search engines and I love reading about peoples experiences getting them running.
There is one but I have been too busy building it that have not posted anything yet. You can always get in touch through email. I like chatting with other hackers.
Thanks. I collect any that come up since its really such a dark area. I should probably write a blog post about it since I have a few more I have since dug up. Glad you like the articles :) it was something I had been writing for weeks and then finally got my act together and finished it.
Also, doesn't it seem inefficient to have crawlers here and indexes there?