The Cellar

The Cellar (http://cellar.org/index.php)
-   The Internet (http://cellar.org/forumdisplay.php?f=8)
-   -   Wolf and Gwennie: attached at the hips? (http://cellar.org/showthread.php?t=8436)

Gwennie! 05-27-2005 03:40 PM

Wolf and Gwennie: attached at the hips?
 
http://search.msn.com/results.aspx?F...pe=0&q=Gwennie

The link to Wolf's profile doesn't even point to the right page. Her profile ranks higher than mine in a search for Gwennie? This looks like a bug in the brand spanking new search engine at MSN.

perth 05-27-2005 03:57 PM

Hrm. We've see this before, haven't we? It changes based on the last poster.

lookout123 05-27-2005 04:37 PM

yeah, it is something like that perth.

for awhile (maybe still) if i searched for lookout123 it came back with links to the bosque and my title as "closet democrat". i can't find that post anywhere, but it shows up on google.

Gwennie! 05-27-2005 06:21 PM

Quote:

Originally Posted by perth
Hrm. We've see this before, haven't we? It changes based on the last poster.

Yea, that's the problem with generic spidering of executable URLs. But, what is weird in this case is that there is a version skew between the words that are in the search index and the content displayed in the results page. It's indexed with "Gwennie", displayed with "Wolf", and linked to the last poster.

I met with the search team at MSN last year; they're all focused on low level details like having the web crawler make direct calls to WinSock and tuning the C code. Yet, they have these serious algorithmic problems with URL normalization and index/display version skews. :smack: :D

When I interviewed there it was like Oil & Water, we didn't mix at all. I do high level programming in Java, Tomcat, and Linux. That wasn't popular with them. The only reason I went up there was because their recruiters called me.

Glad to see they've launched MSN Search and show us how 'great' it is.

Gwennie! 05-28-2005 04:55 PM

I've been thinking about how this bug could arise. They're spidering/indexing process would have to be fairly screwed up for a version skew of the content.

It is clear to me now that they index the text of the link to the page along with the content of the page. The spider grabs a forum index page with a link to last poster of "Gwennie", by the time the spider follows the link it grabs Wolf's profile page and associates the link text with the wrong web page. Since Wolf's profile page out scored the Gwennie! profile page, it is apparent that they put too much weight on the link text. They have plenty of tuning to do.

Those of us that have been search engineers before the Internet came along don't use these crutches like these newbies. Pure linguistic algorithms don't use link information, but rather find documents that are related because of phrases that they share.

Sorry, folks, I'm just thinking out loud here and I thought their may be some interest in Search in The Internet forum.

wolf 05-28-2005 06:10 PM

So, how do we get to be #1 for Whale Penis? What's the best strategy?

Gwennie! 05-29-2005 01:44 AM

Quote:

Originally Posted by wolf

I'm assuming y'all are trying to get the page http://whalepenis.org ranked highly. The keyword density for that page needs to be higher.

If the keyword density is too low, it's relevance score for the keyword will be lower. But, if it is too high such as "Whale Penis, Whale Penis, Whale Penis, Whale Penis, Whale Penis, Whale Penis", that's artificially high and will be rejected. Work the keywords of interest into the web page content as much as possible. Replacing pronouns and repeating the keywords in normal language is the best.

Follow the keyword density of The Cellar Profile pages.

The Cellar Go Back The Cellar > View Profile Reload this Page Gwennie!
User CP Register FAQ Members List Calendar New Posts Search Quick Links Log Out
Search Forums Advanced Search Quick Links New Posts Mark Forums Read
Open Buddy List User Control Panel Edit Signature Edit Profile Edit
Options Miscellaneous Private Messages Subscribed Threads My Profile
Who's Online View Profile: Gwennie!
Gwennie! I'm Just a Gwannabe Gwennie!'s picture Offline
Add Gwennie! to Your Buddy List Add Gwennie! to Your Ignore List
Signature: Only The Crumbliest Flakiest Gwennie!
Forum Info Contact Info Join Date: 12-13-2003 Posts Total Posts: 125 (0.23 posts per day)
Find all posts by Gwennie! Find all threads started by Gwennie!
Home Page: http://tragickingdom.net/
Email: Send a message via email to Gwennie!
Private Message: Send a private message to Gwennie!
Additional Information Group Memberships
Birthday: October 3, 1969 Biography: Location: Anaheim Interests: Occupation: software engineer
Gwennie! is not a member of any public groups


Here's a rewrite of the page that should rank it higher.

Whale Penis: Who We Are

The Church of the Whale Penis is a group of people that merely want to be the number one google site for "Whale Penis." Yes, we're serious. No, really.

How serious?

Well, we own the Whale Penis domain whalepenis.org. And there's this Whale Penis site. And The Church of the Whale Penis is giving out free whalepenis.org e-mail forwarders. Whale Penis friends, how's that for serious?

What is this Whale Penis stuff? Some porn BS?

No! Whale Penis not about porn at all! Seriously. See "Whale Penis: Who We Are" above about The Church of the Whale Penis.

For the Whale Penis site, what is the current ranking on Google in searches for "Whale Penis"?

This Whale Penis website is #228 as of May 19, 2005 in searches for "Whale Penis".

Alright...how do I help make the Whale Penis rise?

Link to us! Friends of the Whale Penis, once you link to us, e-mail us at link(@)whalepenis.org (remove the parentheses around the at sign), and we'll link to you! And tell your family and friends about us!

Links related to this Whale Penis site

The Bosque
The Cellar Image of the Day

Gwennie! 05-29-2005 01:52 AM

Two more things. The posts in the thread on El Ciberbosque should have links to http://whalepenis.org/

If you want Google to visit that page more often put Google AdSense on the bottom of the page. They spider AdSense pages more often than others. You could also put AdSense on El Ciberbosque to get those pages spidered more often.

Church of the Whale Penis

Gwennie! 05-29-2005 02:09 AM

Wow, MSN re-spidered these pages today. Now I'm attached to Troubleshooter.

Undertoad 05-29-2005 05:08 AM

Google page rank gives attention to sites that have heavy inbound linking so the first thing is to get a lot of sites to link to the site -- with the keywords.

Google assigns more priority to sites that change regularly so the next thing to do is to put dynamic content on the page.

wolf 05-29-2005 08:42 AM

Actually, syc's goal is to get the infamous "Whale Penis Thread" to be number one for whale penis, but would probably be satisfied with getting the church's webpage up there as well. Thanks for the analysis!

Gwennie! 05-31-2005 01:20 AM

Quote:

Originally Posted by Undertoad
Google page rank gives attention to sites that have heavy inbound linking so the first thing is to get a lot of sites to link to the site -- with the keywords.

This is true, but it's more like a tiebreaker. Relevance scores are a function of the keyword density. Many pages will have similar relevance scores, then Page Rank boosts linked-to pages. CoWP already has a linking program, but the home page text needed to be edited for search engine scoring.

Quote:

Originally Posted by Undertoad
Google assigns more priority to sites that change regularly so the next thing to do is to put dynamic content on the page.

This is true up to a certain point. Google cites news pages and airline schedules as examples that change too frequently to index in the web-search index.

Quote:

Originally Posted by Wolf
Actually, syc's goal is to get the infamous "Whale Penis Thread" to be number one for whale penis, but would probably be satisfied with getting the church's webpage up there as well. Thanks for the analysis!

You're welcome. I'm your friend in the search business. :ivy:

The CoWP states search ranking as it's goal. So that's why I focused on that site.

The default settings for the Guest user is 20, so a search index will score each page of 20 posts. When you get past 20 posts, the relevance score of the first page of the thread won't change. You are better off starting a new thread filling it with 20 posts with good keyword density and linking to the page.

Adding new pages to a thread won't change the search ranking of previous pages of that thread.

Troubleshooter 05-31-2005 07:41 AM

Quote:

Originally Posted by Gwennie!
Wow, MSN re-spidered these pages today. Now I'm attached to Troubleshooter.

Not anywhere that would require a BCS as far as I can tell.

elSicomoro 06-21-2005 05:57 PM

Finally...updated the COTWP webpage. Thanks for the help, RS!

jaguar 06-22-2005 05:08 AM

I seem to remember some sites had to block the old MSNBot because it saw each sessionID as a unique URL.....


All times are GMT -5. The time now is 12:49 AM.

Powered by: vBulletin Version 3.8.1
Copyright ©2000 - 2025, Jelsoft Enterprises Ltd.