Register Members List Search Today's Posts Mark Forums Read

Reply
 
Thread Tools
  #1  
Old 14 Nov 2009, 17:13
kontrabass kontrabass is offline
 
Join Date: Feb 2002
Search: How does VB4 "optimize" searching? Is Sphinx still required for big boards?

Just curious... I've got 7+Million posts currently and am at the mercy of Sphinx for searches in VB3...
Reply With Quote
  #2  
Old 14 Nov 2009, 17:36
Paul M's Avatar
Paul M Paul M is offline
 
Join Date: Sep 2004
Real name: Paul M
No idea, I dont think anyone has tried it on a board that big.

The search is still undergoing a lot of changes atm, it was basically re-wrritten.
__________________
Former vBulletin.org Staff Member


Cable Forum
Please do not PM me about custom work - I no longer undertake any.
Note: I will not answer support questions via e-mail or PM - please use the relevant thread or forum.
Reply With Quote
  #3  
Old 24 Dec 2009, 04:10
RedWingFan's Avatar
RedWingFan RedWingFan is offline
 
Join Date: Oct 2004
I'm waiting for any kind of Sphinx modification (plugin, product, whatever) in vB4 before we can upgrade our production forum over to it. I don't know enough about vB to code my own, or adapt the existing 3.x implementation for 4.x. (Although I may be forced to... )
__________________
-= N =-
Reply With Quote
  #4  
Old 24 Dec 2009, 16:20
kontrabass kontrabass is offline
 
Join Date: Feb 2002
I did a search for "sphinx" over on vb.com's 4.0 gold installation - it took 34 seconds of search time. Optimized? Heh.
Reply With Quote
  #5  
Old 02 Jan 2010, 14:15
eoc_Jason's Avatar
eoc_Jason eoc_Jason is offline
 
Join Date: Dec 2001
There's a few discussions over there, but the last official answer that I read was that Sphinx was NOT implemented in vB 4.0, with no acknowledgment that it would be implemented any time soon (or ever). I'm starting to seriously consider IPB as it has Sphinx compatibility...

Searching is the one crucial feature of a forum (or heck, even the Internet... Google anyone?) and I am appalled at how the vB team has been putting search features so low on their priority list (if it has even made the list)... Once a forum gets to a certain size, the built-in search becomes totally unusable because of table locking... I've complained for years on behalf of people with large sites, and have only fallen on deaf ears it seems.

It doesn't take much to crash a large vB forum using the built in search. When you do a search, depending on how you word it it can take a long, long time. Sadly if a result is not returned within a couple seconds people usually click the submit button again, and again, and again. Which does nothing but queue up in MySQL... Other people can still browse the forum just fine... UNTIL someone tries to submit a post/thread... When that happens then the submit is held in queue until the search finishes. Likewise every other query to the post table is held in queue because now there is a lock on the table... If your site is busy then you will quickly run out of mysql connections and probably http connections because everyone is "put on hold" till the one freaking search finishes...

A user isn't going to wait minutes for a search result.... Likewise it is not very feasible for a webmaster to get a second server exclusively as a slave search DB. Sphinx on the other hand can return results in a fraction of a second, and the program is FREE... *sigh*...

I'm still stuck on vB 3.6.x because that is the best sphinx code that is available at the moment. I've taken Orban's code and modified / improved it some, but I really don't want to try and tackle all the extra stuff of the newer vB... I don't know why I keep renewing my license, I'm paying for a product that I can't / won't upgrade to... I *thought* sphinx was going to be implemented in vB 4, but that was all false hopes...
__________________
My Site: EXTREME Overclocking

Do not PM me with your iTrader problems or asking for the code. I will just delete your PM without reading it.
Reply With Quote
  #6  
Old 02 Jan 2010, 14:59
kontrabass kontrabass is offline
 
Join Date: Feb 2002
The "good" news is that table locking has been removed from the equation in 4.0. But still, as you said, no user on today's internet is going to sit there for close to a minute waiting for search results. Heck, it'd be a lot more than a minute on my forums, doing a fulltext search on 7 million posts!

I think VB is hesitant to use Sphinx because they don't have control over it. It's not *their* product. The thinking is probably that if Sphinx for some reason becomes unsupported or ceases to exist, VB is crippled.

But I say, if they don't want to incorporate Sphinx, fine - at least create a plugin that is easy to implement, so that if their big-boarding clients have no other option, at least they can upgrade to VB 4.0 instead of moving to IPB.

--------------- Added 02 Jan 2010 at 15:05 ---------------

I just checked out this official doc from IPB:

http://community.invisionpower.com/r...tml?record=181

Wow, that's what I'm looking for. Sure, the user needs to have root server access (probably another reason VB doesn't officially support Sphinx), but what big-boarder seriously is not going to have root access???? No board with 5+ million posts is going to be on a shared $10/month hosting plan.

IPB makes it so easy to use Sphinx, I'm seriously looking at converting even though I've been with VB for 10 years.

Last edited by kontrabass; 02 Jan 2010 at 15:06. Reason: Auto-Merged DoublePost
Reply With Quote
  #7  
Old 03 Jan 2010, 14:27
nitra1000's Avatar
nitra1000 nitra1000 is offline
 
Join Date: Dec 2009
Real name: John Moss
Don't shoot me but you could always implement google custom search...
__________________
My Mods - Pm me for custom work
Reply With Quote
  #8  
Old 04 Jan 2010, 01:54
kontrabass kontrabass is offline
 
Join Date: Feb 2002
Originally Posted by nitra1000 View Post
Don't shoot me but you could always implement google custom search...
We've done that for years as well (in conjunction with sphinx). Unfortunately it's extremely feature limited. Search by username? Order results by thread date? last post date? Search a specific forum category? You're out of luck.
Reply With Quote
  #9  
Old 04 Jan 2010, 03:03
RedWingFan's Avatar
RedWingFan RedWingFan is offline
 
Join Date: Oct 2004
Originally Posted by nitra1000 View Post
Don't shoot me but you could always implement google custom search...
Originally Posted by kontrabass View Post
We've done that for years as well (in conjunction with sphinx). Unfortunately it's extremely feature limited. Search by username? Order results by thread date? last post date? Search a specific forum category? You're out of luck.
To add to what kontrabass writes: Google Search (any version) can't work properly, as it is not a real-time search directly on our database tables. Google can only search on the content it spiders over the course of time. That could be daily, weekly, whatever. And because of this, it can't support searches by username, thread title searches, date-based searches, etc.

Having said that, we DO offer a Google search option, as for some of the phrases our visitors search for in older posts, Google's results are far more accurate and usable. Google's search also gets around common stopwords: on a music forum that discusses rock music, NO built-in search via vB or MySQL can find the title of the album by The Who called "Who's Next", as both are stopwords (unless you custom-configure MySQL).

And to help with server loads, we have asked members to give Google search a try. We have it in the search drop-down on the forum as an option. But, that still precludes using any of the search functions built into vB that we know (and...umm...love). It's a good additional tool for members, but it can't replace vB's search.

Originally Posted by kontrabass View Post
The "good" news is that table locking has been removed from the equation in 4.0.
How did they pull that one off?

Originally Posted by kontrabass View Post
I think VB is hesitant to use Sphinx because they don't have control over it. It's not *their* product. The thinking is probably that if Sphinx for some reason becomes unsupported or ceases to exist, VB is crippled.

But I say, if they don't want to incorporate Sphinx, fine - at least create a plugin that is easy to implement, so that if their big-boarding clients have no other option, at least they can upgrade to VB 4.0 instead of moving to IPB.

%< snip >%

....... Sure, the user needs to have root server access (probably another reason VB doesn't officially support Sphinx), but what big-boarder seriously is not going to have root access???? No board with 5+ million posts is going to be on a shared $10/month hosting plan.
We don't have root access on our dedicated server, BUT...I can still install quite a few packages myself, and I successfully compiled Sphinx and installed it on our box without needing a root login. I could do this on our shared accounts at the same host, but like you say...NO board with as many posts as ours would live long on a shared account! (We're around 4.5 million posts ourselves right now.)

And for that matter, vB doesn't control vB and MySQL, yet they depend on them to make vB run. I agree with you: Sphinx support should be a vB-supplied plugin, or an option we can choose from admincp. Just like in vB 3.x, where we could choose GD or Imagemagick support based on what we had available to us. The only hiccup I can see is that Sphinx does require a bit of additional setup beyond installation, and you need access to cron (or similar mechanism) to run the indexing at regular intervals. Even there, though, admins with large forums like ours already run on dedicated hosts and know how to do this already.

It's been said before that it's not even vB so much as it is MySQL that is limiting us. One of our fellow members here on vB.org rewrote parts of vB3 to use PostreSQL and it ran far better. If the freebie forums like phpBB can offer multiple schemas, I don't see why the same couldn't have been done with vB when it was redesigned from the ground up for vB4. I would pay extra at our host to get PostgreSQL installed if I could use it with our forum.

Originally Posted by eoc_Jason
Searching is the one crucial feature of a forum (or heck, even the Internet... Google anyone?) and I am appalled at how the vB team has been putting search features so low on their priority list (if it has even made the list)... Once a forum gets to a certain size, the built-in search becomes totally unusable because of table locking...
We tried fulltext search and had to go back to MyISAM table types, and the table locking was killing us. Back to built-in vB search for us! Lesser of two evils at the time, until I decided to bite the bullet and put Sphinx on our production forum. Still not ideal, but the forum (and especially our search) became...usable.

I don't know if I'm wrong for saying this, but isn't Search the most intensive operation on the forum, and the one that brings most of our big boards to their knees? If so, then, why aren't we given more options to choose alternate search engines? Granted, us big board admins are a small minority of users, but this also can't look good to visitors who see a busy forum branded with "vB" and find that yet another vB forum drags on for minutes at a time during searches.

Originally Posted by eoc_Jason
Likewise it is not very feasible for a webmaster to get a second server exclusively as a slave search DB. Sphinx on the other hand can return results in a fraction of a second, and the program is FREE... *sigh*...
We run on donations only. We did try a second server for a database-only server, but the additional fee killed us and we almost had to shut down entirely. Fortunately a hardware upgrade saved us. But we can't afford to keep throwing additional hardware at the problem just because ONE function of the forum--search--is dragging us down, and a solution is already available and used by some big-board admins already. Sphinx just seems to solve SO much for us, and yet we still have to resort to using plugins and/or hacks that need a fair amount of tweaking to fit our individual needs.

I hate to keep posting about this, but maybe we just need to make enough noise so that Sphinx finally gets official support and easy integration with vB that we can trust.
__________________
-= N =-
Reply With Quote
  #10  
Old 07 Jan 2010, 13:27
eoc_Jason's Avatar
eoc_Jason eoc_Jason is offline
 
Join Date: Dec 2001
Yeah, compiling & installing sphinx shouldn't require root access (though it would be preferred to setup proper init script, log rotate, etc...)

I'm attaching my stopword list that I use for sphinx, feel free to modify & use it as necessary. You can define a custom stopword list in MySQL too, but it would require re-indexing (ick).

I don't know if I'm wrong for saying this, but isn't Search the most intensive operation on the forum, and the one that brings most of our big boards to their knees? If so, then, why aren't we given more options to choose alternate search engines?
Yes, search IS the most intensive feature. Especially because vB is relying on MySQL to handle a job that it's really not designed to do (in such a capacity)! It's like MS Access... It's a great beginner database and for small projects, but you wouldn't try to run the Stock Exchange off of it would you?

Rather than adding blogging, personal pages, and all this other crap, vB needs to focus more on the CORE of their PRODUCT. People join a forum to EXCHANGE INFORMATION. If a person can't find that information then everything is for not... Search speed & relevance should be fixed as their constant #1 priorty... with every new vB update...

I have to agree, it doesn't have to be core code... vB like to brag about their plugin/proudct system to allow "flexibility"... Well why not make some freaking plugins for various search products?!?!?!

I would love to see Sphinx (open source) & dt-search (commercial) as a bare minimum... Lucene and mnogosearch would also be popular choices.

Heck, I would almost bet money that vBulletin could work out a deal with dt-search for a special forum-only search engine product that would be affordable to most (and somehow work out a split on the package sales).... It would boost both company's sales! dt-search could even continue that line with other forum packages and fulfill an ever growing market that really needs it!
Attached Files
File Type: txt stopwords.txt (4.0 KB, 14 views)
__________________
My Site: EXTREME Overclocking

Do not PM me with your iTrader problems or asking for the code. I will just delete your PM without reading it.
Reply With Quote
  #11  
Old 31 Jan 2010, 20:17
TosaInu's Avatar
TosaInu TosaInu is offline
 
Join Date: Jul 2004
I'm starting to think the upgrade was a very bad move. Partly my own fault by how I do the upgrades, but the board doesn't see the old attachments either.

We dumped Vbulletins own search index years ago in favour of their (unofficial/official?) Fulltext search as our board is too big.

What was the tablename used by Vbulletins search again? To confirm, that's the only method they offer now?
__________________
Ja mata
TosaInu
Reply With Quote
  #12  
Old 31 Jan 2010, 21:17
Paul M's Avatar
Paul M Paul M is offline
 
Join Date: Sep 2004
Real name: Paul M
Please do not start the same conversation across multiple threads. You have already posted in two others. Stick to one of them.
__________________
Former vBulletin.org Staff Member


Cable Forum
Please do not PM me about custom work - I no longer undertake any.
Note: I will not answer support questions via e-mail or PM - please use the relevant thread or forum.
Reply With Quote
Reply



Currently Active Users Viewing This Thread: 1 (0 members and 1 guests)
 
Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off


New To Site? Need Help?

All times are GMT. The time now is 01:03.

Layout Options | Width: Wide Color: