PDA

View Full Version : Google sitemap for the vB Archives. Redirect human and robots.


lierduh
10 Aug 2005, 10:13
Release V1.2 (9 Nov 2005)
* Higher sitemap priority rate is given to threads with new posts. So Google can index fresh threads first.

* Not recommending the original optional STEP 3 hack. To avoid potential Google penalty, my advice is to remove the STEP 3 hack.

Release V1.1a (12 Oct 2005)

* Bug fix only

Release V1.1 (9 Oct 2005)

* Can handle very large forums with more than 50,000 URLs per forum
URLs will be spanned through multiple files for each large forum.

* Created a function to detect search engine crawlers. The vB built-in
search engine detector can only identify about 3 or 4 search engines.
My function will detect over 20 search engine crawlers.

* Support forums hosted by web servers that do not support 'fix_pathinfo'
ie. instead of the usual 'archive/index.php/f-10.html' link. These
forums have a link as 'archive/index.php?f-10.html'.

* Alert about wrong directory permissions to help newbies.

* Automatically write index file to archive directory if the php
script can not write into the base vB directory.

* Bug fixes.


Objectives
==============

Create Google sitemap (http://www.google.com/webmasters/sitemaps/login) files and sitemap index file for vB archives, submit to Google by the Scheduled Tasks.
To have the vB Archive used as a mirror to the actual threads.
Google loves the nature of the archive pages, as they are static and do not contain repeated contents.
Google gauge pages heavily based on external links. We need to redirect these external thread links to the archive pages.
We often see vbulletin archive in the Google search results, but the users are taken to the archive page instead of the actual threads. We need to automatically redirect visitors to the actual threads instead of the archive. Otherwise the visitor either need to reclick for the Full Version or read the dull archive contents.


Q and A
==============
Q. Would the sitemap contain the links for hidden forums?
A. No, the forum permission was consulted while generating the sitemap files.

Q. How often are the sitemap files generated?
A. You decide and set in the Scheduled Tasks. The script can not be called by external user by default to prevent boring people killing your server.

Q. Is the sitemap file compressed.
A. Yes, the multiple sitemap files are gunziped according to Google sitemap standard to save bandwidth. Sitemap index file is not compressed, it is submitted as a normal xml file.

Q. Would the sitemaps include links for the normal threads? eg. showthread.php?t=1234...
A. No, it is unlikely Google will index your entire site if you feed it with all the combination of showthread links. It is better to let Google going through the more static archives. You will have a better chance for sure to have more thread contents indexed by Google this way.

Q. Why don't you go crazy about rewrite rules and do things like including thread title as the url.
A. I won't deny having keywords in the url is a good SEO strategy, but Google also does not like "Over Search Engine Optimized" web sites. Google has recently penalized a huge number of such sites. Sending them from page rank of 5, 6 to 0.

Q. Does sitemap really help?
A. Definitely, Google has done over 60,000 pages since I submitted my sitemaps a few days ago. Yahoo bots were visiting more pages than Google before the sitemap. I expect the total Google visits for this month will be exceeding Yahoo in the next one or two days.

What is involved?
==================
I have divided this hack into two steps. The first step involves unloading a php file. This enables the sitemap to be generated and submitted to Google.

The second step involves installing a Plugin using AdminCP. This sends all robots to the archive pages, preventing them viewing the actual threads.

For example, Google/Other Crawlers follows an external link to visit:
http://forums.mysite/showthread.php?t=1234&page=2

It will be told this page is permanently relocated to:
http://forums.mysite/archive/index.php/t-1234-p-2

This way you don't lose page rank gain from external links.

Install
=========
To install, follow the readme file.
To let me know you have installed this and let me send update information to you. Please click INSTALL (http://www.vbulletin.org/forum/vborg_miscactions.php?do=installhack&threadid=93980) .

Strategy
=========

It is unlikely Google/other Search Engine will index your entire site, especially due to the dynamic nature of the vbulletin forums. An archive sitemap will let Google concentrate on the real contents of your forums -- the threads. If Google needs to go through the endless member profile pages. It will get sick of it and just become tired.(sorry, perhaps robots can not become tired:)). What we can do is disallowing the crawling of unneccessary pages. My robots.txt contains:

#ALL BOTS
User-agent: *
Disallow: /admincp/
Disallow: /ajax.php
Disallow: /attachments/
Disallow: /clientscript/
Disallow: /cpstyles/
Disallow: /images/
Disallow: /includes/
Disallow: /install/
Disallow: /modcp/
Disallow: /subscriptions/
Disallow: /customavatars/
Disallow: /customprofilepics/
Disallow: /announcement.php
Disallow: /attachment.php
Disallow: /calendar.php
Disallow: /cron.php
Disallow: /editpost.php
Disallow: /external.php
Disallow: /faq.php
Disallow: /frm_attach
Disallow: /image.php
#Disallow: /index.php
Disallow: /inlinemod.php
Disallow: /joinrequests.php
Disallow: /login.php
Disallow: /member.php?
Disallow: /memberlist.php
Disallow: /misc.php
Disallow: /moderator.php
Disallow: /newattachment.php
Disallow: /newreply.php
Disallow: /newthread.php
Disallow: /online.php
Disallow: /payment_gateway.php
Disallow: /payments.php
Disallow: /poll.php
Disallow: /postings.php
Disallow: /printthread.php
Disallow: /private.php
Disallow: /profile.php
Disallow: /register.php
Disallow: /report.php
Disallow: /reputation.php
Disallow: /search.php
Disallow: /sendmessage.php
Disallow: /showgroups.php
Disallow: /showpost.php
Disallow: /subscription.php
Disallow: /usercp.php
Disallow: /threadrate.php
Disallow: /usercp.php
Disallow: /usernote.php

You perhaps have noticed I included index.php in there. Apparently Google regards http://forums.mysite/index.html as same as http://forums.mysite/
...but http://forums.mysite/index.php as a different file. The default vB templates include index.php as the internal link. That will spread your page rank on your home page! So it is better off not letting Google see this file.

If you have rewrite installed. Perhaps you could add to the .htaccess file:

RewriteCond %{QUERY_STRING} ^$
RewriteRule ^index.php$ / [R=301,L]

(if your forums are under http://site/forums/. Try: RewriteRule ^forums/index.php$ forums/ [R=301,L])

That will redirect /index.php to /, but only if no query_string is presented. ie. /index.php?do=mymod will not be redirected.

jugo
10 Aug 2005, 12:25
Very sweet.

Thanks alot....Befor eyou know it, My site will be on the #1 Spot for Game Servers and Web hosting.

hotrod1
10 Aug 2005, 14:40
When I try to run the scheduled task I get this error:

Google Sitemap Submit

Fatal error: Call to a member function on a non-object in /home/explosiv/public_html/forum/archive/forums_sitemap.php on line 98

Fatal error: Call to a member function on a non-object in /home/explosiv/public_html/forum/includes/functions.php on line 4240

lierduh
10 Aug 2005, 15:07
There is a bug in the RC2 for running a Task by "Run Now".

You can fix this bug by editing "admincp/cronadmin.php"

Around line 60 find:
$db = null;

Replace it with:

unset($db);

hotrod1
10 Aug 2005, 15:35
Thanks alot, that fixed the problem.

hotrod1
10 Aug 2005, 20:50
There is a bug in the RC2 for running a Task by "Run Now".

You can fix this bug by editing "admincp/cronadmin.php"

Around line 60 find:
$db = null;

Replace it with:

unset($db);

How exactly do I apply the changes to the archive files? There is no directions for doing it manually and I tried replacing the files but had no luck.

flaregun
10 Aug 2005, 21:23
I'm getting:

Warning: fopen(/forum/g_sitemap.xml): failed to open stream: Permission denied in /archive/forums_sitemap.php on line 245

Looks like it's trying to write to my forums dir and not my base dir...how can I change that?

lierduh
10 Aug 2005, 22:30
I'm getting:

Warning: fopen(/forum/g_sitemap.xml): failed to open stream: Permission denied in /archive/forums_sitemap.php on line 245

Looks like it's trying to write to my forums dir and not my base dir...how can I change that?

It will write the sitemap index file to the forums/vb base directory. Sitemap index is best located at the base directory! The base refers to the forum base, not your web root/base.

The index file (g_sitemap.xml) will be written to where forumdisply.php and showthreads.php are located.

flaregun
10 Aug 2005, 22:49
hmm i changed permissions and It still wont work, should I create a blank g_sitemap.xml file in the forums dir?

lierduh
10 Aug 2005, 23:09
How exactly do I apply the changes to the archive files? There is no directions for doing it manually and I tried replacing the files but had no luck.

I have provided the diff for the old file and the new file. Unfortunately I don't have time to write line by line instruction for the modifications.

I have attached the coloured diff from CVS. You should be able follow these to change the files.

(Note, I can not redistribute modified vB files, nor will I do that)
PS (I do not take paid work. I just don't have the time and patience to write the instruction.)

lierduh
10 Aug 2005, 23:13
hmm i changed permissions and It still wont work, should I create a blank g_sitemap.xml file in the forums dir?

What is the permission setting for your forums directory? At the base of the vb directory. Do this

#ls -l ../

and copy the result here.

flaregun
11 Aug 2005, 02:16
now I'm getting this, I change the persmissions in the archive folder, but I still get it.

Warning: gzopen(/public_html/forum/archive/sitemap_3.gz): failed to open stream: Permission denied in /archive/forums_sitemap.php on line 132

alkatraz
11 Aug 2005, 03:11
haven't upgraded to 3.5 yet but used your robots.txt file, thx!

MrNase
12 Aug 2005, 01:06
Nice but I can't use it.. My shared host doesn't allow me to change permissions for ./ :(

lierduh
12 Aug 2005, 01:13
Nice but I can't use it.. My shared host doesn't allow me to change permissions for ./ :(

Easy, find "/g_sitemap.xml" (3 places) and replace it with "/archive/g_sitemap.xml". The sitemap index file will be written to 'archive' directory.

lierduh
12 Aug 2005, 01:14
now I'm getting this, I change the persmissions in the archive folder, but I still get it.

Warning: gzopen(/public_html/forum/archive/sitemap_3.gz): failed to open stream: Permission denied in /archive/forums_sitemap.php on line 132

That means the permission is wrong. Please do a #ls -l and copy the result here.

jdingman
12 Aug 2005, 03:32
The install directions are very confusing. You may want to include very detailed instructions so anyone can install it. Such as where to find the Scheduled task stuff and all that. I'm very new and I had no idea where any of it was, so therefor I have not gotten it installed.

flaregun
13 Aug 2005, 02:21
That means the permission is wrong. Please do a #ls -l and copy the result here.

Permissions in what folder? I followed the instructions as best I could. Could you list all dir that must be changed?

note, my forum is in a sub folder from the root, is that what is messing this script up?

thanks,

- R

lierduh
13 Aug 2005, 05:10
Permissions in what folder? I followed the instructions as best I could. Could you list all dir that must be changed?

note, my forum is in a sub folder from the root, is that what is messing this script up?

thanks,

- R

The readme.txt:
The sitemap files will be created in the archive directory. The index
file will be created in the base vBulletin directory. It is important
a sitemap resides in the base directory. Google assumes you have the
permission to submit the sitemap when the sitemap is not located
in the sub directory.

It is perhaps easier if you post the part that you do not understand. ... and most importantly, list your current directory structure and what you have done. You said you have changed permissions, then what have you changed? I have asked twice for you to copy the result of #ls -l. I don't know what should I presume, be it you can't access to shell or your web server is Windows based? I don't know... I think we have serious communication problem, we do not understand each other by the sound of it.:)

lierduh
13 Aug 2005, 05:13
The install directions are very confusing. You may want to include very detailed instructions so anyone can install it. Such as where to find the Scheduled task stuff and all that. I'm very new and I had no idea where any of it was, so therefor I have not gotten it installed.

Go to your AdminCP (Admin Control Panel), look down on the left hand side.

KarateKid
17 Aug 2005, 12:29
Go to your AdminCP (Admin Control Panel), look down on the left hand side.

does anyone run this hack already successfully with RC2?

Rich
17 Aug 2005, 13:37
Hello,

I am running on a unix machine, my forums are in the root directory and not a sub directory. I uploaded the forums_sitemap.php to the archive folder and then I chmod it to 775. (meaning the archive directory)

I then chmod my root directory (public_html) to 775.

Then I went in, and added the scheduled task.

Then I ran the scheduled task and got these errors:

Fatal error: Call to a member function on a non-object in /home/habitats/public_html/archive/forums_sitemap.php on line 98

which is: $forums = $vbulletin->db->query("

Fatal error: Call to a member function on a non-object in /home/habitats/public_html/includes/functions.php on line 4240

which is: $vbulletin->db->unlock_tables();

Did I miss a step? I believe I covered everything.(I have complete access to my db including telenet authorization as well as query running ability. All of my usernames and passwords are set correctly.)

lierduh
17 Aug 2005, 14:48
See the third post within this thread.

Hello,

I am running on a unix machine, my forums are in the root directory and not a sub directory. I uploaded the forums_sitemap.php to the archive folder and then I chmod it to 775. (meaning the archive directory)

I then chmod my root directory (public_html) to 775.

Then I went in, and added the scheduled task.

Then I ran the scheduled task and got these errors:

Fatal error: Call to a member function on a non-object in /home/habitats/public_html/archive/forums_sitemap.php on line 98

which is: $forums = $vbulletin->db->query("

Fatal error: Call to a member function on a non-object in /home/habitats/public_html/includes/functions.php on line 4240

which is: $vbulletin->db->unlock_tables();

Did I miss a step? I believe I covered everything.(I have complete access to my db including telenet authorization as well as query running ability. All of my usernames and passwords are set correctly.)

flaregun
18 Aug 2005, 05:14
If I call the script directly: I get these errors:

Warning: array_keys(): The first argument should be an array in /includes/class_core.php on line 1375

Warning: Invalid argument supplied for foreach() in /includes/class_core.php on line 1375

Warning: array_keys(): The first argument should be an array in /includes/class_core.php on line 1390

Warning: Invalid argument supplied for foreach() in /includes/class_core.php on line 1390

Brandon Sheley
18 Aug 2005, 05:52
i don't understand this part of the install ? i'm very new to the new setup.. 2 nights now..lol

You need to change the directory permission so that the script can write
to the base and archive directory. If your server runs apache and apache
is run under apache user and apacher group.

I normally assign the permission this way:

#chown apache.MYUSER_GROUP archive
#chmod 775 archive

MYUSER_GROUP is the user group my login belongs to. #ls -l will show that.

775 will let apache (the script) and me (after I log in) add/change files.

Set the same permission to the base vB directory.


is this a file i chmod :ermm: im lost

lierduh
19 Aug 2005, 01:31
Which version of vB do you use? The line numbers do not match. Otherwise, have you modified class_core.php file? perhaps due to the installation of another hack?


If I call the script directly: I get these errors:

Warning: array_keys(): The first argument should be an array in /includes/class_core.php on line 1375

Warning: Invalid argument supplied for foreach() in /includes/class_core.php on line 1375

Warning: array_keys(): The first argument should be an array in /includes/class_core.php on line 1390

Warning: Invalid argument supplied for foreach() in /includes/class_core.php on line 1390

lierduh
19 Aug 2005, 02:14
i don't understand this part of the install ? i'm very new to the new setup.. 2 nights now..lol

You need to change the directory permission so that the script can write
to the base and archive directory. If your server runs apache and apache
is run under apache user and apacher group.

I normally assign the permission this way:

#chown apache.MYUSER_GROUP archive
#chmod 775 archive

MYUSER_GROUP is the user group my login belongs to. #ls -l will show that.

775 will let apache (the script) and me (after I log in) add/change files.

Set the same permission to the base vB directory.


is this a file i chmod :ermm: im lost

chmod and chown are Unix commands.

I did not realise many amdins are newbies when comes to system admin.

Basically you need to change the permission of the directories so that the php script can write files (sitemaps) to them. I understand some of your providers probably do not even provide shell access to the server, only some sort of user control panel for admin purpose. I am afraid I can not explain how to use these control panels as I have not seen one. Users of such ISP control panel may be able to provide more information. People who do not know what to do should provide information such as what sort of control panel do you use, what is in there, what have you tried.

I will try to explain in general:

When I refer to the base directory, I refer to the forum base directory. Some of you might have set up the forums this way:

http://www.mysite.com/forums/

That means http://www.mysite.com/index.html will be in the root directory of the domain. The base directory for the vB will be ./forums under the web root.

If your forums are set up as http://forums.mysite.com/
Then the base vB directory will be the root directory for the domain (forums.mysite.com)

This hack needs to write to
1) The base vB directory (where you find showthread.php file)
2) The archive directoy (where you find archive.css file)

So you need to make these two directories writable for the php script, OR world writable.

A little info for the Unix chomd command.

1: executable
2: writable
4: readable

1+4 = 5 means readable and executable. Directories should be at least 1, 5 for a directory means visitor can read (list) directory contents, unless a index file is found.

php files only need to be readable: 4

If we need to write to the directory, then the permission needs to be:
1+4+2 = 7

There are three permission for each file/directory. 1) User, 2) Group, 3) World/anyone

User means the Unix logged in user, or the user the script runs as (typically apache or nobody is used by web servers).

Group means the user group the user belongs to. Typically the apache server runs the script as 'apache' or 'nobody' group.

World means the permission for everyone. They can be any user who logs into the web server. Naturally it includes the user that the php scripts runs as.

If you do a
#chmod 777 a_directory
The first 7 means the user can read, execute, write to a_directory.
The second 7 means the user group can read, execute, write to a_directory.
The third 7 means anyone can do these tasks.

So a 777 permission will sure let scripts write stuff to the directory, but with less security.

chown is another Unix command to change the owership of a file/directory.

#chown myusername.mygroupname a_directory
will change the directory's owner to 'myusername', and make the directory belongs to 'mygroupname' group.

All above refers to Unix/Linux usage, Windows probably uses some mouse clicks, but the essence should be the same regarding user/group and permission.

Now that I have spent time and effort to write, I hope the people who ask questions can also take the time and effort to write questions.:)

flaregun
19 Aug 2005, 04:26
Which version of vB do you use? The line numbers do not match. Otherwise, have you modified class_core.php file? perhaps due to the installation of another hack?

VB3.5 rc2, no mods to that file :(

Brandon Sheley
19 Aug 2005, 04:47
chmod and chown are Unix commands.

I did not realise many amdins are newbies when comes to system admin.

Basically you need to change the permission of the directories so that the php script can write files (sitemaps) to them. I understand some of your providers probably do not even provide shell access to the server, only some sort of user control panel for admin purpose. I am afraid I can not explain how to use these control panels as I have not seen one. Users of such ISP control panel may be able to provide more information. People who do not know what to do should provide information such as what sort of control panel do you use, what is in there, what have you tried.

I will try to explain in general:

When I refer to the base directory, I refer to the forum base directory. Some of you might have set up the forums this way:

http://www.mysite.com/forums/

That means http://www.mysite.com/index.html will be in the root directory of the domain. The base directory for the vB will be ./forums under the web root.

If your forums are set up as http://forums.mysite.com/
Then the base vB directory will be the root directory for the domain (forums.mysite.com)

This hack needs to write to
1) The base vB directory (where you find showthread.php file)
2) The archive directoy (where you find archive.css file)

So you need to make these two directories writable for the php script, OR world writable.

A little info for the Unix chomd command.

1: executable
2: writable
4: readable

1+4 = 5 means readable and executable. Directories should be at least 1, 5 for a directory means visitor can read (list) directory contents, unless a index file is found.

php files only need to be readable: 4

If we need to write to the directory, then the permission needs to be:
1+4+2 = 7

There are three permission for each file/directory. 1) User, 2) Group, 3) World/anyone

User means the Unix logged in user, or the user the script runs as (typically apache or nobody is used by web servers).

Group means the user group the user belongs to. Typically the apache server runs the script as 'apache' or 'nobody' group.

World means the permission for everyone. They can be any user who logs into the web server. Naturally it includes the user that the php scripts runs as.

If you do a
#chmod 777 a_directory
The first 7 means the user can read, execute, write to a_directory.
The second 7 means the user group can read, execute, write to a_directory.
The third 7 means anyone can do these tasks.

So a 777 permission will sure let scripts write stuff to the directory, but with less security.

chown is another Unix command to change the owership of a file/directory.

#chown myusername.mygroupname a_directory
will change the directory's owner to 'myusername', and make the directory belongs to 'mygroupname' group.

All above refers to Unix/Linux usage, Windows probably uses some mouse clicks, but the essence should be the same regarding user/group and permission.

Now that I have spent time and effort to write, I hope the people who ask questions can also take the time and effort to write questions.:)

i wouldn't go as far as to say im a newbie...
i know how to chmod,
as you notice my question is where is the file if i'm chmoding...

you said something about shell access ? whats that mean, i'm my own hosting reseller, i have full access to my host.
If you woul be so kind to tell me where this file is,, thats all i ask :)
btw its vb 3.5.0 rc2
the newest one as of this post

thank you,
-LM

lierduh
19 Aug 2005, 06:02
i wouldn't go as far as to say im a newbie...
i know how to chmod,
as you notice my question is where is the file if i'm chmoding...

you said something about shell access ? whats that mean, i'm my own hosting reseller, i have full access to my host.
If you woul be so kind to tell me where this file is,, thats all i ask :)
btw its vb 3.5.0 rc2
the newest one as of this post

thank you,
-LM

I spent all that time, and you don't even read it probably?:(

You need to change the directory permission so that the script can write
to the base and archive directory.

Brandon Sheley
19 Aug 2005, 06:52
the public_html folder ?

Brandon Sheley
22 Aug 2005, 00:45
the public_html folder ?


so.....
do a assume that this is correct ? to chmod the forum directory ?
thx for the quick help :rolleyes:

turkforum
25 Aug 2005, 02:38
Thank you very much lierduh, it is really great vB3.5 Extensions..
I will install it as soon as i can but i will have a question for you..

To keep my opened treads in order,I had to have so many subforums on my board, over then 1 million post..there is even no treads opened under main forums but all treads have placed in subforums of the forums..Even though my archive is pr 4 (main forum pr5), google just doesn't get in to those subforums and not indexing my treads..Pls, correct me if i am wrong..With this extension, can we solve this problem?

I use Vb 3.5 rc2
Linux rhel 4.0
my base directory is home/account/Public_html/
And archive at the public_html/archive/

Best regards..

lierduh
25 Aug 2005, 04:52
Yes, you will definity have more pages indexed by Google by using sitemaps.

With my experience, if a web site's structure is too complicated, Google will be hesitated to go deeper. Sitemaps files basically advice Google that the URLs you want Google to crawl. Despite Google claims their crawlers do not base their crawling habbit entirely on sitemaps. In reality, they use sitemaps extensively. They crawl the URLs with higher "priority" first. My sitemaps have the forum list set to higher priority only after the home archive page, followed the thread pages.

My script will only write one small sitemap index file to the base vBulletin directory (g_sitemap.xml), all the sitemaps files are written to 'archive' directory. (sitemap_11.gz, sitemaps_15.gz...)

Thank you very much lierduh, it is really great vB3.5 Extensions..
I will install it as soon as i can but i will have a question for you..

To keep my opened treads in order,I had to have so many subforums on my board, over then 1 million post..there is even no treads opened under main forums but all treads have placed in subforums of the forums..Even though my archive is pr 4 (main forum pr5), google just doesn't get in to those subforums and not indexing my treads..Pls, correct me if i am wrong..With this extension, can we solve this problem?

I use Vb 3.5 rc2
Linux rhel 4.0
my base directory is home/account/Public_html/
And archive at the public_html/archive/

On my base directory I see files like
core.1000 sizes are around 22.000KB
I got more than maybe 30 of these
But Under Archive i can not see a map file
Still looking at the issue on my side

My apache runs as nobody and both directorys have 777 for nobody

Best regards..

turkforum
27 Aug 2005, 00:05
When I check your arcive page i see that the threads are shown as

/archive/index.php/f-10.html

But mine is still

archive/index.php?f-372.html

How can i change the Questionmark with the /

Where did i make a mistake please advice

lierduh
27 Aug 2005, 00:46
When I check your arcive page i see that the threads are shown as

/archive/index.php/f-10.html

But mine is still

archive/index.php?f-372.html

How can i change the Questionmark with the /

Where did i make a mistake please advice

I am glad you mention this. You need to turn on Apache's AcceptPathInfo.
http://httpd.apache.org/docs/2.0/mod/core.html#acceptpathinfo

I will add this to the FAQ.

turkforum
27 Aug 2005, 09:42
I am glad you mention this. You need to turn on Apache's AcceptPathInfo.
http://httpd.apache.org/docs/2.0/mod/core.html#acceptpathinfo

I will add this to the FAQ.

Will not having this feature effect the google indexing and caching results?
iF yes you mentioned it was for apache 2.0
Is there any other way to make this happen?

thnx

Yorixz
03 Sep 2005, 18:41
This mod looked very interesting, install went fine but now when I try to run it I get this errors:
Warning: fopen(/home/ftpusers/otfans/html/forums/g_sitemap.xml) [function.fopen]: failed to open stream: Permission denied in /archive/forums_sitemap.php on line 245

Warning: fwrite(): supplied argument is not a valid stream resource in /archive/forums_sitemap.php on line 247

Fatal error: Call to a member function query() on a non-object in /home/ftpusers/otfans/html/forums/archive/forums_sitemap.php on line 98

Fatal error: Call to a member function unlock_tables() on a non-object in /home/ftpusers/otfans/html/forums/includes/functions.php on line 4240

What am I doing wrong? I'll be very glad to hear it ;)

David_R
03 Sep 2005, 18:51
Hi,
does this hack supports vb 3.x versions ?
does this hack has a feature to capture
title
Keywords
Description as well ?

thanks

Yorixz
03 Sep 2005, 19:03
Hi,
does this hack supports vb 3.x versions ?
does this hack has a feature to capture
title
Keywords
Description as well ?

thanks

It doesn't support vB 3.x, that's why it's in the vB 3.5 category ;)

David_R
03 Sep 2005, 21:36
It doesn't support vB 3.x, that's why it's in the vB 3.5 category ;)

hi,
something similar exists in vb 3.5 extensions, but does supports 3.x as well, can you compare features of both these hacks ?

Yorixz
04 Sep 2005, 09:46
hi,
something similar exists in vb 3.5 extensions, but does supports 3.x as well, can you compare features of both these hacks ?

I'm afraid I can't since I can't get this hack working on my forum right now, if it's running here I'll post my experience and such for you.

vauge
07 Sep 2005, 11:12
I really really like this idea, but is the below a concern? I do not want to do negative thing for my forums.

Don't employ cloaking or sneaky redirects.http://www.google.com/webmasters/guidelines.html

lierduh
08 Sep 2005, 00:11
I really really like this idea, but is the below a concern? I do not want to do negative thing for my forums.

http://www.google.com/webmasters/guidelines.html

301 Redirect is Google preferred way. Google penalize web site with duplicated contents. So if you have two URLs showing the same contents, Google prefers you redirect from one URL to the other. Have both will attract penlty.

What we do here is not sneaky. We have the actual contents, we just want Google to show one version of it. We do not want Google to give us higher page rank than the pages actually worth, we just want Google to index the actually contents, instead of looping through the endless internal links.

I moved my forums to a new domain a few weeks ago (just before I released this hack). There are so far 150,000 pages indexed by Google already. Without sitemap Yahoo has only indexed just over 20,000 pages.

lierduh
08 Sep 2005, 00:14
This mod looked very interesting, install went fine but now when I try to run it I get this errors:
Warning: fopen(/home/ftpusers/otfans/html/forums/g_sitemap.xml) [function.fopen]: failed to open stream: Permission denied in /archive/forums_sitemap.php on line 245


What am I doing wrong? I'll be very glad to hear it ;)

As the error suggests. The permission is denied in the '/home/ftpusers/otfans/html/forums/' directory. That is the base vB directory. Where I described very clearly as "the same directory where you find showthread.php etc.".

People keep asking the directory name to change. I do not know, because that can be anything. In your case it is 'forums', others may be 'public_html'...

KarateKid
12 Sep 2005, 05:49
hm, with vb 3.5 rc3, I get the following errors when accessing the forums_sitemap.php:


Warning: array_keys(): The first argument should be an array in /home/htdocs/web0/html/forum/includes/class_core.php on line 1438

Warning: Invalid argument supplied for foreach() in /home/htdocs/web0/html/forum/includes/class_core.php on line 1438

Warning: array_keys(): The first argument should be an array in /home/htdocs/web0/html/forum/includes/class_core.php on line 1453

Warning: Invalid argument supplied for foreach() in /home/htdocs/web0/html/forum/includes/class_core.php on line 1453



Unable to add cookies, header already sent.
File: /home/htdocs/web0/html/forum/includes/class_core.php
Line: 1438


Any ideas? :confused:

lierduh
13 Sep 2005, 02:32
No idea, I have upgraded to RC3. The hack works without any further modification.

If you have changed the php file, make sure your upgrade does not copy over them. I did not copy new files to 'archive directory'.

buro9
14 Sep 2005, 15:50
Hey lierduh,

Thanks for your hacks over the years, always fine jobs :)

I'm another one wanting instructions for the archive/index.php and archive/global.php

I've applied the rest of the instructions, and all appears to be working fine. And I do get the concept of removing the PDA crud, and redirecting humans out of the archive... but I'm a little lost with the diff output you've supplied... a more primitive ><+/- lines would've confused me less ;)

If you do ever find time to update the instructions, it will be very appreciated by quite a few of us. And I realise how much of a pain that is, as I'm supposed to be porting my own hacks and really don't like the idea much.

Cheers

DavidK

lierduh
14 Sep 2005, 23:26
Hey lierduh,

Thanks for your hacks over the years, always fine jobs :)

I'm another one wanting instructions for the archive/index.php and archive/global.php

I've applied the rest of the instructions, and all appears to be working fine. And I do get the concept of removing the PDA crud, and redirecting humans out of the archive... but I'm a little lost with the diff output you've supplied... a more primitive ><+/- lines would've confused me less ;)

If you do ever find time to update the instructions, it will be very appreciated by quite a few of us. And I realise how much of a pain that is, as I'm supposed to be porting my own hacks and really don't like the idea much.

Cheers

DavidK

Hello DavidK,

The reason it has confused you is my diff were based on the RC2 and modified RC2 files. I presume now you have got RC3 files.

I have attached the diff between RC3 and modified RC2 files (this will confuse everyone else, but not you:)). Otherwise, if you still have the RC2 files, then the coloured diff will make a lot of sense.:)

buro9
15 Sep 2005, 07:32
Hello DavidK,

The reason it has confused you is my diff were based on the RC2 and modified RC2 files. I presume now you have got RC3 files.

I have attached the diff between RC3 and modified RC2 files (this will confuse everyone else, but not you:)). Otherwise, if you still have the RC2 files, then the coloured diff will make a lot of sense.:)

Much better :D Thanks :)

And yes... it will confuse everyone now ;)

Brandon Sheley
15 Sep 2005, 09:37
i can't seem to get the files to be created :(

from the instructions, which where just about to much info for me..
i upload the forums_sitemap.php to arcives then chmom the arcives folder 775,
then make the sceduled task, run it, and the files should be made..
what part am i missing ? thankyou..

at rc3 now.

Yorixz
18 Sep 2005, 12:01
Warning: array_keys() [function.array-keys]: The first argument should be an array in /home/ftpusers/otfans/html/forums/includes/class_core.php on line 1438

Warning: Invalid argument supplied for foreach() in /home/ftpusers/otfans/html/forums/includes/class_core.php on line 1438

Warning: array_keys() [function.array-keys]: The first argument should be an array in /home/ftpusers/otfans/html/forums/includes/class_core.php on line 1453

Warning: Invalid argument supplied for foreach() in /home/ftpusers/otfans/html/forums/includes/class_core.php on line 1453

I'm also still having that errors, weird =/ lierduh, could it be that you've tested it with PHP4 rather than PHP5?

lierduh
18 Sep 2005, 23:46
Warning: array_keys() [function.array-keys]: The first argument should be an array in /home/ftpusers/otfans/html/forums/includes/class_core.php on line 1438

Warning: Invalid argument supplied for foreach() in /home/ftpusers/otfans/html/forums/includes/class_core.php on line 1438

Warning: array_keys() [function.array-keys]: The first argument should be an array in /home/ftpusers/otfans/html/forums/includes/class_core.php on line 1453

Warning: Invalid argument supplied for foreach() in /home/ftpusers/otfans/html/forums/includes/class_core.php on line 1453

I'm also still having that errors, weird =/ lierduh, could it be that you've tested it with PHP4 rather than PHP5?

I do not have php5 to test.

Instead of calling the script directly, have you tried using the Schedule Task's "Run Now" button?

thenetbox
19 Sep 2005, 00:32
thank you very much! I just started trying to do this my self but found this :D yay!

Yorixz
20 Sep 2005, 17:53
I do not have php5 to test.

Instead of calling the script directly, have you tried using the Schedule Task's "Run Now" button?

Yes, that results into Warning: gzopen(/home/ftpusers/otfans/html/forums/archive/sitemap_11.gz) [function.gzopen]: failed to open stream: Permission denied in /archive/forums_sitemap.php on line 132

Warning: gzwrite(): supplied argument is not a valid stream resource in /archive/forums_sitemap.php on line 72

Warning: gzwrite(): supplied argument is not a valid stream resource in /archive/forums_sitemap.php on line 87

Warning: gzwrite(): supplied argument is not a valid stream resource in /archive/forums_sitemap.php on line 87

Warning: gzwrite(): supplied argument is not a valid stream resource in /archive/forums_sitemap.php on line 87

Warning: gzwrite(): supplied argument is not a valid stream resource in /archive/forums_sitemap.php on line 87

Warning: gzwrite(): supplied argument is not a valid stream resource in /archive/forums_sitemap.php on line 87

Warning: gzwrite(): supplied argument is not a valid stream resource in /archive/forums_sitemap.php on line 87

Warning: gzwrite(): supplied argument is not a valid stream resource in /archive/forums_sitemap.php on line 87

Warning: gzwrite(): supplied argument is not a valid stream resource in /archive/forums_sitemap.php on line 87

Warning: gzwrite(): supplied argument is not a valid stream resource in /archive/forums_sitemap.php on line 87
for like thousand times.

Weird thing is that I'm 100% sure that I chmodded everything correctly. (It's on a debian host, if that is relevant)

lierduh
21 Sep 2005, 00:10
Yes, that results into Warning: gzopen(/home/ftpusers/otfans/html/forums/archive/sitemap_11.gz) [function.gzopen]: failed to open stream: Permission denied in /archive/forums_sitemap.php on line 132
for like thousand times.

Weird thing is that I'm 100% sure that I chmodded everything correctly. (It's on a debian host, if that is relevant)

That means the script can't write to archive directory. What is the persmission for this directory?

Yorixz
21 Sep 2005, 06:04
That means the script can't write to archive directory. What is the persmission for this directory?

Right now it's 0777 and it's working, I thought I already changed it some days ago, it was 0775 (which should be enough as far as I know)

Thanks for your support ;)

jribz
24 Sep 2005, 21:34
OK I have this installed and it seems to be working as described. When viewing who's online I can see the search engines looking at threads with url's similar to the following.

/archive/index.php/t-6044.html

When clicked by a human user they are redirected to

/showthread.php?t=6044

So I can only assume this works since all the spiders on the board are seeing the archived version, and when users click they are taken to the full version.

I do have a couple of questions however, since I am not too familiar with Google Sitemaps. Does the script automatically upload the sitemap to Google without any further action aside from making the Scheduled Task? I have made the task in the manager and run it (every day at 1AM), and it has created the files ( [xml] in forum root and specific [gz] forums in archive folder).

[upon further thinking, would I be correct in saying I need to let google know about the xml file in the root of the site?]

What is the affect on other search engines? I see yahoo, msn, ask, and others viewing similar archives, so I assume the affect is similar to what is happening with google, but they are not getting a map.

Last question, does this basically mean that other SEO hacks are not required, since the spiders will never see the rewritten urls anyhow?

Allot of assumptions up there. :ermm:

Oh and one last thing, I do use mod rewrite on my server for many sites, and have had no issues, but the command you say to enter to resolve the index.php issue seems to bog the server, making any urls that point directly to it, as in /index.php, not load. I suppose this could be a conflict within my htaccess file, but not too certain where to start looking. (however, I did try it with only the codes you provided (and RewriteEngine on) and have the same problem.

Thanks for your time and the hack.

lierduh
25 Sep 2005, 01:16
Each time the script is run.

1) It re-generates all the sitemaps. Makes sense because you have more threads/posts now.

2) It notifies Google about new sitemaps being available. You will notice Google fetches these files soon afterwards.

If you have the scheduled task logged. The end of the log is the response sent by Google. It should say:

======================
Sitemap Notification Received

Your Sitemap has been successfully added to our list of Sitemaps to crawl. If this is the first time you are notifying Google about this Sitemap, please add it via http://www.google.com/webmasters/sitemaps so you can track its status. Please note that we do not add all submitted URLs to our index, and we cannot make any predictions or guarantees about when or if they will appear.
======================

One thing to remember is under your Google sitemap account. The 'last submitted' does not reflect the auto ping/submit. It only logs the manual submit you do by push the button at Google sitemap account.

Other search engine do not accept sitemaps as far as I know, at least not using Google's sitemap format. The redirects however works for all the major search engine which I believe benefits the indexing.

I do not recommend using SEO at least for existing sites. The chances are Google has already indexed part of your forums using links like /showthread.php?t=12345. Now if you rewrite all the URLs, Google will have two copies of the same contents for that thread. (one with the traditional URL, one from your new rewrite URL). This will lead Google panalizing your site ranking. Some smarter SEO scheme redirect your old URL to the new one does not suffer this, but it becomes a very complicated add-on. It may break every time a mojor vB version is released. I elect not to use such scheme. For the record, I used URL rewrite SEO back in vB2 era. In my .htaccess, I still need to redirect my old rewritten vB2 URLs in fear of Google penalizing my site. Basically the vB archive is very static, it was designed for SEO in the first place anyway. Think about how many clickable links a normal showthread brings to you, it becomes a mess for search engines no matter how smart your SEO is.

For index.php redirect, my working version is:

RewriteEngine on

#...

RewriteCond %{QUERY_STRING} ^$
RewriteRule ^index.php$ / [R=301,L]

If it does not work for you, I would check the http logs. Failing that, log your rewrite! (you need to do this in your http.conf, consult apache manual for log level etc.:))

jribz
25 Sep 2005, 01:40
Thanks for the reply, that clears up alot... I had to verify site ownership via google, the logs for the cron showed exactly that.

One thing I notice however, while looking now, is that the google spider is viewing a few regular threads, while the google adsense spider is viewing the archive, also viewing the archive is msnbot yahoslurp and askjeeves.

Wonder why google is seeing a regular thread now.

Going to look into the htaccess in a bit.

lierduh
25 Sep 2005, 02:12
Thanks for the reply, that clears up alot... I had to verify site ownership via google, the logs for the cron showed exactly that.

One thing I notice however, while looking now, is that the google spider is viewing a few regular threads, while the google adsense spider is viewing the archive, also viewing the archive is msnbot yahoslurp and askjeeves.

Wonder why google is seeing a regular thread now.

Going to look into the htaccess in a bit.

vB detects search engine by the following string:
google|msnbot|yahoo! slurp (in the init.php file)

Any user agent matchines that will be detected as search engine.
So check your http access log and see what user agent does 'Google' use. It should say "Googlebot".

jribz
25 Sep 2005, 02:38
vB detects search engine by the following string:
google|msnbot|yahoo! slurp (in the init.php file)

Any user agent matchines that will be detected as search engine.
So check your http access log and see what user agent does 'Google' use. It should say "Googlebot".Looking at the logs I have 2 versions of Google coming up...

Googlebot/2.1; +http://www.google.com/bot.html)"
Google/2.1"

The first one looks legit, but why is the second not sending an origen? They both have the same IP.

yessir
03 Oct 2005, 17:11
/me installed

Thank you!

thenetbox
03 Oct 2005, 21:29
I'm not very clear about how to complete step 3. It says:


This step involves change of code. There are number of places you need
to change. I have include the diff result between the final files and
original files. Two files involved are archive/global.php and archive/index.php

Where do I find the instructions to make the changes? :)

DefenceTalk
04 Oct 2005, 16:48
Does this work with the latest version? Anyone?

Thanks

thenetbox
04 Oct 2005, 17:06
Does this work with the latest version? Anyone?

Thanks

Steps 1 and 2 worked great for me :D . I don't know how to do step 3 (optional)

jribz
04 Oct 2005, 20:02
Steps 1 and 2 worked great for me :D . I don't know how to do step 3 (optional)Step 3 is actually pretty simple, but not with the instructions in the download, look at the attachments in this post here (http://www.vbulletin.org/forum/showpost.php?p=773324&postcount=49).

The lines preceeded by << mean remove, or comment out, and replace with the following lines preceeded with >> or add after.

VaaKo
05 Oct 2005, 11:14
will this hack bring spiders to my forum?

dutchbb
05 Oct 2005, 13:23
Ok I will give this a try. We do't have too mutch pages listed (642) and most of them are from pages with no real content (memberlist and such)

So I will install this and let you know what the results are in about 3-4 months.

Ty.

VaaKo
05 Oct 2005, 14:48
this is a very confusing hack :confused:
it's not working I guess, I ran the task and it did create the sitemap files from 36 to 81
but after that it gives me loads of errors, saying it couldn't create g_sitemap.xml and something like that
so I CHMODED the /html/ folder and tried again, it gave me this time an error donno what it is, so I refresh

is it working or no?

VaaKo
05 Oct 2005, 14:50
this is the error:

Warning: fopen(http://www.google.com/webmasters/sitemaps/ping?sitemap=http%3A%2F%2Fwww.oneforum.org%2Fg_sitemap.xml): failed to open stream: Connection timed out in /archive/forums_sitemap.php on line 264

Warning: feof(): supplied argument is not a valid stream resource in /archive/forums_sitemap.php on line 268

Warning: fread(): supplied argument is not a valid stream resource in /archive/forums_sitemap.php on line 27

....

dutchbb
05 Oct 2005, 15:55
Hmz not easy....

I normally assign the permission this way:

#chown apache.MYUSER_GROUP archive
#chmod 775 archive

MYUSER_GROUP is the user group my login belongs to. #ls -l will show that.
I understand chmod 755 (it was actually already set to 777), but what do you mean with: '#chown apache.MYUSER_GROUP archive' and 'MYUSER_GROUP is the user group my login belongs to. #ls -l will show that' ? That's chinese to me !

PLease explain, thank you

lierduh
05 Oct 2005, 21:52
Hmz not easy....


I understand chmod 755 (it was actually already set to 777), but what do you mean with: '#chown apache.MYUSER_GROUP archive' and 'MYUSER_GROUP is the user group my login belongs to. #ls -l will show that' ? That's chinese to me !

PLease explain, thank you

The problem is no one wants to know/learn about the very basics of directory/file permission.

If you have 777, it means everyone can write to it. You won't need to bother with 'chown'. The example I provided was for someone who knows a bit more and wanting to have the more secure way. I will rewrite the instruction at next release and have one set of simple instruction for newbies. In the meantime, if you are still interested about directory permissions, please read:

http://www.vbulletin.org/forum/showpost.php?p=759106&postcount=27

VaaKo
05 Oct 2005, 22:14
what about my error?

lierduh
05 Oct 2005, 22:32
what about my error?

It means your web server could not connect to Google at port 80. It is possible Google was down at the time, or your web server's firewall prevents the script doing so. I just tried manually:

http://www.google.com/webmasters/sitemaps/ping?sitemap=http%3A%2F%2Fwww.oneforum.org%2Fg_sitemap.xml

It returned:
=========
Sitemap Notification Received

Your Sitemap has been successfully added to our list of Sitemaps to crawl. If this is the first time you are notifying Google about this Sitemap, please add it via http://www.google.com/webmasters/sitemaps so you can track its status. Please note that we do not add all submitted URLs to our index, and we cannot make any predictions or guarantees about when or if they will appear.
===============

You can always comment out those lines in the script and manually submit the sitemaps.

buro9
06 Oct 2005, 05:28
To those unsure about this hack, I would say persevere.

My number of Google spiders hasn't increased dramatically, but they're being far more efficient.

A month ago the number of pages I had in Google was only 66,000, now I have over 846,000 pages indexed: http://www.google.co.uk/search?q=site%3Awww.bowlie.com

It really is worth it, although the code edits in archive/index.php can be a bugger to get your head around at first.

dutchbb
06 Oct 2005, 11:33
The problem is no one wants to know/learn about the very basics of directory/file permission.
I do. That's why I replied:
PLease explain, thank you
The more I can lurn , the better :)

If you have 777, it means everyone can write to it. You won't need to bother with 'chown'. The example I provided was for someone who knows a bit more and wanting to have the more secure way. I will rewrite the instruction at next release and have one set of simple instruction for newbies. In the meantime, if you are still interested about directory permissions, please read:

http://www.vbulletin.org/forum/showpost.php?p=759106&postcount=27
thank you, I will look into it and try to setup the hack

dutchbb
06 Oct 2005, 12:51
BTW I created a google sitemap account after the task. Is this needed, or do these need to be joined or something? And how does the vb task send the sitemaps if the google account wasn't even setup?

Also:
Check your web log and see if the Search Engine visits are being redirected.
You should see 301 (Permanent Redirect) for the actual thread visit and
then 200 (ok) for the archive pages.
What weblog is there in vbulletin, if you mean currently active users: i don't see it there.


Alot of newbie questions again :D

PixelFx
06 Oct 2005, 14:44
Ok I know chmod, and set my archive/internal files to 777 .. I still get



Warning: fwrite(): supplied argument is not a valid stream resource in /archive/forums_sitemap.php on line 247

Warning: fwrite(): supplied argument is not a valid stream resource in /archive/forums_sitemap.php on line 253

Warning: fclose(): supplied argument is not a valid stream resource in /archive/forums_sitemap.php on line 254



I also did my root as 750 ..

my direction setup is url/forum/index.php for forums, I changed your robots text to disallow: /forum/files/ etc

and I still get error above, I also setup my .htaccess etc,

I'm running vbulletin v3.5.0 gold.

any sugguestions?

VaaKo
06 Oct 2005, 15:33
Ok I know chmod, and set my archive/internal files to 777 .. I still get



I also did my root as 750 ..

my direction setup is url/forum/index.php for forums, I changed your robots text to disallow: /forum/files/ etc

and I still get error above, I also setup my .htaccess etc,

I'm running vbulletin v3.5.0 gold.

any sugguestions?

i'm getting the same error!

PixelFx
06 Oct 2005, 19:16
I've tried to do the compare html, but its much harder than just making a txt file, saying remove here, add here, place under this, place over that etc. As far as I know this hack is implete or just not designed for 3.5.0 .. although I'm sure its a good hack for RC3 users, gold I can't get it to work.

lierduh
06 Oct 2005, 22:16
Ok I know chmod, and set my archive/internal files to 777 .. I also did my root as 750 ..
any sugguestions?

We are really having fun with permissions, aren't we?
Your permissions are wrong, and by the sound of it, you do not fully understand chmod.:) The good thing is at least you explained what you have done instead of saying: "I have the errors, tell me how to fix".

Please list your directory permissions here. (two #ls -l output) Let me guess, you changed the permission for the files, not the DIRECTORIES as my instruction. You said you changed your root? (I highly doubt,:) do you mean '/' or the base directory of your vB install? When you have 750, what are the ownership of the directory? User ownership and group owership? It is very weird to set a web directory to 750, usually you would need to at least have 751, or 755.

=============
Instruction I will include in the next readme file:

The script will need to write files to two directories.

1) The base vB directory
2) the archive directory.

You will need to change the DIRECTORY permissions for these two directories.

Let's presume your directory structure is:

~public_html
('~' means it is under your user home directory, the acture directory should be something like: /home/your_isp_user_name/public_html)
~public_html/showthread.php
...
~public_html/archive/
~public_html/archive/index.php
...

You need to do:

1) change the permission for the vB base directory. So
#chmod 777 public_html
(or #chmod 777 ~public_html if you are not already under your home directory)

2) change the permission for the archive directory. Do
#chmod 777 public_html/archive
(or #chmod 777 ~public_html/archive
or #chmod 777 archive
if you use some kind of web based control panel)

=============
I really wish someone who can write a better instruction for changing directory permissions for me. 50% of the problems using this hack are permission related.

lierduh
06 Oct 2005, 22:38
What weblog is there in vbulletin, if you mean currently active users: i don't see it there.


I meant the web server log. Please ask your host about it. You ask them where can you access to the web log for your web site they host.

PixelFx
07 Oct 2005, 08:12
I have cpanel system, was using wsftp to chmod the directories. As per your instructions. by Root I ment public_html .. its a cpanel permission thing, as base permission. anyway I did what you asked in your intructions(readme, not whats above), but for some reason it was greifing, I've never had an issue with chmod till trying to do this script, I figoured it was RC3 issue not vb3.5.0 .. I'll give it a shot again with your info about :)

.. in instructions above, the only thing I missed was permissions on /forum/ from how you explained it above. I'll try again and post results.

my setup is pubic_html/forum/archive .. your saying in your setup in relation to mine to do /forum/ = 777 and /archive/ = 777, which is what I'll try next, currently I'm in sleep typing mode.

dutchbb
07 Oct 2005, 11:55
To those unsure about this hack, I would say persevere.

My number of Google spiders hasn't increased dramatically, but they're being far more efficient.

A month ago the number of pages I had in Google was only 66,000, now I have over 846,000 pages indexed: http://www.google.co.uk/search?q=site%3Awww.bowlie.com

It really is worth it, although the code edits in archive/index.php can be a bugger to get your head around at first.
I second that!

It's been a few days now, guess what

http://www.google.be/search?hl=nl&q=site%3Awww.dutchbodybuilding.com&btnG=Google+zoeken&meta=

9.640 listed already, coming from 642 just 2 days ago!!!!!!

Man this hack rocks, you deserve to win hack of the month + hack of the year for all I care :D

ps: just need a better explanation for step 3, I couldn't do that one.

buro9
07 Oct 2005, 13:28
I have a problem though... you're making a sitemap gz for each forum, well, some of my forums are big:

Sitemap Errors

1. Too many URLs with Sitemap http://www.bowlie.com/forum/archive/sitemap_4.gz
* Your Sitemap contains too many URLs. Please create multiple Sitemaps with up to 50000 URLs each and submit all Sitemaps.


Could you add spanning?

So we'd start with:
http://www.bowlie.com/forum/archive/sitemap_4_1.gz

And when we passed an arbitrary value (make it a setting in the file in case Google change it later) we would move onto:
http://www.bowlie.com/forum/archive/sitemap_4_2.gz
http://www.bowlie.com/forum/archive/sitemap_4_3.gz
through
http://www.bowlie.com/forum/archive/sitemap_4_9999999.gz
etc

As it stands, Google is now refusing to pay attention to my mine as the one that exceeds it basically causes the whole thing to error.

buro9
07 Oct 2005, 13:30
I have a problem though... you're making a sitemap gz for each forum, well, some of my forums are big:


Could you add spanning?

So we'd start with:
http://www.bowlie.com/forum/archive/sitemap_4_1.gz

And when we passed an arbitrary value (make it a setting in the file in case Google change it later) we would move onto:
http://www.bowlie.com/forum/archive/sitemap_4_2.gz
http://www.bowlie.com/forum/archive/sitemap_4_3.gz
through
http://www.bowlie.com/forum/archive/sitemap_4_9999999.gz
etc

As it stands, Google is now refusing to pay attention to my mine as the one that exceeds it basically causes the whole thing to error.

Oh, and mine contains 50,214 URL's ;) That one forum :D

avexzero
07 Oct 2005, 18:06
Does this workd on 3.5.0 Gold?

Unreal Player
07 Oct 2005, 18:20
What are we looking for when this script runs. I see alot of zip files or whatever in the archive section but nothing in the other sections..is this right

dutchbb
07 Oct 2005, 18:29
Does this workd on 3.5.0 Gold?
yep up and running

lierduh
08 Oct 2005, 00:58
This is something I had in mind to implement.:) So next version will certainly contain this feature.

I think I should be able to push a new version out this weekend including better documentation for the step 3. I have been waiting for the vB Gold.

I have a problem though... you're making a sitemap gz for each forum, well, some of my forums are big:


Could you add spanning?

So we'd start with:
http://www.bowlie.com/forum/archive/sitemap_4_1.gz

And when we passed an arbitrary value (make it a setting in the file in case Google change it later) we would move onto:
http://www.bowlie.com/forum/archive/sitemap_4_2.gz
http://www.bowlie.com/forum/archive/sitemap_4_3.gz
through
http://www.bowlie.com/forum/archive/sitemap_4_9999999.gz
etc

As it stands, Google is now refusing to pay attention to my mine as the one that exceeds it basically causes the whole thing to error.

Unreal Player
08 Oct 2005, 02:23
is it normal for my site to still be PENDING after 6 hours at google. And how does my site know what account i'm using to resubmit it automatically?

dutchbb
08 Oct 2005, 19:06
This is something I had in mind to implement.:) So next version will certainly contain this feature.

I think I should be able to push a new version out this weekend including better documentation for the step 3. I have been waiting for the vB Gold.
HI

Google Spider still only looks at the normal threads in who's online?

Only the Yahoo! Slurp Spider looks at the archives?

falter
09 Oct 2005, 00:06
Hi there,
I'm very happy with the archive redirection. That's pretty slick stuff, and it seems to be working great. The sitemap submission to google hasn't really taken effect quite yet, but it's only be 36 hours since submission (I imagine that these things can take some time). Yahoo is going bonkers on us, though!

Anyway, I've submitted a bug/feature request to vbulletin as a result of installing this mod. You can see it here:
http://www.vbulletin.com/forum/bugs35.php?do=view&bugid=1576

Specifically, it has to do with the way in which $show['search_engine'] is defined, which seems important as it plays quite an important role in this particular mod.

Looking at the definition of $show['search_engine'] seemed important as I, like others, have noticed that sometimes googlebot doesn't want to get redirected from showthread to the archives.

(as seen in /includes/init.php)
$show['search_engine'] = ($vbulletin->superglobal_size['_COOKIE'] == 0 AND preg_match("#(google|msnbot|yahoo! slurp)#si", $_SERVER['HTTP_USER_AGENT']));

As you can see, the vBulletin assumes that no search engine spider will ever use a cookie. I found the redirection to be more effective after removing the checking for the absence of a cookie, which resulted in this:
$show['search_engine'] = (true AND preg_match("#(google|msnbot|yahoo! slurp)#si", $_SERVER['HTTP_USER_AGENT']));

Now, as you can see in my bug report, I'm not terribly satisfied with the way $show['search_engine'] is defined in the first place, but making the mod as seen above helped me out, some.

Hope this helps some of you guys...

~mike.

falter
09 Oct 2005, 00:12
HI

Google Spider still only looks at the normal threads in who's online?

Only the Yahoo! Slurp Spider looks at the archives?
Triple_T,
Just for clarity's sake, I was having the same problem you are having. Try my mod (in the post above this one), and see if that helps.

~mike

dutchbb
09 Oct 2005, 03:00
Triple_T,
Just for clarity's sake, I was having the same problem you are having. Try my mod (in the post above this one), and see if that helps.

~mike
I looked right after and 1 x google was in the archives. After that it was still also in the threads.

I noticed google is mutch less effective in comparison:
- only 1 spider most of the time (yahoo 10 or more)
- yahoo is now always in the archives, googlebot almost always not
- googlebot still goes to pages like printthread and member.php , and that even with a robot.txt disallowing that to happen.

MSN bot has not gone further than index.php, so looks like yahoo is just a better bot?

Now I have 2 questions regarding robots.txt:

- I have one both in the site root en the vbulletin root, is this needed , if not, what is the correct place (from what I have read it should be the site root)

- Is the .php extention needed for disallowing files, some say it's best to not include it, i have not seen a difference so far.

jdingman
09 Oct 2005, 04:34
Looks great so far. One question about mod_rewrite

using RewriteCond %{QUERY_STRING} ^$
RewriteRule ^index.php$ / [R=301,L] that redirects if you're using forums.domain.com. What about if you're using domain.com/forums/? What mod_rewrite would you use for that redirect?

(not exactly for me because I can probably get it working, but anyone else that might need this as well.)

falter
09 Oct 2005, 04:51
Now I have 2 questions regarding robots.txt:

- I have one both in the site root en the vbulletin root, is this needed , if not, what is the correct place (from what I have read it should be the site root)

- Is the .php extention needed for disallowing files, some say it's best to not include it, i have not seen a difference so far.
your robots.txt should be accessible at the root of your domain (http://www.mydomain.com/robots.txt). this is the only place that spiders know to check.

if you're trying to explicitly define specific files (ex. /forums/showthread.php), then you should define that entry in your robots.txt file. there's no point in not putting the ".php" at the end (ex. /forums/showthread), it doesn't buy you anything. it can actually have a negative impact if your entries aren't defined well. say you're trying to tell search engines to ignore "/forum/s.php" (this is just hypothetical). if you were to just put "/forum/s" in your robots.txt, then, in addition to blocking "/forum/s.php", you'd be blocking "/forum/showthread.php", "/forum/search.php", "/forum/showgroups.php", anything else where the url starts with "/forum/s" .... as you can see, it's important to be as specific as possible, otherwise you risk shutting spiders out of huge chunks of your site.

falter
09 Oct 2005, 04:59
I looked right after and 1 x google was in the archives. After that it was still also in the threads.

I noticed google is mutch less effective in comparison:
- only 1 spider most of the time (yahoo 10 or more)
- yahoo is now always in the archives, googlebot almost always not
- googlebot still goes to pages like printthread and member.php , and that even with a robot.txt disallowing that to happen.
i've thought about it some more.
301 code just tells the bot that the link has permanently moved. it would take a second request from the spider to actually jump to the archives. if the spider is slow (as googlebot and msnbot typically are), i can see how it would appear as though googlebot was sitting in showthread, instead of being directed to the archive....

lierduh
09 Oct 2005, 06:42
I have a new version ready to be released. If anyone wants, you can download this and try out before I put together the package.

I still need to do the documentation for the modifications of index.php and global.php files.

lierduh
09 Oct 2005, 06:46
Looks great so far. One question about mod_rewrite

using that redirects if you're using forums.domain.com. What about if you're using domain.com/forums/? What mod_rewrite would you use for that redirect?

(not exactly for me because I can probably get it working, but anyone else that might need this as well.)

Without testing, I think
RewriteRule ^forums/index.php$ forums/ [R=301,L]

Should do.

falter
09 Oct 2005, 07:24
I have a new version ready to be released. If anyone wants, you can download this and try out before I put together the package.

I still need to do the documentation for the modifications of index.php and global.php files.
I don't know if this is due to any mods I have (which I'm pretty light on), but when I run your script directly (not using cron), I get the following output:


Warning: array_keys(): The first argument should be an array in /path/to/my/stuff/forums/includes/class_core.php on line 1453

Warning: Invalid argument supplied for foreach() in /path/to/my/stuff/forums/includes/class_core.php on line 1453

Warning: array_keys(): The first argument should be an array in /path/to/my/stuff/forums/includes/class_core.php on line 1472

Warning: Invalid argument supplied for foreach() in /path/to/my/stuff/forums/includes/class_core.php on line 1472

Unable to add cookies, header already sent.
File: /path/to/my/stuff/forums/archive/forums_sitemap.php
Line: 1



Removing the "unset($_COOKIE);" from line 56 helps get the script to run, but, since my cookies are still there, all my private forums get sitemapped, too. so, I just moved down the stuff in the block above, and everything works.

So, I go from this:
if (function_exists('log_cron_action'))
{
global $index_zp;
global $debug_log;
global $max_url;
unset($vbulletin->userinfo);
$vbulletin->userinfo['userid'] = 0;
}
else
{
if ($run_by_vb_Scheduled_Task_only)
{
exit("Script can only be run by vB Scheduled Tasks. Set \$run_by_vb_Scheduled_Task_only to 0 if you need to run manually");
}

unset($_COOKIE);
$specialtemplates = array();
require_once(CWD . '/includes/init.php');
}

to this

if (function_exists('log_cron_action'))
{
global $index_zp;
global $debug_log;
global $max_url;
unset($vbulletin->userinfo);
$vbulletin->userinfo['userid'] = 0;
}
else
{
if ($run_by_vb_Scheduled_Task_only)
{
exit("Script can only be run by vB Scheduled Tasks. Set \$run_by_vb_Scheduled_Task_only to 0 if you need to run manually");
}

$specialtemplates = array();
require_once(CWD . '/includes/init.php');
unset($vbulletin->userinfo);
$vbulletin->userinfo['userid'] = 0;
}

lierduh
09 Oct 2005, 08:04
I remember now, someone else reported this as well. I think it might be php5 related. I don't have php5 to test, so I think I won't unset cookies then.:)

Thanks.

I don't know if this is due to any mods I have (which I'm pretty light on), but when I run your script directly (not using cron), I get the following output:



Removing the "unset($_COOKIE);" from line 56 helps get the script to run, but, since my cookies are still there, all my private forums get sitemapped, too. so, I just moved down the stuff in the block above, and everything works.

dutchbb
09 Oct 2005, 12:00
your robots.txt should be accessible at the root of your domain (http://www.mydomain.com/robots.txt). this is the only place that spiders know to check.

if you're trying to explicitly define specific files (ex. /forums/showthread.php), then you should define that entry in your robots.txt file. there's no point in not putting the ".php" at the end (ex. /forums/showthread), it doesn't buy you anything. it can actually have a negative impact if your entries aren't defined well. say you're trying to tell search engines to ignore "/forum/s.php" (this is just hypothetical). if you were to just put "/forum/s" in your robots.txt, then, in addition to blocking "/forum/s.php", you'd be blocking "/forum/showthread.php", "/forum/search.php", "/forum/showgroups.php", anything else where the url starts with "/forum/s" .... as you can see, it's important to be as specific as possible, otherwise you risk shutting spiders out of huge chunks of your site.
Thank you. I read it on this site, the guy seems to be some sort of guru about vbulletin SEO: http://forum.time2dine.co.nz/seo-vbulletin/search-engine-optimize-vbulletin-98.html

I have a few questions (also for the author of this thread: )

What does http://www.vbseo.com have that this hack doesn't provide. Is this worth buying, or is it basically the same?

What do you think about the tips/hack provided on this site: http://forum.time2dine.co.nz/seo-vbulletin/search-engine-optimize-vbulletin-98.html he has nr1 ranking on google for "vbulletin SEO" keywords.

lierduh
09 Oct 2005, 12:32
Basically my hack only lets Google index the real contents of the forums using vB archives. I do not think it is neccessary to let Google index both the full version threads and the archives. For more details and reasons, please read my open post.

Unreal Player
09 Oct 2005, 14:53
Ok, My site has been pending for almost 2 days. They say "several hours" wtf? anyone else get this?

jdingman
09 Oct 2005, 15:16
Is it crucial that I change permission for the root or my forum directory? I haven't changed them and it's been working fine. I did change my /archive/ to 755, but not ./

does it make that much of a difference?

trilljester
09 Oct 2005, 17:01
Is it crucial that I change permission for the root or my forum directory? I haven't changed them and it's been working fine. I did change my /archive/ to 755, but not ./

does it make that much of a difference?

Well, as long as the web server "user" process has access to write to the root forum directory and archive/ then 755 is fine, assuming that the user owns the directories. The 55 part will keep others from writing to those directories.

xtreme-mobile
09 Oct 2005, 17:50
ummm all is goinbg well but what the helldo i have to do for step 3 it doesnt make any sense to me :(

any help would be fantastic :D

falter
09 Oct 2005, 17:54
hey lierduh,

I've been playing around a bit with the robot detection. I snagged a bunch of code from "online.php", hacked it up a bit, and came up with this (as a drop-in replacement for the "is_robot_visit" function. This one uses the spiders_vbulletin.xml file, which I recommend people updating. The 3.5.0 gold version is fairly vanilla. I got an updated one from here: http://www.vbulletin.com/forum/showpost.php?p=565415&postcount=12

Anyway, here's the change to global.php (this is assuming that you have the very latest version of lierduh's code :) )

/**
* Return true if visited by a robot.
*/
function is_robot_visit()
{
require_once(DIR . '/includes/class_xml.php');
$xmlobj = new XMLparser(false, DIR . '/includes/xml/spiders_vbulletin.xml');
$spiderdata = $xmlobj->parse();

if (is_array($spiderdata['spider']))
{
foreach ($spiderdata['spider'] AS $spiderling)
{
if (isset($_SERVER['HTTP_USER_AGENT']) AND preg_match("#". preg_quote($spiderling['ident'], '#') . "#si", $_SERVER['HTTP_USER_AGENT'])) {
return true;
}
}
}
unset($spiderdata, $xmlobj);
return false;
}


There's all sorts of extra markup in the xml for ip ranges and such, but I'm just goign to match against the user-agents, for now.

falter
09 Oct 2005, 18:09
update: I went so far as moving the code out of "archive/global.php" into "includes/init.php", where $show['search_engine'] is defined.

I replaced:
$show['search_engine'] = ($vbulletin->superglobal_size['_COOKIE'] == 0 AND preg_match("#(google|msnbot|yahoo! slurp)#si", $_SERVER['HTTP_USER_AGENT']));
with
/**
* Return true if visited by a robot.
*/
function is_robot_visit()
{
require_once(DIR . '/includes/class_xml.php');
$xmlobj = new XMLparser(false, DIR . '/includes/xml/spiders_vbulletin.xml');
$spiderdata = $xmlobj->parse();

if (is_array($spiderdata['spider']))
{
foreach ($spiderdata['spider'] AS $spiderling)
{
if (isset($_SERVER['HTTP_USER_AGENT']) AND preg_match("#". preg_quote($spiderling['ident'], '#') . "#si", $_SERVER['HTTP_USER_AGENT'])) {
return true;
}
}
}
unset($spiderdata, $xmlobj);
return false;
}

$show['search_engine'] = is_robot_visit();

Everything works great!

dutchbb
09 Oct 2005, 18:59
You guys have time for "the redirect to the actual thread for human visitors" instructions in step 3?

Or maybe you can just send the 2 files? The ones in the zip don't make any sence to me :(

falter
09 Oct 2005, 19:22
if lierduh could provide patchable diff's between the gold (3.5.0) and his modifications, that'd be awesome (I'd do it, but I've hacked up my files way too much, sorry).

lierduh, this page shows how: http://www2.linuxjournal.com/article/1237

it'll make modifying the vanilla files much easier.

xtreme-mobile
09 Oct 2005, 20:06
You guys have time for "the redirect to the actual thread for human visitors" instructions in step 3?

Or maybe you can just send the 2 files? The ones in the zip don't make any sence to me :(


same here i aint got a clue what to do :(

lierduh
09 Oct 2005, 23:36
Thanks falter. I will merge your code in my next release. With the Stadler's updated xml file. It will be better using the xml file than my string. I should be able to make that into plugins. So that no code needs to be changed regarding search engine crawler detection.

I also have an idea that I should make threads with new posts higher sitemap priority. So a new version is not far away.:)

It is good to see hackers work together creating better code. It makes it more interesting than explaining how to set permissions. :)

I have attached patch.diff for the two files.

hey lierduh,

I've been playing around a bit with the robot detection. I snagged a bunch of code from "online.php", hacked it up a bit, and came up with this (as a drop-in replacement for the "is_robot_visit" function. This one uses the spiders_vbulletin.xml file, which I recommend people updating. The 3.5.0 gold version is fairly vanilla. I got an updated one from here: http://www.vbulletin.com/forum/showpost.php?p=565415&postcount=12

Anyway, here's the change to global.php (this is assuming that you have the very latest version of lierduh's code :) )

There's all sorts of extra markup in the xml for ip ranges and such, but I'm just goign to match against the user-agents, for now.

hotrod1
10 Oct 2005, 01:28
Great hack, thanks alot!

falter
10 Oct 2005, 01:29
lierduh, are you sure that you're diffing a modified 3.5.0 (gold) against the original 3.5.0 (gold) ? I'm having problems running the diffs as patches against the 3.5.0 gold versions of the files (pulled straight out of the tarball).

regardless, perhaps the diffs are the wrong route? Maybe it would do better service to the less technical users if you had some instructions for code modifications that were similar to many of the others hacks? (ex. find [this block of code] add [this chunk of code] right after it, bla bla bla).

Brandon Sheley
10 Oct 2005, 01:32
I also have an idea that I should make threads with new posts higher sitemap priority. So a new version is not far away.:)


sounds great, keep up the good work ;)

Unreal Player
10 Oct 2005, 01:50
I do not get your two files at all.

Vtec44
10 Oct 2005, 02:15
Did you chmod 777 your home folder?

lierduh
10 Oct 2005, 02:31
lierduh, are you sure that you're diffing a modified 3.5.0 (gold) against the original 3.5.0 (gold) ? I'm having problems running the diffs as patches against the 3.5.0 gold versions of the files (pulled straight out of the tarball).

regardless, perhaps the diffs are the wrong route? Maybe it would do better service to the less technical users if you had some instructions for code modifications that were similar to many of the others hacks? (ex. find [this block of code] add [this chunk of code] right after it, bla bla bla).

The problem is probably caused by individual Licence Number and download timestamp. So my Gold 3.5 file is slightly different from yours. I will PM you my files' header and footer for you to try patch again. I used Unified Context. You can try to force patch with this flag if needed.

I don't really have the energy and time at the moment to write step by step instructions. I will be abroad in a few days time. So I would rather spend time on the actually code.:)

xtreme-mobile
10 Oct 2005, 17:34
ive managed to get this sort of working but dont understand step 3 it makes no sense whats so ever :(

any way it submitted my site map and i manually added it to google so i could track it BUT when i click status in google sitemaps it is showing this error:

Sitemap Errors
HTTP error

The server returned an error when we tried to access the URL provided. Please make sure the Sitemap URL is correct and resubmit your Sitemap.

whats happened here then?

dutchbb
11 Oct 2005, 00:48
Hi lierduh,

I can't seem to open my /archive map anymore? It loads for a while and than its a white screen forever.

I noticed there are alot of zip files there, so what's the deal, do I need to delete/clean out or something, is it getting to big, i can't access it now anymore :(

lierduh
11 Oct 2005, 01:56
ive managed to get this sort of working but dont understand step 3 it makes no sense whats so ever :(

any way it submitted my site map and i manually added it to google so i could track it BUT when i click status in google sitemaps it is showing this error:

Sitemap Errors
HTTP error

The server returned an error when we tried to access the URL provided. Please make sure the Sitemap URL is correct and resubmit your Sitemap.

whats happened here then?

If you unzip the diff_for_modified_vb_files.zip and put all the files in one of your PC's folder. You should have files include: global_php - diff - 1_6.htm and index_php - diff - 1_5.htm. Use your browser to open these files. It will show you a graphic version of the difference you need to change. Check the the legend for the colour at the left bottom of the screen.

Pink: Removed from old file
Yellow: changed lines
Green: Added in new file

This is very very easy by the way.

What is the URL that Google said it is wrong?

xtreme-mobile
11 Oct 2005, 07:19
If you unzip the diff_for_modified_vb_files.zip and put all the files in one of your PC's folder. You should have files include: global_php - diff - 1_6.htm and index_php - diff - 1_5.htm. Use your browser to open these files. It will show you a graphic version of the difference you need to change. Check the the legend for the colour at the left bottom of the screen.

Pink: Removed from old file
Yellow: changed lines
Green: Added in new file

This is very very easy by the way.

What is the URL that Google said it is wrong?

i put in google this

http://www.talk-365.com/forum/archive/forums_sitemap.php

ill try what you said tonight :D

dutchbb
11 Oct 2005, 10:41
Hi lierduh,

I can't seem to open my /archive map anymore? It loads for a while and than its a white screen forever.

I noticed there are alot of zip files there, so what's the deal, do I need to delete/clean out or something, is it getting to big, i can't access it now anymore :(
Can you reply please :)

xxskullxx
11 Oct 2005, 13:28
i put in google this

http://www.talk-365.com/forum/archive/forums_sitemap.php

ill try what you said tonight :DYou should be submitting http://www.talk-365.com/forum/g_sitemap.xml

xtreme-mobile
11 Oct 2005, 17:28
ok ive upoaded the sitemap to forums directory and submitted taht to google sitemaps admin and see if taht returns anything, does this mean that the actaull hack his failimng then? cos it tells you to put the sitemap file in the archive folder not in forum folder??

im confused

also ive opened the files up i the browser for the step 3 but what exactly do i do???

its says the differences from V1.4 what the hell is V1.4 do i add the extra code etc that is missin in 1.4 into my global files or what? this makes no sense :(

im thick

sHORTYWZ
11 Oct 2005, 17:49
Since upgrading I've had major problems with this.. I'm running two forums and have had identical problems on each.

I test out the script and it generates the indexes correctly, however sometime down the road a few crons later the script somehow ends up generating a .gz for EVERY thread on my forum causing my /archive directory to contain thousands of .gz's and my index file to be almost a meg.

This seems to be the same problem that Triple_T is having.

Any ideas?

dutchbb
11 Oct 2005, 18:47
c'mon guys a little support here, this hack is too good to uninstall ;)

lierduh
11 Oct 2005, 20:49
Since upgrading I've had major problems with this.. I'm running two forums and have had identical problems on each.

I test out the script and it generates the indexes correctly, however sometime down the road a few crons later the script somehow ends up generating a .gz for EVERY thread on my forum causing my /archive directory to contain thousands of .gz's and my index file to be almost a meg.

This seems to be the same problem that Triple_T is having.

Any ideas?

Ok, please list some examples of the file names and its contents. Without having the same problem. It can be hard for me to find the bug.

I honestly did not know what Triple_T was talking about though it makes more sense now.

buro9
11 Oct 2005, 21:31
Ok, please list some examples of the file names and its contents. Without having the same problem. It can be hard for me to find the bug.

I honestly did not know what Triple_T was talking about though it makes more sense now.

I actually saw this today.

I couldn't get it to do it when I ran the cron manually from the admincp, but the scheduled task being run by the schedule did just this.

I had somewhere in the region of half a 100,00 files splattered around before I managed to kill the process (by nuking the MySql process and also restarting httpd). It took about 15 minutes running "rm -f *" against batches of 5,000 *.gz items at a time to delete them all.

I thought it was just me!

Anyhow... my cron job log output for this task is attached. Beware that the single logged job expands to 7.68mb of log file!

Fun :)

lierduh
11 Oct 2005, 21:56
I have nailed down the bug.
Please download new version (V1.1a)!

buro9
11 Oct 2005, 22:10
I have nailed down the bug.
Please download new version (V1.1a)!

Fantastic response :)

I shall test it straight away :D

buro9
11 Oct 2005, 22:18
Yup, this one has fixed it.

Thanks very muchly... it's damn early there, so I'm guessing that was a pre-shower fix, much respect for it.

dutchbb
11 Oct 2005, 22:26
very fast,thank you

lierduh
11 Oct 2005, 22:27
a pre-shower fix.

Yes, it was a pre-shower release. :o :)

Suiko Jin
12 Oct 2005, 02:44
I am confused on step three. I have read the little diff sheets and the color charts but I don't get alot of it. Like on the yellow/changed line things.... do I change the code all together? Like on the index diff code chart on line 317, do I change that whole first yellow section on the left to the code on the right? or do I add the code on the right with it? Bah... very confusing since some of the code on other lines that it does this with seems to be exactly the same.

buro9
12 Oct 2005, 05:31
I am confused on step three. I have read the little diff sheets and the color charts but I don't get alot of it. Like on the yellow/changed line things.... do I change the code all together? Like on the index diff code chart on line 317, do I change that whole first yellow section on the left to the code on the right? or do I add the code on the right with it? Bah... very confusing since some of the code on other lines that it does this with seems to be exactly the same.

I use the text diff files.

And very basically:
Lines that start '<' need removing in your version
Lines that start '>' need adding in your version

I find text diff's much more friendly and when you start changing the code you'll find it's easier than you think. Just look at the code and you'll see lierduh is removing the PDA stuff, redirecting real people to the threads and making things nicer for such spiders.

Brandon Sheley
12 Oct 2005, 06:27
I'll give this a shot again
and i get this error when i chmod the root and arcives to 777 as the instructions say :|

Internal Server Error
The server encountered an internal error or misconfiguration and was unable to complete your request.
Please contact the server administrator, webmaster@locoforum.com and inform them of the time the error occurred, and anything you might have done that may have caused the error.

More information about this error may be available in the server error log.


Additionally, a 404 Not Found error was encountered while trying to use an ErrorDocument to handle the request.


:ermm:

lierduh
12 Oct 2005, 07:22
Like on the index diff code chart on line 317, do I change that whole first yellow section on the left to the code on the right? or do I add the code on the right with it? Bah... very confusing since some of the code on other lines that it does this with seems to be exactly the same.

Yellow means change, why add?

The three lines before any coloured lines means: Fine these line, and change the lines after these three lines.
Then if yellow, change the whole yellow block on the left into the block on the right.

If you see green on the right, that means add lines. Green never appear on the left.

Same goes for pink, that never appears on the right because it means delete.

The coloured chart is even easier than the step by step instruction I think.:)

Funchiestz
12 Oct 2005, 11:31
Yellow means change, why add?

The three lines before any coloured lines means: Fine these line, and change the lines after these three lines.
Then if yellow, change the whole yellow block on the left into the block on the right.

If you see green on the right, that means add lines. Green never appear on the left.

Same goes for pink, that never appears on the right because it means delete.

The coloured chart is even easier than the step by step instruction I think.:)

somehow i just install like the instruction, but i confuse... is it i need to submit manually or it will submit to google automatically?

d2sector
14 Oct 2005, 11:23
The Instructions gave me a headache :p But part 3 is optional anyhow isnt it?

is it i need to submit manually or it will submit to google automatically?
The part in the readme that instructs you to add a scheduled task is what will automatically send the sitemap to google on whatever frequencey you set the task at.

MSC James
14 Oct 2005, 13:59
Hi, I installed this mod a couple of days ago (great mod!)

It was working fine for a while, then I just went to Google to track the status and it's telling me the index file is too big:

Your index file contains too many Sitemaps. Please create multiple index files with up to 1000 Sitemaps each and submit all index files.

Hope you can help :)

James.

d2sector
14 Oct 2005, 14:16
My forums seem to use:
/archive/index.php?t-10939.html
By the sitemaps created are
/archive/index.php/t-10939.html

Howdo i get it to generate them with a ? instead of an /

lierduh
14 Oct 2005, 15:04
Looks you guys are still running the old version.

I sent a messages about the new version a few days ago, people who registered "Install" would have received the message.

MSC James
14 Oct 2005, 22:42
Oops, I'll get it now, thanks :)

This is the only vB mod i've installed that I've forgotten to click install on so far :o clicked it now :)

Unreal Player
14 Oct 2005, 23:59
Looks you guys are still running the old version.

I sent a messages about the new version a few days ago, people who registered "Install" would have received the message.
Can you send me the message, i forgot to click install?

Suiko Jin
15 Oct 2005, 03:07
Alright, I did everything and installed it. Then I submitted this link to the google sitemap.

http://guardiansanctuary.net/forums/g_sitemap.xml

So I wait a day and then it says that it is ok but when I check on the stats, it says that it has generated some HTTP Error.

lierduh
15 Oct 2005, 08:07
New beta script.

Threads containing newer posts will have higher sitemap priority with this new script.

c0d3x
15 Oct 2005, 10:27
hi, i have some "hidden" forums that i want to be shown in the archive, but not shown on the forums page!! what can i do??

buro9
15 Oct 2005, 10:43
hi, i have some "hidden" forums that i want to be shown in the archive, but not shown on the forums page!! what can i do??

Why on earth would you want this?

If you have them in the archives and Google indexes them (it will), then your hidden forums will be cached in Google and you will find it VERY hard to remove them from there.

If they're hidden from your members, why on earth would you want to publish them to the whole world and your members anyway?

This request does not parse.

c0d3x
15 Oct 2005, 12:35
no no !! they're not hidden to members, they don't appear on the forumhome but they're active!!

jdingman
15 Oct 2005, 15:54
I've been modifying the files manually each time there is an update. How would automate the process so I don't have to manually modify the files each time, but rather just run something like the 'diff' script so that it is automated?

hotrod1
15 Oct 2005, 15:56
With the newest updates at least all you have to do is upload the forums_sitemap.php and that will overwrite the old one with the new code, that's it.

jdingman
15 Oct 2005, 16:02
Ok, but in general, is there a way to do something like that with plugins? Using a 'diff' type of method to implement code?

hotrod1
15 Oct 2005, 16:04
Yeah with products that will do the file edits for you but this mod doesn't support that yet and I don't know if it will but for now you would have to do it manually.

c0d3x
15 Oct 2005, 18:04
hi, i have some "hidden" forums that i want to be shown in the archive, but not shown on the forums page!! what can i do??

bump :ninja:

007
15 Oct 2005, 20:14
So far so good for me. Now instead of bots being all over the place they seem to only be looking at threads and the main page. :)

I haven't had any trouble with submitting to Google though. Actually Google seems to crawl my forums quite a few times per week anyway, so maybe I don't need to bother submitting it..

buro9
16 Oct 2005, 09:23
I haven't had any trouble with submitting to Google though. Actually Google seems to crawl my forums quite a few times per week anyway, so maybe I don't need to bother submitting it..

You should submit the sitemap. I had Google spiders, and it's not about attracting them, it's about getting them to be more efficient, to look at things they didn't know were there, to look deeper, and to get more of your content indexed.

Suiko Jin
16 Oct 2005, 22:33
Alright, I did everything and installed it. Then I submitted this link to the google sitemap.

http://guardiansanctuary.net/forums/g_sitemap.xml

So I wait a day and then it says that it is ok but when I check on the stats, it says that it has generated some HTTP Error.
Still can't figure this out...

exceem
17 Oct 2005, 15:28
all 3 steps installed :)

and all working fine (hopefully!!!)

Thanks for the hack :)

exceem
17 Oct 2005, 20:13
been getting a few php errors in my error log:

error is
PHP Fatal error: Class 'vBulletinHook' not found in /home/trevor/public_html/forums/includes/functions.php on line 4322

I think its reffering to this bit of code

function exec_header_redirect($url)
{
global $vbulletin;

$url = create_full_url($url);

if (class_exists('vBulletinHook'))
{
// this can be called when we don't have the hook class
($hook = vBulletinHook::fetch_hook('header_redirect')) ? eval($hook) : false;
}

$url = str_replace('&amp;', '&', $url); // prevent possible oddity

if (SAPI_NAME == 'cgi' OR SAPI_NAME == 'cgi-fcgi')
{
header('Status: 301 Moved Permanently');
}
else
{
header('HTTP/1.1 301 Moved Permanently');
}

header("Location: $url");
define('NOPMPOPUP', 1);
if (defined('NOSHUTDOWNFUNC'))
{
exec_shut_down();
}
exit;
}


the redirects for me visiting a archieved link works taking me to a "proper" thread

im checking my logs now to see if its not working the other way around, any ideas as to whats causing this error?

007
18 Oct 2005, 15:20
I got this error now:

A Sitemap Index may not directly or indirectly reference itself. Please fix your Sitemap Index before resubmitting.

How does this Google Sitemap hack compare to the other one on vB?

eoc_Jason
18 Oct 2005, 15:53
What is the other one? (URL?)

buro9
18 Oct 2005, 16:16
I got something similar but couldn't see anything wrong with the sitemap... so I posted on the Google Groups with it:
http://groups.google.com/group/google-sitemaps/browse_thread/thread/fe7e0aca343fb8ec

Similar error:

Recursive Index

A Sitemap Index may not directly or indirectly reference itself.
Please fix your Sitemap Index before resubmitting.

eoc_Jason
18 Oct 2005, 16:44
Just checked today and got the same error as you! Hmm...

xtreme-mobile
18 Oct 2005, 18:44
when running sitemap.php i get this

Script can only be run by vB Scheduled Tasks. Set $run_by_vb_Scheduled_Task_only to 0 to call this script directly.

i dont understand this bit how do i change it to what its asking :)

dutchbb
18 Oct 2005, 19:33
when running sitemap.php i get this

Script can only be run by vB Scheduled Tasks. Set $run_by_vb_Scheduled_Task_only to 0 to call this script directly.

i dont understand this bit how do i change it to what its asking :)
Open /archive/forums_sitemap.php file

change: $run_by_vb_Scheduled_Task_only = 0;

to: $run_by_vb_Scheduled_Task_only = 1;

Now you (and everyone else) can run the script directly.
You can also leave it to 0 and run this script in your scheduled tasks manager, that way, you are the only person who can run it.

dutchbb
18 Oct 2005, 19:38
BTW the beta script is working fine AFAICS

New beta script.

Threads containing newer posts will have higher sitemap priority with this new script.

eoc_Jason
20 Oct 2005, 20:13
Where is the link to the beta script? Is it burried on one of the many pages?

BTW the beta script is working fine AFAICS

Also, I'm sure you probably already noticed, but you can also gzip the main xml file. I modified my filie to do that, wasn't much effort.

dutchbb
20 Oct 2005, 22:05
yeah beta script is few pages back

falter
21 Oct 2005, 00:17
those wanting better bot detection may want to try the mod I recommend in this post:
http://www.vbulletin.com/forum/showthread.php?p=993396#post993396

This will make use of the spiders XML file that so many work so hard on.

dutchbb
21 Oct 2005, 00:41
I've written the instructions for step 3 in a step by step txt file.

This is an alternative for the coloured diff. Should only be used on a untouched index.php and global.php file.

dutchbb
21 Oct 2005, 02:54
Lierduh,

Is it possible to exclude forums from the sitemap? I don't want to get my chat section on google listed too mutch for instance.

Thanks

PeterKjaer
21 Oct 2005, 15:10
Hi,

Anybody know how to set the premissions on a Windows server running IIS.

It will offcause be easy enough to give access for internet_user, but can you give the primission only for this job?

/Peter

falter
21 Oct 2005, 15:16
Hi,

Anybody know how to set the premissions on a Windows server running IIS.

It will offcause be easy enough to give access for internet_user, but can you give the primission only for this job?

/Peter
what kind of access do you have to the server?

PeterKjaer
22 Oct 2005, 04:31
what kind of access do you have to the server?

It's my own, so i have admin rights to the server

/Peter

c0d3x
22 Oct 2005, 06:23
hi, i have some "hidden" forums that i want to be shown in the archive, but not shown on the forums page!! what can i do??

they're not hidden to members, but i don't want to show them in the forumhome

up please!

Mr Chad
22 Oct 2005, 10:47
Ok its prolly a stupid thing but all the spiders view the archives like this:

/forums/archive/index.php/t-1192.html

when it should be

/forums/archive/index.php?t-1192.html

eoc_Jason
22 Oct 2005, 18:23
chatbum - actually the proper way is the first that you showed with slashes only. The reason is that it emulates directories and doesn't use a query string (making it look more static to spiders).

The reason your server might be using the query string method is because yours doesn't support the first method. (Check your /archive/global.php file where it checks for SLASH_METHOD.)

I *thought* he implemented the check and both options in the archive generation code, but maybe it might not be working properly. *shrug*

Mr Chad
23 Oct 2005, 00:02
chatbum - actually the proper way is the first that you showed with slashes only. The reason is that it emulates directories and doesn't use a query string (making it look more static to spiders).

The reason your server might be using the query string method is because yours doesn't support the first method. (Check your /archive/global.php file where it checks for SLASH_METHOD.)

I *thought* he implemented the check and both options in the archive generation code, but maybe it might not be working properly. *shrug*

When i use the slashes it wont bring it to the thread it just redirects to the index, but when it uses the '?' it works.

lierduh
24 Oct 2005, 09:17
Ok its prolly a stupid thing but all the spiders view the archives like this:

/forums/archive/index.php/t-1192.html

when it should be

/forums/archive/index.php?t-1192.html

This is implemented in the beta version. I announced this to all the people who clicked "Install".

lierduh
24 Oct 2005, 09:20
Lierduh,

Is it possible to exclude forums from the sitemap? I don't want to get my chat section on google listed too mutch for instance.

Thanks

No option for this at the moment. I doubt it is a sought after feature. Google WILL index your chat section even if you don't `sitemap` it.

007
24 Oct 2005, 22:52
Any idea on when this will get another update? It works for the most part but that recursive index problems keeps happening.. :(

trackpads
25 Oct 2005, 00:06
Any idea on when this will get another update? It works for the most part but that recursive index problems keeps happening.. :(

mee too, thanks again, Jason

lierduh
25 Oct 2005, 00:28
I don't know what the reason is. I do not have this problem so far. Two things to try:

1) Have you deleted the old sitemap entry from Google sitemap account?
2) Try to change the sitemap index file name.

dutchbb
26 Oct 2005, 21:12
No option for this at the moment. I doubt it is a sought after feature. Google WILL index your chat section even if you don't `sitemap` it.
Yep, but a lot more if I send a sitemap. It's also a very big forum that has many links, and I don't want it to get any bigger by sending more traffic.

However, I understand that this is a very personal request, so please contact me if you want to do a payed service for this.

thank you

GT1
27 Oct 2005, 01:45
Well - I have everything done except step 3. I am completely lost in editing my global and index .php files Can anyone DO them for me or something? Would like to get this done asap if at all possible. Would be hugely appreciated.

dutchbb
27 Oct 2005, 15:01
Well - I have everything done except step 3. I am completely lost in editing my global and index .php files Can anyone DO them for me or something? Would like to get this done asap if at all possible. Would be hugely appreciated.
Do you have a problem with the coloured diff or text files in the original zip?

I posted a step by step instruction here (http://www.vbulletin.org/forum/showpost.php?p=800865&postcount=176)

Still problems? Send me your msn , I'll help you out.

D|ver
27 Oct 2005, 23:14
i have a small question:
is it possible to add additional pages to the sitemap?

buro9
28 Oct 2005, 14:07
I have a feature request... to dump a single text file, gzipped, of ALL of the urls that go into the various sitemaps.

Basically... when we're in the loop to create the various sitemaps, to additionally write a text file, with just one full URL per line.

This is because this would also be good for Yahoo and other spiders. Yahoo specifically asks for such a thing on their submit page:
http://submit.search.yahoo.com/free/request

You can also provide the location of a text file containing a list of URLs, one URL per line, say urllist.txt. We also recognize compressed versions of the file, say urllist.gz.


And to me... it seems that the loop to create the Google Sitemap, is the perfect low overhead place to also dump the Archive URL's into a text file for Yahoo and other spiders to feed from.

:Judge:
28 Oct 2005, 20:58
Don't know but maybe this is over my head here :p

Went through and made changes 1 and 2 and that is all and I have no idea to check and see if it is working.

I have no errors so that is a plus.

Do I have to sign up for Google Sitemap? - Forget I said that.

dutchbb
29 Oct 2005, 10:52
Just a warning here... I was reading about seo and step 3 might not be such a good idea. This is called cloaking and it is a black hat technique. It is clearly stated in the google quality terms as a forbidden way to SEO.

Actually I found out after installing this, my pagerank for my homepage did go down 4 pages for a very important keyword, I didn't do anything else that could be suspicious so I removed this!

The sitemap is good, but optimizing pages just for search engines and make them look different from what you human visitors see, is NOT recommended and you take a high risk for being penalized by google or other SE.

falter
30 Oct 2005, 00:13
Just a warning here... I was reading about seo and step 3 might not be such a good idea. This is called cloaking and it is a black hat technique. It is clearly stated in the google quality terms as a forbidden way to SEO.

Actually I found out after installing this, my pagerank for my homepage did go down 4 pages for a very important keyword, I didn't do anything else that could be suspicious so I removed this!

The sitemap is good, but optimizing pages just for search engines and make them look different from what you human visitors see, is NOT recommended and you take a high risk for being penalized by google or other SE.
if you impelemented the robots.txt that is suggested, that is most likely the cause of your PR drop, and not due to the cloaking.

There are many, many things that you can do that are considered cloaking. I don't think google would flip out over this seeing as the content that is provided to the search engine spider is the same content that is provided to the human user. An example of an abuse of cloaking is where, say, a completely different set of content is given to the spider than that which is given to the human.

I'm going to pubcon (http://www.pubcon.com) in a couple weeks; I'll ask around to see what some SEO's think about what we're doing here. I can even ask guys at yahoo and google. Personally, I think it's fine, since the same core content is being given to the search engines and humans.

eoc_Jason
01 Nov 2005, 17:23
The problem is, what you think is the "same content" is different than what a spider thinks is the same. Yes, cloaking is a serious issue and search engines do penaltize sites for doing such. Some search engines (like google) have spiders that look like a regular web broswer so that it can compare results between it and the actual spider results. If they don't match then, well, you get the idea.

I expanded my robots.txt file to exclude a lot of the links that are listed in the notes. And I use the generator script to make the xml files for google, but that's it. I do not believe it trying to redirect bots or users to various pages, that will only end up with bad things happening.

lierduh
04 Nov 2005, 23:20
Cloaking or not, is a long debating topic. The general advise from the experts is to not cloaking due the risk it involves. I would say do not install the step 3 if you are concerned about this.

However nowadays many major sites use cloaking including Amazon and Google itself. Believe or not vBulletin also uses cloaking!

eoc_Jason
07 Nov 2005, 17:01
Yes, but Amazon is a much more reputable site than say, joe bob's bait shack... Plus companies like that work directly with Google to enhance features for both sites.

falter
08 Nov 2005, 22:27
Well! I've backed-out the cloaking after the number of my indexed pages on google went from >40,000 to just over 800. I'm assuming that we got penalized in some form. My PR is still a 5, but that doesn't mean much of anything at all.

I can honestly say that my opinion is reversed on the cloaking side of things. I do not recommend implementing step 3.

Citizen
09 Nov 2005, 00:22
What exactly was step 3 of the hack? I looked over the installation and didn't see a "step 3"

Robru
09 Nov 2005, 07:14
Step 3
================================================================================ =

This step is to redirect human visitors to your actual threads.
When people do a Web/Google search and when they
click on the search result (note, Search Engine only index our archive
pages now), instead of taking the visitors to the plain archive pages.
The visitor will be taken to the actual forum threads!

This step involves change of code. There are number of places you need
to change. I have include the diff result between the final files and
original files. Two files involved are archive/global.php and archive/index.php

You can use the coloured diff: diff_for_modified_vb_files.zip
or use the plain old diff: diff_global.php.txt and diff_index.php.txt
(It will be appreciated if someone can write a step by step instruction
to change these two files):)

For the archive/global.php file. I added output compression to save
bandwidth. I have also got rid of the pda links.

Please post your questions to the thread and click Install.

Robru
09 Nov 2005, 07:15
I notice this download (http://www.vbulletin.org/forum/attachment.php?attachmentid=37278) is only a upgrade?

dutchbb
09 Nov 2005, 15:24
Well! I've backed-out the cloaking after the number of my indexed pages on google went from >40,000 to just over 800. I'm assuming that we got penalized in some form. My PR is still a 5, but that doesn't mean much of anything at all.

I can honestly say that my opinion is reversed on the cloaking side of things. I do not recommend implementing step 3.
Yes, jagger could be the main cause.

It's just not worth the risk and there are better ways to do it anyway. My archive now works like a sitemap, so visitors and spiders are redirected to the actual threads with a legit redirect. This actually works pretty good as we can see, our pages in google did go from 600 to 15000 in a few weeks.

You can do this yourself or use vbseo like I did. It's not cheap though

DrHUS
15 Nov 2005, 06:26
for security purpose & instant of giving 777 for base vB directory,
you can upload an empty file to base vB directory (as sitemap.xml) with 666 ..

futuregizmo
19 Nov 2005, 20:11
noob question, but is this hack exclusive to 3.5.1 or would this work in 3.5.0 too?

poprulz
19 Nov 2005, 23:20
noob question, but is this hack exclusive to 3.5.1 or would this work in 3.5.0 too?

i got the same question..does it work with 3.5 gold?

lierduh
20 Nov 2005, 23:44
It works for 3.5.0 too.

poprulz
01 Dec 2005, 14:32
I got a problem here.
The forum url is domain.com/forums/
and archive domain.com/forums/archive/

I changed the directory such that the forums and archive is writable(#777). But when i call up the function in the browser...i get the following error... What might be the reason?

"The directory containing /usr/local/psa/home/vhosts/mallupride.com/httpdocs/forums/archive/forums_sitemap.php cannot be world writable. "

JCD
03 Dec 2005, 15:06
Could it be that spiders are no longer redirected to the archive when having upgraded to 3.5.1? Before upgrading, it worked great, but now spiders are showing up crawling the regular threads.

poprulz
14 Dec 2005, 20:51
I installed the extension..and its been 2 weeks, still google crawls only the main page and the search results are shown frm the main page.
I am using vbadvanced and vb3.5.0
The sitemap is been properly generated n submitted on a daily basis.Wot could have gone wrong?
The google result shows the following.
http://www.google.com/search?q=site:www.mallupride.com/&hl=en

Thanx in advance!

Totti
15 Dec 2005, 15:03
ho guys, i am using vbSEO sitemap generation ... this plug-in for 3.5x.

Actually my question is: Which product is better?
Or should we possibly merge them by using the sitemap of vbseo and the redirection of this one here.

thanks for replies!

rlamego
07 Jan 2006, 22:05
Is there any way to run this without having to chmod my root dir to the the most insecure values on earth?

eoc_Jason
11 Jan 2006, 18:46
Is there any way to run this without having to chmod my root dir to the the most insecure values on earth?

If you create the initial file with the proper permissions, then you shouldn't have to chmod the whole directory as it just overwrites the file I believe...

samu2
11 Jan 2006, 20:59
How to I install the plugin? what hook does it go under and what title does it have? thanks :)

EDIT-Know how to install plugin,just don't know what hook name or what I should call it.

samu2
11 Jan 2006, 22:07
I need to learn to read the install files ;)

OK so submitted it,it said it couldn't make a site map from a certain file then it attempted to submit it from elsewhere and it went through OK I assume as it said that it had been submitted for crawling.

Installed the plug in.Since then my bots on my site have been stuck viewing one thread for an hour.I guess I will see what they do tomorrow :) if I made an error could it make the bots get stuck?

Eagle Creek
12 Jan 2006, 22:28
Looks nice!

Is there any possibility I will lose, or decrease my hits on google?

geezmo
20 Jan 2006, 07:56
Help, my plug-in is not sending the sitemap to Google! I've installed everything correctly, added the plug-in and set a scheduled task for it. In my /forum/archive folder, there are several .gz files so I know sitemaps are being generated by the hack. But the problem is, these are not being sent to Google.

In AdminCP > Scheduled Task Manager, I get a "Done" message everytime I click "Run Now" to test the plugin. But if I check Scheduled Task Log, there's no log entry for this plugin, not even one!

Can anyone tell me what's wrong?

Probably this could help. While installing this hack, I chmodded to 777 the /forum and /archive folders as per the instructions. However when I tried to call the forums_sitemap.php file, I always get an Error 500 (Internal Server Error). Also, the forum itself is down and inaccessible while the folders are chmodded 777. I checked my error log and it confirms that the error existed because the "directory is writable by others."

The point is, when I call the forums_sitemap.php file, I get a Server Error so I basically can't make it to run even for the first time. Because the forum is down because the "directory is writable by others," I have to chmod the folders back to 755.

Anyone who knows how to make the hack send the sitemaps to Google?

PS. I'm using ver. 3.5.3

geezmo
24 Jan 2006, 04:14
any reply to my question above anyone?

anyway, i have another question. what i just did was manually submit to Google the sitemaps generated by this hack. as i said before, in /forum/archive folder, i have a lot of gz files which i assume are sitemaps.

my question is, how are the different gz files different from one another? i have more than 50 sitemap files from number 13 until number 98 but i am missing other files. what i mean is i have sitemap_13.gz file and sitemap_98.gz but i don't have sitemap_21.gz , sitemap_22.gz and sitemap_23.gz files to name a few.

i only submitted sitemap_98.gz and i'm wondering if the other (or all?) sitemap files need to be submitted to Google?

Zia
20 Feb 2006, 04:13
ho guys, i am using vbSEO sitemap generation ... this plug-in for 3.5x.
Actually my question is: Which product is better?
Or should we possibly merge them by using the sitemap of vbseo and the redirection of this one here.
thanks for replies!

I have The Same Qustion.....as we are not xpert hand..
we are usingng "vBSEO Google/Yahoo Sitemap Generator for vBulletin 3.5.x & vBulletin 3.0.x " located in Here (http://www.vbulletin.org/forum/showthread.php?t=100435&highlight=sitemap+generator)

what is the difference in between this two ? if u xplain,it will help us to decide our mind.....
other hand we looks for ur suggetion about url re-writer(not vbseo origin one-that costs 150$)

does this sitemap provide rss feed portion too? (maybe i cant ask the qustion properly).i mean using this sitemap does google discover sites rss ?
btw in robots.txt why we need to disallow external.php ?


Plz let us know the details...

Thnx in advance.

stinger2
20 Apr 2006, 20:01
no one is supporting this ?

newmomsforum
21 Apr 2006, 17:38
Hmm. strange.

Everything appears to have worked well. the first manual run went well.

CMOD Archive directory and forum root to 777

Lots of nice sitemap_1.gz - sitemap_22.gz files in my Archive But no g_sitemap.xml in my forum root.

Any ideas?

Thanks All

Claire :)

newmomsforum
21 Apr 2006, 17:40
any reply to my question above anyone?

anyway, i have another question. what i just did was manually submit to Google the sitemaps generated by this hack. as i said before, in /forum/archive folder, i have a lot of gz files which i assume are sitemaps.

my question is, how are the different gz files different from one another? i have more than 50 sitemap files from number 13 until number 98 but i am missing other files. what i mean is i have sitemap_13.gz file and sitemap_98.gz but i don't have sitemap_21.gz , sitemap_22.gz and sitemap_23.gz files to name a few.

i only submitted sitemap_98.gz and i'm wondering if the other (or all?) sitemap files need to be submitted to Google?

As I understand it a sitemap is created for each forum?

stinger2
23 Apr 2006, 17:41
any one tried this on 3.5.4?

stinger2
26 Apr 2006, 10:12
bump

filmking
29 Apr 2006, 20:07
Thanks fior this great hack. My only problem in install was setting permission for base directory. I don't use a "forums" directory.. all base files are in public_html root.... so for whatever reason just CHmod 777 on the root didn't allow the base sitemap index file to get created. THe program continued to put the index file in archieves with the others. Finally over came this problem by placing a blank "sitemap.xml" file in the base directory and then chmod it to 777. Ran the program and it got replaced by the program with a new one and no longer received the error message of base premissions not correct. Getting an immediate increase in Google listings.. thank you

RichieBoy67
29 Apr 2006, 20:48
Can you please post what the step 3 was so I can remove it???

ShackMaster
11 May 2006, 06:36
Is this hack no longer supported?

I've tried uploading /forums/archive/forums_sitemap.php to Google and it will not let me add it. I noticed in an ealier thread that /forums/g_sitemap.xml should be what is submitted... but I do not have this file anywhere that I can find.

I did check out my server weblog and these are a few lines:
"GET /forums/archive/index.php/t-106-p-1.html HTTP/1.1" 200 10440 "-" "Mozilla/5.0 (compatible; Googlebot/2.1; +http://www.google.com/bot.html)"

"GET /forums/showthread.php?goto=lastpost&t=128 HTTP/1.1" 301 5 "-" "Mozilla/5.0 (compatible; Googlebot/2.1; +http://www.google.com/bot.html)"

"GET /forums/showthread.php?p=520 HTTP/1.1" 301 5 "-" "Mozilla/5.0 (compatible; Googlebot/2.1; +http://www.google.com/bot.html)"

"GET /forums/archive/index.php/t-10.html HTTP/1.1" 200 11155 "-" "Mozilla/5.0 (compatible; Googlebot/2.1; +http://www.google.com/bot.html)"

"GET /forums/showthread.php?p=642 HTTP/1.1" 301 5 "-" "Mozilla/5.0 (compatible; Googlebot/2.1; +http://www.google.com/bot.html)"

"GET /forums/archive/index.php/t-106-p-1.html HTTP/1.1" 200 10440 "-" "Mozilla/5.0 (compatible; Googlebot/2.1; +http://www.google.com/bot.html)"

"GET /forums/showthread.php?goto=lastpost&t=89 HTTP/1.1" 301 5 "-" "Mozilla/5.0 (compatible; Googlebot/2.1; +http://www.google.com/bot.html)"



Any help?

justim123
18 Oct 2006, 09:40
now I'm getting this, I change the persmissions in the archive folder, but I still get it.

Warning: gzopen(/public_html/forum/archive/sitemap_3.gz): failed to open stream: Permission denied in /archive/forums_sitemap.php on line 132

Chown data floder to apache user and it gets fixed.

Viper007Bond
19 Nov 2006, 06:44
Is this available for v3.6.x? I'd LOVE this for it! :D

maxicep
04 Feb 2007, 13:38
rulaz.reserved