Welcome Tour
Site Checklist
Our Story




This command/function can be used in one of two ways.

1) You can use it to manually index any page, or even a csv list of pages. Just supply the page name (or names) to the first parameter (or the value parameter). Each page will be indexed according to any and all indexing rules setup on the site.index page.

This may be useful if you turn off BoltWire's automatic indexing (set indexing: false in site.config) and then only index specific pages when you edit them. In such a case a simple line like this could be inserted into a normal edit form after an edit has taken place:

[command index {p}]

2) Or, you can set the rule parameter to one of the index rules in site.index. By default, it contains a "site" rule which indexes the basic text of most pages in your installation. But you can define any rule, and as many of them, as you wish.

Suppose you set the command to index the "forum" rule. The index function would scan page, retrieve a certain number of pages determined by the indexbatch value in site.config (by default this is 25), and will index that number of pages. This could be useful if you want to create a form to manually index a large number of pages. It is also used in the index action, which can create a list of pages for any index rule, and then index them.

Indexing Rules

Indexing rules are defined on the site.index page. By default, three rules are defined: site, tags and links. The site rule looks like this:

site: mode=text exclude={systempages} type=-{zones}

It basically includes three main sections. The rule name (in this case it is 'site'), the mode (in this case 'text'), and finally a number of parameters to exclude certain pages. Note that variables can be used in index rules. If I wanted to create a forum rule, it might look like this:

forum: mode=text group=forum.* type=number

This would only index pages that begin with forum and end with a number. Parameters that can be used to control which pages are to be indexed are basically the same as those used by the search function:

pages -- give a csv list of specific pages
group -- specify pages or groups of pages
dir -- limit pages to a given directory
pattern -- only groups that match a certain patter expression
include -- add certain pages or groups of pages to the list
exclude -- remove certain pages or groups of pages from list
type -- allowed or disallowed page types
if -- test pages by some conditional or another

Indexing Modes

There are four different indexing modes: text, links, data, and tags:

This strips out all the markup and punctuation in the page and leaves just the words. Duplicate words are omitted. This allows for text searching.

This scans the page for any links and creates an index of any pages the page you are indexing is linked to. This allows for link searching.

This requires you to set a legend specifying the data vars to index, and the order. Creates an info var from the data vars in pages you specify.

This extracts any tags in a page and stores them as an info var in the index. Tags can be put in the text of the page, or in data variables.

To see what kinds of indexes these modes create, try creating one, and then look at the index page: index.rule.

You can also create your own indexing modes for specialize types of indexes. Simply creating a PHP function like the following:

function BOLTindexDoMode($page, $args) {
     //process page as desired
     //return index value for the page

So if I wanted to create a special index that looked for fruit, I could call it BOLTindexDoFruit and then set the mode=fruit. The function would scan the contents of the specified pages, defined by any parameters in the indexing rule, and then return a simple list of any fruit my script recognized on that page. Look at the existing index mode functions in library.php for examples.

Note: If you wish to specify a different location for the index page (rather than index.rule), add an index parameter to the index rule. IE:

Auto Indexing

The index function is not normally needed because BoltWire has built in auto indexing. When indexing is turned on in site.config (indexing: true), any page that is modified gets automatically indexed according to any rules you have defined on site.index.

To reindex a site, or create an initial index of your site for a new rule, use the index action. There, you can select any of the index rules on site.index and generate a full list of pages to be indexed. Next refresh the page as many times as is necessary to complete the new index. Repeat the process to create another index.

Indexing by Cron

On a very large site with index files of several megabytes or more, even indexing a single page via BoltWire's autoindexing can impact performance. In this case we recommend setting "indexing: cron" in site.cong. This is a special mode that suspends all automatic indexing and instead records all page edits on site.index.cron for later indexing.

Note: this will not catch pages uploaded via ftp or edited outside of BoltWire. To catch them, you will need to run the index action.

To go from getting these pages listed to getting them indexed, you will need to set up a cron job on your server that calls a specific page (like cron.index) every few minutes containing the index function, ie: [(index cron=5)]. It will gradually work through the changed pages on site.index.cron, and process 5 pages at a time each time the cron job calls the page. It will check each page against each rule, and do any needed indexing for you. It may take a few hours to get everything done but you won't have to keep refreshing the action page. And once caught up, it should keep your site current all by itself.

The index function also allows you to index specific pages: [(index,anotherpage)]. And you can do [(index rule=myrule)]. This will process any pages on site.index.myrule. That list will need to be generated by the index action, of course, as it will not be populated with page names otherwise.

As of version 5.08, it is no longer required to use the cron parameter, you can just do [(index 5)]. If no number is specified, the indexBatch value in site.config (default 25) is used.

Contact your webhosting provider for information about how to setup a cron job. It only needs to call the page that does the cron indexing.

The Index Action

The index action is an interesting study. Basically, it generates a list of all your existing rules, and invites you to select one. When you submit the form, BoltWire retrieves the rule definition from site.index, does a search for page names matching those parameters, and then saves that list to site.index.rule.

On the next screen you are shown the pages on that index list, and offered two options: continue indexing, or create another index list. If you click continue, it begins indexing the pages in that list, a few at a time. (The number is set by indexbatch in site.config). If you are using cron to index your site, you only have to generate the list. Eventually your entire site will reindex itself.