![]() |
| Frontier Tutorials / Indexing a Website / Build the Alphabetical Index |
|
|---|
An alphabetical index and a topical index are two very different beasts. You might as well compare a unicorn to the 'orrible black beast of aaaarrrrrrgggghghghggghhhhhhhhh![1]
To create an alphabetical index, we'll use the same Indexer Suite script we used before: indexer.BuildPageIndex. But instead of indexing by keyword, we'll use the page titles. And to avoid polluting the topic index, we'll create a new index: titleAlpha.
The key function call we need to make to build this new index is
| Indexer.BuildPageIndex( @websites.mysite, @websites.mysite.["#indices"].["titleAlpha"], true, "title", "title" ) |
(Replace "websites.mysite" with your own site table, of course.)
Type this command into the Quick Script window and execute it, then examine the ["#indices"].titleAlpha subtable. It should contain an entry for each title.
But it's a pain to have to type in something like this whenever you want to rebuild the index, so we'll put it in a script, and call it BuildTitleAlphaIndex:
|
on BuildTitleAlphaIndex_TUT1( sourceAdr=@tutorials.indexsite, destTbl=@tutorials.indexsite.["#indices"].["titleAlpha"], inReplaceIndices=true )
|
Another trivial script. In this case, though, I'm looking ahead; as we'll see shortly, this is not going to be quite adequate.
Add this to the menu item script for the Update Indices command on your website menu. The menu item script should now look like this:
|
websites.mysite.["#tools"].BuildTopicsIndex()
websites.mysite.["#tools"].BuildTitleAlphaIndex() |
Select Update Indices command from your website menu.
Open the ["#indices"].titleAlpha subtable. Look at the names of the entries. If any of them start with "a", "the", "an", or other articles, they probably aren't sorted correctly. When the title of a paper or book begins with one of these words, the standard way of sorting is to ignore the meaningless leading word and sort alphabetically on the remainder. Other, related rules sort names beginning with "St." as "Saint", or "McAnything" as "MacAnything". And with the indexer.BuildPageIndex script, there's not much we can do about it.
What we need to do is to intercept the indexer and massage the article titles and/or keywords so they are entered the way we want them to.
And guess what? The Indexer Suite let's us do that, with the indexer.BuildPageIndexGeneric script. It's a little more difficult to use than indexer.BuildPageIndex, but much more powerful.
Indexer.BuildPageIndexGeneric constructs a keyword index of all pages in a specified website table or subtable. To determine whether to add a page to the index, and what keywords should be used, it calls a pair of callback functions.
Only entries in the source table (and its subtables) that the website framework will render into HTML pages will be indexed. All other entries are ignored, and will not appear in the index/indices.
Any page entry for which the test callback returns TRUE will be included in the index. Any table for which the test callback does not return TRUE will is ignored (i.e., will not appear in the generated index).
BuildPageIndexGeneric( inSourceAdr, inDestTbl, keywordSpec, testCB, infoCB, inReplaceIndices=true, doExpandNestedKeywords=false )
- inSourceAdr
- The location from which pages should be indexed. To index your entire site, this would be the address of the website table (e.g., @websites.mysite). To index only a portion of a site, pass the address of the subsite table. To index just a single page (as from the #filters.finalFilter script), pass the address of the individual page.
- inDestTbl
- The address of the table in which the index information should be stored (e.g., @websites.mysite.["#indices"].["topic"]).
- keywordSpec
- An identifier for the keywords to be used by the callback functions. Because this parameter is interpreted by user-provided callback functions, the type and range of values of this parameter may vary widely depending on application.
- testCB
- The address of a callback function that BuildPageIndexGeneric will call to determine whether to index a particular table or to continue scanning.
The test callback is expected to take two parameters:
The test callback must return either TRUE or FALSE. If the test callback returns TRUE, the info callback will be called and the table will be added to the index. If the test callback return FALSE, the info callback will not be called and the table will not be added to the index.
- entryAdr
- The address of the entry (table or page) currently being examined by BuildPageIndexGeneric.
- keywordSpec
- The keywordSpec parameter that was passed to BuildPageIndexGeneric.
- infoCB
- The address of a callback function that BuildPageIndexGeneric will call to get the indexing values for a given table. The info callback function will only be called if the test callback function returned TRUE.
The info callback is expected to take two parameters:
The info callback is expected to return a table. The returned info table may contain the following elements:
- tableAdr
- The address of the entry (table or page) currently being examined by BuildPageIndexGeneric.
- keywordSpec
- The keywordSpec parameter that was passed to BuildPageIndexGeneric.
- keywords
- [REQUIRED] A keyword specification, as described in suites.indexer.doc.Keywords.
- entryAdr
- [REQUIRED] The entry address to enter in the index. This may be the tableAdr that was passed in, or it may be any other address in the ODB.
- entryName
- [OPTIONAL] The name to use for the entry in the index. This forces the sort order of the index. Note that the index does not support multiple entries with the same name; if you return the same name for multiple entries, only the latest one will appear in the index.
- errorMessage
- [OPTIONAL] This is an error message generated by the info callback script. If errorMessage is set, all other fields in the returned info table will be ignored.
- inReplaceIndices
- Specifies whether to replace the index at destAdr^ or to simply add to it.
- doExpandNestedKeywords
- Specifies whether nested keywords should also be entered individually. If true, expanded keywords will be entered in the index as specified, and will also be split into individual keywords, and those keywords will be entered in the index.
For example, if doExpandNestedKeywords is true, "frontier:community" is equivalent to "frontier:community, frontier, community".
We need to change the BuildTitleAlphaIndex script to call BuildPageIndexGeneric instead of BuildPageIndex, and create the necessary callback functions. The necessary BuildPageIndexGeneric call is:
| indexer.BuildPageIndexGeneric( sourceAdr, destTbl, "title", @PageTestCB, @PageInfoCB, inReplaceIndices ) |
Replace the BuildPageIndex function call with this new call.
The Test Callback function, PageTestCB, must return TRUE if and only if the page whose address is passed to it should be included in the index. In this case, we simply want to include anything that is a renderable page. So insert this function in BuildTitleAlphaIndex just before the call to BuildPageIndexGeneric:
|
on PageTestCB( entryAdr, keywordSpec )
|
The callback first calls html.traversalSkip to determine whether it should even be considered as a page.
If the entry is a table, it tries to locate a renderTableWith directive, which would indicate that the table gets rendered as a page rather than a directory. If the directive is not found, it is not a page.
Finally, anything left is assumed to be a page, and should be added to the index.
The info callback must return a table, as described above. It must fill in the following values:
entryAdr--This one's easy: it's the address of the page entry in the website table--the entry address that is passed into the function.
entryName--This forces the sort order in the index table, so we should omit the leading articles (the, an, a, etc.) here--or better yet, move them to the end, after a comma, as is normally done with articles and books. We do have to set this: if we don't, the sort order will be by ODB address, which is not particularly useful on a web page.
keywords--Another easy one: it's the first character of the title (as modified for entryName.
For example, if the page title is "The Importance of Being Earnest", we would store the following values in our returned info table:
entryName = "Importance of Being Earnest, The"
keywords = "I"
When we put all these pieces together, we get the following PageInfoCB function:
|
on PageInfoCB( entryAdr, keywordSpec )
|
Insert this function in BuildTitleAlphaIndex just before the call to BuildPageIndexGeneric:
I'll save you the trouble of typing this all in. The final version of BuildTitleAlphaIndex is stored in this page in fatpage format. Save this page to disk, and open it with Frontier's File->Open... command to import it.
| 1 | This unpronouncable creature was a source of terror to the crusaders of Monty Python and the Holy Grail. |